Friday, April 22, 2016

Saving Time During Document Review and Indexing

Few paralegal tasks are more important than document review and indexing, but few paralegal tasks can be more mind-numbing than document review and indexing.  There are ways to automate some of the repetitious portions of indexing documents that will save you time and your sanity while allowing you to put more focus on the actual substance of the documents themselves.

I will walk through a few different scenarios that you are likely to encounter and how to deal with each.

You Have a Group of Documents with Bates NumbersNumberstual substance of the documents themselves.us portions of indexing documents that will save you time and your sanity.toma as the File Names

If you need to index a group of documents in which the file names are the Bates numbers or Bates labels on each document, your index, of course, will need to include the Bates numbers in order to identify each document.  Rather than having to type out each Bates number as you create your index, you can automatically generate a list of the file names using some built-in features of your computer.  In a PC or a computer that is running some version of Windows, you can create the list of files by using a DOS command.  To do that, take the following steps which should only take about 5 to 10 minutes:

1)    Create a copy of the directory or folder containing the files to be indexed just in case.

2)    Click on the Windows Start button.

3)    Type CMD in the search box at the bottom of the Start menu window and hit enter.  The search box will look similar to the image below.



4)   This will bring up a window containing a DOS prompt that will look like this:



5)    You will need to go to the drive and directory (or folder) where the documents are saved.  The easiest way to do this is to identify which drive contains the documents.  If they are not on the C drive of your computer, you need to type the correct drive letter followed by a colon.  For example, if the documents were saved on the E drive of your computer, you would need to type “e:” or “E:” (without the quotes) like below.  DOS is not case sensitive so capitalization does not matter.



6)    Once you have selected the correct drive, use the CD command (which stands for Change Directory) to go to the correct directory.  To do this, copy the full directory path where your documents are saved by opening a window where the documents are saved, and left clicking in the area that is highlighted in blue in the image below.  Hit Ctrl+C to copy that directory path.  Go back to the DOS command screen, type “CD “ (again, don’t type the quotes), right click on the DOS command screen, and click on paste in the small window that will appear to paste the file path of the documents.  Hit enter.  Ctrl+V won’t work in the DOS window.



7)    The DOS command screen should now show the full path to the documents you need to index.  You can see a list of those documents by typing “dir”. (Guess what?  No quotes when you type.)  To create a file containing that list, type “dir>list.txt”.  (No quotes).  This will create a TXT file named “list.txt” with the list of documents in the same directory as the documents.

8)    You now have all of the file names of the documents you need to index.  There is some additional information you likely will not need or want in your index.  For example, I created a list of the files from the following directory:



The list that was generated from the DOS command and is saved in the list.txt file looks like this:

 Volume in drive C has no label.
 Volume Serial Number is 9870-65BA

 Directory of C:\Users\Bruce\Temp\Test

04/07/2016  02:42 AM    <DIR>          .
04/07/2016  02:42 AM    <DIR>          ..
01/31/2016  12:48 AM            29,793 BATES001.pdf
01/31/2016  12:47 AM            30,788 BATES003.pdf
01/31/2016  12:44 AM            28,395 BATES006.pdf
01/31/2016  12:42 AM            27,601 BATES012.pdf
01/31/2016  12:40 AM            27,196 BATES054.pdf
01/31/2016  12:39 AM            27,426 BATES108.pdf
01/31/2016  12:36 AM            27,094 BATES229.pdf
01/31/2016  12:34 AM            26,990 BATES230.pdf
01/31/2016  12:32 AM            27,088 BATES231.pdf
01/31/2016  12:31 AM            25,452 BATES237.pdf
01/31/2016  12:30 AM            26,613 BATES240.pdf
01/31/2016  12:30 AM            28,857 BATES247.pdf
04/07/2016  02:43 AM                 0 list.txt
              13 File(s)        333,293 bytes
               2 Dir(s)   2,977,239,040 bytes free

9)    To get rid of the unnecessary information, copy only the list of files and paste that information into a blank Word Document.  Keep in mind that the list.txt document itself will be included in the list of files so be sure to exclude that from your final list.  In my example, the list became this:
01/31/2016  12:48 AM            29,793 BATES001.pdf
01/31/2016  12:47 AM            30,788 BATES003.pdf
01/31/2016  12:44 AM            28,395 BATES006.pdf
01/31/2016  12:42 AM            27,601 BATES012.pdf
01/31/2016  12:40 AM            27,196 BATES054.pdf
01/31/2016  12:39 AM            27,426 BATES108.pdf
01/31/2016  12:36 AM            27,094 BATES229.pdf
01/31/2016  12:34 AM            26,990 BATES230.pdf
01/31/2016  12:32 AM            27,088 BATES231.pdf
01/31/2016  12:31 AM            25,452 BATES237.pdf
01/31/2016  12:30 AM            26,613 BATES240.pdf
01/31/2016  12:30 AM            28,857 BATES247.pdf

10)  Now, to be left with only a list of file names, do a find and replace in Word by clicking CTRL+F.  Search for “,^#^#^# “ and replace it with “^t”.  (You guessed it.  Don’t include the quotes.)  This will search for a comma followed by three consecutive digits along with a space and will replace all of that with a tab.  Copy all of the resulting information and paste it into a blank Excel spreadsheet.  The last column in the spreadsheet will contain the file names of your documents.  You can copy this column and paste it into a blank Excel file or a Word document to start your index.

My list of files would look like:
BATES001.pdf
BATES003.pdf
BATES006.pdf
BATES012.pdf
BATES054.pdf
BATES108.pdf
BATES229.pdf
BATES230.pdf
BATES231.pdf
BATES237.pdf
BATES240.pdf
BATES247.pdf

If you would like to remove the “.pdf” from each of the rows of your index, you can do a find and replace in Excel that searches for .pdf and replaces it with nothing.

While this process seems a bit complicated, it can easily be accomplished in 10 minutes.  That’s a lot better than manually typing the file names you are dealing with more than a few dozen files.  If you have thousands of files, this could save you many hours of work.

You Have a Group of Files that You Need to Convert to PDF Files and Bates Label

1)    Create a copy of the files to be converted (this is always a good idea when doing any type of batch conversion or other work on a group of files).

2)    Convert the files to a set of PDF files using a PDF creator like Adobe Pro, Nuance eCopy
PDF Pro, etc.  Unless there is some reason to give the PDF files some other file names, I would have the PDF files be given the same names as the original files.  For example, I converted the files in the first image below to the PDFs in the second image below. 

 


3)    Again, unless there is a reason not to do so, sort the files by file name in alphabetical order.  This will help you keep the files in the same order as the software that will apply Bates labels to your documents which is likely to rearrange them into alphabetical order as the Bates labels are applied.

4)    Create a list of the files using the steps in the first section of this post.  Save the list of files names in an Excel file.

In my example, the relevant portion of my list.txt document looks like this:
01/31/2016  12:42 AM            27,601 Excel File.pdf
01/31/2016  12:44 AM            28,395 Text File.pdf
01/31/2016  12:48 AM            29,793 Word Document 1.pdf
01/31/2016  12:47 AM            30,788 Word Document 2.pdf

After using the find and replace steps explained in the first section above, this list becomes:
Excel File.pdf
Text File.pdf
Word Document 1.pdf
Word Document 2.pdf

5)     Apply Bates numbers to the files using your PDF editor software.  Save the Bates-labeled files in a new directory and have them renamed by their Bates labels.  In my example, the Bates-labeled files are:

6)    Create a list of the Bates-labeled files using the steps from the first section of this post.  Save the list of Bates-labeled files in an Excel file.

My list becomes:
BATES001.pdf
BATES003.pdf
BATES006.pdf
BATES012.pdf

7)    Take the two columns from the Excel spreadsheets and save them into one Excel spreadsheet.  This will give you a list of the original file names in one column and the Bates labels of those files in the other column.  The file name in a given row should correspond to the Bates number(s) of the files in that same row.  The table below shows how this will appear.

Original File Names
Bates-labeled Files
Excel File.pdf
BATES001.pdf
Text File.pdf
BATES003.pdf
Word Document 1.pdf
BATES006.pdf
Word Document 2.pdf
BATES012.pdf

This will give you a good start towards indexing the documents.

You Received a Hard Copy Set of Bates-Labeled Production Documents

When you receive a set of production documents in hard copy format, there is not a lot of automation that you can do to save time when indexing the documents, but you can create a list of all of the Bates numbers in a few minutes, even if there are thousands of pages.  Here’s how to do that.

I have created lists of Bates numbers that use “BATES” as a generic prefix.  There are 5 different lists with between 2 digits (e.g., BATES12) and 6 digits (e.g., BATES123456) in the Bates numbers.  These lists are saved as TXT files.  These files include leading zeroes at the beginning of some of the numbers (e.g., BATES00123).

1)    If you only need 2 digits (BATES12), download this file:  https://drive.google.com/file/d/0B6Zj-oo4RdabZ0tQbW52cUI2NXM/view?usp=sharing

2)    If you only need 3 digits (BATES123), download this file:  https://drive.google.com/file/d/0B6Zj-oo4RdabcFhQUGJhcnVXMWc/view?usp=sharing

3)    If you only need 4 digits (BATES1234), download this file:  https://drive.google.com/file/d/0B6Zj-oo4RdabSXpCRFZocVRBQlk/view?usp=sharing

4)    If you only need 5 digits (BATES12345), download this file:  https://drive.google.com/file/d/0B6Zj-oo4RdabdkVsRWlQdWlYczA/view?usp=sharing

5)    If you only need 6 digits (BATES123456), download this file:  https://drive.google.com/open?id=0B6Zj-oo4RdabOFBLcGhMLVRGRnM

6)     After the downloading the correct file, do a find and replace in the file in order to change the “BATES” prefix at the beginning of each Bates number to the prefix that is on your documents.  You can access find and replace in a text file by hitting CTRL+H.

7)    Once you have the correct Bates prefixes, copy all of the data from the file by hitting CTRL+A followed by CTRL+C and then paste it into a Word document or Excel spreadsheet to begin indexing the documents.

If you have more than 1 million pages of documents, i.e., your Bates numbers have more than 6 digits, you can create lists of numbers that are longer than 6 digits using Excel, but you definitely need a true document management system for any type of indexing, annotation, searching, etc. of such a large volume of documents.

If you have any questions or have any suggested improvements to this post, please leave a comment or email me at blauney@gmail.com.

No comments:

Post a Comment