An electronic document production is contained in a folder often referred to as a production folder or volume. This production folder or volume can also generally be referred to as a load file. To ingest a production, use the Load file ingest feature within DISCO.
Load file ingest prerequisites
In order to ingest load-file data into a database, you must first ensure that the load file meets our minimum requirements:
- Encoded in ASCII or UTF-8 formats.
- One of the supported file types:
- Data load files are either .dat or .csv.
- Image load files are .opt.
- Formatted properly:
- Any native, images or text files referenced by the data load file via relative file path are contained in one folder.
- Paths referenced in the load file are relative to the root of the production folder.
- If the load file uses custom delimiters (instead of default Concordance delimiters), they must be valid UTF-8 characters and be present in the load file itself.
- All included image files are one of the supported file types:
- TIFF files must be single-page.
- PDF files must be multi-page.
- Field names are valid
- Custom Field Names can be any combination of [a-z][A-Z], [0-9], dash, underscore, space, period
- Bates numbers are valid
- Bates numbers can be any combination of [a-z][A-Z], [0-9], dash, underscore, ampersand, space
Once certain the load file meets these requirements, you can start ingesting the production folder into the appropriate DISCO database.
Ingesting load files
To ingest load files into a database:
-
On the DISCO home page, in the main menu, click Ingest.
-
On the Ingest page, click New Ingest and then click Load file.
-
On the Name and description page, enter a name and an optional description for the load-file ingest. The ingest name and title will be displayed on the ingest card.
-
On the Choose production folder page, navigate to the unzipped production folder (also called the volume) that contains all load files, native files, image files, and text files to be ingested.
-
On the Load files page, within the volume you selected, select the data load file and, if present, the image load file. Make sure the file paths in the load files are relative to the folder/volume selected in the previous step. For example, if you selected a folder/volume that organizes all image files under the path images/0001/, then all references to image files in the load file must begin with that path (e.g., images/00001/00001.tiff).
-
On the Field mapping page, define your family relationships and map the load-file fields to DISCO’s fields. Not all load-file fields must be mapped, and not all DISCO fields must be used.
First, select the unique identifier in your load file and the column headers that define your family relationships. The unique identifier is typically a ControlID or BegBates number. You can choose to select a single value or separate begin and end values to define families.
Historically, DISCO required mapping a column to BeginBates in order to ingest a load file, but now only requires selecting a unique ID and way to define family relationships between the documents. By using a unique ID and a single field to define parents (typically Parent ID), you can load documents with a wide variety of relationships, including documents with multi-generational families or mixed suffixes.
NOTE: The column headers selected to define your family relationships will not be stored in DISCO unless they are mapped to a DISCO or custom field in the Field Mapping section below. If values in Bates numbers are causing failed validations, you can always map the Bates fields to custom fields and keep family relationships intact through the relationship fields.
-
To map a load file field, you can either apply a previously saved mapping or click on the placeholder to the right of the field that reads Select or type to add a new field. Then, select or begin typing the DISCO field to associate with that field. If a DISCO field already exists, it will appear below your typing and can be selected. If a DISCO field does not already exist, you will see an option to create a new custom field and choose a data type for that field. If the field name you are trying to create is already in use in DISCO, you will see a warning message and not be able to create a conflicting field with that name.
If the field mapping tool does not properly display your data load file fields or it displays an error message, then there is something wrong with your load file’s delimiters. You can adjust your delimiters in the tool or contact DISCO’s Professional Services team for consulting on how to modify your data. Additional charges may apply. -
Common Fields that exist in productions can be mapped to the associated DISCO field
Common DISCO Fields
Definition
BegBates
This is the first Bates Number assigned to a document.
EndBates
This is the last Bates Number assigned to a document.
BegAttach
The Bates number of the first page of all the attachments in a family.
EndAttach
The Bates number of the last page of all the attachments in a family.
ParentID
Unique number that identifies the parent's Bates Number of a family relationship.
Custodian
Name of the individual or department from whom the document originated.
OriginalFilePath
The original file path for the Native document.
Filename
Filename for the Native document.
Author
The author of the document.
From
The "From" field of an email
To
The "To" field of an email
CC
The "CC" field of an email
BCC
The "BCC" field of an email
Subject
The "Subject" field of an email
DateSent
The date the email was sent.
TimeSent
The time the email was sent.
DateRcvd
The date the email was received.
TimeRcvd
The time the email was received.
DateCreated
The date the document was created.
TimeCreated
The time the document was created.
DateLastMod
The date the document was last modified.
TimeLastMod
The time the document was last modified.
ImagePath
Relative file path to the document image
NativePath
Relative file path to the Native document.
LoadFileTextPath
Relative file path to the text file.
The three types of delimiters are:- Field separator - A character separating the columns.
- Field quote - The character used to “quote” or group multiple items together, like a phrase.
-
Multi-value delimiter - A character used to delimit multiple values within a column (e.g. ,“Kinsley, Dave”;”Slovacek, Jerry”,)
DISCO will pre-populate the delimiters based on the selected load file type. These values can be changed by selecting a different value from the delimiter dropdown lists.
For more information about mapping fields in load file ingest, see Load-file ingest fields.
- Once you’ve finished mapping all the fields, click Next
-
On the Ingest options page, select the options for handling exceptions and hidden content for native files. Note: Exception handling by DISCO Professional Services is a billable service
-
- Send exceptions to DISCO Professional Services for analysis and remediation. – Any errors generated during your ingest will be sent to DISCO Professional Services to be addressed and remediated, if applicable. Additional hourly charges will apply.
- I will review the exception report and remediate any exceptions as necessary. – Any errors generated during your ingest will be your responsibility to address. You can view these errors by downloading an Ingest Exception Report and following DISCO’s suggested remediation steps as outlined in Identifying ingest exceptions and Remediating ingest exceptions.
- If the Custodian field is not mapped during field mapping, you can define a default custodian for all the rows in your load file. As in native ingests, you can choose from existing custodians or create a new one from the dropdown.
- The Ingest Options screen also allows you to configure how DISCO should display hidden content from files on the near-natives it generates. By default, DISCO enables all hidden content, but the settings can be changed and set for an entire matter through the Matter Defaults link above.
- DISCO will automatically render and attach near-natives behind slipsheets for documents which appear to be slipsheeted natives. You can choose to disable this capability in these options. DISCO recommends keeping the capability on because it speeds up review by not forcing reviewers to download natives of common filetypes in order to review them.
-
Finally, you can set and restore defaults for these options, including exception handling and hidden content for ingests across an entire matter through the Matter Defaults in the upper right.
Load file ingests are not de-duplicated during ingest.
-
-
Review everything on the Summary page and ensure it is correct. When you’re sure the ingest has no mistakes, click Ingest.
Monitoring the status of a load-file ingest
In the Ingest section of DISCO, you can follow the status of the ingest on the ingest session card.
If your ingested load file fails validation or something else goes wrong during ingest, it will appear on the ingest session card. If your ingest is completed without issue, this will also be displayed on the card.
From the ingest session card, you can:
- See the major details (type, custodian, mappings, and size) of the ingest
- Download and read the validation report, if the file failed validation
- Download and read the errors-only ingest report
- Download and read the entire ingest report
For more information, see Load-file ingest validation report