An electronic document production is contained in a folder often referred to as a production folder or volume. This production folder or volume can also generally be referred to as a load file. To ingest a production, use the Load file ingest feature within DISCO.
Load file ingest prerequisites
In order to ingest load-file data into a database, you must first ensure that the load file meets our minimum requirements:
- Encoded in ASCII or UTF-8 formats.
- One of the supported file types:
- Data load files are either .dat or .csv.
- Image load files are .opt.
- Any native, images or text files referenced by the data load file via relative file path are contained in one folder.
- Paths referenced in the load file are relative to the root of the production folder.
- If the load file uses custom delimiters (instead of default Concordance delimiters), they must be valid UTF-8 characters and be present in the load file itself.
- TIFF files must be single-page.
- PDF files must be multi-page.
- Custom Field Names can be any combination of [a-z][A-Z], [0-9], dash, underscore, space, period
- Bates numbers can be any combination of [a-z][A-Z], [0-9], dash, underscore, ampersand, space
Once certain the load file meets these requirements, you can start ingesting the production folder into the appropriate DISCO database.
Ingesting load files
To ingest load files into a database:
- On the DISCO home page, in the main menu, click Ingest.
- On the Ingest page, click New Ingest and then click Load file.
- On the Name and description page, enter a name and an optional description for the load-file ingest. The ingest name and title will be displayed on the ingest card.
- On the Choose production folder page, navigate to the unzipped production folder (also called the volume) that contains all load files, native files, image files, and text files to be ingested.
- On the Load files page, within the volume you selected, select the data load file and, if present, the image load file. Make sure the file paths in the load files are relative to the folder/volume selected in the previous step. For example, if you selected a folder/volume that organizes all image files under the path images/0001/, then all references to image files in the load file must begin with that path (e.g., images/00001/00001.tiff).
- On the Field mapping page, map the load-file fields to DISCO’s fields. Not all load-file fields must be mapped and not all DISCO fields must be used.
First, select the unique identifier in your load file and the column headers that define your family relationships. The unique identifier is typically a ControlID or BegBates number. You can choose to select a single value or separate begin and end values to define families. Historically, DISCO has required mapping a column to BeginBates in order to ingest a load file, but now only requires selecting a unique ID and way to define family relationships between the documents.
NOTE: The column headers selected to define your family relationships will not be stored in DISCO unless they are mapped to a DISCO or custom field in the Field Mapping section below.
To map a load file field, click on the placeholder to the right of the field that reads Select or type to add a new field. Then, select or begin typing the DISCO field to associate with that field. If a DISCO field already exists, it will appear below your typing and can be selected. If a DISCO field does not already exist, you will see an option to create a new custom field. If the field name you are trying to create is already in use in DISCO, you will see a warning message and not be able to create a conflicting field with that name.
If the field mapping tool does not properly display your data load file fields or it displays an error message, then there is something wrong with your load file’s delimiters. You can adjust your delimiters in the tool or contact DISCO’s Professional Services team for consulting on how to modify your data. Additional charges may apply.
Common Fields that exist in productions can be mapped to the associated DISCO field
Common DISCO Fields
This is the first Bates Number assigned to a document.
This is the last Bates Number assigned to a document.
The Bates number of the first page of all the attachments in a family.
The Bates number of the last page of all the attachments in a family.
Unique number that identifies the parent's Bates Number of a family relationship.
Name of the individual or department from whom the document originated.
The original file path for the Native document.
Filename for the Native document.
The author of the document.
The "From" field of an email
The "To" field of an email
The "CC" field of an email
The "BCC" field of an email
The "Subject" field of an email
The date the email was sent.
The time the email was sent.
The date the email was received.
The time the email was received.
The date the document was created.
The time the document was created.
The date the document was last modified.
The time the document was last modified.
Hyperlink to the document image
Hyperlink to the Native document.
Relative path to the text file.
Relative file path to the text file for conversation indexing using provided text.
The three types of delimiters are:
- Field separator - A character separating the columns.
- Field quote - The character used to “quote” or group multiple items together, like a phrase.
- Multi-value delimiter - A character used to delimit multiple values within a column (e.g. ,“Kinsley, Dave”;”Slovacek, Jerry”,)
DISCO will pre-populate the delimiters based on the selected load file type. These values can be changed by selecting a different value from the delimiter dropdown lists.
For more information about mapping fields in load file ingest, see Load-file ingest fields.
Once you’ve finished mapping all the fields, click Next
- On the Ingest options page, select the options for handling exceptions and hidden content for native files. Note: Exception handling by DISCO Professional Services is a billable service.
Load file ingests are not de-duplicated during ingest.
- Review everything on the Summary page and ensure it is correct. When you’re sure the ingest has no mistakes, click Ingest.
Monitoring the status of a load-file ingest
In the Ingest section of DISCO, you can follow the status of the ingest on the ingest session card.
If your ingested load file fails validation or something else goes wrong during ingest, it will appear on the ingest session card. If your ingest is completed without issue, this will also be displayed on the card.
From the ingest session card, you can:
- See the major details (type, custodian, tags, and size) of the ingest
- Download and read the validation report, if the file failed validation
- Download and read the errors-only ingest report
- Download and read the entire ingest report
For more information, see Load-file ingest validation report