Follow

Incoming Production Standard Formats

Preferred Composition of Digital Data for DISCO

This document is designed to provide guidelines and best practices for sending digital data produced by other parties for ingestion into DISCO.

Documents produced as images require a separate, image-only load file, in addition to the comprehensive load file described below. DISCO prefers an Opticon (.opt or .log) or another similar file format which shows the document boundaries and the file path to the location of the image on the delivery media.1

Bates ranges, metadata and file paths to natives and text should be included in a comprehensive load file. For this, DISCO prefers a DAT file with standard Concordance delimiters. Opticon or similar cross reference files are still required for images. It is highly recommended that the load file contain a field for the Bates number ranges in order to populate both the “BeginBates” and “EndBates” fields in DISCO. To assure the most efficient ingest, we strongly encourage adherence to these guidelines.

Image File Production Specifications2

Requirement Description
File Image Format / TIFF Production

Document images will be provided as whole-document PDF or single-page TIFF format, using Group 4 compression with at least 300 dots per inch (“dpi”) resolution. Images may be reduced by up to 10% to allow for a dedicated space for page numbering and other endorsements of documents. Images will be in black and white, unless color is necessary to understand the meaning of the document.

Load File

A cross-reference load file in an Opticon (.opt or .log) or other similar file format will accompany the images, showing the document boundaries and the correlation between the unique page identifier of the document (i.e., “Bates Number”) and the location of the file on the delivery media.

Unitization

Each page of a document will be electronically saved into an image file. If a document is more than one page, the unitization of the document and any attachments will be maintained as it existed in the original form and reflected in the load file. The parties will make their best efforts to unitize documents correctly.

 

1,2 “Model Stipulated Production Specifications”. (2016) Legal Technology Professionals Institute. https://www.legaltechpi.org.

Comprehensive Load File Column Specifications

The metadata of electronic document collections should be extracted and provided in a DAT file using the field name and formatting described below. Other fields not listed here may be mapped as custom fields into the matter, per consultation with DISCO technical services.

Field Name Content Specifications
Author

Author field extracted from the metadata of a non-email document
(Note: this does not include sender of an email. See “from” field.)

BCC

BCC or blind carbon copy field extracted from an email message

BeginAttachmentBates

Unique number identifying the first page or first document of a document attachment(s)

BeginBates

Beginning Bates number of document

CC

CC or carbon copy field extracted from an email message

CreateDate

Date that a file was created (mm/dd/yyyy format)

CreateTime

Time that a file was created

Custodian

Name of the custodian of the file(s) produced (last name, first name)

DuplicateCustodians

Identifies duplicate custodian sources for files excluded from production based on MD5 or SHA-1 hash de-duplication

DuplicateFilenames

If collected from multiple sources, the name of each additional file

DuplicateOriginalFilepath

If collected from multiple sources, the filepath of each additional file

EndAttachmentBates

Unique number identifying the last page or last document of a document attachment(s)

EndBates

Ending Bates number of document

Filename

Filename of the original digital file name

From

From field extracted from an email message

Hash

MD5 or SHA-1 unique 32 or 40 character hexadecimal value, respectively.
A "digital file fingerprint".

ImageFilename

Filename to produced PDF image

ImagePath

Path to produced PDF image

LastModifiedDate

Modification date(s) of a non-email document

LastModifiedTime

Modification time(s) of a non-email document

NativeFilename

Filename to produced native file

NativePath

Path to produced native file

OCRPath

Path to OCR text file

OCRTextFilename

Filename to OCR text file

OriginalFilepath

Original filepath of the document

PageCount

Number of pages in the document

ParentID

ID of the parent of the document

ReceivedDate

Received date of an email message (mm/dd/yyyy format)

ReceivedTime

Received time of an email message

ReferenceID

Cross-reference identifier (if needed)

ReviewID

Another identifier (if needed)

SendDate

Sent date of an email message (mm/dd/yyyy format)

SendTime

Sent time of an email message

Subject

Subject (or "re" line) of an email

Tags

Tags or codes added by users

To

To or Recipient field extracted from an email message

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments