Follow

Ingest Reports

DISCO's Ingest Report allows users to get full visibility and insight into the files that have been processed and loaded into their matter.  The detailed Ingest Report information identifies exceptions, duplicates, file sizes and other key pieces of information.   For example, use the Processing details column to identify failed items such as password protected or corrupt files.  Detection and resolution of failed items ensures that there are no process gaps impacting your review or production.

 

To view your Ingest Report go to Menu > Data > Ingest.  Here on Ingest screen you will find details about each ingest session including:

  • Type - indicates the type of files that were ingested
  • Custodians - Indicates the names of the custodian(s) assigned to that session
  • Tags Applied - Indicates which, if any, tags were assigned to the documents in that session
  • File Size - The pre-compressed file size of the data

 

 

To access ingest reports for All Ingests or Selected Ingest Sessions use the Download Report button at the top of the Ingest screen.

 

To access a complete ingest report (Everything) or Errors only report for each individual session, use the Download Report button at top right corner of the Ingest Session details.

 

 

 

Below is a list of each column provided in DISCO's Ingest Report along with a detailed description of what that column means.

Column Name Description
Instance Id Unique ID assigned to each file ingested into DISCO. The numbers to the left of the decimal will become the DISCO ID and will be displayed to the end users. The numbers to the right refer to the number of duplicate files (instances) that have been ingested.
Instance Hash An alphanumeric value that uniquely identifies each file in an ingest session.
DeDup Hash DISCO computed value that is used to deduplicate instances within a matter.
Ingest Time The time a file was ingested into DISCO.
Custodian The name of a person or entity representing whom the data was collected. For example, the custodian of an email is the owner of the inbox which contains the message. Custodians are assigned prior to ingest.
File Length File length is a file's size in bytes.
File Path File path is the file, folder, or directory structure from which the document was collected.
ContainerPath Container path is the path within the parent container from which the document was collected.
Processing Status Primary indicator of the overall processing outcome of an ingested file. Files categorized as partial or failure may require additional processing work.
Processing Details The reason why an ingested file received its processing status.
Ingested Indicates if an processed file was ingested into the matter database.
Processed as Native Indicates that some aspect of processing was unsuccessful. However, DISCO will create a record in the matter with a link to the native file.
Image Indicates if an file was processed as an image or not. An image is sometimes also referred to as a "near native" or "PDF."
Image Page Limit Exceeded Indicates if DISCO failed to produce a near native image due to the native file exceeding the supported number of pages. The page limit is set to 30,000 pages.
Search Text Indicates if the record contains searchable text. This will be set to 'No" when items are processed as native.
Text Limit Exceeded Indicates that the ingested file exceeded a maximum amount of text allowed. This limit is configured per matter. The default limit is 100 MB.
OCR Indicates whether or not any of the extracted text was derived via OCR.
Input File Indicates whether this was an object viewable in the file system when it was received by DISCO for processing. Items within containers will have a N in this column as they cannot be viewed as a “file system object”.
Object Type Indicates one of 4 primary files types:
Containers - A file that contains other files such as PST, NSF, TAR, RAR. It is important to note that DISCO does not create records within the matter for container files.
Loose Files - A file with no family relationships (not a parent or an attachment).
Family Head - A file that is identified as the top member of the family.
Attachment - A file within a family that is not the family head.
Container Member  Indicates whether or not the file is a member of container. Container are: TAR, RAR, PST, NSF, MSG files that do not generate records in the DISCO matter when successfully fully processed.
Partial Container  Indicates that a container could not be fully processed. DISCO will create a record within the matter and have a link to the native container file so that it can be retrieved for additional processing.
Slipsheet Identified DISCO identifies when a "slipsheet" is produced in conjunction with a native file for load file ingests. DISCO will create a near native image from the native file and append to slipsheet in the Document Viewer.
Missing Native Indicates when the native file was not supplied. If an image is supplied, DISCO will use that to populate the native file link. If no native or image is supplied, DISCO will create a near native image from the supplied text and use that to populate the native link.
Ingest Type Indicates what type of files were ingested. There are 2 types of data deliveries that can be processed:
Native - Files delivered as they were maintained during the normal corse of business.
Loadfiles - Files (either images only or images with natives) accompanied by a loadfile that supplies family relationship and metadata information.
Hidden Text Indicates whether the file contains hidden content that cannot current be searched or viewed on the near native image. Examples of files that contain hidden content include: Word, Excel, and Powerpoint.
Hidden Type Indicates the type(s) of Hidden content contained within the file. Types include: Revisions, Hidden Sheets, Very Hidden Sheets, Comments, and Notes.
Wrong Extension Indicates that the extension of an ingested file is inconsistent with the determined type.
Extension Contains the extension of the ingested file. *Note: Field will be blank when there is no extension available.
ContentType The DISCO type is a normalized file type. Examples of content types in DISCO are: Excluded, Unknown, Text, Email, PDF, Word, Excel, PowerPoint, PST, HTML, Image, RichText, ZIP, LoadFile, Audio, Video, Appointment, Contact, Cad, Project , Xps, Vcard, Visio, OpenXml, ISO, Mbox, RAR
File Name The name of the ingested file.
Container Name The name of the container from which the file was extracted.
Detected Email Files that DISCO has identified as emails, during processing, based on an examination of the file's text / OCR.
Image Size The size, in bytes, of the image when downloaded to a computer. Note: the original image will either be the near native image created by DISCO or a production image ingested via load file delivery.
Parent Instance Id The instance ID of the ingested file's immediate parent. Note: the immediate parent is not always the family head.

 

 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments