Downloading and understanding your ingest report – DISCO

Ingest reports give you full visibility and insight into the files that have been ingested into your database. The reports include exceptions, duplicates, file sizes, failed ingests (for example, if a document was protected by a password or was corrupt), and other key pieces of information.

You can use ingest reports to ensure that there are no process gaps that impact your review or production.

Downloading an ingest report

To download an ingest report:

In the DISCO main menu, click Ingest.
On the card for the ingest session you want to view the report for, click the ellipsis and then click Download ingest report.

The report will download to the Reports page.
To view the ingest report, in the main menu, click Reports.
Locate the report you want to download. If the list is very long, you can narrow it down to only ingest reports using the Ingest Reports option in the left menu.
Click Download. The report will be downloaded as an Excel file.

Understanding an ingest report

Your ingest report will have the following columns:

Column Name	Description
Instance Id	A unique ID assigned to each file ingested into DISCO. The numbers to the left of the decimal will become the DISCO ID and will be displayed to the end users. The numbers to the right refer to the number of duplicate files (instances) that have been ingested.
Instance Hash	An alphanumeric value that uniquely identifies each file in an ingest session.
>DeDup Hash	A DISCO-computed value that is used to deduplicate instances within a database.
Ingest Time	The time the file was ingested into DISCO.
Custodian	The name of a person or entity representing whom the data was collected. For example, the custodian of an email is the owner of the inbox which contains the message. Custodians are assigned prior to ingest.
File Length	The file's size in bytes.
File Path	The file, folder, or directory structure from which the document was collected.
ContainerPath	The path within the parent container from which the document was collected.
Processing Status	The primary indicator of the overall processing outcome of an ingested file. Files categorized as partial or failure may require additional processing work.
Processing Details	The reason why an ingested file received its processing status.
Ingested	Indicates if a processed file was ingested into the database.
Processed as Native	Indicates that some aspect of processing was unsuccessful. However, DISCO will create a record in the database with a link to the native file.
Image	Indicates if a file was processed as an image or not. An image is sometimes also referred to as a near native or PDF.
Image Page Limit Exceeded	Indicates if DISCO failed to produce a near-native image due to the native file exceeding the supported number of pages. The page limit is set to 30,000 pages.
Search Text	Indicates if the record contains searchable text. "No" indicates that items are processed as native.
Text Limit Exceeded	Indicates that the ingested file exceeded a maximum amount of text allowed. This limit is configured per database. The default limit is 100 MB.
OCR	Indicates whether or not any of the extracted text was derived via OCR.
Input File	Indicates whether this was an object viewable in the file system when it was received by DISCO for processing. Items within containers will have an N in this column as they cannot be viewed as a file system object.
Object Type	Indicates one of four primary files types: Containers – Files that contain other files, such as PST, NSF, TAR, or RAR. It is important to note that DISCO does not create records within the database for container files. Loose Files – Files with no family relationships (not a parent or an attachment). Family Head – A file that is identified as the top member of the family. Attachment – A file within a family that is not the family head.
Container Member	Indicates whether or not the file is a member of a container. Container files, such as TAR, RAR, PST, NSF, and MSG files, do not generate records in the DISCO database when successfully processed.
Partial Container	Indicates that a container could not be fully processed. DISCO will create a record within the database and have a link to the native container file so that it can be retrieved for additional processing.
Slipsheet Identified	DISCO identifies when a slip sheet is produced in conjunction with a native file for load file ingests. DISCO will create a near-native image from the native file and append to slip sheet in the document viewer.
Missing Native	Indicates when the native file was not supplied. If an image is supplied, DISCO will use that to populate the native file link. If no native or image is supplied, DISCO will create a near-native image from the supplied text and use that to populate the native link.
Ingest Type	Indicates what type of files were ingested. There are two types of data deliveries that can be processed: Native – Files delivered as they were maintained during the normal course of business. Load files – Files (either images only or images with natives) accompanied by a load file that supplies family relationship and metadata information.
Hidden Text	Indicates whether the file contains hidden content that cannot currently be searched or viewed on the near-native image. Examples of files that contain hidden content are Word, Excel, and PowerPoint files.
Hidden Type	Indicates the type of hidden content contained within the file. Types include revisions, hidden sheets, very hidden sheets, comments, and notes.
Wrong Extension	Indicates that the extension of an ingested file is inconsistent with the determined type.
Extension	Contains the extension of the ingested file. This field will be blank when there is no extension available.
Content Type	The content type is a normalized file type. Examples of content types in DISCO are Excluded, Unknown, Text, Email, PDF, Word, Excel, PowerPoint, PST, HTML, Image, RichText, ZIP, LoadFile, Audio, Video, Appointment, Contact, Cad, Project , Xps, Vcard, Visio, OpenXml, ISO, Mbox, and RAR.
File Name	The name of the ingested file.
Container Name	The name of the container from which the file was extracted.
Detected Email	Files that DISCO has identified as emails during processing, based on an examination of the file's text and/or OCR.
Image Size	The size, in bytes, of the image when downloaded to a computer. The original image will either be the near-native image created by DISCO or a production image ingested via load file delivery.
Parent Instance Id	The instance ID of the ingested file's immediate parent. The immediate parent is not always the family head.
Detected Language	The primary language identified in the document.
Native View Support	Files that DISCO is able to process and show in the DISCO document viewer in their native format.
Created Overflow	Indicates the file hit the max instance count of 200 and has created a new document for the additional instances.

For information about ingest exceptions, see Remediating ingest exceptions.

Downloading an ingest report

Understanding an ingest report

Related articles

Articles in this section