DISCO uses Westlaw and Lexis-style search syntax with boolean search operators. These can be used in conjunction with a field, such as in privilegeNote(“attorney-client” or “work-product”), or in a fieldless search1 in the DISCO search bar.
This guide explains how to search in DISCO.
Download the DISCO Search Quick Reference guide.
CONTENTS
- Search basics
- Order of operations
- Standard document fields
- Numeric searches
- Tag fields
- Tag prediction
- Date fields
- Redactions
- Custom fields
Search basics
DISCO removes most punctuation and non-alphanumeric characters from a search query. Periods, colons, semicolons, and apostrophes within a word are not removed. As an example, periods in a name or email address are indexed and searchable.
Operators | Description | Example |
---|---|---|
&, and |
Includes results with both terms |
|
[space] , or |
Includes results with either term or field |
|
%, not |
Excludes term or field from results |
|
/n |
Proximity search, searching within n words, in any order |
|
+n |
Proximity search, searching within n words in prescribed order |
|
“ ” |
Exact phrase intended2 |
|
! |
Truncation search or root expander; can be used at the beginning or end of a term |
|
* |
Wildcard search for single character |
|
~ |
Fuzzy or approximate word search4 |
|
. |
Period in a name or email address |
|
( ) |
Grouping syntax |
|
sample(n, search) |
Returns n documents randomly selected from results of search. If n is less than 1, this number is treated as a percentage.
|
Returns 50% and 700 of the search results for contract at random, respectively |
field(terms) |
Field searching (see below for standard DISCO fields) |
|
Order of operations
- Exact phrases: “ ”
- Groupings: ( )
- Proximity: /n, +n
- &, and, %, not
- [space], or
Standard document fields
Search queries that do not specify a field search on document text, document notes, custodians, authors, and user defined fields.
Command | Description | Example |
---|---|---|
batesNumber |
Field of Bates numbers applied to documents (either from DISCO production or from elsewhere) |
|
batesPrefix |
The English portion of the Bates stamp |
|
DeDupHash |
The unique identifier used to deduplicate documents at the time of ingest |
|
hashes |
Searches across deduphash, sha1hash, md5hash, and objecthash |
|
id |
An identification number assigned by DISCO that is unique to each document in the database |
|
ingestSessionId |
A number assigned to each group of documents ingested into DISCO |
|
md5hash |
MD5 hash of the file binary; 32 characters |
|
objecthash5 |
Hash of the file document, without considering the parent document; 40 characters |
objecthash(hpsrxok9xug43sr17g7ypheyiurrlu
7fx4dqj4wt) — The document with the hpsrxok9xug43sr17g7ypheyiurrlu7fx4dqj4wt object hash |
production |
A name assigned by the user to each set of documents produced using DISCO |
|
referenceid |
Optional identification number users can assign (contact DISCO Client Success to do so) |
|
sha1hash |
SHA-1 hash of the file binary; 40 characters |
|
User-created information
Command | Description | Example |
---|---|---|
documentNote |
Document note field (edited manually by DISCO users) |
|
field |
Custom metadata field added by admin-level users that is used to capture additional information |
|
folder |
Folder created in DISCO by a user |
|
privilegeNote |
A text field that will appear on privilege logs |
|
redactionReason |
Optional user-created text that can overlay redactions |
|
stage |
A set of documents created by users for linear review |
|
stageBatched |
Documents in stages that have been pulled into batches |
|
stageComplete |
Documents in stages that have been marked "reviewed" and checked-in |
|
stageReviewed |
Documents in stages that have been marked "reviewed" |
|
tag |
Fields appended to documents by users |
|
taggroup |
Fields from a specific group appended to documents by users |
|
Document metadata fields6
Command | Description | Example |
---|---|---|
author |
Identifies the author or creator of a file (does not include the sender of an email) |
|
bcc |
A recipient that was blind copied (or blind carbon copied) on an email |
|
cc |
A recipient that was copied (or carbon copied) on an email |
|
custodian |
Typically the individual or location from which a document was collected |
|
custodianExact |
An exact custodian name |
|
domain |
An email domain (typically, the text after the @ symbol) |
|
folderpath |
Any folder in which a document was saved (such as a computer or network drive) |
|
from |
Identifies the sender of an email |
|
recipient |
Searches the combined email metadata fields: to, cc, and bcc (i.e., anyone who was the recipient of an email) |
|
subject |
Searches the subject (or "re:" line) of an email |
|
subjectNormalized |
Searches for exact match against the subject without "Re:" or "Fwd:" |
|
text |
The OCR (extracted text) information in an image |
|
to |
Identifies to whom an email was sent (excludes the cc and bcc fields) |
|
Document attributes
Command | Description | Example |
---|---|---|
billingSize |
Returns all documents with the indicated size in bytes relative to the review database's billing statement |
|
childCount |
Returns all parent documents that have the specified number of children 7 |
|
comment |
Returns all documents with the indicated text left inline in the original file |
|
company |
Returns all documents created by the same company |
|
conversationCount |
Returns all emails belonging to conversations8 that contain the number or range of emails specified |
|
extension |
Returns all documents with the specified file extension |
|
familyConsistentTag |
Returns documents where every family member has the specified tag. Limited to documents in families. |
|
familyInconsistentTag |
Returns documents that do not have the specified tag, where at least one of its family members has the specified tag. Limited to documents in families. |
|
fileLength |
Returns files corresponding to the specified size of the file (in bytes) |
|
filename |
The filename of any document |
|
hasDetectedSlipsheet |
DISCO identifies when a slipsheet is produced in conjunction with a native file for load file ingests, and will create a near-native image from the native file and append it to the slipsheet in the document viewer |
|
hasDocumentNote |
Returns documents that include a note |
|
hasHiddenType |
Returns documents with hidden content detected |
|
hasImage |
Indicates if an item was processed as an image or not. An image is sometimes also referred to as a “near native” or “PDF.” |
|
hasLanguage |
Documents that have an identified language |
|
hasNative |
Indicates if the native file was not supplied. If an image is supplied, DISCO will use that to populate the native file link. If no native or image is supplied, DISCO will create a near-native image from the supplied text and use that to populate the native link. |
|
hasOCR |
Returns documents for which any of the extracted text was derived via OCR |
|
hasPrivilege |
Returns documents that include a privilege tag |
|
hasSearchText |
Returns documents that include searchable text |
|
hasWorkProduct |
Documents that have any tags, folders, notes, fields, or have been reviewed within a stage or produced |
|
hasWrongExtension |
Indicates if the extension of an ingested item is inconsistent with the determined type
|
|
hiddentText |
Hidden text is text that has been set to be hidden in the original file |
|
hiddentype |
Identifies whether hidden data is in a file, including comments, revisions, notes, hidden sheets, or very hidden sheets |
|
imageSize |
Returns documents with the indicated size in bytes of the document’s image file |
|
ingestType |
Indicates which of the two types of data deliveries can be processed:
|
|
invisibleText |
Invisible text is text that is the same color as its background |
|
isDetectedEmail |
Files identified as emails during processing based on an examination of the file's text/OCR |
|
isInclusive |
Returns emails with unique content |
|
isProcessedAsNative |
Indicates that some aspect of processing was unsuccessful. However, DISCO will create a record in the matter with a link to the native file. |
|
language |
Searches for documents with specific languages (English, German, French, Portuguese, Spanish, Chinese, Japanese, Korean, or Undetermined) |
|
pageCount |
Returns documents containing the specified number of pages in an image |
|
parentCount |
Returns documents with the specified number or range of parents |
|
path |
Location from which the document was collected |
|
prediction |
Tag predictions are useful for finding documents that DISCO predicts are likely or unlikely to receive a specific tag |
|
primaryLanguage |
Returns documents that have the specified primary language |
|
processingDetails |
Returns documents with the specified processing reason |
|
processingStatus |
Returns documents with the specified processing outcome |
|
speakerNote |
Speaker notes are slide-specific notes that are hidden from the audience, but are visible to the presenter while editing |
|
tagcount |
Useful for figuring out whether a document has been tagged or not |
|
textLength |
Returns documents with the specified number or range of characters in the text of an image (including spaces) |
|
title |
Returns documents with the specified document title |
|
type |
The type of a document in DISCO is the file type (e.g., email, Word, PDF, audio, Excel, video, or text) |
|
unfoldered |
Folders are used for organizing documents |
|
Numeric searches
For the following, you can search for an exact document, a range of documents, or documents greater than or less than the indicated value (for example, id(1234), id(1 to 1234), and id(<1234), respectively).
id( )
batesNumber( )
billingSize( )
childCount( )
conversationCount( )
fileLength( )
imageSize( )
instanceCount( )
pageCount( )
parentCount( )
similarCount( )
tagcount( )
textLength( )
Tag fields
Tagging decisions can also be searched by dates applied and users applying them, using the following syntaxes:
Syntax | Description |
---|---|
tag(by reviewer@lawfirm.com) |
Documents tagged by reviewer@lawfirm.com |
tag(responsive & by reviewer@lawfirm.com) |
Documents tagged responsive by reviewer@lawfirm.com |
tag(non-responsive % by reviewer@csdisco.com) |
Documents tagged non-responsive by someone other than reviewer@csdisco.com |
sample(10, tag(by reviewer@csdisco.com)) |
Ten random documents tagged by reviewer@csdisco.com |
tag(by reviewer@csdisco.com & on 10/16/2015) |
Documents tagged by reviewer@csdisco.com on 10/16/2015 |
tag(responsive & by reviewer@csdisco.com & before 10/16/2015) |
Documents tagged responsive by reviewer@csdisco.com before 10/16/2015 |
removedTag(responsive) |
Documents from which the responsive tag was removed (by anyone) |
removedTag(by reviewer@lawfirm.com) |
Documents from which any tags were removed by reviewer@lawfirm.com |
taggroup(Issue) |
Documents with at least one tag from the Issue tag group |
* % tag(!) |
Documents that have zero tags associated |
Tag prediction
Tag prediction uses AI to predict a tag or tags that should be applied to a document. Tag prediction search syntax is displayed below. Search syntax can accommodate multiple tags, e.g., prediction("tag name1" & "tag name2", >20).
Syntax | Description |
---|---|
prediction("tag name", Likeliness value) |
— Documents that are highly likely to be tagged Attorney-Client
— Documents that are likely to be tagged Attorney-Client
— Documents that are neither likely nor unlikely to be tagged Attorney-Client
— Documents that are unlikely to be tagged Attorney-Client
— Documents that are highly unlikely to be tagged Attorney-Client |
prediction("tag name", score range) |
— Documents that have a likelihood score in the range of -20 to 20 to be tagged Attorney-Client
— Documents that have a likelihood score over 20 to be tagged Attorney-Client |
prediction("tag name", exact score) |
— Documents that have a likelihood score of 20 to be tagged Attorney-Client |
Date fields
Dates can be searched by exact date, a range of dates, and before or after a date (for example, createDate(5/4/09), createDate(5/4/09 to 5/10/09), and createDate(before 5/10/09), respectively).
Dates can be formatted as date certain (12/10/2015), month and year (9/2016), or year (2012).
allDates( )
conversationdate( )10
createDate( )
date( )11
familydate( )12
lastaccesseddate( )
lastmodifieddate( )
lastprinteddate( )
loaddate( )
receiveddate( )
sendDate( )
Redactions
Syntax | Description | Example |
---|---|---|
hasRedaction |
Searches for documents that have redactions, do not have redactions, or have redactions in a specific location. Value options are yes, no, onDocument, onMetadata, and any metadata fields. |
|
hasRedactionReason |
Searches for documents that have redaction reasons, do not have redaction reasons, or have redaction reasons in a specific location |
|
hasRedactionWithoutReason |
Searches for documents that have a redaction, but no corresponding redaction reason. Value options are yes, no, onDocument, onMetadata, and any metadata field. |
|
Custom fields
DISCO supports user defined fields created in the product, and custom fields ingested from a load file. The search syntax works similarly for both types of custom fields.
Example | Description |
---|---|
Deposition(!) |
Documents with any contents in the Deposition field |
Deposition(Important) |
Documents with the word Important in the Deposition field |
"My notes"(!) |
Documents with any contents in the My notes field |
"My notes"("Review again") |
Documents with Review again in the My notes field |
There are two additional searches for user defined fields.
Example | Description |
---|---|
hasFields(true) |
Documents with contents in any user defined field |
field("Red Team") |
Documents with Red Team in any user defined field |
For a training video about DISCO's search and review features, see DISCO 101: Search & Review.
-
Fieldless (keyword) searches the document text, document notes, subject lines, custodians, authors, and user defined fields.
- Word operators (and, or, and not) can be searched if placed in quotation marks, e.g., “contract and payment”
-
Because the @ symbol is not indexed, to search for an email address, the address must be contained in quotes.
-
Fuzzy searching allows for the addition, deletion, or substitution of up to two letters in a word.
- For non-email files, objecthash is the same as sha1hash. For email files, objecthash is computed by extracting and hashing parts of an email including sent date, sender, message body, and a few more.
-
Metadata is data about a document that is not part of the document’s content. Metadata can include things like the creation date, author, sending and receiving information, file name, and file path.
-
A child is typically either an attachment to an email or an embedded file in a document. A parent is the email containing the attachment or the document containing the file.
-
Conversations include the various responses, replies, and forwards of an email chain.
- Choices include revisions, hidden sheets, very hidden sheets, comments, and notes.
-
For all documents in an email conversation, the conversation date is the sent date of the first email in the conversation. Emails have a sent date, but other document types do not.
-
date( ) refers to the sent date for emails, or the last modified date for other files.
-
For all documents in a family, the family date is the sent date of the head of the family.