DISCO uses Westlaw and Lexis-style search syntax with boolean search operators. These can be used in a fieldless search or with a field such as tag(Responsive or "Further Review"). This guide explains how to search in DISCO. For a training video about DISCO's search and review features, see DISCO 101: Search & Review.
Contents
Search basics
Operators | Description | Example |
---|---|---|
& ,and |
Includes results with both terms |
|
[space] ,or ,|
|
Includes results with either term or field |
|
% ,not |
Excludes term or field from results |
|
+family |
Includes family members as search hits for the entire query or any portion of a query |
Returns documents tagged Responsive and documents whose family members are tagged Responsive, but removes documents tagged Attorney-Client and family members of documents tagged Attorney-Client. |
“ ” |
Exact phrase intended |
Be sure to include quotes when searching for an email address.
*Note that not all DISCO fields are equipped for exact match phrase searches depending on the field type. |
! |
Truncation search or root expander; can be used at the beginning of a term, end of a term, or both. Also can be used to return all documents. |
|
* |
Wildcard search for single character |
|
/n ,w/n |
Proximity search, searching within n words, in any order |
|
+n ,w+n |
Proximity search, searching within n words in prescribed order |
|
~ |
Fuzzy or approximate word search. Fuzzy searching allows for the addition, deletion, or substitution of up to two letters in a word. |
|
. |
Period in a name or email address |
|
( ) |
Grouping syntax |
|
sample(n, search) |
Returns n documents randomly selected from results of search. If n is less than 1, this number is treated as a percentage.
|
Returns 50% and 700 of the search results for contract at random, respectively |
field(terms) |
Field searching (see below for standard DISCO fields) |
|
DISCO ignores most punctuation and non-alphanumeric characters when searching. Periods, colons, semicolons, and apostrophes within a word are not removed. As an example, periods in a name or email address are indexed and searchable. Word operators (and, or, and not) can be searched if placed in quotation marks, “contract and payment”.
Order of operations
DISCO performs a search using the following order of operations:
- Term modifiers: !, *, ~
- Exact phrases: “ ”
- Groupings: ( )
- Proximity: /n, +n
- Family subsearch: +family
- &, and, %, not
- [space], or
Reviewing & searching document families
DISCO has two powerful features to help you navigate and search families of documents: family inclusion mode and family subsearch.
- Family mode allows reviewers to review documents in the context of their entire family by including family members in the search results list.
- Family subsearch is a search syntax capability allowing you to include family members as search hits. +family can be added to any valid query, or any portion of a query that would be a valid standalone query.
Document fields
Fieldless, or keyword, searches the document text, document notes, subject lines, custodians, authors, and user defined fields.
User-created information
Field | Description | Example |
---|---|---|
documentNote |
A notes field for each document |
|
field |
Custom metadata field added by admin-level users that is used to capture additional information |
|
folder |
Folder created in DISCO by a user |
|
privilegeNote |
A privilege notes field for each document |
|
production |
A name assigned by the user to each set of documents produced using DISCO |
|
redactionReason |
Optional user-created text that can describe redactions |
|
referenceid |
Optional identification number users can assign (contact DISCO Support to do so) |
|
searchTermReport |
Documents matching the search criteria within a search term report |
searchTermReport("Responsive Terms") — Documents matching the search scope and search terms within the Responsive Terms search term report |
stage |
A set of documents created by users for linear review |
|
stageBatched |
Documents in stages that have been pulled into batches |
|
stageComplete |
Documents in stages that have been marked "reviewed" and checked-in |
|
stageReviewed |
Documents in stages that have been marked "reviewed" |
|
tag |
Fields appended to documents by users |
|
tagGroup |
Fields from a specific group appended to documents by users |
|
Document metadata & attributes
Metadata is data about a document that is not part of the document’s content. Metadata can include things like the creation date, author, sending and receiving information, file name, and file path.
Field | Description | Example |
---|---|---|
author |
Identifies the author or creator of a file |
|
batesNumber |
Bates numbers applied to documents, either from a DISCO production or elsewhere |
|
batesPrefix |
The English portion of the Bates stamp |
|
bcc |
A recipient that was blind carbon copied on an email |
|
billingSize |
Returns all documents with the indicated size in bytes relative to the review database's billing statement |
|
cc |
A recipient that was carbon copied on an email |
|
childCount |
Returns all parent documents that have the specified number of children. A child is typically an attachment to an email or an embedded file in a document. A parent is the document containing the attachment or file. |
|
comment |
Returns all documents with the indicated text left inline in the original file |
|
company |
Returns all documents created by the same company |
|
conversationCount |
Returns all emails belonging to conversations that contain the number or range of emails specified. Conversations include the various responses, replies, and forwards of an email chain. |
|
custodian |
Typically the individual or location from which a document was collected |
|
dataSpace |
Searches for documents in a specific data space - see this feature spotlight for more information |
|
dedupHash |
The unique identifier used to deduplicate documents at the time of ingest |
|
domain |
An email domain (typically, the text after the @ symbol) |
See more detailed feature notes with examples here. |
domainCount |
Domain count is the unique count of email domains found within the from, to, cc, and bcc fields on email messages. |
See more detailed feature notes with examples here. |
extension |
Returns all documents with the specified file extension |
|
familyConsistentTag |
Returns documents where every family member has the specified tag. Limited to documents in families. |
|
familyInconsistentTag |
Returns documents that do not have the specified tag, where at least one of its family members has the specified tag. Limited to documents in families. |
|
fileLength |
Returns files corresponding to the specified size of the file (in bytes) |
|
filename |
The filename of any document |
|
folderpath |
Any folder in which a document was saved (such as a computer or network drive) |
|
from |
Identifies the sender of an email |
|
hasDetectedSlipsheet |
DISCO identifies when a slipsheet is produced in conjunction with a native file for load file ingests, and will create a near-native image from the native file |
|
hasDocumentNote |
Returns documents that include a document note |
|
hashes |
Searches across deduphash, sha1hash, md5hash, and objecthash |
Document with a hash starting with xyonqy |
hasHiddenType |
Returns documents with hidden content detected. Hidden content can include comments, hidden sheets, notes, hidden rows, etc. |
|
hasImage |
Indicates if an item was processed as an image or not. An image is sometimes also referred to as a “near native” or “PDF.” |
|
hasLanguage |
Documents that have an identified language |
|
hasNative |
Indicates if the native file was not supplied. If an image is supplied, DISCO will use that to populate the native file link. If no native or image is supplied, DISCO will create a near-native image from the supplied text and use that to populate the native link. |
|
hasOCR |
Returns documents for which any of the extracted text was derived via OCR |
|
hasPrivilege |
Returns documents that include a privilege tag |
|
hasPrivilegeNote |
Returns documents that include a privilege note |
|
hasRedaction |
Searches for documents that have redactions, do not have redactions, or have redactions in a specific location |
|
hasRedactionReason |
Searches for documents that have redaction reasons, do not have redaction reasons |
|
hasRedactionWithout Reason |
Searches for documents that have a redaction, but no corresponding redaction reason |
|
hasSearchText |
Returns documents that include searchable text |
|
hasWrongExtension |
Indicates if the extension of an ingested item is inconsistent with the determined type
|
|
hiddenText |
Hidden text is text that has been set to be hidden in the original file |
|
hiddentype |
Identifies whether hidden data is in a file, including comments, revisions, notes, hidden sheets, or very hidden sheets |
|
id |
An identification number assigned by DISCO that is unique to each document in the database |
|
imageSize |
Returns documents with the indicated size in bytes of the document’s image file |
|
ingestSessionId |
A number assigned to each group of documents ingested into DISCO |
|
ingestType |
Indicates which of the two types of data deliveries can be processed |
|
invisibleText |
Invisible text is text that is the same color as its background |
|
isDetectedEmail |
Files identified as emails during processing based on an examination of the file's text/OCR |
|
isInclusive |
Returns emails with unique content |
|
isProcessedAsNative |
Indicates that some aspect of processing was unsuccessful However, DISCO will create a record in the matter with a link to the native file |
|
md5Hash |
MD5 hash of the file binary; 32 characters |
|
objectHash |
Hash of the file document, without considering the parent document; 40 characters For non-email files, objecthash is the same as sha1hash. For email files, objecthash is computed by extracting and hashing parts of an email including sent date, sender, message body, and a few more. |
objecthash(hpsrxo!) — Document with an object hash starting with hpsrxo
|
pageCount |
Returns documents containing the specified number of pages |
|
parentCount |
Returns documents with the specified number or range of parents |
|
participant |
Searches the from, to, cc, and bcc fields on email messages |
See more detailed feature notes with examples here. |
participantCount |
Participant count is the unique count of email participants in the from, to, cc, and bcc fields on email messages |
See more detailed feature notes with examples here. |
path |
Location from which the document was collected |
|
prediction |
Tag predictions are useful for finding documents that DISCO predicts are likely or unlikely to receive a specific tag |
|
primaryLanguage |
Returns documents that have the specified primary language |
|
processingDetails |
Returns documents with the specified processing reason |
|
processingStatus |
Returns documents with the specified processing outcome |
|
recipient |
Searches the combined email metadata fields: to, cc, and bcc (i.e., anyone who was the recipient of an email) |
See more detailed feature notes with examples here. |
recipientCount |
Recipient count is the unique count of recipients in the to, cc, and bcc fields of an email |
See more detailed feature notes with examples here. |
relationshipStatus |
Finds documents based on their family relationships. |
|
search |
Searches using the criteria of a saved search |
|
sha1hash |
SHA-1 hash of the file binary; 40 characters |
|
similarCount |
Returns documents with the specified number of similar documents |
|
speakerNote |
Speaker notes are slide-specific notes that are hidden from the audience, but are visible to the presenter while editing |
|
subject |
Searches the subject (or "re:" line) of an email |
|
text |
The extracted text in a document image - does not search metadata |
|
to |
Identifies to whom an email was sent |
|
tagcount |
Documents with a specific count of tags |
|
textLength |
Returns documents with the specified number of characters in the text, including spaces |
|
title |
Returns documents with the specified document title |
|
type |
The type of a document in DISCO is the file type such as email, Word, or PDF |
|
unfoldered |
Documents that are, or are not, within a folder |
|
Date fields
Dates can be searched by exact date, a range of dates, and before or after a date. Dates can be formatted as date certain (12/10/2015), month and year (9/2016), or year (2012). For example:
- Exact:
date(5/4/09)
- Range:
date(5/4/09)
- Before:
date(before 5/10/09)
- After:
date(after 5/10/09
Field | Description |
---|---|
conversationDate |
For all documents in an email conversation, the conversation date is the send date of the first email in the conversation. |
createDate |
Created date is the date the file was created. |
date |
Date searches the send date of emails and the last modified date of all other document types. |
familyDate |
For all documents in a family, the family date is the sent date (for emails) or last modified date (for non-emails) of the family head document. |
lastAccessedDate |
The last accessed date is the date the file was last accessed. |
lastModifiedDate |
The last modified date is the date changes were last made to a file. |
lastPrintedDate |
The last printed date is the date the file was last printed. |
loadDate |
The load date is the date documents were ingested into DISCO. |
receivedDate |
The received date is the date an email was received. |
sendDate |
The send date is the date an email was sent. |
Tag predictions
Tag prediction uses AI to predict a tag or tags that should be applied to a document. Tag prediction search syntax is displayed below. Search syntax can accommodate multiple tags, e.g., prediction("tag name1" & "tag name2", >20).
Field | Description |
---|---|
prediction("tag name", Likeliness value) |
|
prediction("tag name", score range) |
— Documents that have a likelihood score in the range of -20 to 20 to be tagged Attorney-Client
— Documents that have a likelihood score over 20 to be tagged Attorney-Client |
prediction("tag name", exact score) |
— Documents that have a likelihood score of 20 to be tagged Attorney-Client |
Tag decisions
Tagging decisions can also be searched by dates applied and users applying them, using the following syntaxes:
Field | Description |
---|---|
tag(by "reviewer@lawfirm.com") |
Documents tagged by reviewer@lawfirm.com |
tag(responsive & by "reviewer@lawfirm.com") |
Documents tagged responsive by reviewer@lawfirm.com |
tag(non-responsive % by "reviewer@csdisco.com") |
Documents tagged non-responsive by someone other than reviewer@csdisco.com |
sample(10, tag(by "reviewer@csdisco.com")) |
Ten random documents tagged by reviewer@csdisco.com |
tag(by "reviewer@csdisco.com" & on 10/16/2015) |
Documents tagged by reviewer@csdisco.com on 10/16/2015 |
tag(responsive & by "reviewer@csdisco.com" & before 10/16/2015) |
Documents tagged responsive by reviewer@csdisco.com before 10/16/2015 |
removedTag(responsive) |
Documents from which the responsive tag was removed (by anyone) |
removedTag(by "reviewer@lawfirm.com") |
Documents from which any tags were removed by reviewer@lawfirm.com |
Numeric searches
The following numeric fields can search for an exact document, a range of documents, or documents greater than or less than the indicated value. For example:
id(1234)
id(1 to 1234)
id(>1234)
id(<1234)
Numeric fields
id( )
billingSize( )
childCount( )
conversationCount( )
fileLength( )
imageSize( )
pageCount( )
parentCount( )
similarCount( )
tagcount( )
textLength( )
Metadata redactions
Redaction fields can search for redactions on specific metadata, including:
- Any metadata:
hasRedaction("onmetadata")
- File name:
hasRedaction("filename")
- Path:
hasRedaction("path")
- Custodian:
hasRedaction("custodian")
- Subject:
hasRedaction("subject")
- From:
hasRedaction("from")
- To:
hasRedaction("to")
- Cc:
hasRedaction("cc")
- Bcc:
hasRedaction("bcc")
- Send date:
hasRedaction("senddate")
- Received date:
hasRedaction("receiveddate")
- Author:
hasRedaction("author")
- Created date:
hasRedaction("createddate")
- Modified date:
hasRedaction("lastmodifieddate")
- Printed date:
hasRedaction("lastprinteddate")
- Accessed date:
hasRedaction("lastaccesseddate")
- Company:
hasRedaction("company")
- Title:
hasRedaction("title")
Please see Finding Redacted Records for more metadata redaction search examples.
Custom fields
DISCO supports user defined fields created in the product, and custom fields ingested from a load file. The search syntax works similarly for both types of custom fields.
Example | Description |
---|---|
Deposition(!) |
Documents with any contents in the Deposition field |
Deposition(Important) |
Documents with the word Important in the Deposition field |
"My notes"(!) |
Documents with any contents in the My notes field |
"My notes"("Review again") |
Documents with Review again in the My notes field |
There are two additional searches for user defined fields.
Example | Description |
---|---|
hasFields(true) |
Documents with contents in any user defined field |
field("Red Team") |
Documents with Red Team in any user defined field |