Overview
DISCO uses AI to automatically create an index of the topical content within documents, cluster similar topics together, and organize topic clusters into an easily navigable list. The list of topics is a table of contents for your document universe that can help you quickly understand your data. You can browse broad topics or explore more specific topics to quickly find individual documents. Each topic cluster is represented as a row within the topic list. Topic clusters can accelerate document review and help you find relevant documents:
- During early case assessment, you can explore the contents and characteristics of topic clusters to identify key custodians and date ranges to inform your review strategy.
- When organizing a document review, you can identify relevant or irrelevant topic clusters to prioritize or exclude from your review workflow.
- When looking for evidence, you can explore interesting topic clusters to quickly find key documents.
Identifying topics in documents
DISCO will automatically identify topic clusters as documents are ingested - no setup or administration is needed. A minimum of 20,000 eligible review documents is required to start topic clustering. Some documents may not eligible for topic clustering - such as an image file or a spreadsheet with only numeric data.
Within a document, topics are determined at the sentence level rather than the document level. This approach significantly improves the quality of topic clusters because a single document often includes multiple topics. It’s common for an individual document to appear in multiple topic clusters. Tagging decisions, which inform AI tag predictions, do not affect topic clusters.
Searching with topic clusters
Topics are available in search visualization & search filters. You can easily combine topics with any other search criteria - including date ranges, document text, and any other searchable document attribute. Searching a topic cluster returns the documents that represent that topical content.
Topics start broad, and become more specific deeper in the topic list. The broad topic clusters are inclusive of the more specific topic clusters - you can expand a topic cluster to see the specific topic clusters that are included.
The document count within each histogram bar indicates the number of review documents assigned to that topic cluster. The percentage value quantifies a topic cluster's document count relative to the number of documents returned in the search results.
Navigating topic clusters
In addition to browsing the topic list, you can enter a key word or phrase to find all matching topic clusters. In this example, the topic clusters are limited to those that include the word market, and DISCO maintains the hierarchy of topics so you can continue browsing.
Maintaining topic clusters
The document corpus may change, as new documents are ingested or removed. Newly ingested documents are automatically assessed against the existing topic clusters, and may be assigned to those topic clusters. You can filter for documents added through a specific ingest if you want to quickly understand the topics of those documents.
The document corpus may change significantly over time, and require a new topic list. DISCO AI continually assesses the corpus to identify when a new topic list should be created. When a new topic list is generating, the current topic list remains available with a notification. If needed, you can always use folders to preserve specific sets of documents. When the new topic list is available, it becomes active and the previous topic list is discarded.