Similar documents (also known as near duplicates) are documents that share nearly all of the same text. This feature is not enabled by default in DISCO, but can be turned on for a database by contacting DISCO customer support.
Documents for which at least a 95% similarity has been found will show the icon in the search results, along with the number of similar documents found.
When viewing a document with near-duplicates, click the Similar button on the left sidebar to see a list of all similar documents. The following information is displayed in the sidebar: DISCO ID, filename, and an excerpt of first few words of the parent document. Click on any of the four documents to navigate to that document for reviewing, tagging, etc. Click the Back button in the upper left corner of the document viewer to return to the original document.
To search for similar documents, use the following syntax:
- similarcount( __ to __)
- * % similarcount(0) This would return any document that has at least one similar document.