With DISCO sampling, you can search for a randomized subset of documents within your entire set or any subset to get an idea of what they contain. Here’s how it works:
In the DISCO search bar, enter one of the following new queries:
Sampling search syntax
- sample(0.1, “any query”) – Gives you 10% of search results set
- sample(500, “any query”) – Gives you a maximum of 500 documents from search results set
- sample(10%, “any query”) – Gives you 10% of search results set
- sample(10, tag(by price@csdisco.com)) – Gives you ten random documents tagged by price@csdisco.com
Within the parentheses, enter a valid search query. For example, to sample 10% of your entire database, enter the search sample(0.1,”!”), the exclamation point being a wildcard that instructs DISCO to search all documents. However, if you would like to search for a randomized 10% of all documents containing the word flowers, change the search to sample(0.1,flowers). To search a random 10% sample of the keywords budget, fraud, or inquiry with the custodian Jeff Skilling with a date range of 1/1/2000 - 1/1/2002, use sample(10%,("budget" "fraud" "inquiry") & custodian("Jeff Skilling") & date(1/1/2000 to 1/1/2002)).
Some important items to note:
- IF you search or refresh the page, it will give you a new random subset of documents.
- IF you give it a number that is more than the search results, it will give you the total number of search results.
- IF you use the sample query, sort order for search results is disabled.