Initial Run Through
With DISCO sampling, you can search for a randomized subset of documents within your entire set or any subset to get an idea of what they contain. Here’s how it works:
From the DISCO search bar, enter in one of the following new queries:
Sampling Search Syntax
- sample(0.1, “any query”) - gives you 10% of search results set
- sample(500, “any query”) - gives you a maximum of 500 documents from search results set
- sample(10%, “any query”) - gives you 10% of search results set
- sample(10, tag(by email@example.com)) to get ten random documents tagged by firstname.lastname@example.org.
Within the parenthesis, enter in any valid search query. For example, to sample 10% of your entire database, enter the search sample(0.1,”!”), the exclamation point being a wildcard that indicates “search all documents”. However, if you would like to search for a randomized 10% of all documents containing the word “flowers”, change the search to sample(0.1,flowers). To search a random 10% sample of the keywords budget, fraud, or inquiry within the custodian Jeff Skilling with a date range of 1/1/2000 - 1/1/2002, use sample(10%,("budget" "fraud" "inquiry") & custodian("Jeff Skilling") & date(1/1/2000 to 1/1/2002)).
Some important items to note:
- IF you search or refresh the page, it will give you a new random subset of documents.
- IF you give it a number that is more than the search results, it will give you the total number of search results.
- IF you use the sample query, sort order for search results is disabled.