Viewing and understanding Auto Review results and metrics – DISCO

Accessing Auto Review results

DISCO stores all tag suggestions (including historical runs) and the most recent reasons for each suggestion. DISCO does not maintain suggestions or reasons for deleted tags.

You can access tag suggestions and their reasons from the document list, the document viewer, filters, and search. To allay any confusion during review, historical suggestions are only accessible via search syntax and not through the viewer or document list.

From the document list

Ensure that the columns with Auto Review results are visible in your configurable doc list viewer.

Auto Review's most recent suggestions can be added to your custom document list view by selecting the fields Suggested as likely and Suggested as unlikely under Work Product/Tags (below). (See also: Creating custom views.)

Once you’ve added the columns to your view, Auto Review results will appear in the columns and bear the green or gray star icons, with suggestions accessible by hovering over the tag. If a human reviewer selects a tag, it will have a filled background to help visually track disagreements with Auto Review.

You can also export these fields to an XLSX file via DISCO's document list export feature. The resulting export will have a column containing the tag name, the suggestion (either yes or no), and the suggestion reason, separated by semicolons. If multiple tags have been Auto Reviewed for a document, double semicolons will separate each tag grouping.

From the document viewer

You can see Auto Review's most recent suggestions and the reasons for these suggestions by looking for the star icon while reviewing a document. If the icon is green, Auto Review has suggested the tag; if it is gray, Auto Review did not suggest the tag; and if it is not present, Auto Review did not evaluate the tag. You can view the suggestion reason by hovering over the icon.

Example: For the document below, Auto Review's suggestions are as follows:

Hot: There is no Auto Review job with the tag Hot for this document (no Auto Review suggestion / no icon).
Arthur Anderson: Auto Review suggests that this tag does not apply to this document (negative Auto Review suggestion / grey star).
Special Purpose Vehicles: Auto Review suggests that this tag does apply to this document (positive Auto Review suggestion / green star).

Review stage batches

In review stage batches, Auto Review results operate similar to tag predictions, showing both suggested and not suggested tags. In the below example, Raptor was positively suggested (green icons), while California Energy Market and LJM were reviewed but not suggested (gray icon). The user is hovering over the green star icon next to Raptor in order to see Auto Review's tag explanation.

Ad hoc review

When using the review decisions ("ad hoc review") panel, the panel is slightly different based on whether a human reviewer applied a given tag. Tags applied by a human – including those that were also evaluated by Auto Review – appear in the Tags box. Auto Review suggested tags that were not applied by a human appear under the Auto Review Suggestions box. To reduce clutter, Auto Review Suggestions only displays positively-suggested tags; tags that were evaluated but not suggested can be viewed by clicking into the Tags box.

Via filters

You can access Auto Review results across all jobs by using the Filters pane. Results are organized by tag group. Selecting multiple checkboxes within the "Auto Review Tag Suggestions" filter group will be connected with "OR". For example, in the screenshot below, the search will show 19 documents: any document where Auto Review considers the Hot tag to be Likely OR any document with an Auto Review suggestion (positive or negative) for any tag in the Privilege group.

Via searching

Finally, you can find all tag suggestions via searching, either by using the links from Auto Review jobs or by leveraging search syntax.

Auto Review job links

To use the links, go to Menu > Workflow > Auto Review to open up the Auto Review section. Click on Open on the top right of any Auto Review job to see results from a particular job.

This will open the Metrics tab for that particular job. For every tag, click the numbers under Suggested likely or Disagreements to open the documents where Auto Review suggestions are present for that tag. You can also click the numbers under the Decision comparison column (see below for more information).

In the screenshot below, Auto Review suggests that the tag Arthur Anderson should apply to 3 documents that users have also tagged with Arthur Anderson. There are 5 documents where Auto Review thinks Arthur Anderson should apply, but users either disagree or have not reviewed the documents, because the Arthur Anderson tag is not applied.

Search syntax

To input the syntax directly, insert aitagdecision(“Tag Name”, Y, “job identifier”) into the search bar, where “Tag Name” is the name of the tag in quotes, Y or N is whether or not the tag was suggested, and “job identifier” is the unique identifier Auto Review creates for the job (not the user-assigned job name).

For example, if you are looking for documents Auto Review suggested for the Arthur Anderson tag in the job number 5e817751-ff83-4c51-83a2-ecbabebe623a, you would enter the syntax:

aitagdecision("Arthur Anderson", Y, 5e817751-ff83-4c51-83a2-ecbabebe623a)

Accessing Auto Review metrics

Open an Auto Review job to see the metrics and instructions (case background and tag descriptions) for that job.

Metrics

The Metrics tab allows you to access overall review metrics and compare the suggestions provided by Auto Review to tags that have been applied by human reviewers.

Job details

Each job includes the number of documents submitted, start and end times, the person who ran the job, and the total runtime. Click Job details on the top right to access this information.

Documents overview

The Documents overview section contains search-clickable tallies for the documents that were successfully reviewed, that failed, and that were skipped.

We recommend manual review of documents that failed or were skipped. Failed documents could not be processed by Auto Review. Skipped documents do not have any text characters in them and could not be reviewed for this reason.

Overall metrics

The Overall metrics section provides the metrics for all of the tags that were Auto Reviewed in that job. Values that are averages are calculated by adding the value from each tag and dividing by the number of tags . Detailed definitions of each metric are below; you can also get information about each of the metrics by hovering over the Info icon next to the metric.

Agreement rate – This is the percentage of Auto Review’s decisions that agree with your team's decisions. This is an average value from all of the tags reviewed in this job.
Prevalence (user) – The percentage of documents for which a human reviewer believes at least one tag should apply. This is NOT an average value.
Prevalence (Auto Review) – The percentage of documents for which Auto Review believes at least one tag should apply. This is NOT an average value.
Precision – This is the rate at which Auto Review was correct when it positively suggested a tag. Accuracy is determined by tracking the human application of the suggested tag. This is an average value from all of the tags reviewed in this job.
Recall – This is the rate at which Auto Review suggested a tag present on a document. This is an average value from all of the tags reviewed in this job.

Tag metrics

This section allows you to see the results of Auto Review. Note that these metrics assume that the documents in this job have also been accurately reviewed and tagged manually for the purposes of comparison—e.g., if the tag California Energy Market is not applied to a document, Auto Review will assume this is because a human reviewer reviewed the document and determined it should not apply.

This section allows you to drill down into the metrics for each tag in a job. This section provides an error matrix and five percentages to help you understand and defend the effectiveness of your review.

Under Decision comparison, the error matrix is the comparison chart of Auto Review's suggestions versus the currently-applied tagging. All values in the red and green boxes are links that let you review the agreements and disagreements between Auto Review and human reviewers. Reviewing disagreements is particularly helpful in guiding you in honing your tag descriptions or adjusting your review process.

TP: true positives — documents where Auto Review suggested a tag for the document and the tag has been applied to the document
FP: false positives — documents where Auto Review suggested that a tag applies to the document but the tag has not been applied to the document
TN: true negatives — documents where Auto Review suggested a tag does not apply to the document and the tag has not been applied to the document
FN: false negatives — documents where Auto Review suggested a tag does not apply to the document but the tag has been applied to the document

This section also provides five percentages that are standard metrics for evaluating a review's accuracy, as detailed further below.

Prevalence (user) – This is the percentage of documents for which a human reviewer applied the tag divided by the total number of documents.
Prevalence (Auto Review) – This is the percentage of documents for which Auto Review suggested the tag divided by the total number of documents.
Precision – This measures how often Auto Review was correct for the documents in which it positively suggested the tag. This is calculated by dividing the agreement value in the top left (the true positives) by the sum of the top row (the true positives and the false positives).
Recall – This measures how often Auto Review suggested the tag within the set of documents that have the tag. This is calculated by dividing the true positives in the top left by the sum of the left column (or true positives and false negatives).
F-1 score – This is the harmonic mean of the precision and the recall. This is calculated by adding the reciprocals of the precision and the recall, dividing by 2, and then taking the reciprocal.

Instructions

The Instructions tab provides a record of the specific case background and tag descriptions that were used for that job. This allows you to compare the instructions used in your iteration process.