Searchable Audio and Video Transcriptions – DISCO

Overview

Unlock the content of your audio and video files with our new automated transcription feature. Upon ingest, spoken words are converted into searchable text, allowing you to find key information within AV files just as easily as with text documents—no extra steps needed. This feature integrates seamlessly into your review with keyword highlighting and the option to include transcripts in productions, helping you understand multimedia evidence faster and more efficiently. Furthermore, once transcribed, this text becomes fully queryable by Cecilia Q&A and is leveraged by our other AI-driven tools, including topic clustering, similar document detection, and predictive tagging, enriching your ability to uncover insights across all your data.

This feature is in Walk.

How to use searchable transcriptions

As a user who has access to ingest files, add audio and video files to your review database:
Once your ingest has completed, your audio and video files are transcribed as searchable text for you.
- Transcriptions happen post ingest and you'll need to wait until this has finished
The example below shows transcriptions not yet read as the Excerpt is blank
Whereas the below example shows you where transcriptions are ready for searching, now the Excerpt is not blank and is populated by text
In the above example you'll notice that the search bar has the word 'deposition' in it and DISCO has returned a variety of videos that contain this search word. If you open one up:
- Notice the transcription pane on the right. To see your search words, click on the text tab to the search word 'deposition' highlighted in the raw text version of the transcript.
In the regular view you can see the transcriptions pane:
- You'll notice that the transcript automatically scrolls to keep pace with the video.
You can click on timestamps associated with each speaker ID to automatically jump the video or audio file to that portion of the media. When you do so, automatic syncing of transcript to video is halted. But you can re-activate!
At the top of the transcription pane you can toggle this back on again and the syncing of transcription pane and video playback will continue.
Otherwise DISCO search will work exactly like text based documents do in the Ediscovery platform.
It's worth noting that this means, all AI features now work with audio and video files. This includes AI Predictive Tagging, Topic Clustering, Similar Documents and Cecilia Q&A ... let's take a look at how searchable transcriptions integrates with our generative AI in particular.
Open up a video in the document viewer on a database that has Cecilia skills enabled:
Change to the Cecilia summary tab as shown above and click 'Summarize this document'
- The AI generated summary of the video transcript is now available
Now change to Cecilia single doc Q&A tab as show below and ask a question:
- You can use Cecilia to ask questions about videos to help streamline your early case assessment of such media.
You can also ask Cecilia Q&A from Search & Review a question about the contents of video and audio files as shown below:
- In this case we asked "Are Beagles good dogs?" and Cecilia has answered the question affirmatively and provided source citations which we'll open and probe further:
Searching through videos and using the power of DISCO's generative AI Cecilia skills can be a transformative way to rapidly perform data analytics in early case assessment.
You can also translate AV transcriptions into any language:
Now switch to the translations area and translate it!

Searchable Audio and Video Transcriptions FAQ

Q: How are transcriptions generated?
A: They are generated by AI.

Q: What is the additional cost for this?
A: No additional cost, see below on limits.

Q: What limits are there for review database with transcriptions?
A: Every review database can transcribe up to 250,000 minutes cumulatively. As you approach this limit, someone from DISCO will reach out about next steps. We strive to avoid charging customers for overages here.

Q: How do I know if my audio/video file has been transcribed or is still processing?

A: You'll see status indicators for transcription progress. In Search & Review, the 'Excerpt' column will initially be blank for an AV file that is still transcribing; this column will then populate with text once the transcription is complete and ready for searching. Within the document viewer, if you open an AV file while it's processing, the system will also indicate that transcription is in progress.

Q: What is the additional cost for this?
A: No additional cost, see below on limits.

Q: What events automatically generate an AV transcription on a file?
A: An AV file is ingested or an AV file is overlaid specific to the native file.

Q: Can transcribed AV text be translated?
A: Yes.

Q: Are machine transcriptions included in batch printing?
A: Yes. Any machine transcriptions are included in a batch print if “Include text files” is checked.

Q: Are machine transcriptions included in productions?
A: Any machine transcriptions are included in a production if “Include text for audio/video files” is checked. This is a new checkbox option on the create-production page under “Other options”. It’s unchecked by default.

Q: Specifically, what text is produced?
A: We produce the index text, not the transcribed text. The index text is the text in the text tab in the doc viewer that is searchable and producible.

Q: What permissions are in place for mass transcriptions?
A: If you can ingest a file you can transcribe it. If you can kick off productions then you’ll also have permissions to opt in or out of including AV files with transcriptions. If you have any batch print permissions then you’ll also be able to batch print AV files with their transcribed text

Q: Any additional search syntax for transcribed audio and video files?
A: No, it just works like regular PDF for search syntax.

Q: What limits are there for individual transcriptions?
A: Maximum file length is 4 hours.

Q: What data spaces is this available in?
A: ECA and Active Review. Both of them!

Q: What about overlays and load file text, what do we do here?
A: When handling text for your audio and video (AV) files, our system prioritizes human-provided information. Any human-transcribed overlaid text or text included in a load file will take precedence over machine-generated transcriptions for searchability; specifically, load file text becomes the primary searchable content. While the machine-generated text will still be available for review in the dedicated transcription panel, if load file text exists for an AV file, it will be displayed first in the general 'text' tab, followed by any machine-transcribed text.

Q: If I provide text for an AV file through a load file during ingest, will DISCO still transcribe it? Which text will be searchable?
A: If you provide text via a load file for an AV file at ingest, that load file text will become the primary searchable content (Index Text). DISCO will prioritize your supplied text. While a machine transcription may still be generated and available in the transcription panel for reference, the load file text will be what's used for searching, keyword highlighting in the main text view, and productions (if you opt to include text).

Q: What happens if I ingest an AV file without any text in a load file?
A: If an AV file is ingested without accompanying load file text, DISCO will automatically initiate a machine transcription. Once completed, this transcription will become the searchable Index Text for that document.

Q: If an AV file is machine-transcribed during ingest, and I later overlay it with a load file that only contains new text (no new native AV file), what becomes the searchable text?
A: In this scenario, the text you provide in the overlay load file will replace the previous machine transcription as the searchable Index Text. The system prioritizes human-provided or explicitly updated text.

Q: If I overlay an existing AV file with a new native AV file, will it be re-transcribed? What if that overlay also includes load file text?
A: Yes, if you overlay an existing AV document with a new native AV file, a new machine transcription will be generated from the new native file. If that same overlay operation also includes text provided via a load file, then that load file text will be prioritized and become the searchable Index Text, while the new machine transcription will be available in the transcription panel. If the overlay with a new native does not include load file text, the new machine transcription will become the searchable Index Text.

Q: You mentioned that if I provide load file text, it becomes the searchable Index Text. Is there any way to force DISCO to use its machine transcription for searching instead if load file text already exists?
A: Currently, if load file text is provided for a document (either at ingest or via overlay), that load file text is treated as the authoritative text for searching, and there isn't a way to force the system to index the machine transcription instead for that document. The machine transcription would generally still be viewable in the transcription panel, but the load file text would be what's searched against.

Q: If I have OCR text, load file text, and a machine transcription for an AV document (perhaps through a series of updates or a complex ingest), which one will DISCO make searchable?
A: DISCO has a clear order of precedence: human-provided load file text is prioritized first for the searchable Index Text. If no load file text exists, then machine transcription (for AV files) or OCR text (for imaged documents) would be used. The system aims to use the most reliable or explicitly provided text source for search.

Q: How does DISCO handle text if I perform an image overlay on a document that was originally an AV file and had a machine transcription?
A: This is a specific scenario with distinct outcomes depending on your choice during the image overlay. Here’s what happens:

If you perform an image overlay and select "Replace document text with image OCR text" (AKA: Force Text Update): This is straightforward. The new OCR text generated from the image you just added will become the searchable text for the document. Any previous machine transcription from the AV file will be replaced by this new image text for search purposes.
If you perform an image overlay and do NOT select "Replace document text with image OCR text" (AKA: Retain Text): For an AV file that was previously machine-transcribed, choosing Retain Text when adding an image layer works a bit differently than it might for other document types.What happens: The system will ensure that the original machine transcription (the words spoken in your AV file) is preserved and remains the primary searchable text.
What this means for you: Even after overlaying an image and choosing Retain Text you will continue to search against the content of the original audio recording. The system prioritizes the spoken words in this scenario to ensure that valuable information from the audio is not lost or unintentionally replaced when an image representation is added.

Q: What video file formats are supported?
A: Supported video formats

Q: What audio file formats are supported?
A: Supported audio formats

Q: How do I get my matter in DISCO working with this?
A: File a ticket with DISCO Desk and they'll be happy to help migrate your matter for you!

Q: I've read this far and I really want to get in touch with DISCO product and engineering about this. How can I do that?
A: Our DM's are always open to you and we are with you in every case. Please contact freedman@csdisco.com as we would love to hear directly from you!

Related articles

Articles in this section