How are hash values used in deduplication?

The hash value of a document is a numeric value of a fixed length that uniquely identifies data. During ingestion, DISCO uses SHA1 hash to identical documents at the byte level. For a document to be deduplicated, all metadata has to be the same. If two or more instances have an identical hash value, they are deduplicated. 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request


Chat is online
Chat is woffline