Deduplication - What hash values are used?


Question: I know that Disco uses both Sha and Md5 hashes so can you confirm for me where these are used. One for dedupe and one for productions I think but which for which?

Answer: sha1 hash is exclusively used within our system. The reason for this is that the sha1 hash produces less collisions (i.e. non-unique keys).

According to our CTO, we support MD5 hash exclusively for compatibility with other systems.

