
Optimized Min Heap Based Similarity Detection For Delta Encoding
While Similarity based Delta Encoding has been used before, the algorithm described here uses a unique variant of the MinHash technique to compute a hash-based Similarity Index value for a data chunk. This value can then be compared to the values of the other chunks to detect similar chunks.
Publication Date
12 March 2013
Tags
data de-duplication MinHash delta encodingClick here to download link on ip.com
Click to share this page via your favorite social network.
What we are trying to do?
We are attempting to mobilize the creativity and innovative capacities of the Linux and broader open source community to codify the universe of preexisting inventions in defensive publications that upon publication in the IP.COM database will immediately serve as effective prior art that prevents anyone from having a patent issued that claims inventions that have already been document in a defensive publication. In addition to creating a vehicle to utilize this highly effective form of IP rights management for known inventions, it is hoped that the community will use defensive publications as a means of codifying future inventions should the inventors prefer not to make their invention the subject of a patent disclosure and application.