You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "elhoim gibor (JIRA)" <ji...@apache.org> on 2009/05/08 15:13:45 UTC

[jira] Created: (HADOOP-5793) High speed compression algorithm like BMDiff

High speed compression algorithm like BMDiff
--------------------------------------------

                 Key: HADOOP-5793
                 URL: https://issues.apache.org/jira/browse/HADOOP-5793
             Project: Hadoop Core
          Issue Type: New Feature
            Reporter: elhoim gibor
            Priority: Minor


Add a high speed compression algorithm like BMDiff.
It gives speeds ~100MB/s for writes and ~1000MB/s for reads, compressing 2.1billions web pages from 45.1TB in 4.2TB

Reference:
http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=437
2005 Jeff Dean talk about google architecture - around 46:00.

http://feedblog.org/2008/10/12/google-bigtable-compression-zippy-and-bmdiff/

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=755678

A reference implementation exists in HyperTable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.