You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Alex Newman (JIRA)" <ji...@apache.org> on 2009/01/14 19:01:59 UTC
[jira] Created: (HBASE-1126) LZO COMPRESSION support
LZO COMPRESSION support
-----------------------
Key: HBASE-1126
URL: https://issues.apache.org/jira/browse/HBASE-1126
Project: Hadoop HBase
Issue Type: New Feature
Environment: All
Reporter: Alex Newman
It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1126) Enable choice of codec; i.e. at a
minimum enable LZO COMPRESSION support
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699250#action_12699250 ]
stack commented on HBASE-1126:
------------------------------
See http://code.google.com/p/hadoop-gpl-compression/.
> Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support
> ------------------------------------------------------------------------
>
> Key: HBASE-1126
> URL: https://issues.apache.org/jira/browse/HBASE-1126
> Project: Hadoop HBase
> Issue Type: New Feature
> Environment: All
> Reporter: Alex Newman
>
> It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-1126) Enable choice of codec; i.e. at a
minimum enable LZO COMPRESSION support
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-1126.
--------------------------
Resolution: Fixed
Fix Version/s: 0.20.0
Ryan made this work, and he doc'd it: http://wiki.apache.org/hadoop/UsingLzoCompression
> Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support
> ------------------------------------------------------------------------
>
> Key: HBASE-1126
> URL: https://issues.apache.org/jira/browse/HBASE-1126
> Project: Hadoop HBase
> Issue Type: New Feature
> Environment: All
> Reporter: Alex Newman
> Fix For: 0.20.0
>
>
> It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-1126) Enable choice of code; i.e. at a
minimum enable LZO COMPRESSION support
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-1126:
-------------------------
Summary: Enable choice of code; i.e. at a minimum enable LZO COMPRESSION support (was: LZO COMPRESSION support)
Broaden the issue to making it so users can choose codec.
Over in http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation#0_19_0, I tested the DefaultCodec (zlib, using native encoder/decoder) and found that random reads are horrid, writes just a bit slower and scans about the same. I wonder how lzo would change this? Perhaps scanning and writes would run as fast as non-compressed and random reads would come up close to non-compressed data? I did notice that block compression made for less regions -- about half -- and this was with the PE data which does its best to foil good compression.
> Enable choice of code; i.e. at a minimum enable LZO COMPRESSION support
> -----------------------------------------------------------------------
>
> Key: HBASE-1126
> URL: https://issues.apache.org/jira/browse/HBASE-1126
> Project: Hadoop HBase
> Issue Type: New Feature
> Environment: All
> Reporter: Alex Newman
>
> It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-1126) Enable choice of codec; i.e. at a
minimum enable LZO COMPRESSION support
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-1126:
-------------------------
Summary: Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support (was: Enable choice of code; i.e. at a minimum enable LZO COMPRESSION support)
> Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support
> ------------------------------------------------------------------------
>
> Key: HBASE-1126
> URL: https://issues.apache.org/jira/browse/HBASE-1126
> Project: Hadoop HBase
> Issue Type: New Feature
> Environment: All
> Reporter: Alex Newman
>
> It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1126) Enable choice of codec; i.e. at a
minimum enable LZO COMPRESSION support
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711284#action_12711284 ]
stack commented on HBASE-1126:
------------------------------
See HBASE-1379
> Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support
> ------------------------------------------------------------------------
>
> Key: HBASE-1126
> URL: https://issues.apache.org/jira/browse/HBASE-1126
> Project: Hadoop HBase
> Issue Type: New Feature
> Environment: All
> Reporter: Alex Newman
> Fix For: 0.20.0
>
>
> It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.