You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/08/20 00:31:47 UTC

[jira] [Commented] (HBASE-8201) OrderedBytes: an ordered encoding strategy

    [ https://issues.apache.org/jira/browse/HBASE-8201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703853#comment-14703853 ] 

ASF GitHub Bot commented on HBASE-8201:
---------------------------------------

GitHub user spanchamiamapr opened a pull request:

    https://github.com/apache/drill/pull/122

    DRILL - 3492 - Add support for encoding of Drill data types into byte ordered format

    Description:
    This change allows encoding/decoding of data from/to 'double', 'float', 'bigint', and 'int' data types to/from OrderedBytes format. It also allows for OrderedByte encoded row-keys to be stored in ascending as well as descending order.
    
    The following JIRA added the OrderedBytes encoding to HBase:
    https://issues.apache.org/jira/browse/HBASE-8201
    
    This encoding scheme will preserve the sort-order of the native data-type when it is stored as sorted byte arrays on disk.
    Thus, it will help the HBase storage plugin if the row-keys have been encoded in OrderedBytes format.
    
    This functionality allows us to prune the scan ranges, thus reading much lesser data from the server.
    
    Testing Done:
    Added a new unit-test class TestOrderedBytesConvertFunctions.java which derives from TestConvertFunctions.java class. Also added new test cases to TestHBaseFilterPushDown class that will test if we were able to push-down filters correctly and if the results are correct.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/spanchamiamapr/drill master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/122.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #122
    
----
commit e3642de28a5f03c702433fe581819acced7847a7
Author: spanchamia <sp...@maprtech.com>
Date:   2015-07-29T23:59:31Z

    DRILL-3364: Prune scan range if the filter is on the leading  field with
    byte comparable encoding
    
    The change adds support to perform row-key range pruning when the
    row-key
    prefix is interpretted as UINT4_BE, TIMESTAMP_EPOCH_BE, TIME_EPOCH_BE,
    DATE_EPOCH_BE, UINT8_BE encoded.
    
    Testing Done: Added a unit-tests for the new feature, also ran all
    existing unit-tests to make sure there is no regression.

commit 70e1f3b2ca0410748b9872535bb205651e86d6c9
Author: spanchamia <sp...@maprtech.com>
Date:   2015-07-30T05:53:04Z

    DRILL-3492: Add support for encoding/decoding of to/from OrderedBytes
    format
    
    Description:
    This change allows encoding/decoding of data from/to 'double', 'float',
    'bigint', 'int' and 'utf8' data types to/from OrderedBytes format.
    It also allows for OrderedByte encoded row-keys to be stored in
    ascending as well as descending order.
    
    The following JIRA added the OrderedBytes encoding to HBase:
    https://issues.apache.org/jira/browse/HBASE-8201
    
    This encoding scheme will preserve the sort-order of the native
    data-type when it is stored as sorted byte arrays on disk.
    Thus, it will help the HBase storage plugin if the row-keys have been
    encoded in OrderedBytes format.
    
    This functionality allows us to prune the scan ranges, thus reading much
    lesser data from the server.
    
    Testing Done:
    Added a new unit-test class TestOrderedBytesConvertFunctions.java which
    derives from TestConvertFunctions.java class.
    Also add new test cases to TestHBaseFilterPushDown class that will test
    if we were able to push-down filters correctly and if the results are
    correct.

commit c9f8622b5cc0cf87dcdf88d73e608039556fedcb
Author: Smidth Panchamia <sp...@mapr.com>
Date:   2015-08-19T21:51:36Z

    Merge remote-tracking branch 'apache/master'
    
    Conflicts:
    	contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/CompareFunctionsProcessor.java
    	contrib/storage-hbase/src/test/java/org/apache/drill/hbase/HBaseTestsSuite.java
    	contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java
    	contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestTableGenerator.java

commit 2daacad4ca62e753bbcad7f3637512ca810ea491
Author: Smidth Panchamia <sp...@mapr.com>
Date:   2015-08-19T22:18:33Z

    DRILL-3492 - * Remove repeated allocations of byte arrays and PositionedByteRange objects on heap(as suggested by Jason).
    * Remove OrderedBytes encode/decode operations on UTF8 types.
    Reasons -
    1. These operations are slow and incur a lot of heap allocations
    2. UTF8 types maintain their natural sort order when stored as binary arrays.

commit 71b053006b587f39a47025302e7d3de8dcac482d
Author: Smidth Panchamia <sp...@mapr.com>
Date:   2015-08-19T22:27:02Z

    DRILL-3492 - Remove test code that creates test tables with UTF8 OrderedByte encoding.

----


> OrderedBytes: an ordered encoding strategy
> ------------------------------------------
>
>                 Key: HBASE-8201
>                 URL: https://issues.apache.org/jira/browse/HBASE-8201
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>            Reporter: Nick Dimiduk
>            Assignee: Smidth Panchamia
>             Fix For: 0.98.0, 0.95.2
>
>         Attachments: 0001-HBASE-8201-OrderedBytes-order-preserving-encoding.patch, 0001-HBASE-8201-OrderedBytes-order-preserving-encoding.patch, 0001-HBASE-8201-OrderedBytes-order-preserving-encoding.patch, 0001-HBASE-8201-OrderedBytes-order-preserving-encoding.patch, 0001-HBASE-8201-OrderedBytes-order-preserving-encoding.patch, 0001-HBASE-8201-OrderedBytes-order-preserving-encoding.patch, 0001-HBASE-8201-OrderedBytes-order-preserving-encoding.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch, 0001-HBASE-8201-OrderedBytes-provides-order-preserving-se.patch
>
>
> Once the spec is agreed upon, it must be implemented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)