You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nick Dimiduk (JIRA)" <ji...@apache.org> on 2013/09/17 00:19:51 UTC

[jira] [Updated] (HBASE-9549) KeyValue#parseColumn(byte[]) does not handle empty qualifier

     [ https://issues.apache.org/jira/browse/HBASE-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Dimiduk updated HBASE-9549:
--------------------------------

    Component/s:     (was: shell)
                 util
    Description: 
HTable allows a user to interact directly with a KeyValue with an empty qualifier, yet {{KeyValue#parseColumn(byte[])}} treats this as a reference to a column family. No qualifier delimiter and an empty qualifier are treated as the same:

{code}
    if (index == -1) {
      // If no delimiter, return array of size 1
      return new byte [][] { c };
    } else if(index == c.length - 1) {
      // Only a family, return array size 1
      byte [] family = new byte[c.length-1];
      System.arraycopy(c, 0, family, 0, family.length);
      return new byte [][] { family };
    }
    ...
{code}

This inconsistency breaks external interfaces which depend on {{parseColumn}}, for instance, the shell:

{noformat}

# shell interactions with KV with an empty qualifier

hbase(main):001:0> create 'foo', 'f1'
0 row(s) in 1.4130 seconds

=> Hbase::Table - foo
hbase(main):002:0> put 'foo', 'rk1', 'f1:', 'empty?'
0 row(s) in 0.0750 seconds # <= put works

hbase(main):003:0> put 'foo', 'rk1', 'f1:bar', 'value'
0 row(s) in 0.0070 seconds

# attempt to retrieve just the kv with empty qualifier

hbase(main):004:0> get 'foo', 'rk1', 'f1:'
COLUMN                                              CELL
 f1:                                                timestamp=1379363480020, value=empty?
 f1:bar                                             timestamp=1379363546360, value=value
2 row(s) in 0.0360 seconds # <= returns more than expected!

hbase(main):005:0> get 'foo', 'rk1', 'f1'
COLUMN                                              CELL
 f1:                                                timestamp=1379363480020, value=empty?
 f1:bar                                             timestamp=1379363546360, value=value
2 row(s) in 0.0120 seconds

hbase(main):006:0> delete 'foo', 'rk1', 'f1:'
0 row(s) in 0.0290 seconds # <= delete works

hbase(main):007:0> get 'foo', 'rk1', 'f1:'
COLUMN                                              CELL
 f1:bar                                             timestamp=1379363546360, value=value
1 row(s) in 0.0260 seconds

hbase(main):008:0> get 'foo', 'rk1', 'f1'
COLUMN                                              CELL
 f1:bar                                             timestamp=1379363546360, value=value
1 row(s) in 0.0080 seconds

# restore the empty qual kv for HTable test

hbase(main):011:0> put 'foo', 'rk1', 'f1:', 'empty?'
0 row(s) in 0.0950 seconds

hbase(main):010:0> get 'foo', 'rk1', 'f1:'
COLUMN                                              CELL
 f1:                                                timestamp=1379365262555, value=empty?
 f1:bar                                             timestamp=1379365134135, value=value
2 row(s) in 0.0290 seconds

hbase(main):011:0> get 'foo', 'rk1', 'f1'
COLUMN                                              CELL
 f1:                                                timestamp=1379365262555, value=empty?
 f1:bar                                             timestamp=1379365134135, value=value
2 row(s) in 0.0080 seconds

hbase(main):012:0> hconf = org.apache.hadoop.hbase.HBaseConfiguration.create()
=> #<Java::OrgApacheHadoopConf::Configuration:0x208e2fb5>
hbase(main):013:0> t = org.apache.hadoop.hbase.client.HTable.new(hconf,'foo')
=> #<Java::OrgApacheHadoopHbaseClient::HTable:0x437d51a6>

# create a Get requesting the empty qualifier only, works

hbase(main):014:0> g1 = org.apache.hadoop.hbase.client.Get.new(org.apache.hadoop.hbase.util.Bytes.toBytes('rk1'))
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x796523ab>
hbase(main):015:0> g1.addColumn(org.apache.hadoop.hbase.util.Bytes.toBytes('f1'), nil)
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x796523ab>
hbase(main):016:0> t.get(g1).toString()
=> "keyvalues={rk1/f1:/1379365262555/Put/vlen=6/mvcc=0}"

# create a Get requesting the whole family, works

hbase(main):017:0> g2 = org.apache.hadoop.hbase.client.Get.new(org.apache.hadoop.hbase.util.Bytes.toBytes('rk1'))
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x52e5376a>
hbase(main):018:0> g2.addFamily(org.apache.hadoop.hbase.util.Bytes.toBytes('f1'))
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x52e5376a>
hbase(main):019:0> t.get(g2).toString()
=> "keyvalues={rk1/f1:/1379365262555/Put/vlen=6/mvcc=0, rk1/f1:bar/1379365134135/Put/vlen=5/mvcc=0}"
{noformat}

  was:
The shell is commonly used as a simple way to validate assumptions about the hbase data model. In this particular case, the shell behaves slightly differently from the HTable interface.

{noformat}

# shell interactions with KV with an empty qualifier

hbase(main):001:0> create 'foo', 'f1'
0 row(s) in 1.4130 seconds

=> Hbase::Table - foo
hbase(main):002:0> put 'foo', 'rk1', 'f1:', 'empty?'
0 row(s) in 0.0750 seconds # <= put works

hbase(main):003:0> put 'foo', 'rk1', 'f1:bar', 'value'
0 row(s) in 0.0070 seconds

# attempt to retrieve just the kv with empty qualifier

hbase(main):004:0> get 'foo', 'rk1', 'f1:'
COLUMN                                              CELL
 f1:                                                timestamp=1379363480020, value=empty?
 f1:bar                                             timestamp=1379363546360, value=value
2 row(s) in 0.0360 seconds # <= returns more than expected!

hbase(main):005:0> get 'foo', 'rk1', 'f1'
COLUMN                                              CELL
 f1:                                                timestamp=1379363480020, value=empty?
 f1:bar                                             timestamp=1379363546360, value=value
2 row(s) in 0.0120 seconds

hbase(main):006:0> delete 'foo', 'rk1', 'f1:'
0 row(s) in 0.0290 seconds # <= delete works

hbase(main):007:0> get 'foo', 'rk1', 'f1:'
COLUMN                                              CELL
 f1:bar                                             timestamp=1379363546360, value=value
1 row(s) in 0.0260 seconds

hbase(main):008:0> get 'foo', 'rk1', 'f1'
COLUMN                                              CELL
 f1:bar                                             timestamp=1379363546360, value=value
1 row(s) in 0.0080 seconds

# restore the empty qual kv for HTable test

hbase(main):011:0> put 'foo', 'rk1', 'f1:', 'empty?'
0 row(s) in 0.0950 seconds

hbase(main):010:0> get 'foo', 'rk1', 'f1:'
COLUMN                                              CELL
 f1:                                                timestamp=1379365262555, value=empty?
 f1:bar                                             timestamp=1379365134135, value=value
2 row(s) in 0.0290 seconds

hbase(main):011:0> get 'foo', 'rk1', 'f1'
COLUMN                                              CELL
 f1:                                                timestamp=1379365262555, value=empty?
 f1:bar                                             timestamp=1379365134135, value=value
2 row(s) in 0.0080 seconds

hbase(main):012:0> hconf = org.apache.hadoop.hbase.HBaseConfiguration.create()
=> #<Java::OrgApacheHadoopConf::Configuration:0x208e2fb5>
hbase(main):013:0> t = org.apache.hadoop.hbase.client.HTable.new(hconf,'foo')
=> #<Java::OrgApacheHadoopHbaseClient::HTable:0x437d51a6>

# create a Get requesting the empty qualifier only, works

hbase(main):014:0> g1 = org.apache.hadoop.hbase.client.Get.new(org.apache.hadoop.hbase.util.Bytes.toBytes('rk1'))
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x796523ab>
hbase(main):015:0> g1.addColumn(org.apache.hadoop.hbase.util.Bytes.toBytes('f1'), nil)
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x796523ab>
hbase(main):016:0> t.get(g1).toString()
=> "keyvalues={rk1/f1:/1379365262555/Put/vlen=6/mvcc=0}"

# create a Get requesting the whole family, works

hbase(main):017:0> g2 = org.apache.hadoop.hbase.client.Get.new(org.apache.hadoop.hbase.util.Bytes.toBytes('rk1'))
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x52e5376a>
hbase(main):018:0> g2.addFamily(org.apache.hadoop.hbase.util.Bytes.toBytes('f1'))
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x52e5376a>
hbase(main):019:0> t.get(g2).toString()
=> "keyvalues={rk1/f1:/1379365262555/Put/vlen=6/mvcc=0, rk1/f1:bar/1379365134135/Put/vlen=5/mvcc=0}"
{noformat}

        Summary: KeyValue#parseColumn(byte[]) does not handle empty qualifier  (was: shell get command does not distinguish between family scan and empty qualifier)
    
> KeyValue#parseColumn(byte[]) does not handle empty qualifier
> ------------------------------------------------------------
>
>                 Key: HBASE-9549
>                 URL: https://issues.apache.org/jira/browse/HBASE-9549
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>            Priority: Minor
>
> HTable allows a user to interact directly with a KeyValue with an empty qualifier, yet {{KeyValue#parseColumn(byte[])}} treats this as a reference to a column family. No qualifier delimiter and an empty qualifier are treated as the same:
> {code}
>     if (index == -1) {
>       // If no delimiter, return array of size 1
>       return new byte [][] { c };
>     } else if(index == c.length - 1) {
>       // Only a family, return array size 1
>       byte [] family = new byte[c.length-1];
>       System.arraycopy(c, 0, family, 0, family.length);
>       return new byte [][] { family };
>     }
>     ...
> {code}
> This inconsistency breaks external interfaces which depend on {{parseColumn}}, for instance, the shell:
> {noformat}
> # shell interactions with KV with an empty qualifier
> hbase(main):001:0> create 'foo', 'f1'
> 0 row(s) in 1.4130 seconds
> => Hbase::Table - foo
> hbase(main):002:0> put 'foo', 'rk1', 'f1:', 'empty?'
> 0 row(s) in 0.0750 seconds # <= put works
> hbase(main):003:0> put 'foo', 'rk1', 'f1:bar', 'value'
> 0 row(s) in 0.0070 seconds
> # attempt to retrieve just the kv with empty qualifier
> hbase(main):004:0> get 'foo', 'rk1', 'f1:'
> COLUMN                                              CELL
>  f1:                                                timestamp=1379363480020, value=empty?
>  f1:bar                                             timestamp=1379363546360, value=value
> 2 row(s) in 0.0360 seconds # <= returns more than expected!
> hbase(main):005:0> get 'foo', 'rk1', 'f1'
> COLUMN                                              CELL
>  f1:                                                timestamp=1379363480020, value=empty?
>  f1:bar                                             timestamp=1379363546360, value=value
> 2 row(s) in 0.0120 seconds
> hbase(main):006:0> delete 'foo', 'rk1', 'f1:'
> 0 row(s) in 0.0290 seconds # <= delete works
> hbase(main):007:0> get 'foo', 'rk1', 'f1:'
> COLUMN                                              CELL
>  f1:bar                                             timestamp=1379363546360, value=value
> 1 row(s) in 0.0260 seconds
> hbase(main):008:0> get 'foo', 'rk1', 'f1'
> COLUMN                                              CELL
>  f1:bar                                             timestamp=1379363546360, value=value
> 1 row(s) in 0.0080 seconds
> # restore the empty qual kv for HTable test
> hbase(main):011:0> put 'foo', 'rk1', 'f1:', 'empty?'
> 0 row(s) in 0.0950 seconds
> hbase(main):010:0> get 'foo', 'rk1', 'f1:'
> COLUMN                                              CELL
>  f1:                                                timestamp=1379365262555, value=empty?
>  f1:bar                                             timestamp=1379365134135, value=value
> 2 row(s) in 0.0290 seconds
> hbase(main):011:0> get 'foo', 'rk1', 'f1'
> COLUMN                                              CELL
>  f1:                                                timestamp=1379365262555, value=empty?
>  f1:bar                                             timestamp=1379365134135, value=value
> 2 row(s) in 0.0080 seconds
> hbase(main):012:0> hconf = org.apache.hadoop.hbase.HBaseConfiguration.create()
> => #<Java::OrgApacheHadoopConf::Configuration:0x208e2fb5>
> hbase(main):013:0> t = org.apache.hadoop.hbase.client.HTable.new(hconf,'foo')
> => #<Java::OrgApacheHadoopHbaseClient::HTable:0x437d51a6>
> # create a Get requesting the empty qualifier only, works
> hbase(main):014:0> g1 = org.apache.hadoop.hbase.client.Get.new(org.apache.hadoop.hbase.util.Bytes.toBytes('rk1'))
> => #<Java::OrgApacheHadoopHbaseClient::Get:0x796523ab>
> hbase(main):015:0> g1.addColumn(org.apache.hadoop.hbase.util.Bytes.toBytes('f1'), nil)
> => #<Java::OrgApacheHadoopHbaseClient::Get:0x796523ab>
> hbase(main):016:0> t.get(g1).toString()
> => "keyvalues={rk1/f1:/1379365262555/Put/vlen=6/mvcc=0}"
> # create a Get requesting the whole family, works
> hbase(main):017:0> g2 = org.apache.hadoop.hbase.client.Get.new(org.apache.hadoop.hbase.util.Bytes.toBytes('rk1'))
> => #<Java::OrgApacheHadoopHbaseClient::Get:0x52e5376a>
> hbase(main):018:0> g2.addFamily(org.apache.hadoop.hbase.util.Bytes.toBytes('f1'))
> => #<Java::OrgApacheHadoopHbaseClient::Get:0x52e5376a>
> hbase(main):019:0> t.get(g2).toString()
> => "keyvalues={rk1/f1:/1379365262555/Put/vlen=6/mvcc=0, rk1/f1:bar/1379365134135/Put/vlen=5/mvcc=0}"
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira