You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Randy Hu (JIRA)" <ji...@apache.org> on 2016/02/17 23:02:19 UTC
[jira] [Created] (HBASE-15287)
org.apache.hadoop.hbase.mapreduce.RowCounter returns incorrect result with
binary row key inputs
Randy Hu created HBASE-15287:
--------------------------------
Summary: org.apache.hadoop.hbase.mapreduce.RowCounter returns incorrect result with binary row key inputs
Key: HBASE-15287
URL: https://issues.apache.org/jira/browse/HBASE-15287
Project: HBase
Issue Type: Bug
Components: mapreduce, util
Affects Versions: 1.1.1
Reporter: Randy Hu
org.apache.hadoop.hbase.mapreduce.RowCounter takes optional start/end key as inputs (-range option). It would work only when the string representation of value is identical to the string. When row key is binary, the string representation of the value would look like this: "\x00\x01", which would be incorrect interpreted as 8 char string in the current implementation:
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java
To fix that, we need change how the value is converted from command line inputs:
Change
scan.setStartRow(Bytes.toBytes(startKey));
to
scan.setStartRow(Bytes.toBytesBinary(startKey));
Do the same conversion to end key as well.
The issue was discovered when the utility was used to calcualte row distribution on regions from table with binary row keys. The hbase:meta contains the start key of each region in format of above example.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)