You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2019/02/07 18:19:00 UTC

[jira] [Commented] (HBASE-21852) Cannot get rows from hbase-rest when the rowkey contains any bytes above 0x7f

    [ https://issues.apache.org/jira/browse/HBASE-21852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762942#comment-16762942 ] 

Andrew Purtell commented on HBASE-21852:
----------------------------------------

I believe at some past point an upgrade of the Jetty dependencies changed how things work in this area. Note the exception:
{noformat}
WARN [qtp1473981203-37561] util.URIUtil: /emps/%00%00%00%00%00%00%03%FF/ org.eclipse.jetty.util.Utf8Appendable$NotUtf8Exception: Not valid UTF8! byte Ff in state 0{noformat}
The REST gateway was developed back when everything was org.mortbay.jetty....

So it sounds like binary key component handling needs to be redone end to end.

> Cannot get rows from hbase-rest when the rowkey contains any bytes above 0x7f
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-21852
>                 URL: https://issues.apache.org/jira/browse/HBASE-21852
>             Project: HBase
>          Issue Type: Bug
>          Components: REST
>            Reporter: Travis Hegner
>            Priority: Major
>
> I have a table that stores it's records with big-endian long (8 byte integer) rowkeys. I'd like to access this data via the hbase-rest api, but have come across an issue where I can't access every row that exists. For example:
> {{$ curl -v -H "Accept: application/json" "http://hbase-rest:8080/emps/%00%00%00%00%00%00%04%00/"}}
> Returns the expected row without issue. However
> {{$ curl -v -H "Accept: application/json" "http://hbase-rest:8080/emps/%00%00%00%00%00%00%03%FF/"}}
> Returns a {{404 Not Found}}, though I'm certain the record exists. The broken query also generates a log message on the rest server like this:
> {{WARN [qtp1473981203-37561] util.URIUtil: /emps/%00%00%00%00%00%00%03%FF/ org.eclipse.jetty.util.Utf8Appendable$NotUtf8Exception: Not valid UTF8! byte Ff in state 0}}
> Some troubleshooting and testing suggests that the error happens when any query contains an encoded byte above {{0x7f}}.
> I've [read|https://stackoverflow.com/a/31772127/2639647] that hbase-rest supports hex-escaped representation, like the shell, but that has not worked for me, and when looking through {{RowSpec.java}}, I don't see any indication that the {{parseRowKeys()}} method is attempting to parse the hex-escaped representation. Am I missing something here? Is the rest server supposed to support hex-escaped representation, and I'm not querying correctly?
> I've looked at version 0.98, and the current master branch, and the {{RowSpec.java}} source looks largely the same, so I don't believe this to even be a regression.
> I believe the error to be caused by {{java.net.urldecoder}}. I can only speculate, but would it be more appropriate to have a generic function that converts {{%XX}} strings directly to bytes, not relying on a specific {{Charset}}? Or perhaps some logic should be put into the parser to truly support the hex-escaped representation. Perhaps with a url parameter to indicate parsing as such, much like the shell requires using double quotes to indicate byte parsing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)