You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ben West <bw...@yahoo.com> on 2011/10/02 22:15:39 UTC

Spaces disappear in HBase?

Hey all,

I'm running the standalone HBase server (0.90.4) and REST client (version 0.0.2). When I POST data and then GET it back, the data is changed; particularly the spaces seem to be removed. Does anyone know what's going on?

Here is a python script replicating my problem; I have a table named 'eipi' with a column family 'eipi':

#!/usr/bin/python

import sys
import urllib2
import simplejson


def getData(name, val):
cell = { 'Row': 
{'@key' : 'foo', 
'Cell': [{'@column': 'eipi:%s' % name, 
'$': val }] 
}
}
return simplejson.dumps(cell)

def sendData(key, colName, colVal):
opener = urllib2.build_opener()
url = 'http://localhost:8081/eipi/%s/eipi:%s' % (key, colName)
print colVal
req = urllib2.Request(url, 
headers = { 'Content-Type': 'application/json' },
data = getData(colName, colVal))
f = opener.open(req)
f.read()

def printData(key):
opener = urllib2.build_opener()
url = 'http://localhost:8081/eipi/%s' % key
req = urllib2.Request(url, 
headers = { 'Accept': 'application/json' })
f = opener.open(req)
parsed = simplejson.load(f)
print(parsed['Row'][0]['Cell'][0]['$'])

sendData('test','eipi:test','some stuff')
printData('test')


result:
> python getHBase.py 
some stuff
somestuf

(The space was removed, as well as a trailing 'f'...)

Thanks!
-Ben


Re: Spaces disappear in HBase?

Posted by Andrew Purtell <ap...@apache.org>.
Keys and values must be base64 encoded if using the JSON or XML representations. It's documented in the XML schema at least. I thought elsewhere, but if not then I agree there should be a clear discussion about it.
In the body of a request or response using XML or JSON representation, the row, column, and value elements/attributes must be base64 encoded, and will be base64 encoded in a response.

If using binary representation (content-type or accept of application/octet-stream) then the raw bytes should be passed in and will be passed out.

If using protobufs representation (content-type or accept of application/x-protobuf), then the row, column, and value should be passed in to the builder as-is. 


In the request URI, the row or column should be URLencoded instead, because this is HTTP...


Best regards,


       - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Ben West <bw...@yahoo.com>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>; Andrew Purtell <ap...@apache.org>
>Sent: Monday, October 10, 2011 6:04 AM
>Subject: Re: Spaces disappear in HBase?
>
>Thanks Andy!
>
>I do not see this in the wiki anywhere (http://wiki.apache.org/hadoop/Hbase/Stargate) - could we put it in? I'm not certain I know what exactly needs to be encoded: just values when you're inserting? How about the row names when you're scanning? (I've been having trouble with this.)
>
>-Ben
>
>
>----- Original Message -----
>From: Andrew Purtell <ap...@apache.org>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>; Ben West <bw...@yahoo.com>
>Cc: 
>Sent: Monday, October 3, 2011 6:50 PM
>Subject: Re: Spaces disappear in HBase?
>
>Keys and values need to be base64 encoded in all non-binary representations, XML and JSON currently.
> 
>Best regards,
>
>
>   - Andy
>
>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>
>
>>________________________________
>>From: Ben West <bw...@yahoo.com>
>>To: "user@hbase.apache.org" <us...@hbase.apache.org>
>>Sent: Sunday, October 2, 2011 1:15 PM
>>Subject: Spaces disappear in HBase?
>>
>>Hey all,
>>
>>I'm running the standalone HBase server (0.90.4) and REST client (version 0.0.2). When I POST data and then GET it back, the data is changed; particularly the spaces seem to be removed. Does anyone know what's going on?
>>
>>Here is a python script replicating my problem; I have a table named 'eipi' with a column family 'eipi':
>>
>>#!/usr/bin/python
>>
>>import sys
>>import urllib2
>>import simplejson
>>
>>
>>def getData(name, val):
>>cell = { 'Row': 
>>{'@key' : 'foo', 
>>'Cell': [{'@column': 'eipi:%s' % name, 
>>'$': val }] 
>>}
>>}
>>return simplejson.dumps(cell)
>>
>>def sendData(key, colName, colVal):
>>opener = urllib2.build_opener()
>>url = 'http://localhost:8081/eipi/%s/eipi:%s' % (key, colName)
>>print colVal
>>req = urllib2.Request(url, 
>>headers = { 'Content-Type': 'application/json' },
>>data = getData(colName, colVal))
>>f = opener.open(req)
>>f.read()
>>
>>def printData(key):
>>opener = urllib2.build_opener()
>>url = 'http://localhost:8081/eipi/%s' % key
>>req = urllib2.Request(url, 
>>headers = { 'Accept': 'application/json' })
>>f = opener.open(req)
>>parsed = simplejson.load(f)
>>print(parsed['Row'][0]['Cell'][0]['$'])
>>
>>sendData('test','eipi:test','some stuff')
>>printData('test')
>>
>>
>>result:
>>> python getHBase.py 
>>some stuff
>>somestuf
>>
>>(The space was removed, as well as a trailing 'f'...)
>>
>>Thanks!
>>-Ben
>>
>>
>>
>>
>
>
>

Re: Spaces disappear in HBase?

Posted by Ben West <bw...@yahoo.com>.
Thanks Andy!

I do not see this in the wiki anywhere (http://wiki.apache.org/hadoop/Hbase/Stargate) - could we put it in? I'm not certain I know what exactly needs to be encoded: just values when you're inserting? How about the row names when you're scanning? (I've been having trouble with this.)

-Ben


----- Original Message -----
From: Andrew Purtell <ap...@apache.org>
To: "user@hbase.apache.org" <us...@hbase.apache.org>; Ben West <bw...@yahoo.com>
Cc: 
Sent: Monday, October 3, 2011 6:50 PM
Subject: Re: Spaces disappear in HBase?

Keys and values need to be base64 encoded in all non-binary representations, XML and JSON currently.
 
Best regards,


   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Ben West <bw...@yahoo.com>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>
>Sent: Sunday, October 2, 2011 1:15 PM
>Subject: Spaces disappear in HBase?
>
>Hey all,
>
>I'm running the standalone HBase server (0.90.4) and REST client (version 0.0.2). When I POST data and then GET it back, the data is changed; particularly the spaces seem to be removed. Does anyone know what's going on?
>
>Here is a python script replicating my problem; I have a table named 'eipi' with a column family 'eipi':
>
>#!/usr/bin/python
>
>import sys
>import urllib2
>import simplejson
>
>
>def getData(name, val):
>cell = { 'Row': 
>{'@key' : 'foo', 
>'Cell': [{'@column': 'eipi:%s' % name, 
>'$': val }] 
>}
>}
>return simplejson.dumps(cell)
>
>def sendData(key, colName, colVal):
>opener = urllib2.build_opener()
>url = 'http://localhost:8081/eipi/%s/eipi:%s' % (key, colName)
>print colVal
>req = urllib2.Request(url, 
>headers = { 'Content-Type': 'application/json' },
>data = getData(colName, colVal))
>f = opener.open(req)
>f.read()
>
>def printData(key):
>opener = urllib2.build_opener()
>url = 'http://localhost:8081/eipi/%s' % key
>req = urllib2.Request(url, 
>headers = { 'Accept': 'application/json' })
>f = opener.open(req)
>parsed = simplejson.load(f)
>print(parsed['Row'][0]['Cell'][0]['$'])
>
>sendData('test','eipi:test','some stuff')
>printData('test')
>
>
>result:
>> python getHBase.py 
>some stuff
>somestuf
>
>(The space was removed, as well as a trailing 'f'...)
>
>Thanks!
>-Ben
>
>
>
>

Re: Spaces disappear in HBase?

Posted by Andrew Purtell <ap...@apache.org>.
Keys and values need to be base64 encoded in all non-binary representations, XML and JSON currently.
 
Best regards,


   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Ben West <bw...@yahoo.com>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>
>Sent: Sunday, October 2, 2011 1:15 PM
>Subject: Spaces disappear in HBase?
>
>Hey all,
>
>I'm running the standalone HBase server (0.90.4) and REST client (version 0.0.2). When I POST data and then GET it back, the data is changed; particularly the spaces seem to be removed. Does anyone know what's going on?
>
>Here is a python script replicating my problem; I have a table named 'eipi' with a column family 'eipi':
>
>#!/usr/bin/python
>
>import sys
>import urllib2
>import simplejson
>
>
>def getData(name, val):
>cell = { 'Row': 
>{'@key' : 'foo', 
>'Cell': [{'@column': 'eipi:%s' % name, 
>'$': val }] 
>}
>}
>return simplejson.dumps(cell)
>
>def sendData(key, colName, colVal):
>opener = urllib2.build_opener()
>url = 'http://localhost:8081/eipi/%s/eipi:%s' % (key, colName)
>print colVal
>req = urllib2.Request(url, 
>headers = { 'Content-Type': 'application/json' },
>data = getData(colName, colVal))
>f = opener.open(req)
>f.read()
>
>def printData(key):
>opener = urllib2.build_opener()
>url = 'http://localhost:8081/eipi/%s' % key
>req = urllib2.Request(url, 
>headers = { 'Accept': 'application/json' })
>f = opener.open(req)
>parsed = simplejson.load(f)
>print(parsed['Row'][0]['Cell'][0]['$'])
>
>sendData('test','eipi:test','some stuff')
>printData('test')
>
>
>result:
>> python getHBase.py 
>some stuff
>somestuf
>
>(The space was removed, as well as a trailing 'f'...)
>
>Thanks!
>-Ben
>
>
>
>