You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by mrconk <ev...@gmail.com> on 2012/01/17 17:53:32 UTC

Hbase Images

How can I insert images into Hbase? Is it possible to use HBase to serve
images? For example: A user uploads an image to Hbase via our PHP
application (REST/Stargate interface). Also, what is the best way to serve
images in Hbase via img src path or URI? Is it possible to publish a URI
when we want to retrieve the image? In what format would you store the image
in Hbase (binary etc). I am looking for examples or detailed instructions. I
know this is possible but would like a tutorial or some sort of reference.
Any help would be greatly appreciated.
-- 
View this message in context: http://old.nabble.com/Hbase-Images-tp33155569p33155569.html
Sent from the HBase User mailing list archive at Nabble.com.


Re: Hbase Images

Posted by Jack Levin <ma...@gmail.com>.
images (jpgs) are bytes, there is no difference, you just need to add
appropriate http headers using nginx or any other proxy of choice and
put it on top of REST HBASE api.

-Jack

On Tue, Jan 17, 2012 at 10:11 AM, shashwat shriparv
<dw...@gmail.com> wrote:
> You can not store image as such rather you need to convert the image into
> bytes and then you can store it into hbase. One more thing you can do is
> that put the image information in hbase and put the image on hdfs file
> system.
>
> On Tue, Jan 17, 2012 at 11:37 PM, Stack <st...@duboce.net> wrote:
>
>> On Tue, Jan 17, 2012 at 8:53 AM, mrconk <ev...@gmail.com> wrote:
>> >
>> > How can I insert images into Hbase? Is it possible to use HBase to serve
>> > images? For example: A user uploads an image to Hbase via our PHP
>> > application (REST/Stargate interface). Also, what is the best way to
>> serve
>> > images in Hbase via img src path or URI? Is it possible to publish a URI
>> > when we want to retrieve the image? In what format would you store the
>> image
>> > in Hbase (binary etc). I am looking for examples or detailed
>> instructions. I
>> > know this is possible but would like a tutorial or some sort of
>> reference.
>> > Any help would be greatly appreciated.
>>
>> HBase takes arrays of bytes.  Just stuff the raw image bytes into an hbase
>> cell.
>>
>> Can't you serve out of hbase via hbase REST interface?  (This is what
>> the imageshack/yfrog folks do.  They have architecture that has
>> varnish caching the hbase images and inline w/ the request they'll
>> have imagemagick do size transforms:
>> http://www.slideshare.net/jacque74/hug-hbase-presentation)
>>
>> At SU we serve thumbnails from hbase.
>>
>> St.Ack
>>
>
>
>
> --
> Shashwat Shriparv

Re: HBase 0.92rc3 rest performance

Posted by Andrew Purtell <ap...@apache.org>.
Thanks Ben.

See https://issues.apache.org/jira/browse/HBASE-5228

Best regards,


  - Andy


Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
> From: Ben West <bw...@yahoo.com>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>; Andrew Purtell <ap...@apache.org> 
>Sent: Wednesday, January 18, 2012 7:18 AM
>Subject: Re: HBase 0.92rc3 rest performance
> 
>Thanks for the quick responses!
>
>No higher CPU or memory usage as far as I can tell. No WARNs. We're using the default Jersey (1.4) and Jetty (6.1.26).
>
>Yes, you only need to start up some number of concurrent GETs. Here is my test script, if that helps. It is quite simple (in ksh):
>
>maxId=274894038
>for i in {1..1000}
>do
>                # get a random number
>                hex=`dd if=/dev/urandom bs=1 count=8 2>/dev/null |
>                        od -tx1 | head -1 | cut -d' ' -f2- |
>                        tr -d ' ' | tr '[a-f]' '[A-F]'`
>                # convert from hexadecimal to decimal:
>                dec=`echo "ibase=16; $hex" | bc`
>                # echo >&2 "DEBUG: hex=<$hex>; dec=<$dec>"
>                dec=$(( dec % maxId))
>                #echo "$dec"   
>                start=$(date +%s%N)
>                curl -silent http://server.com:8080/table/$dec > /dev/null
>                elapsed=$(($(date +%s%N) - $start))
>                echo $elapsed
>done
>
>I have another script like
>
>for i in {1..100}
>do
>   ksh myScript >> logfile &
>done
>
>
>I will try jstack to see if I can find anything, thanks for the suggestion.
>
>
>----- Original Message -----
>From: Andrew Purtell <ap...@apache.org>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>
>Cc: 
>Sent: Tuesday, January 17, 2012 5:48 PM
>Subject: Re: HBase 0.92rc3 rest performance
>
>jstacks could help point out if the REST server has some internal lock or monitor contention.
>
>Internal to the REST server is just the HBase Java client. REST uses a HTablePool to manage a small pool of HTable instances that interact with the cluster according to the request at hand. Client changes could show up in REST but it would be odd (and possibly a REST bug) to see them only there.
>
>Other things to check:
>
>  - What version of Jetty?
>
>  - What version of Jersey?
>
>  - Any WARNs in logs from the REST server?
>
>Reproducing this would be as simple as starting up 50 or so concurrent fetches?
>
>Best regards,
>
>
>      - Andy
>
>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>
>
>----- Original Message -----
>> From: Jean-Daniel Cryans <jd...@apache.org>
>> To: user@hbase.apache.org
>> Cc: 
>> Sent: Tuesday, January 17, 2012 1:45 PM
>> Subject: Re: HBase 0.92rc3 rest performance
>> 
>>T his seems to single out the REST server since thrift and native
>> clients stayed the same.
>> 
>> Can you provide us your test so we can do testing on our side too?
>> 
>> Maybe doing a few jstacks on the REST server could point out the
>> obvious bottlenecks.
>> 
>> J-D
>> 
>> On Tue, Jan 17, 2012 at 11:25 AM, Ben West <bw...@yahoo.com> 
>> wrote:
>>>  Hi all
>>> 
>>>  We're trying out .92rc3 instead of .90.4, and for the most part 
>> everything seems fine. But we have a simple test of REST performance which is 
>> basically a large number of cURL jobs getting random rows, and this test is 
>> running *a lot* slower under .92.
>>> 
>>>  When we run just a single client doing REST GETs, the performance is fine. 
>> But once I have dozens or hundreds of clients, performance is ~20x worse than 
>> under .90.4 (response time is 7-800ms instead of 40-50ms).
>>> 
>>>  YCSB has pretty much the same performance under both versions, as do other 
>> internal tools measuring Thrift and native performance, so I don't feel like 
>> this is a problem with HBase coresetup (although it could be). I don't see 
>> anything suspicious in any logs, IO and CPU utilization are both low. Has anyone 
>> run into this or have thoughts on how to troubleshoot?
>>> 
>>>  Thanks!
>>>  Ben
>>
>
>
>
>

Re: HBase 0.92rc3 rest performance

Posted by Ben West <bw...@yahoo.com>.
Thanks for the quick responses!

No higher CPU or memory usage as far as I can tell. No WARNs. We're using the default Jersey (1.4) and Jetty (6.1.26).

Yes, you only need to start up some number of concurrent GETs. Here is my test script, if that helps. It is quite simple (in ksh):

maxId=274894038
for i in {1..1000}
do
                # get a random number
                hex=`dd if=/dev/urandom bs=1 count=8 2>/dev/null |
                        od -tx1 | head -1 | cut -d' ' -f2- |
                        tr -d ' ' | tr '[a-f]' '[A-F]'`
                # convert from hexadecimal to decimal:
                dec=`echo "ibase=16; $hex" | bc`
                # echo >&2 "DEBUG: hex=<$hex>; dec=<$dec>"
                dec=$(( dec % maxId))
                #echo "$dec"   
                start=$(date +%s%N)
                curl -silent http://server.com:8080/table/$dec > /dev/null
                elapsed=$(($(date +%s%N) - $start))
                echo $elapsed
done

I have another script like

for i in {1..100}
do
   ksh myScript >> logfile &
done


I will try jstack to see if I can find anything, thanks for the suggestion.


----- Original Message -----
From: Andrew Purtell <ap...@apache.org>
To: "user@hbase.apache.org" <us...@hbase.apache.org>
Cc: 
Sent: Tuesday, January 17, 2012 5:48 PM
Subject: Re: HBase 0.92rc3 rest performance

jstacks could help point out if the REST server has some internal lock or monitor contention.

Internal to the REST server is just the HBase Java client. REST uses a HTablePool to manage a small pool of HTable instances that interact with the cluster according to the request at hand. Client changes could show up in REST but it would be odd (and possibly a REST bug) to see them only there.

Other things to check:

  - What version of Jetty?

  - What version of Jersey?

  - Any WARNs in logs from the REST server?

Reproducing this would be as simple as starting up 50 or so concurrent fetches?

Best regards,


      - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


----- Original Message -----
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: user@hbase.apache.org
> Cc: 
> Sent: Tuesday, January 17, 2012 1:45 PM
> Subject: Re: HBase 0.92rc3 rest performance
> 
>T his seems to single out the REST server since thrift and native
> clients stayed the same.
> 
> Can you provide us your test so we can do testing on our side too?
> 
> Maybe doing a few jstacks on the REST server could point out the
> obvious bottlenecks.
> 
> J-D
> 
> On Tue, Jan 17, 2012 at 11:25 AM, Ben West <bw...@yahoo.com> 
> wrote:
>>  Hi all
>> 
>>  We're trying out .92rc3 instead of .90.4, and for the most part 
> everything seems fine. But we have a simple test of REST performance which is 
> basically a large number of cURL jobs getting random rows, and this test is 
> running *a lot* slower under .92.
>> 
>>  When we run just a single client doing REST GETs, the performance is fine. 
> But once I have dozens or hundreds of clients, performance is ~20x worse than 
> under .90.4 (response time is 7-800ms instead of 40-50ms).
>> 
>>  YCSB has pretty much the same performance under both versions, as do other 
> internal tools measuring Thrift and native performance, so I don't feel like 
> this is a problem with HBase coresetup (although it could be). I don't see 
> anything suspicious in any logs, IO and CPU utilization are both low. Has anyone 
> run into this or have thoughts on how to troubleshoot?
>> 
>>  Thanks!
>>  Ben
>


Re: HBase 0.92rc3 rest performance

Posted by Andrew Purtell <ap...@apache.org>.
jstacks could help point out if the REST server has some internal lock or monitor contention.

Internal to the REST server is just the HBase Java client. REST uses a HTablePool to manage a small pool of HTable instances that interact with the cluster according to the request at hand. Client changes could show up in REST but it would be odd (and possibly a REST bug) to see them only there.

Other things to check:

  - What version of Jetty?

  - What version of Jersey?

  - Any WARNs in logs from the REST server?

Reproducing this would be as simple as starting up 50 or so concurrent fetches?

Best regards,


      - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


----- Original Message -----
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: user@hbase.apache.org
> Cc: 
> Sent: Tuesday, January 17, 2012 1:45 PM
> Subject: Re: HBase 0.92rc3 rest performance
> 
>T his seems to single out the REST server since thrift and native
> clients stayed the same.
> 
> Can you provide us your test so we can do testing on our side too?
> 
> Maybe doing a few jstacks on the REST server could point out the
> obvious bottlenecks.
> 
> J-D
> 
> On Tue, Jan 17, 2012 at 11:25 AM, Ben West <bw...@yahoo.com> 
> wrote:
>>  Hi all
>> 
>>  We're trying out .92rc3 instead of .90.4, and for the most part 
> everything seems fine. But we have a simple test of REST performance which is 
> basically a large number of cURL jobs getting random rows, and this test is 
> running *a lot* slower under .92.
>> 
>>  When we run just a single client doing REST GETs, the performance is fine. 
> But once I have dozens or hundreds of clients, performance is ~20x worse than 
> under .90.4 (response time is 7-800ms instead of 40-50ms).
>> 
>>  YCSB has pretty much the same performance under both versions, as do other 
> internal tools measuring Thrift and native performance, so I don't feel like 
> this is a problem with HBase coresetup (although it could be). I don't see 
> anything suspicious in any logs, IO and CPU utilization are both low. Has anyone 
> run into this or have thoughts on how to troubleshoot?
>> 
>>  Thanks!
>>  Ben
> 

Re: HBase 0.92rc3 rest performance

Posted by Jean-Daniel Cryans <jd...@apache.org>.
This seems to single out the REST server since thrift and native
clients stayed the same.

Can you provide us your test so we can do testing on our side too?

Maybe doing a few jstacks on the REST server could point out the
obvious bottlenecks.

J-D

On Tue, Jan 17, 2012 at 11:25 AM, Ben West <bw...@yahoo.com> wrote:
> Hi all
>
> We're trying out .92rc3 instead of .90.4, and for the most part everything seems fine. But we have a simple test of REST performance which is basically a large number of cURL jobs getting random rows, and this test is running *a lot* slower under .92.
>
> When we run just a single client doing REST GETs, the performance is fine. But once I have dozens or hundreds of clients, performance is ~20x worse than under .90.4 (response time is 7-800ms instead of 40-50ms).
>
> YCSB has pretty much the same performance under both versions, as do other internal tools measuring Thrift and native performance, so I don't feel like this is a problem with HBase coresetup (although it could be). I don't see anything suspicious in any logs, IO and CPU utilization are both low. Has anyone run into this or have thoughts on how to troubleshoot?
>
> Thanks!
> Ben

Re: HBase 0.92rc3 rest performance

Posted by lars hofhansl <lh...@yahoo.com>.
Do you see different CPU/Memory utilization?
Where do things differ? Client or server?

Thanks.

-- Lars



----- Original Message -----
From: Ben West <bw...@yahoo.com>
To: "user@hbase.apache.org" <us...@hbase.apache.org>
Cc: 
Sent: Tuesday, January 17, 2012 11:25 AM
Subject: HBase 0.92rc3 rest performance

Hi all

We're trying out .92rc3 instead of .90.4, and for the most part everything seems fine. But we have a simple test of REST performance which is basically a large number of cURL jobs getting random rows, and this test is running *a lot* slower under .92.

When we run just a single client doing REST GETs, the performance is fine. But once I have dozens or hundreds of clients, performance is ~20x worse than under .90.4 (response time is 7-800ms instead of 40-50ms).

YCSB has pretty much the same performance under both versions, as do other internal tools measuring Thrift and native performance, so I don't feel like this is a problem with HBase coresetup (although it could be). I don't see anything suspicious in any logs, IO and CPU utilization are both low. Has anyone run into this or have thoughts on how to troubleshoot?

Thanks!
Ben


HBase 0.92rc3 rest performance

Posted by Ben West <bw...@yahoo.com>.
Hi all

We're trying out .92rc3 instead of .90.4, and for the most part everything seems fine. But we have a simple test of REST performance which is basically a large number of cURL jobs getting random rows, and this test is running *a lot* slower under .92.

When we run just a single client doing REST GETs, the performance is fine. But once I have dozens or hundreds of clients, performance is ~20x worse than under .90.4 (response time is 7-800ms instead of 40-50ms).

YCSB has pretty much the same performance under both versions, as do other internal tools measuring Thrift and native performance, so I don't feel like this is a problem with HBase coresetup (although it could be). I don't see anything suspicious in any logs, IO and CPU utilization are both low. Has anyone run into this or have thoughts on how to troubleshoot?

Thanks!
Ben

Re: Hbase Images

Posted by shashwat shriparv <dw...@gmail.com>.
You can not store image as such rather you need to convert the image into
bytes and then you can store it into hbase. One more thing you can do is
that put the image information in hbase and put the image on hdfs file
system.

On Tue, Jan 17, 2012 at 11:37 PM, Stack <st...@duboce.net> wrote:

> On Tue, Jan 17, 2012 at 8:53 AM, mrconk <ev...@gmail.com> wrote:
> >
> > How can I insert images into Hbase? Is it possible to use HBase to serve
> > images? For example: A user uploads an image to Hbase via our PHP
> > application (REST/Stargate interface). Also, what is the best way to
> serve
> > images in Hbase via img src path or URI? Is it possible to publish a URI
> > when we want to retrieve the image? In what format would you store the
> image
> > in Hbase (binary etc). I am looking for examples or detailed
> instructions. I
> > know this is possible but would like a tutorial or some sort of
> reference.
> > Any help would be greatly appreciated.
>
> HBase takes arrays of bytes.  Just stuff the raw image bytes into an hbase
> cell.
>
> Can't you serve out of hbase via hbase REST interface?  (This is what
> the imageshack/yfrog folks do.  They have architecture that has
> varnish caching the hbase images and inline w/ the request they'll
> have imagemagick do size transforms:
> http://www.slideshare.net/jacque74/hug-hbase-presentation)
>
> At SU we serve thumbnails from hbase.
>
> St.Ack
>



-- 
Shashwat Shriparv

Re: Hbase Images

Posted by Stack <st...@duboce.net>.
On Tue, Jan 17, 2012 at 8:53 AM, mrconk <ev...@gmail.com> wrote:
>
> How can I insert images into Hbase? Is it possible to use HBase to serve
> images? For example: A user uploads an image to Hbase via our PHP
> application (REST/Stargate interface). Also, what is the best way to serve
> images in Hbase via img src path or URI? Is it possible to publish a URI
> when we want to retrieve the image? In what format would you store the image
> in Hbase (binary etc). I am looking for examples or detailed instructions. I
> know this is possible but would like a tutorial or some sort of reference.
> Any help would be greatly appreciated.

HBase takes arrays of bytes.  Just stuff the raw image bytes into an hbase cell.

Can't you serve out of hbase via hbase REST interface?  (This is what
the imageshack/yfrog folks do.  They have architecture that has
varnish caching the hbase images and inline w/ the request they'll
have imagemagick do size transforms:
http://www.slideshare.net/jacque74/hug-hbase-presentation)

At SU we serve thumbnails from hbase.

St.Ack

Re: Hbase Images

Posted by Doug Meil <do...@explorysmedical.com>.
Hi there-

You probably want to start with this...

http://hbase.apache.org/book.html#supported.datatypes




On 1/17/12 11:53 AM, "mrconk" <ev...@gmail.com> wrote:

>
>How can I insert images into Hbase? Is it possible to use HBase to serve
>images? For example: A user uploads an image to Hbase via our PHP
>application (REST/Stargate interface). Also, what is the best way to serve
>images in Hbase via img src path or URI? Is it possible to publish a URI
>when we want to retrieve the image? In what format would you store the
>image
>in Hbase (binary etc). I am looking for examples or detailed
>instructions. I
>know this is possible but would like a tutorial or some sort of reference.
>Any help would be greatly appreciated.
>-- 
>View this message in context:
>http://old.nabble.com/Hbase-Images-tp33155569p33155569.html
>Sent from the HBase User mailing list archive at Nabble.com.
>
>