You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "leiwangouc@gmail.com" <le...@gmail.com> on 2014/08/11 13:14:20 UTC
How to get specific rowkey from hbase
Hi,
I have an input which has about 10M records,each recored is a rowkey in hbase.
How can i get these data from HBase with MapReduce job?
Thanks,
Lei
leiwangouc@gmail.com
Re: Re: How to get specific rowkey from hbase
Posted by Esteban Gutierrez <es...@cloudera.com>.
You can do that via
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#get(java.util.List).
You will basically point the HTable of via setTable in TableInputFormat to
the table with the new users for the time range you are looking and use the
result of to build the list that will be fed into HTableList<Get> but
instead of reading any data from the input split, you will be fetching data
via this list of new users. The same should be necessary for updating the
rows via HTable.put(List<Put>)
regards,
esteban.
--
Cloudera, Inc.
On Mon, Aug 11, 2014 at 6:10 AM, leiwangouc@gmail.com <le...@gmail.com>
wrote:
>
> Actually i mean how to do randomly get in MapReduce, not scan.
>
> Let me give a detailed description of my requirement:
> There's a Hbase table contais all the users(about 2G) we collected, and
> the rowkey is the user id.
> Every hour there comes some user info(5M~10M)
> For every coming user, get(HBase Get) the info from HBase, do a merge with
> the current hour info and put to HBase again. (If the user not exists in
> HBase, just consider this hour info)
>
> Now the getting step is done on one machine, i want to do it distributly
> with MapReduce.
>
>
>
> leiwangouc@gmail.com
>
> From: Shahab Yunus
> Date: 2014-08-11 20:10
> To: user@hbase.apache.org
> Subject: Re: How to get specific rowkey from hbase
> You can use the util classes provided already. Note that it won't be very
> fast and you might want to try out bulk import as well (especially if it is
> one time or rare occurrence.) It depends on your use case. Check out the
> documentation below:
>
> For the Map Reduce Hbase util:
> http://hbase.apache.org/book/mapreduce.example.html
>
> http://bigdataprocessing.wordpress.com/2012/07/27/hadoop-hbase-mapreduce-examples/
>
> For Hbase Bulk import:
>
> http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/
>
> Regards,
> Shahab
>
>
> On Mon, Aug 11, 2014 at 7:14 AM, leiwangouc@gmail.com <
> leiwangouc@gmail.com>
> wrote:
>
> >
> > Hi,
> >
> > I have an input which has about 10M records,each recored is a
> rowkey
> > in hbase.
> > How can i get these data from HBase with MapReduce job?
> >
> > Thanks,
> > Lei
> >
> >
> > leiwangouc@gmail.com
> >
>
Re: Re: How to get specific rowkey from hbase
Posted by "leiwangouc@gmail.com" <le...@gmail.com>.
Actually i mean how to do randomly get in MapReduce, not scan.
Let me give a detailed description of my requirement:
There's a Hbase table contais all the users(about 2G) we collected, and the rowkey is the user id.
Every hour there comes some user info(5M~10M)
For every coming user, get(HBase Get) the info from HBase, do a merge with the current hour info and put to HBase again. (If the user not exists in HBase, just consider this hour info)
Now the getting step is done on one machine, i want to do it distributly with MapReduce.
leiwangouc@gmail.com
From: Shahab Yunus
Date: 2014-08-11 20:10
To: user@hbase.apache.org
Subject: Re: How to get specific rowkey from hbase
You can use the util classes provided already. Note that it won't be very
fast and you might want to try out bulk import as well (especially if it is
one time or rare occurrence.) It depends on your use case. Check out the
documentation below:
For the Map Reduce Hbase util:
http://hbase.apache.org/book/mapreduce.example.html
http://bigdataprocessing.wordpress.com/2012/07/27/hadoop-hbase-mapreduce-examples/
For Hbase Bulk import:
http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/
Regards,
Shahab
On Mon, Aug 11, 2014 at 7:14 AM, leiwangouc@gmail.com <le...@gmail.com>
wrote:
>
> Hi,
>
> I have an input which has about 10M records,each recored is a rowkey
> in hbase.
> How can i get these data from HBase with MapReduce job?
>
> Thanks,
> Lei
>
>
> leiwangouc@gmail.com
>
Re: How to get specific rowkey from hbase
Posted by Shahab Yunus <sh...@gmail.com>.
You can use the util classes provided already. Note that it won't be very
fast and you might want to try out bulk import as well (especially if it is
one time or rare occurrence.) It depends on your use case. Check out the
documentation below:
For the Map Reduce Hbase util:
http://hbase.apache.org/book/mapreduce.example.html
http://bigdataprocessing.wordpress.com/2012/07/27/hadoop-hbase-mapreduce-examples/
For Hbase Bulk import:
http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/
Regards,
Shahab
On Mon, Aug 11, 2014 at 7:14 AM, leiwangouc@gmail.com <le...@gmail.com>
wrote:
>
> Hi,
>
> I have an input which has about 10M records,each recored is a rowkey
> in hbase.
> How can i get these data from HBase with MapReduce job?
>
> Thanks,
> Lei
>
>
> leiwangouc@gmail.com
>