You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Deepa Jayaveer <de...@tcs.com> on 2016/02/05 08:43:14 UTC

HBase --aggregation using MR

Hi 
   I am using Hbase 0.98 and my use case is to aggregate the data by 
fetching the records from HBase table. 
Need to fetch the filtered records based on the business scenario.

say, 
am storing 
  store_number -product_number- week - sales information  in HBase table. 
Row key is storeNumber-productNumber-weekId
say, store1 - product1 -week1 -$ 100
...
  store1000 -product100000-week52 -$200

use case: I need to aggregate the store sales data for the selected stores 
and selected products and for the selected weeks.

I thought of  writing  HBase Map reduce  but the challenge here is how to 
filter the selected store/product/weeks.
The filter columns is part of the row key and will rowfilter is the good 
option in terms of performance.

or shall  we go with normal HBase Java API to fetch the records?

can you please help to resolve this
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you



Re: HBase --aggregation using MR

Posted by Ted Yu <yu...@gmail.com>.
FuzzyRowFilter is performant.

I am not familiar with using Pig.

Going with Mapreduce should be fine.

On Sun, Feb 7, 2016 at 11:26 PM, Deepa Jayaveer <de...@tcs.com>
wrote:

> Thanks Ted but how about FuzzyRowFilter performance?
> Is it fine to go with  Java Map reduce or  PIG to get the desired output ?
>
>
>
>
>
> From:   Ted Yu <yu...@gmail.com>
> To:     "user@hbase.apache.org" <us...@hbase.apache.org>
> Date:   06-02-2016 00:55
> Subject:        Re: HBase --aggregation using MR
>
>
>
> Here is javadoc for RowFilter :
>  * This filter is used to filter based on the key. It takes an operator
>
>  * (equal, greater, not equal, etc) and a byte [] comparator for the row,
>
>  * and column qualifier portions of a key.
>
> I guess you would want flexibility with comparing part(s) of row key.
>
> Please take a look at FuzzyRowFilter and related unit test to see if it is
> better fit.
>
> On Thu, Feb 4, 2016 at 11:43 PM, Deepa Jayaveer <de...@tcs.com>
> wrote:
>
> > Hi
> >    I am using Hbase 0.98 and my use case is to aggregate the data by
> > fetching the records from HBase table.
> > Need to fetch the filtered records based on the business scenario.
> >
> > say,
> > am storing
> >   store_number -product_number- week - sales information  in HBase
> table.
> > Row key is storeNumber-productNumber-weekId
> > say, store1 - product1 -week1 -$ 100
> > ...
> >   store1000 -product100000-week52 -$200
> >
> > use case: I need to aggregate the store sales data for the selected
> stores
> > and selected products and for the selected weeks.
> >
> > I thought of  writing  HBase Map reduce  but the challenge here is how
> to
> > filter the selected store/product/weeks.
> > The filter columns is part of the row key and will rowfilter is the good
> > option in terms of performance.
> >
> > or shall  we go with normal HBase Java API to fetch the records?
> >
> > can you please help to resolve this
> > =====-----=====-----=====
> > Notice: The information contained in this e-mail
> > message and/or attachments to it may contain
> > confidential or privileged information. If you are
> > not the intended recipient, any dissemination, use,
> > review, distribution, printing or copying of the
> > information contained in this e-mail message
> > and/or attachments to it are strictly prohibited. If
> > you have received this communication in error,
> > please notify us by reply e-mail or telephone and
> > immediately and permanently delete the message
> > and any attachments. Thank you
> >
> >
> >
>
>

Re: HBase --aggregation using MR

Posted by Deepa Jayaveer <de...@tcs.com>.
Thanks Ted but how about FuzzyRowFilter performance? 
Is it fine to go with  Java Map reduce or  PIG to get the desired output ?





From:   Ted Yu <yu...@gmail.com>
To:     "user@hbase.apache.org" <us...@hbase.apache.org>
Date:   06-02-2016 00:55
Subject:        Re: HBase --aggregation using MR



Here is javadoc for RowFilter :
 * This filter is used to filter based on the key. It takes an operator

 * (equal, greater, not equal, etc) and a byte [] comparator for the row,

 * and column qualifier portions of a key.

I guess you would want flexibility with comparing part(s) of row key.

Please take a look at FuzzyRowFilter and related unit test to see if it is
better fit.

On Thu, Feb 4, 2016 at 11:43 PM, Deepa Jayaveer <de...@tcs.com>
wrote:

> Hi
>    I am using Hbase 0.98 and my use case is to aggregate the data by
> fetching the records from HBase table.
> Need to fetch the filtered records based on the business scenario.
>
> say,
> am storing
>   store_number -product_number- week - sales information  in HBase 
table.
> Row key is storeNumber-productNumber-weekId
> say, store1 - product1 -week1 -$ 100
> ...
>   store1000 -product100000-week52 -$200
>
> use case: I need to aggregate the store sales data for the selected 
stores
> and selected products and for the selected weeks.
>
> I thought of  writing  HBase Map reduce  but the challenge here is how 
to
> filter the selected store/product/weeks.
> The filter columns is part of the row key and will rowfilter is the good
> option in terms of performance.
>
> or shall  we go with normal HBase Java API to fetch the records?
>
> can you please help to resolve this
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>


Re: HBase --aggregation using MR

Posted by Ted Yu <yu...@gmail.com>.
Here is javadoc for RowFilter :

 * This filter is used to filter based on the key. It takes an operator

 * (equal, greater, not equal, etc) and a byte [] comparator for the row,

 * and column qualifier portions of a key.

I guess you would want flexibility with comparing part(s) of row key.

Please take a look at FuzzyRowFilter and related unit test to see if it is
better fit.

On Thu, Feb 4, 2016 at 11:43 PM, Deepa Jayaveer <de...@tcs.com>
wrote:

> Hi
>    I am using Hbase 0.98 and my use case is to aggregate the data by
> fetching the records from HBase table.
> Need to fetch the filtered records based on the business scenario.
>
> say,
> am storing
>   store_number -product_number- week - sales information  in HBase table.
> Row key is storeNumber-productNumber-weekId
> say, store1 - product1 -week1 -$ 100
> ...
>   store1000 -product100000-week52 -$200
>
> use case: I need to aggregate the store sales data for the selected stores
> and selected products and for the selected weeks.
>
> I thought of  writing  HBase Map reduce  but the challenge here is how to
> filter the selected store/product/weeks.
> The filter columns is part of the row key and will rowfilter is the good
> option in terms of performance.
>
> or shall  we go with normal HBase Java API to fetch the records?
>
> can you please help to resolve this
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>