You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Gayatri Rao <rg...@gmail.com> on 2011/08/15 19:57:54 UTC

Help regarding HBaseStorage

Hi All,

I am trying to perform a join of some hbase tables in pig and I am using
HBaseStorage to load the data from hbase in pig .

I was able to load my data using HBaseStorage but I have one problem.  My
Hbase tables are large and contain historic data. Hence I want to load the
data in hbase for the last one hour or day. Is there a way I can do this? I
tried to read about HBaseStorage but couldnt find a way I can achieve this

Kindly suggest if this can be done.

Thanks
Gayatri

Re: Help regarding HBaseStorage

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
That would have to be *pre* pending, which causes problems (hotspots) on load.

It might be better to use timestamps (support work is underway) or to
design your schema such that you have separate columns for separate
epochs, and scan ranges of columns.

D

On Mon, Aug 15, 2011 at 11:26 AM, Gayatri Rao <rg...@gmail.com> wrote:
> Thanks Bill.
>
> My rowkeys currently are ids(alpha numeric)
> By rowkeys being time based did you mean appending the time stamp to the
> keys?
>
> Thanks for pointing out the jira issue I will check that also.
>
> -Gayatri
> On Mon, Aug 15, 2011 at 11:39 PM, Bill Graham <bi...@gmail.com> wrote:
>
>> If your rowKeys are time-based you can filter on them in the constructor
>> with the -lt and -gt params. If instead you want to filter by cell
>> timestamp, PIG-2114 us currently underway to support that, but it's not
>> there yet.
>>
>>
>> On Mon, Aug 15, 2011 at 10:57 AM, Gayatri Rao <rg...@gmail.com> wrote:
>>
>> > Hi All,
>> >
>> > I am trying to perform a join of some hbase tables in pig and I am using
>> > HBaseStorage to load the data from hbase in pig .
>> >
>> > I was able to load my data using HBaseStorage but I have one problem.  My
>> > Hbase tables are large and contain historic data. Hence I want to load
>> the
>> > data in hbase for the last one hour or day. Is there a way I can do this?
>> I
>> > tried to read about HBaseStorage but couldnt find a way I can achieve
>> this
>> >
>> > Kindly suggest if this can be done.
>> >
>> > Thanks
>> > Gayatri
>> >
>>
>

Re: Help regarding HBaseStorage

Posted by Gayatri Rao <rg...@gmail.com>.
Thanks Bill.

My rowkeys currently are ids(alpha numeric)
By rowkeys being time based did you mean appending the time stamp to the
keys?

Thanks for pointing out the jira issue I will check that also.

-Gayatri
On Mon, Aug 15, 2011 at 11:39 PM, Bill Graham <bi...@gmail.com> wrote:

> If your rowKeys are time-based you can filter on them in the constructor
> with the -lt and -gt params. If instead you want to filter by cell
> timestamp, PIG-2114 us currently underway to support that, but it's not
> there yet.
>
>
> On Mon, Aug 15, 2011 at 10:57 AM, Gayatri Rao <rg...@gmail.com> wrote:
>
> > Hi All,
> >
> > I am trying to perform a join of some hbase tables in pig and I am using
> > HBaseStorage to load the data from hbase in pig .
> >
> > I was able to load my data using HBaseStorage but I have one problem.  My
> > Hbase tables are large and contain historic data. Hence I want to load
> the
> > data in hbase for the last one hour or day. Is there a way I can do this?
> I
> > tried to read about HBaseStorage but couldnt find a way I can achieve
> this
> >
> > Kindly suggest if this can be done.
> >
> > Thanks
> > Gayatri
> >
>

Re: Help regarding HBaseStorage

Posted by Bill Graham <bi...@gmail.com>.
If your rowKeys are time-based you can filter on them in the constructor
with the -lt and -gt params. If instead you want to filter by cell
timestamp, PIG-2114 us currently underway to support that, but it's not
there yet.


On Mon, Aug 15, 2011 at 10:57 AM, Gayatri Rao <rg...@gmail.com> wrote:

> Hi All,
>
> I am trying to perform a join of some hbase tables in pig and I am using
> HBaseStorage to load the data from hbase in pig .
>
> I was able to load my data using HBaseStorage but I have one problem.  My
> Hbase tables are large and contain historic data. Hence I want to load the
> data in hbase for the last one hour or day. Is there a way I can do this? I
> tried to read about HBaseStorage but couldnt find a way I can achieve this
>
> Kindly suggest if this can be done.
>
> Thanks
> Gayatri
>