You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ninad Raut <hb...@gmail.com> on 2009/04/21 05:17:46 UTC

Using Yahoo Pig to to do adhoc querying on HBase

Hi,

As there is no easy way to query HBase, can pig be used to query HBase tables?
If so, can any one give me an example as to how to use it....


Regards,
Ninad.

Re: Using Yahoo Pig to to do adhoc querying on HBase

Posted by stack <st...@duboce.net>.
On using PIG and hbase, this is the issue:
https://issues.apache.org/jira/browse/PIG-6.  It was committed a while
back.  Have you tried referencing hbase in your PIG latin?
St.Ack

On Wed, Apr 22, 2009 at 9:34 AM, stack <st...@duboce.net> wrote:

> What do you need Ninad?
>
> There are general notes on running hbase MR jobs in the mapred package:
> http://hadoop.apache.org/hbase/docs/r0.19.1/api/org/apache/hadoop/hbase/mapred/package-summary.html.
> Discusses CLASSPATH on your hadoop cluster.
>
> Thereafter, running the BuildIndexTable job, after doing above suggested
> setup, looks like you'd do:
>
> ./bin/hadoop org.apache.hadoop.hbase.mapred.BuildIndexTable
>
> Will probably barf and tell you the command line options you are missing --
> probably target directory for lucene indices, the table to index and the
> columns in the table to consider.
>
> No harm studying the source code in absence of better documentation.
>
> IIRC, this job will build as many lucene indices as there are reducers.
>
> Start with a small table and a single reducer.  See how it goes.
>
> You'll have to copy the index out of hdfs to use it I'd imagine.
>
> St.Ack
>
>
> On Tue, Apr 21, 2009 at 11:12 PM, Ninad Raut <hb...@gmail.com>wrote:
>
>> Thanks JD.
>>
>> Can u help me with this one with a small code snippet, as an example
>> ?... There is not much dicussion on this .. vl require your help in
>> this.
>>
>> On 4/21/09, Jean-Daniel Cryans <jd...@apache.org> wrote:
>> > Ninad,
>> >
>> > There is no index apart from the primary key so you are right, out of
>> > the box HBase doesn't directly provide such a facility. You can either
>> > index your table by hand using preferably a MapReduce job or use the
>> > BuildTableIndex provided in mapred. See
>> >
>> http://hadoop.apache.org/hbase/docs/r0.19.1/api/org/apache/hadoop/hbase/mapred/BuildTableIndex.html
>> >
>> > J-D
>> >
>> > On Tue, Apr 21, 2009 at 5:04 AM, Ninad Raut <hbase.user.ninad@gmail.com
>> >
>> > wrote:
>> >> Suppose I have a Column Family called Status: which holds status
>> >> value. If I want to find how many values are there which have
>> >> Status=UNANALYSED I cannot do so using hbase shell.. is there any
>> >> other way???
>> >>
>> >> On 4/21/09, Billy Pearson <sa...@pearsonwholesale.com> wrote:
>> >>> there is multi ways to query hbase there's
>> >>> hbase shell
>> >>> thrift
>> >>> rest
>> >>> java api
>> >>> and I thank a few more.
>> >>>
>> >>> The easiest with out having to write code or anything would be hbase
>> >>> shell
>> >>> if just wanting to check manually if something is there or the value
>> of
>> >>> it.
>> >>>
>> >>> Billy
>> >>>
>> >>> "Ninad Raut" <hb...@gmail.com>
>> >>> wrote in message
>> >>> news:4d371b590904202017n400e09adte017208f79e2b164@mail.gmail.com...
>> >>>> Hi,
>> >>>>
>> >>>> As there is no easy way to query HBase, can pig be used to query
>> HBase
>> >>>> tables?
>> >>>> If so, can any one give me an example as to how to use it....
>> >>>>
>> >>>>
>> >>>> Regards,
>> >>>> Ninad.
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>
>

Re: Using Yahoo Pig to to do adhoc querying on HBase

Posted by stack <st...@duboce.net>.
What do you need Ninad?

There are general notes on running hbase MR jobs in the mapred package:
http://hadoop.apache.org/hbase/docs/r0.19.1/api/org/apache/hadoop/hbase/mapred/package-summary.html.
Discusses CLASSPATH on your hadoop cluster.

Thereafter, running the BuildIndexTable job, after doing above suggested
setup, looks like you'd do:

./bin/hadoop org.apache.hadoop.hbase.mapred.BuildIndexTable

Will probably barf and tell you the command line options you are missing --
probably target directory for lucene indices, the table to index and the
columns in the table to consider.

No harm studying the source code in absence of better documentation.

IIRC, this job will build as many lucene indices as there are reducers.

Start with a small table and a single reducer.  See how it goes.

You'll have to copy the index out of hdfs to use it I'd imagine.

St.Ack

On Tue, Apr 21, 2009 at 11:12 PM, Ninad Raut <hb...@gmail.com>wrote:

> Thanks JD.
>
> Can u help me with this one with a small code snippet, as an example
> ?... There is not much dicussion on this .. vl require your help in
> this.
>
> On 4/21/09, Jean-Daniel Cryans <jd...@apache.org> wrote:
> > Ninad,
> >
> > There is no index apart from the primary key so you are right, out of
> > the box HBase doesn't directly provide such a facility. You can either
> > index your table by hand using preferably a MapReduce job or use the
> > BuildTableIndex provided in mapred. See
> >
> http://hadoop.apache.org/hbase/docs/r0.19.1/api/org/apache/hadoop/hbase/mapred/BuildTableIndex.html
> >
> > J-D
> >
> > On Tue, Apr 21, 2009 at 5:04 AM, Ninad Raut <hb...@gmail.com>
> > wrote:
> >> Suppose I have a Column Family called Status: which holds status
> >> value. If I want to find how many values are there which have
> >> Status=UNANALYSED I cannot do so using hbase shell.. is there any
> >> other way???
> >>
> >> On 4/21/09, Billy Pearson <sa...@pearsonwholesale.com> wrote:
> >>> there is multi ways to query hbase there's
> >>> hbase shell
> >>> thrift
> >>> rest
> >>> java api
> >>> and I thank a few more.
> >>>
> >>> The easiest with out having to write code or anything would be hbase
> >>> shell
> >>> if just wanting to check manually if something is there or the value of
> >>> it.
> >>>
> >>> Billy
> >>>
> >>> "Ninad Raut" <hb...@gmail.com>
> >>> wrote in message
> >>> news:4d371b590904202017n400e09adte017208f79e2b164@mail.gmail.com...
> >>>> Hi,
> >>>>
> >>>> As there is no easy way to query HBase, can pig be used to query HBase
> >>>> tables?
> >>>> If so, can any one give me an example as to how to use it....
> >>>>
> >>>>
> >>>> Regards,
> >>>> Ninad.
> >>>>
> >>>
> >>>
> >>>
> >>
> >
>

Re: Using Yahoo Pig to to do adhoc querying on HBase

Posted by Ninad Raut <hb...@gmail.com>.
Thanks JD.

Can u help me with this one with a small code snippet, as an example
?... There is not much dicussion on this .. vl require your help in
this.

On 4/21/09, Jean-Daniel Cryans <jd...@apache.org> wrote:
> Ninad,
>
> There is no index apart from the primary key so you are right, out of
> the box HBase doesn't directly provide such a facility. You can either
> index your table by hand using preferably a MapReduce job or use the
> BuildTableIndex provided in mapred. See
> http://hadoop.apache.org/hbase/docs/r0.19.1/api/org/apache/hadoop/hbase/mapred/BuildTableIndex.html
>
> J-D
>
> On Tue, Apr 21, 2009 at 5:04 AM, Ninad Raut <hb...@gmail.com>
> wrote:
>> Suppose I have a Column Family called Status: which holds status
>> value. If I want to find how many values are there which have
>> Status=UNANALYSED I cannot do so using hbase shell.. is there any
>> other way???
>>
>> On 4/21/09, Billy Pearson <sa...@pearsonwholesale.com> wrote:
>>> there is multi ways to query hbase there's
>>> hbase shell
>>> thrift
>>> rest
>>> java api
>>> and I thank a few more.
>>>
>>> The easiest with out having to write code or anything would be hbase
>>> shell
>>> if just wanting to check manually if something is there or the value of
>>> it.
>>>
>>> Billy
>>>
>>> "Ninad Raut" <hb...@gmail.com>
>>> wrote in message
>>> news:4d371b590904202017n400e09adte017208f79e2b164@mail.gmail.com...
>>>> Hi,
>>>>
>>>> As there is no easy way to query HBase, can pig be used to query HBase
>>>> tables?
>>>> If so, can any one give me an example as to how to use it....
>>>>
>>>>
>>>> Regards,
>>>> Ninad.
>>>>
>>>
>>>
>>>
>>
>

Re: Using Yahoo Pig to to do adhoc querying on HBase

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Ninad,

There is no index apart from the primary key so you are right, out of
the box HBase doesn't directly provide such a facility. You can either
index your table by hand using preferably a MapReduce job or use the
BuildTableIndex provided in mapred. See
http://hadoop.apache.org/hbase/docs/r0.19.1/api/org/apache/hadoop/hbase/mapred/BuildTableIndex.html

J-D

On Tue, Apr 21, 2009 at 5:04 AM, Ninad Raut <hb...@gmail.com> wrote:
> Suppose I have a Column Family called Status: which holds status
> value. If I want to find how many values are there which have
> Status=UNANALYSED I cannot do so using hbase shell.. is there any
> other way???
>
> On 4/21/09, Billy Pearson <sa...@pearsonwholesale.com> wrote:
>> there is multi ways to query hbase there's
>> hbase shell
>> thrift
>> rest
>> java api
>> and I thank a few more.
>>
>> The easiest with out having to write code or anything would be hbase shell
>> if just wanting to check manually if something is there or the value of it.
>>
>> Billy
>>
>> "Ninad Raut" <hb...@gmail.com>
>> wrote in message
>> news:4d371b590904202017n400e09adte017208f79e2b164@mail.gmail.com...
>>> Hi,
>>>
>>> As there is no easy way to query HBase, can pig be used to query HBase
>>> tables?
>>> If so, can any one give me an example as to how to use it....
>>>
>>>
>>> Regards,
>>> Ninad.
>>>
>>
>>
>>
>

Re: Using Yahoo Pig to to do adhoc querying on HBase

Posted by Ninad Raut <hb...@gmail.com>.
Suppose I have a Column Family called Status: which holds status
value. If I want to find how many values are there which have
Status=UNANALYSED I cannot do so using hbase shell.. is there any
other way???

On 4/21/09, Billy Pearson <sa...@pearsonwholesale.com> wrote:
> there is multi ways to query hbase there's
> hbase shell
> thrift
> rest
> java api
> and I thank a few more.
>
> The easiest with out having to write code or anything would be hbase shell
> if just wanting to check manually if something is there or the value of it.
>
> Billy
>
> "Ninad Raut" <hb...@gmail.com>
> wrote in message
> news:4d371b590904202017n400e09adte017208f79e2b164@mail.gmail.com...
>> Hi,
>>
>> As there is no easy way to query HBase, can pig be used to query HBase
>> tables?
>> If so, can any one give me an example as to how to use it....
>>
>>
>> Regards,
>> Ninad.
>>
>
>
>

Re: Using Yahoo Pig to to do adhoc querying on HBase

Posted by Billy Pearson <sa...@pearsonwholesale.com>.
there is multi ways to query hbase there's
hbase shell
thrift
rest
java api
and I thank a few more.

The easiest with out having to write code or anything would be hbase shell 
if just wanting to check manually if something is there or the value of it.

Billy

"Ninad Raut" <hb...@gmail.com> 
wrote in message 
news:4d371b590904202017n400e09adte017208f79e2b164@mail.gmail.com...
> Hi,
>
> As there is no easy way to query HBase, can pig be used to query HBase 
> tables?
> If so, can any one give me an example as to how to use it....
>
>
> Regards,
> Ninad.
>