You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Vijay <vi...@gmail.com> on 2009/04/27 22:17:25 UTC

Hbase Analytical function

Hi,
I wanted to find do some calculation on a huge table which will have
millions of rows in it.... Whats the optimal way to do it? should i just
pass this to a map reduce job and it will take care of it? will the results
be stored in the memcached too? how abt the rows which i am intially
querying (usually the query for all the data from the date range). I am
trying to use chukwa with hbase as a database.

Regards,
</VJ>

Re: Hbase Analytical function

Posted by Vijay <vi...@gmail.com>.

Thanks Stack.... that helps.
Regards,
</VJ>




On Tue, Apr 28, 2009 at 5:06 PM, stack <st...@duboce.net> wrote:

> You talking about running the function inside hbase?  To do that you'd need
> to subclass hbase.  See for example the transactional hbase for an example.
>
> But I'd suggest doing stuff outside in a mapreduce job first.  If that
> proves too slow, then do something radical like the subclassing.
>
> St.Ack
>
> On Tue, Apr 28, 2009 at 3:52 PM, Vijay <vi...@gmail.com> wrote:
>
> > Thanks Stack,
> > Any Suggestions on the Alternative to MR (Using MR after fetching the
> > data)?
> > i mean in the Hbase to use custom functions insted of getting the whole
> set
> > of data and then doing the function like sum pr max? while fetching the
> > rows
> > in the same region server?
> > Regards,
> > </VJ>
> >
> >
> >
> >
> > On Tue, Apr 28, 2009 at 3:12 PM, stack <st...@duboce.net> wrote:
> >
> > >  reduce job when it is
> > > > fetching the data.... as u know it will be more faster ?
> > >
> >
>

Re: Hbase Analytical function

Posted by stack <st...@duboce.net>.

You talking about running the function inside hbase?  To do that you'd need
to subclass hbase.  See for example the transactional hbase for an example.

But I'd suggest doing stuff outside in a mapreduce job first.  If that
proves too slow, then do something radical like the subclassing.

St.Ack

On Tue, Apr 28, 2009 at 3:52 PM, Vijay <vi...@gmail.com> wrote:

> Thanks Stack,
> Any Suggestions on the Alternative to MR (Using MR after fetching the
> data)?
> i mean in the Hbase to use custom functions insted of getting the whole set
> of data and then doing the function like sum pr max? while fetching the
> rows
> in the same region server?
> Regards,
> </VJ>
>
>
>
>
> On Tue, Apr 28, 2009 at 3:12 PM, stack <st...@duboce.net> wrote:
>
> >  reduce job when it is
> > > fetching the data.... as u know it will be more faster ?
> >
>

Re: Hbase Analytical function

Posted by Vijay <vi...@gmail.com>.

Thanks Stack,
Any Suggestions on the Alternative to MR (Using MR after fetching the data)?
i mean in the Hbase to use custom functions insted of getting the whole set
of data and then doing the function like sum pr max? while fetching the rows
in the same region server?
Regards,
</VJ>

On Tue, Apr 28, 2009 at 3:12 PM, stack <st...@duboce.net> wrote:

>  reduce job when it is
> > fetching the data.... as u know it will be more faster ?
>

Re: Hbase Analytical function

Posted by stack <st...@duboce.net>.

MR = MapReduce.
St.Ack


On Tue, Apr 28, 2009 at 2:21 PM, Vijay <vi...@gmail.com> wrote:

> Hi,
> What is MR? I am trying to read the logs which is collected from the chukwa
> and parse and store all the logs into the Hbase table.... and when the user
> is trying to see the data i will just do a agreegation function which will
> be using the map/reduce framework.... i was thinking of putting back
> the aggregated data into the memcached so the later queries doesnt need to
> go to hbase even.....
>
> But the question is that..... do we have any framework where i can create
> my
> own agreegation function which will run on the map reduce job when it is
> fetching the data.... as u know it will be more faster ?
>
> If i dont have alternative then i was thinking of using the hbase to return
> me the result later i will use a map reduce kick another processing.....
>
> Any advice will be helpfull.... i am new to hbase and still trying to learn
> it....
>
> Regards,
> </VJ>
>
>
>
>
> On Mon, Apr 27, 2009 at 9:12 PM, stack <st...@duboce.net> wrote:
>
> > Try MR.  Read and write to HBase.  Results will be persisted to HDFS.
> >
> > Tell us more.  Hooking up Chukwa and HBase sounds interesting.
> >
> > St.Ack
> >
> > On Mon, Apr 27, 2009 at 1:17 PM, Vijay <vi...@gmail.com> wrote:
> >
> > > Hi,
> > > I wanted to find do some calculation on a huge table which will have
> > > millions of rows in it.... Whats the optimal way to do it? should i
> just
> > > pass this to a map reduce job and it will take care of it? will the
> > results
> > > be stored in the memcached too? how abt the rows which i am intially
> > > querying (usually the query for all the data from the date range). I am
> > > trying to use chukwa with hbase as a database.
> > >
> > > Regards,
> > > </VJ>
> > >
> >
>

Re: Hbase Analytical function

Posted by Vijay <vi...@gmail.com>.

Hi,
What is MR? I am trying to read the logs which is collected from the chukwa
and parse and store all the logs into the Hbase table.... and when the user
is trying to see the data i will just do a agreegation function which will
be using the map/reduce framework.... i was thinking of putting back
the aggregated data into the memcached so the later queries doesnt need to
go to hbase even.....

But the question is that..... do we have any framework where i can create my
own agreegation function which will run on the map reduce job when it is
fetching the data.... as u know it will be more faster ?

If i dont have alternative then i was thinking of using the hbase to return
me the result later i will use a map reduce kick another processing.....

Any advice will be helpfull.... i am new to hbase and still trying to learn
it....

Regards,
</VJ>

On Mon, Apr 27, 2009 at 9:12 PM, stack <st...@duboce.net> wrote:

> Try MR.  Read and write to HBase.  Results will be persisted to HDFS.
>
> Tell us more.  Hooking up Chukwa and HBase sounds interesting.
>
> St.Ack
>
> On Mon, Apr 27, 2009 at 1:17 PM, Vijay <vi...@gmail.com> wrote:
>
> > Hi,
> > I wanted to find do some calculation on a huge table which will have
> > millions of rows in it.... Whats the optimal way to do it? should i just
> > pass this to a map reduce job and it will take care of it? will the
> results
> > be stored in the memcached too? how abt the rows which i am intially
> > querying (usually the query for all the data from the date range). I am
> > trying to use chukwa with hbase as a database.
> >
> > Regards,
> > </VJ>
> >
>

Re: Hbase Analytical function

Posted by stack <st...@duboce.net>.

Try MR.  Read and write to HBase.  Results will be persisted to HDFS.

Tell us more.  Hooking up Chukwa and HBase sounds interesting.

St.Ack

On Mon, Apr 27, 2009 at 1:17 PM, Vijay <vi...@gmail.com> wrote:

> Hi,
> I wanted to find do some calculation on a huge table which will have
> millions of rows in it.... Whats the optimal way to do it? should i just
> pass this to a map reduce job and it will take care of it? will the results
> be stored in the memcached too? how abt the rows which i am intially
> querying (usually the query for all the data from the date range). I am
> trying to use chukwa with hbase as a database.
>
> Regards,
> </VJ>
>