You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Mohit Anchlia <mo...@gmail.com> on 2012/08/04 23:34:22 UTC

Real time reads

I am prototyping flume ng for real time sink to hbase and hdfs for
clickstream. I was wondering if Flume provides something that can read from
HBase too. That is API request comes to our web server and after we receive
the request we use flume to retrieve the result from HBase. Is flume ng
meant for this type of scenario?

Re: Real time reads

Posted by Hari Shreedharan <hs...@cloudera.com>.
Let me correct what I said. You could implement a source that implements PollableSource interface to actually pull data. We do have a source that does pull data, the ExecSource - though it pulls data from the output stream of a local process. You could write a source that connects to HBase and pulls data out. The implementation can be on the lines of the ExecSource, except that you should replace the polling logic. 

Hope this helps.


Thanks,
Hari

-- 
Hari Shreedharan


On Saturday, August 4, 2012 at 3:37 PM, Mohit Anchlia wrote:

> 
> 
> On Sat, Aug 4, 2012 at 3:28 PM, Patrick Wendell <pwendell@gmail.com (mailto:pwendell@gmail.com)> wrote:
> > Mohit,
> > 
> > This sounds like something where you would want your web tier to
> > directly access HBase through its existing client API.
> > 
> > Do you need functionality not offered there?
>  
> Yes that's how I am currently testing but I just thought it would be nice to use one framework for both read and write.
> > 
> > - Patrick
> > 
> > On Sat, Aug 4, 2012 at 2:49 PM, Hari Shreedharan
> > <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)> wrote:
> > > Mohit,
> > >
> > > Flume NG does not really have an HBase source. More importantly our sources
> > > do not actually pull data. It waits for data to be pushed to it, at least
> > > that is what all the current sources do. I don't know if it is a good idea
> > > to have a source proactively pull stuff from a source. But Flume is designed
> > > to make practically everything pluggable, so you can write your own source
> > > that does this.
> > >
> > >
> > > Hari
> > >
> > > --
> > > Hari Shreedharan
> > >
> > > On Saturday, August 4, 2012 at 2:34 PM, Mohit Anchlia wrote:
> > >
> > > I am prototyping flume ng for real time sink to hbase and hdfs for
> > > clickstream. I was wondering if Flume provides something that can read from
> > > HBase too. That is API request comes to our web server and after we receive
> > > the request we use flume to retrieve the result from HBase. Is flume ng
> > > meant for this type of scenario?
> > >
> > >
> 


Re: Real time reads

Posted by Mohit Anchlia <mo...@gmail.com>.
On Sat, Aug 4, 2012 at 3:28 PM, Patrick Wendell <pw...@gmail.com> wrote:

> Mohit,
>
> This sounds like something where you would want your web tier to
> directly access HBase through its existing client API.
>
> Do you need functionality not offered there?


Yes that's how I am currently testing but I just thought it would be nice
to use one framework for both read and write.

>
> - Patrick
>
> On Sat, Aug 4, 2012 at 2:49 PM, Hari Shreedharan
> <hs...@cloudera.com> wrote:
> > Mohit,
> >
> > Flume NG does not really have an HBase source. More importantly our
> sources
> > do not actually pull data. It waits for data to be pushed to it, at least
> > that is what all the current sources do. I don't know if it is a good
> idea
> > to have a source proactively pull stuff from a source. But Flume is
> designed
> > to make practically everything pluggable, so you can write your own
> source
> > that does this.
> >
> >
> > Hari
> >
> > --
> > Hari Shreedharan
> >
> > On Saturday, August 4, 2012 at 2:34 PM, Mohit Anchlia wrote:
> >
> > I am prototyping flume ng for real time sink to hbase and hdfs for
> > clickstream. I was wondering if Flume provides something that can read
> from
> > HBase too. That is API request comes to our web server and after we
> receive
> > the request we use flume to retrieve the result from HBase. Is flume ng
> > meant for this type of scenario?
> >
> >
>

Re: Real time reads

Posted by Patrick Wendell <pw...@gmail.com>.
Mohit,

This sounds like something where you would want your web tier to
directly access HBase through its existing client API.

Do you need functionality not offered there?

- Patrick

On Sat, Aug 4, 2012 at 2:49 PM, Hari Shreedharan
<hs...@cloudera.com> wrote:
> Mohit,
>
> Flume NG does not really have an HBase source. More importantly our sources
> do not actually pull data. It waits for data to be pushed to it, at least
> that is what all the current sources do. I don't know if it is a good idea
> to have a source proactively pull stuff from a source. But Flume is designed
> to make practically everything pluggable, so you can write your own source
> that does this.
>
>
> Hari
>
> --
> Hari Shreedharan
>
> On Saturday, August 4, 2012 at 2:34 PM, Mohit Anchlia wrote:
>
> I am prototyping flume ng for real time sink to hbase and hdfs for
> clickstream. I was wondering if Flume provides something that can read from
> HBase too. That is API request comes to our web server and after we receive
> the request we use flume to retrieve the result from HBase. Is flume ng
> meant for this type of scenario?
>
>

Re: Real time reads

Posted by Hari Shreedharan <hs...@cloudera.com>.
Mohit, 

Flume NG does not really have an HBase source. More importantly our sources do not actually pull data. It waits for data to be pushed to it, at least that is what all the current sources do. I don't know if it is a good idea to have a source proactively pull stuff from a source. But Flume is designed to make practically everything pluggable, so you can write your own source that does this.


Hari 

-- 
Hari Shreedharan


On Saturday, August 4, 2012 at 2:34 PM, Mohit Anchlia wrote:

> I am prototyping flume ng for real time sink to hbase and hdfs for clickstream. I was wondering if Flume provides something that can read from HBase too. That is API request comes to our web server and after we receive the request we use flume to retrieve the result from HBase. Is flume ng meant for this type of scenario? 
>