You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Shumin Wu <sh...@gmail.com> on 2012/01/04 03:37:49 UTC

Is Flume NG good for production?

 Hi,

I have been using Flume 0.9.4/5 for a while. Now I am thinking of migrating
to Flume NG. But there are a fews things I would like to make sure before I
do that.

1. As my application needs flume to connect with a legacy MySql database, I
wrote my own JDBC channel plugin using the goodies from Spring framework.
Flume NG's JdbcChannelProviderImpl uses DBCP.  What is the best practice
say if I want to use my Spring jdbc channel? Does the current framework
allow me to write a plugin? Or should I just make a patch?

2.  How does NG's plugin framework differ from OG? I still want to use the
flume-plugin-hbasesink. Maybe I have missed something, but the page of
"Flume and HBase Integration" does not mention this. Neither does the Flume
NG document.

3. How good is Flume NG for a production environment in general? Anyone in
the community already tried it out in production already?

Thanks,

Shumin

flume + hive options?

Posted by Sushruth Puttaswamy <su...@nor1.com>.
Guys,

Whats the best known usage of flume with hive? Just curious to see what everyone is using. My requirements are standard..


 *   Currently writing logs onto HDFS from different production servers.
 *   Need to pre process the logs before writing onto hive.
 *   Need a way to merge the files generated by flume.

I see that there is a flume+hive sink plugin, but did not find much usage data on that. I could write a custom sink or a custom decorator to do the pre processing & then run every hour cron jobs to write data from HDFS to hive.

Any suggestions?

Sushruth

Re: Is Flume NG good for production?

Posted by Shumin Wu <sh...@gmail.com>.
Hi Eric,

Thanks for answering my questions. Your help is appreciated.

1. The JDBC channel I wrote for Flume is a plugin package including a
Decorator and a Sink, which are used to prepare and dump data to MySQL
database. Flume OG allows clients to write their own plugins through
extending EventSink.Base class and load them in runtime when FlumeMaster
gets started. The flume-plugin-helloworld is a demo of this feature. I see
that the AbstractSink replaces this EventSink class in NG, but I don't see
the corresponding plugin design in NG, which is the SinkBuilder in OG. If
NG can get that part in, that would be a lot easier for users to work with
their own sources/sinks.

2. Will the community consider integrating the HBase Sink to NG in the
roadmap, just as the HDFS sink? If so, can I open an JIRA for this feature?

3. Based on what you are saying, I think NG is a bit premature for
production. But I would like to do some load testing and share with the
community. Are there any existing results posted somewhere I could use as a
reference?

Thanks again for your information.


Shumin




On Tue, Jan 3, 2012 at 8:14 PM, Eric Sammer <es...@cloudera.com> wrote:

> On Tue, Jan 3, 2012 at 6:37 PM, Shumin Wu <sh...@gmail.com> wrote:
>
>>  Hi,
>>
>> I have been using Flume 0.9.4/5 for a while. Now I am thinking of
>> migrating to Flume NG. But there are a fews things I would like to make
>> sure before I do that.
>>
>> 1. As my application needs flume to connect with a legacy MySql database,
>> I wrote my own JDBC channel plugin using the goodies from Spring framework.
>> Flume NG's JdbcChannelProviderImpl uses DBCP.  What is the best practice
>> say if I want to use my Spring jdbc channel? Does the current framework
>> allow me to write a plugin? Or should I just make a patch?
>>
>
> In NG you wouldn't use a channel plugin. You'd write a sink that writes to
> MySQL where you can do anything you want. Flume's channel is just used for
> delivery from a source to a sink.
>
>
>> 2.  How does NG's plugin framework differ from OG? I still want to use
>> the flume-plugin-hbasesink. Maybe I have missed something, but the page of
>> "Flume and HBase Integration" does not mention this. Neither does the Flume
>> NG document.
>>
>
> Not sure what you mean by plugin framework, but Flume NG is API
> incompatible with OG sources and sinks. Specifically, no one has ported the
> HBase sink to the NG APIs yet.
>
>
>>
>> 3. How good is Flume NG for a production environment in general? Anyone
>> in the community already tried it out in production already?
>>
>
> It depends on what features you're looking for. Currently, NG is
> considered (at least by the active committers) alpha or beta quality. It's
> a developer preview release meant to let early adopters try it out and for
> plugin developers to get an idea of what the APIs will look like. Some
> features are more mature than others but I can't promise it's bug free nor
> has it seen exhaustive testing under very high load. I'm not sure if that
> answers your question.
>
> We (again, the committers, but also the community) are extremely
> interested in feedback. Maybe you could help test on low value data (if
> there is such a thing) where the potential for data loss is less scary.
>
>
>> Thanks,
>>
>> Shumin
>
>
>
>
> --
> Eric Sammer
> twitter: esammer
> data: www.cloudera.com
>

Re: Is Flume NG good for production?

Posted by Eric Sammer <es...@cloudera.com>.
On Tue, Jan 3, 2012 at 6:37 PM, Shumin Wu <sh...@gmail.com> wrote:

>  Hi,
>
> I have been using Flume 0.9.4/5 for a while. Now I am thinking of
> migrating to Flume NG. But there are a fews things I would like to make
> sure before I do that.
>
> 1. As my application needs flume to connect with a legacy MySql database,
> I wrote my own JDBC channel plugin using the goodies from Spring framework.
> Flume NG's JdbcChannelProviderImpl uses DBCP.  What is the best practice
> say if I want to use my Spring jdbc channel? Does the current framework
> allow me to write a plugin? Or should I just make a patch?
>

In NG you wouldn't use a channel plugin. You'd write a sink that writes to
MySQL where you can do anything you want. Flume's channel is just used for
delivery from a source to a sink.


> 2.  How does NG's plugin framework differ from OG? I still want to use the
> flume-plugin-hbasesink. Maybe I have missed something, but the page of
> "Flume and HBase Integration" does not mention this. Neither does the Flume
> NG document.
>

Not sure what you mean by plugin framework, but Flume NG is API
incompatible with OG sources and sinks. Specifically, no one has ported the
HBase sink to the NG APIs yet.


>
> 3. How good is Flume NG for a production environment in general? Anyone in
> the community already tried it out in production already?
>

It depends on what features you're looking for. Currently, NG is considered
(at least by the active committers) alpha or beta quality. It's a developer
preview release meant to let early adopters try it out and for plugin
developers to get an idea of what the APIs will look like. Some features
are more mature than others but I can't promise it's bug free nor has it
seen exhaustive testing under very high load. I'm not sure if that answers
your question.

We (again, the committers, but also the community) are extremely interested
in feedback. Maybe you could help test on low value data (if there is such
a thing) where the potential for data loss is less scary.


> Thanks,
>
> Shumin




-- 
Eric Sammer
twitter: esammer
data: www.cloudera.com