You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Allan Feid <al...@gmail.com> on 2013/06/19 17:00:42 UTC

multi-threaded elasticsearch sink

I'm not that great at Java at the moment, but it appears that the single
threaded nature of the elasticsearch sink has trouble keeping up with ~5k
events/second at 2k batch size. It looks like the HDFS sink has the ability
to run multiple threads that write to the HDFS. I can get some performance
increase by adding multiple ElasticSearch sinks to simulate parallelism,
but it would be great for the sink itself to support multiple threads.

Is there a sink example that should be used as a guide towards getting the
same features in the elasticsearch sink?

Thanks,
Allan

Re: multi-threaded elasticsearch sink

Posted by Hari Shreedharan <hs...@cloudera.com>.
Technically, even the HDFS sink uses only one thread to write to HDFS. The Async Hbase Sink writes using multiple threads (though they are hidden away from the sink itself - it is in the underlying API).  


Cheers,
Hari


On Wednesday, June 19, 2013 at 11:17 AM, Roshan Naik wrote:

> take a look at hdfs sink.
> -roshan
> 
> 
> 
> On Wed, Jun 19, 2013 at 8:00 AM, Allan Feid <allanfeid@gmail.com (mailto:allanfeid@gmail.com)> wrote:
> > I'm not that great at Java at the moment, but it appears that the single threaded nature of the elasticsearch sink has trouble keeping up with ~5k events/second at 2k batch size. It looks like the HDFS sink has the ability to run multiple threads that write to the HDFS. I can get some performance increase by adding multiple ElasticSearch sinks to simulate parallelism, but it would be great for the sink itself to support multiple threads.
> > 
> > Is there a sink example that should be used as a guide towards getting the same features in the elasticsearch sink?
> > 
> > Thanks,
> > Allan
> > 
> > 
> 
> 
> 


Re: multi-threaded elasticsearch sink

Posted by Roshan Naik <ro...@hortonworks.com>.
take a look at hdfs sink.
-roshan


On Wed, Jun 19, 2013 at 8:00 AM, Allan Feid <al...@gmail.com> wrote:

> I'm not that great at Java at the moment, but it appears that the single
> threaded nature of the elasticsearch sink has trouble keeping up with ~5k
> events/second at 2k batch size. It looks like the HDFS sink has the ability
> to run multiple threads that write to the HDFS. I can get some performance
> increase by adding multiple ElasticSearch sinks to simulate parallelism,
> but it would be great for the sink itself to support multiple threads.
>
> Is there a sink example that should be used as a guide towards getting the
> same features in the elasticsearch sink?
>
> Thanks,
> Allan
>