You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flume.apache.org by Matt Wise <ma...@nextdoor.com> on 2014/04/10 18:02:11 UTC

Flume-NG ElasticSearch Sink Backing up @ Midnight...

We use Flume 1.4 to pass logs into HDFS as well as ElasticSearch for
storage. The pipeline looks roughly like this:

Client to Server Flow...
(local_app -> local_host_flume_agent) ---- AVRO/SSL ---->
(remote_flume_agent)...

Agent Server Flow ...
(inbound avro -> FC1 -> ElasticSearch)
(inbound avro -> FC2 -> S3/HDFS)


In the last week we've made a few changes and now we're seeing a bit of a
problem. We'e seen 3 different occurrences of a single flume agent server
node beginning to back up its FC1 channel indefinitely until we log in and
restart Flume entirely. The data just stops flowing -- we can't find any
errors in the logs on either the ES or Flume side. A simple restart of
Flume fixes it.

Our sink config looks like this:

> agent.sinks.elasticsearch.type =
> org.apache.flume.sink.elasticsearch.ElasticSearchSink
> agent.sinks.elasticsearch.hostNames = xxx:9300
> agent.sinks.elasticsearch.indexName = flume
> agent.sinks.elasticsearch.clusterName =
> flume-elasticsearch-production-useast1
> agent.sinks.elasticsearch.batchSize = 1000
> agent.sinks.elasticsearch.ttl = 30
> agent.sinks.elasticsearch.serializer =
> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer
> agent.sinks.elasticsearch.channel = fc-unstructured-es


This ONLY happens at Midnight, and only happens on one flume server. I'm
wondering whether it has to do with the time it takes our ES nodes to
create a new index ... and the first flume agent that triggers "index
creation" could be getting blocked or stuck?

Matt Wise
Sr. Systems Architect
Nextdoor.com

Re: Flume-NG ElasticSearch Sink Backing up @ Midnight...

Posted by Matt Wise <ma...@nextdoor.com>.

One additional thing.. we have two ES sinks actually pointing to the same
cluster. The config looks more like this actually:
(inbound avro -> FC1 -> ElasticSearch)
(inbound avro -> FC2 -> S3/HDFS)
(inbound avro_2 -> FC3 -> ElasticSearch)
(inbound avro_2 -> FC4 -> S3/HDFS)


Matt Wise
Sr. Systems Architect
Nextdoor.com


On Thu, Apr 10, 2014 at 9:02 AM, Matt Wise <ma...@nextdoor.com> wrote:

> We use Flume 1.4 to pass logs into HDFS as well as ElasticSearch for
> storage. The pipeline looks roughly like this:
>
> Client to Server Flow...
> (local_app -> local_host_flume_agent) ---- AVRO/SSL ---->
> (remote_flume_agent)...
>
> Agent Server Flow ...
> (inbound avro -> FC1 -> ElasticSearch)
> (inbound avro -> FC2 -> S3/HDFS)
>
>
> In the last week we've made a few changes and now we're seeing a bit of a
> problem. We'e seen 3 different occurrences of a single flume agent server
> node beginning to back up its FC1 channel indefinitely until we log in and
> restart Flume entirely. The data just stops flowing -- we can't find any
> errors in the logs on either the ES or Flume side. A simple restart of
> Flume fixes it.
>
> Our sink config looks like this:
>
>> agent.sinks.elasticsearch.type =
>> org.apache.flume.sink.elasticsearch.ElasticSearchSink
>> agent.sinks.elasticsearch.hostNames = xxx:9300
>> agent.sinks.elasticsearch.indexName = flume
>> agent.sinks.elasticsearch.clusterName =
>> flume-elasticsearch-production-useast1
>> agent.sinks.elasticsearch.batchSize = 1000
>> agent.sinks.elasticsearch.ttl = 30
>> agent.sinks.elasticsearch.serializer =
>> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer
>> agent.sinks.elasticsearch.channel = fc-unstructured-es
>
>
> This ONLY happens at Midnight, and only happens on one flume server. I'm
> wondering whether it has to do with the time it takes our ES nodes to
> create a new index ... and the first flume agent that triggers "index
> creation" could be getting blocked or stuck?
>
> Matt Wise
> Sr. Systems Architect
> Nextdoor.com
>