You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flume.apache.org by Tejinder Aulakh <te...@sharethis.com> on 2012/06/27 01:06:27 UTC

Running Flume-NG in multiple threads

We are using Flume-NG to transfer logs from the log servers (40)  to
collectors (4). However, now we are seeing a long delay before the logs
show up at the collectors.

I was wondering if there is any way of running Flume-NG using Multiple
threads so that collectors can process all the load they get form the log
servers. Is it configurable in flume-env.sh? How many threads does Flume-NG
use by default?

Thanks,
Tejinder

Re: Running Flume-NG in multiple threads

Posted by Mike Percy <mp...@cloudera.com>.

Can you paste in your config file please? I assume you are using only one
tier of collection? Please quantify long delay... Events should have very
low latency.

Regards,
Mike

On Tuesday, June 26, 2012, Tejinder Aulakh wrote:

> We are using Flume-NG to transfer logs from the log servers (40)  to
> collectors (4). However, now we are seeing a long delay before the logs
> show up at the collectors.
>
> I was wondering if there is any way of running Flume-NG using Multiple
> threads so that collectors can process all the load they get form the log
> servers. Is it configurable in flume-env.sh? How many threads does Flume-NG
> use by default?
>
> Thanks,
> Tejinder
>
>

Re: Running Flume-NG in multiple threads

Posted by Juhani Connolly <ju...@cyberagent.co.jp>.

It seems to be more a channel problem as replacing with a memory channel 
sped it up.

But yes, splitting the work over multiple disks should certainly help.

I only intended this as an example of how the delay may be caused by 
throughput not keeping up. Hopefully Tejinder could let us know what his 
config is like.

On 06/27/2012 11:01 AM, Mike Percy wrote:
> We are able to push>  8000 events/sec (2KB per event) through a single file channel if you put checkpoint on one disk and use 2 other disks for data dirs. Not sure what the limit is. This is using the latest trunk code. Other limitations may be you need to add additional sinks to your channel to drain it faster. This is because sinks are single threaded and sources are multithreaded.
>
> Mike
>
>
> On Tuesday, June 26, 2012 at 6:30 PM, Juhani Connolly wrote:
>
>> Depending on your setup, you may find that some channels just cannot
>> keep up with throughput. Is the timestamp on the logs gradually falling
>> further behind the time it appears on the collector? If so, some
>> component(likely the channels) are turning into a bottleneck.
>>
>> We have about 1000 events/sec running through to our collector and using
>> file channel it unfortunately could not keep up. If you're not already,
>> try running memory channels and see how that works.
>>
>> On 06/27/2012 08:06 AM, Tejinder Aulakh wrote:
>>> We are using Flume-NG to transfer logs from the log servers (40) to
>>> collectors (4). However, now we are seeing a long delay before the
>>> logs show up at the collectors.
>>>
>>> I was wondering if there is any way of running Flume-NG using Multiple
>>> threads so that collectors can process all the load they get form the
>>> log servers. Is it configurable in flume-env.sh (http://flume-env.sh)? How many threads does
>>> Flume-NG use by default?
>>>
>>> Thanks,
>>> Tejinder
>
>
>

Re: Running Flume-NG in multiple threads

Posted by Mike Percy <mp...@cloudera.com>.

Great! 


On Thursday, June 28, 2012 at 11:23 AM, Tejinder Aulakh wrote:

> Thanks Mike. Adding more sinks seems to have helped from what I'm noticing right now. 
> 
> Tejinder
> 
> On Wed, Jun 27, 2012 at 12:39 AM, Mike Percy <mpercy@cloudera.com (mailto:mpercy@cloudera.com)> wrote:
> > Maybe your parsing is taking some time?
> > 
> > Try adding multiple sinks or sink groups to the same channel on the app tier to get more throughput.
> > 
> > Regards,
> > Mike
> > 
> > 
> > On Tuesday, June 26, 2012 at 10:14 PM, Tejinder Aulakh wrote:
> > 
> > > Hi Mike,
> > > 
> > > Below is the config for agents and collectors.
> > > 
> > > CassandraSinkST2 - puts the events in Cassandra
> > > AvroSinkParserST2 - customized avro sink which parses the events before sending.
> > > 
> > > So you are saying if we simply add more sinks (CassandraSinkST2 on collectors) and connect them to the same channel, it would speed things up?
> > > 
> > > -Tejinder
> > > 
> > > Agent
> > > -------
> > > #ch source, channel, and sink to use
> > > agent.channels = myMemoryChannel
> > > agent.sources = myExecSource
> > > agent.sinks = collector1002sink collector1003sink stFileSink
> > > agent.sinkgroups = collectorGroup
> > > 
> > > # Define a memory channel called myMemoryChannel
> > > agent.channels.myMemoryChannel.type = memory
> > > agent.channels.myMemoryChannel.capacity = 1000000
> > > agent.channels.myMemoryChannel.transactionCapacity = 10000
> > > agent.channels.myMemoryChannel.keep-alive = 30
> > > 
> > > # Define an exec source called myExecChannel to tail log file
> > > agent.sources.myExecSource.channels = myMemoryChannel
> > > agent.sources.myExecSource.type = exec
> > > agent.sources.myExecSource.command = tail -F /mnt/nginx/r.log
> > > 
> > > # Define a custom avro sink called collector1003sink
> > > agent.sinks.collector1003sink.channel = myMemoryChannel
> > > agent.sinks.collector1003sink.type = com.sharethis.web.flume.AvroSinkParserST2 (http://web.flume.AvroSinkParserST2) (http://web.flume.AvroSinkParserST2)
> > > agent.sinks.collector1003sink.hostname = {PRIMARY_IP}
> > > agent.sinks.collector1003sink.port = 45678
> > > agent.sinks.collector1003sink.batch-size = 100
> > > 
> > > # Define a custom avro sink called collector1002sink
> > > agent.sinks.collector1002sink.channel = myMemoryChannel
> > > agent.sinks.collector1002sink.type = com.sharethis.web.flume.AvroSinkParserST2 (http://web.flume.AvroSinkParserST2) (http://web.flume.AvroSinkParserST2)
> > > agent.sinks.collector1002sink.hostname = {BACKUP_IP}
> > > agent.sinks.collector1002sink.port = 45678
> > > agent.sinks.collector1002sink.batch-size = 100
> > > 
> > > # logger sink called stFileSink
> > > agent.sinks.stFileSink.channel = myMemoryChannel
> > > agent.sinks.stFileSink.type = file_roll
> > > agent.sinks.stFileSink.sink.directory = /var/tmp/unprocessed_events/
> > > agent.sinks.stFileSink.type.sink.rollInterval = 1800
> > > 
> > > # configure sinkgroup agentGroup. sinks with higher priorities are run first
> > > agent.sinkgroups.collectorGroup.sinks = collector1003sink collector1002sink stFileSink
> > > agent.sinkgroups.collectorGroup.processor.type = failover
> > > agent.sinkgroups.collectorGroup.processor.priority.collector1003sink = 10
> > > agent.sinkgroups.collectorGroup.processor.priority.collector1002sink = 15
> > > agent.sinkgroups.collectorGroup.processor.priority.stFileSink = 5
> > > agent.sinkgroups.collectorGroup.processor.maxpenalty = 10000
> > > 
> > > 
> > > Collector
> > > -------------
> > > # Tell Flume which source, channel, and sink to use
> > > collector.channels = collectorMemoryChannel
> > > collector.sources = collectorAvroSource
> > > collector.sinks = collectorCassandraSink collectorFile
> > > collector.sinkgroups = collectorGroup
> > > 
> > > # Define a memory channel called collectorMemoryChannel
> > > collector.channels.collectorMemoryChannel.type = memory
> > > collector.channels.collectorMemoryChannel.capacity = 10000000
> > > collector.channels.collectorMemoryChannel.transactionCapacity = 100000
> > > collector.channels.collectorMemoryChannel.keep-alive = 30
> > > 
> > > # Define an exec source called collectorAvroSource
> > > collector.sources.collectorAvroSource.channels = collectorMemoryChannel
> > > collector.sources.collectorAvroSource.type = avro
> > > collector.sources.collectorAvroSource.bind = 0.0.0.0
> > > collector.sources.collectorAvroSource.port = 45678
> > > 
> > > # Define a custom sink called collectorCustomSink
> > > collector.sinks.collectorCassandraSink.channel = collectorMemoryChannel
> > > collector.sinks.collectorCassandraSink.type = com.sharethis.web.flume.CassandraSinkST2 (http://web.flume.CassandraSinkST2) (http://web.flume.CassandraSinkST2)
> > > collector.sinks.collectorCassandraSink.cassandraHostname = {CASSANDRA_HOSTNAME}
> > > collector.sinks.collectorCassandraSink.cassandraPort = 9160
> > > 
> > > # logger sink called collectorFile
> > > collector.sinks.collectorFile.channel = collectorMemoryChannel
> > > collector.sinks.collectorFile.type = file_roll
> > > collector.sinks.collectorFile.sink.directory = /var/tmp/unprocessed_events/
> > > collector.sinks.collectorFile.type.sink.rollInterval = 1800
> > > 
> > > # configure sinkgroup collectorGroup. sinks with higher priorities are run first
> > > collector.sinkgroups.collectorGroup.sinks = collectorCassandraSink collectorFile
> > > collector.sinkgroups.collectorGroup.processor.type = failover
> > > collector.sinkgroups.collectorGroup.processor.priority.collectorCassandraSink = 15
> > > collector.sinkgroups.collectorGroup.processor.priority.collectorFile = 5
> > > collector.sinkgroups.collectorGroup.processor.maxpenalty = 10000
> > > 
> > > 
> > > On Tue, Jun 26, 2012 at 7:01 PM, Mike Percy <mpercy@cloudera.com (mailto:mpercy@cloudera.com) (mailto:mpercy@cloudera.com)> wrote:
> > > > We are able to push > 8000 events/sec (2KB per event) through a single file channel if you put checkpoint on one disk and use 2 other disks for data dirs. Not sure what the limit is. This is using the latest trunk code. Other limitations may be you need to add additional sinks to your channel to drain it faster. This is because sinks are single threaded and sources are multithreaded.
> > > > 
> > > > Mike
> > > > 
> > > > 
> > > > On Tuesday, June 26, 2012 at 6:30 PM, Juhani Connolly wrote:
> > > > 
> > > > > Depending on your setup, you may find that some channels just cannot
> > > > > keep up with throughput. Is the timestamp on the logs gradually falling
> > > > > further behind the time it appears on the collector? If so, some
> > > > > component(likely the channels) are turning into a bottleneck.
> > > > > 
> > > > > We have about 1000 events/sec running through to our collector and using
> > > > > file channel it unfortunately could not keep up. If you're not already,
> > > > > try running memory channels and see how that works.
> > > > > 
> > > > > On 06/27/2012 08:06 AM, Tejinder Aulakh wrote:
> > > > > > We are using Flume-NG to transfer logs from the log servers (40) to
> > > > > > collectors (4). However, now we are seeing a long delay before the
> > > > > > logs show up at the collectors.
> > > > > > 
> > > > > > I was wondering if there is any way of running Flume-NG using Multiple
> > > > > > threads so that collectors can process all the load they get form the
> > > > > > log servers. Is it configurable in flume-env.sh (http://flume-env.sh) (http://flume-env.sh) (http://flume-env.sh)? How many threads does
> > > > > > Flume-NG use by default?
> > > > > > 
> > > > > > Thanks,
> > > > > > Tejinder
> > > > > 
> > > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > --
> > > Tejinder Aulakh
> > > Senior Software Engineer, ShareThis
> > > e: tejinder@sharethis.com (mailto:tejinder@sharethis.com) (mailto:tejinder@sharethis.com)
> > > m: 510.708.2499 (tel:510.708.2499)
> > > 
> > > (http://pinterest.com/sharethis/)
> > > Learn More: SQI (Social Quality Index) - A Universal Measure of Social Quality (http://sharethis.com/sqi)
> > 
> 
> 
> 
> 
> -- 
> Tejinder Aulakh
> Senior Software Engineer, ShareThis
> e: tejinder@sharethis.com (mailto:tejinder@sharethis.com)
> m: 510.708.2499
> 
> (http://pinterest.com/sharethis/)
> Learn More: SQI (Social Quality Index) - A Universal Measure of Social Quality (http://sharethis.com/sqi) 
> 
> 
> 
>

Re: Running Flume-NG in multiple threads

Posted by Tejinder Aulakh <te...@sharethis.com>.

Thanks Mike. Adding more sinks seems to have helped from what I'm noticing
right now.

Tejinder

On Wed, Jun 27, 2012 at 12:39 AM, Mike Percy <mp...@cloudera.com> wrote:

> Maybe your parsing is taking some time?
>
> Try adding multiple sinks or sink groups to the same channel on the app
> tier to get more throughput.
>
> Regards,
> Mike
>
>
> On Tuesday, June 26, 2012 at 10:14 PM, Tejinder Aulakh wrote:
>
> > Hi Mike,
> >
> > Below is the config for agents and collectors.
> >
> > CassandraSinkST2 - puts the events in Cassandra
> > AvroSinkParserST2 - customized avro sink which parses the events before
> sending.
> >
> > So you are saying if we simply add more sinks (CassandraSinkST2 on
> collectors) and connect them to the same channel, it would speed things up?
> >
> > -Tejinder
> >
> > Agent
> > -------
> > #ch source, channel, and sink to use
> > agent.channels = myMemoryChannel
> > agent.sources = myExecSource
> > agent.sinks = collector1002sink collector1003sink stFileSink
> > agent.sinkgroups = collectorGroup
> >
> > # Define a memory channel called myMemoryChannel
> > agent.channels.myMemoryChannel.type = memory
> > agent.channels.myMemoryChannel.capacity = 1000000
> > agent.channels.myMemoryChannel.transactionCapacity = 10000
> > agent.channels.myMemoryChannel.keep-alive = 30
> >
> > # Define an exec source called myExecChannel to tail log file
> > agent.sources.myExecSource.channels = myMemoryChannel
> > agent.sources.myExecSource.type = exec
> > agent.sources.myExecSource.command = tail -F /mnt/nginx/r.log
> >
> > # Define a custom avro sink called collector1003sink
> > agent.sinks.collector1003sink.channel = myMemoryChannel
> > agent.sinks.collector1003sink.type =
> com.sharethis.web.flume.AvroSinkParserST2 (
> http://web.flume.AvroSinkParserST2)
> > agent.sinks.collector1003sink.hostname = {PRIMARY_IP}
> > agent.sinks.collector1003sink.port = 45678
> > agent.sinks.collector1003sink.batch-size = 100
> >
> > # Define a custom avro sink called collector1002sink
> > agent.sinks.collector1002sink.channel = myMemoryChannel
> > agent.sinks.collector1002sink.type =
> com.sharethis.web.flume.AvroSinkParserST2 (
> http://web.flume.AvroSinkParserST2)
> > agent.sinks.collector1002sink.hostname = {BACKUP_IP}
> > agent.sinks.collector1002sink.port = 45678
> > agent.sinks.collector1002sink.batch-size = 100
> >
> > # logger sink called stFileSink
> > agent.sinks.stFileSink.channel = myMemoryChannel
> > agent.sinks.stFileSink.type = file_roll
> > agent.sinks.stFileSink.sink.directory = /var/tmp/unprocessed_events/
> > agent.sinks.stFileSink.type.sink.rollInterval = 1800
> >
> > # configure sinkgroup agentGroup. sinks with higher priorities are run
> first
> > agent.sinkgroups.collectorGroup.sinks = collector1003sink
> collector1002sink stFileSink
> > agent.sinkgroups.collectorGroup.processor.type = failover
> > agent.sinkgroups.collectorGroup.processor.priority.collector1003sink = 10
> > agent.sinkgroups.collectorGroup.processor.priority.collector1002sink = 15
> > agent.sinkgroups.collectorGroup.processor.priority.stFileSink = 5
> > agent.sinkgroups.collectorGroup.processor.maxpenalty = 10000
> >
> >
> > Collector
> > -------------
> > # Tell Flume which source, channel, and sink to use
> > collector.channels = collectorMemoryChannel
> > collector.sources = collectorAvroSource
> > collector.sinks = collectorCassandraSink collectorFile
> > collector.sinkgroups = collectorGroup
> >
> > # Define a memory channel called collectorMemoryChannel
> > collector.channels.collectorMemoryChannel.type = memory
> > collector.channels.collectorMemoryChannel.capacity = 10000000
> > collector.channels.collectorMemoryChannel.transactionCapacity = 100000
> > collector.channels.collectorMemoryChannel.keep-alive = 30
> >
> > # Define an exec source called collectorAvroSource
> > collector.sources.collectorAvroSource.channels = collectorMemoryChannel
> > collector.sources.collectorAvroSource.type = avro
> > collector.sources.collectorAvroSource.bind = 0.0.0.0
> > collector.sources.collectorAvroSource.port = 45678
> >
> > # Define a custom sink called collectorCustomSink
> > collector.sinks.collectorCassandraSink.channel = collectorMemoryChannel
> > collector.sinks.collectorCassandraSink.type =
> com.sharethis.web.flume.CassandraSinkST2 (
> http://web.flume.CassandraSinkST2)
> > collector.sinks.collectorCassandraSink.cassandraHostname =
> {CASSANDRA_HOSTNAME}
> > collector.sinks.collectorCassandraSink.cassandraPort = 9160
> >
> > # logger sink called collectorFile
> > collector.sinks.collectorFile.channel = collectorMemoryChannel
> > collector.sinks.collectorFile.type = file_roll
> > collector.sinks.collectorFile.sink.directory =
> /var/tmp/unprocessed_events/
> > collector.sinks.collectorFile.type.sink.rollInterval = 1800
> >
> > # configure sinkgroup collectorGroup. sinks with higher priorities are
> run first
> > collector.sinkgroups.collectorGroup.sinks = collectorCassandraSink
> collectorFile
> > collector.sinkgroups.collectorGroup.processor.type = failover
> >
> collector.sinkgroups.collectorGroup.processor.priority.collectorCassandraSink
> = 15
> > collector.sinkgroups.collectorGroup.processor.priority.collectorFile = 5
> > collector.sinkgroups.collectorGroup.processor.maxpenalty = 10000
> >
> >
> > On Tue, Jun 26, 2012 at 7:01 PM, Mike Percy <mpercy@cloudera.com(mailto:
> mpercy@cloudera.com)> wrote:
> > > We are able to push > 8000 events/sec (2KB per event) through a single
> file channel if you put checkpoint on one disk and use 2 other disks for
> data dirs. Not sure what the limit is. This is using the latest trunk code.
> Other limitations may be you need to add additional sinks to your channel
> to drain it faster. This is because sinks are single threaded and sources
> are multithreaded.
> > >
> > > Mike
> > >
> > >
> > > On Tuesday, June 26, 2012 at 6:30 PM, Juhani Connolly wrote:
> > >
> > > > Depending on your setup, you may find that some channels just cannot
> > > > keep up with throughput. Is the timestamp on the logs gradually
> falling
> > > > further behind the time it appears on the collector? If so, some
> > > > component(likely the channels) are turning into a bottleneck.
> > > >
> > > > We have about 1000 events/sec running through to our collector and
> using
> > > > file channel it unfortunately could not keep up. If you're not
> already,
> > > > try running memory channels and see how that works.
> > > >
> > > > On 06/27/2012 08:06 AM, Tejinder Aulakh wrote:
> > > > > We are using Flume-NG to transfer logs from the log servers (40) to
> > > > > collectors (4). However, now we are seeing a long delay before the
> > > > > logs show up at the collectors.
> > > > >
> > > > > I was wondering if there is any way of running Flume-NG using
> Multiple
> > > > > threads so that collectors can process all the load they get form
> the
> > > > > log servers. Is it configurable in flume-env.sh (
> http://flume-env.sh) (http://flume-env.sh)? How many threads does
> > > > > Flume-NG use by default?
> > > > >
> > > > > Thanks,
> > > > > Tejinder
> > > >
> > >
> >
> >
> >
> >
> > --
> > Tejinder Aulakh
> > Senior Software Engineer, ShareThis
> > e: tejinder@sharethis.com (mailto:tejinder@sharethis.com)
> > m: 510.708.2499
> >
> > (http://pinterest.com/sharethis/)
> > Learn More: SQI (Social Quality Index) - A Universal Measure of Social
> Quality (http://sharethis.com/sqi)
> >
> >
> >
> >
>
>
>


-- 
*Tejinder Aulakh
*Senior Software Engineer, ShareThis
*e:* tejinder@sharethis.com
*m:* 510.708.2499*
*
** <http://pinterest.com/sharethis/>* <http://sharethis.com/>
**Learn More:*  SQI (Social Quality Index) - A Universal Measure of Social
Quality <http://sharethis.com/sqi>

[image: Facebook] <http://www.facebook.com/sharethis> [image:
Twitter]<https://twitter.com/#!/SHARETHIS>
 [image: LinkedIn]<http://www.linkedin.com/company/207839?trk=pro_other_cmpy>
 [image: pinterest] <http://pinterest.com/sharethis/>

Re: Running Flume-NG in multiple threads

Posted by Mike Percy <mp...@cloudera.com>.

Maybe your parsing is taking some time?

Try adding multiple sinks or sink groups to the same channel on the app tier to get more throughput.

Regards,
Mike


On Tuesday, June 26, 2012 at 10:14 PM, Tejinder Aulakh wrote:

> Hi Mike, 
> 
> Below is the config for agents and collectors. 
> 
> CassandraSinkST2 - puts the events in Cassandra
> AvroSinkParserST2 - customized avro sink which parses the events before sending. 
> 
> So you are saying if we simply add more sinks (CassandraSinkST2 on collectors) and connect them to the same channel, it would speed things up? 
> 
> -Tejinder
> 
> Agent
> -------
> #ch source, channel, and sink to use
> agent.channels = myMemoryChannel
> agent.sources = myExecSource
> agent.sinks = collector1002sink collector1003sink stFileSink
> agent.sinkgroups = collectorGroup
> 
> # Define a memory channel called myMemoryChannel
> agent.channels.myMemoryChannel.type = memory
> agent.channels.myMemoryChannel.capacity = 1000000
> agent.channels.myMemoryChannel.transactionCapacity = 10000
> agent.channels.myMemoryChannel.keep-alive = 30
> 
> # Define an exec source called myExecChannel to tail log file
> agent.sources.myExecSource.channels = myMemoryChannel
> agent.sources.myExecSource.type = exec
> agent.sources.myExecSource.command = tail -F /mnt/nginx/r.log
> 
> # Define a custom avro sink called collector1003sink
> agent.sinks.collector1003sink.channel = myMemoryChannel
> agent.sinks.collector1003sink.type = com.sharethis.web.flume.AvroSinkParserST2 (http://web.flume.AvroSinkParserST2)
> agent.sinks.collector1003sink.hostname = {PRIMARY_IP}
> agent.sinks.collector1003sink.port = 45678
> agent.sinks.collector1003sink.batch-size = 100
> 
> # Define a custom avro sink called collector1002sink
> agent.sinks.collector1002sink.channel = myMemoryChannel
> agent.sinks.collector1002sink.type = com.sharethis.web.flume.AvroSinkParserST2 (http://web.flume.AvroSinkParserST2)
> agent.sinks.collector1002sink.hostname = {BACKUP_IP}
> agent.sinks.collector1002sink.port = 45678
> agent.sinks.collector1002sink.batch-size = 100
> 
> # logger sink called stFileSink 
> agent.sinks.stFileSink.channel = myMemoryChannel
> agent.sinks.stFileSink.type = file_roll
> agent.sinks.stFileSink.sink.directory = /var/tmp/unprocessed_events/
> agent.sinks.stFileSink.type.sink.rollInterval = 1800
> 
> # configure sinkgroup agentGroup. sinks with higher priorities are run first
> agent.sinkgroups.collectorGroup.sinks = collector1003sink collector1002sink stFileSink
> agent.sinkgroups.collectorGroup.processor.type = failover
> agent.sinkgroups.collectorGroup.processor.priority.collector1003sink = 10
> agent.sinkgroups.collectorGroup.processor.priority.collector1002sink = 15
> agent.sinkgroups.collectorGroup.processor.priority.stFileSink = 5
> agent.sinkgroups.collectorGroup.processor.maxpenalty = 10000
> 
> 
> Collector
> -------------
> # Tell Flume which source, channel, and sink to use
> collector.channels = collectorMemoryChannel
> collector.sources = collectorAvroSource
> collector.sinks = collectorCassandraSink collectorFile
> collector.sinkgroups = collectorGroup
> 
> # Define a memory channel called collectorMemoryChannel 
> collector.channels.collectorMemoryChannel.type = memory
> collector.channels.collectorMemoryChannel.capacity = 10000000
> collector.channels.collectorMemoryChannel.transactionCapacity = 100000
> collector.channels.collectorMemoryChannel.keep-alive = 30
> 
> # Define an exec source called collectorAvroSource
> collector.sources.collectorAvroSource.channels = collectorMemoryChannel
> collector.sources.collectorAvroSource.type = avro
> collector.sources.collectorAvroSource.bind = 0.0.0.0
> collector.sources.collectorAvroSource.port = 45678
> 
> # Define a custom sink called collectorCustomSink 
> collector.sinks.collectorCassandraSink.channel = collectorMemoryChannel
> collector.sinks.collectorCassandraSink.type = com.sharethis.web.flume.CassandraSinkST2 (http://web.flume.CassandraSinkST2)
> collector.sinks.collectorCassandraSink.cassandraHostname = {CASSANDRA_HOSTNAME}
> collector.sinks.collectorCassandraSink.cassandraPort = 9160
> 
> # logger sink called collectorFile
> collector.sinks.collectorFile.channel = collectorMemoryChannel
> collector.sinks.collectorFile.type = file_roll
> collector.sinks.collectorFile.sink.directory = /var/tmp/unprocessed_events/
> collector.sinks.collectorFile.type.sink.rollInterval = 1800
> 
> # configure sinkgroup collectorGroup. sinks with higher priorities are run first 
> collector.sinkgroups.collectorGroup.sinks = collectorCassandraSink collectorFile
> collector.sinkgroups.collectorGroup.processor.type = failover
> collector.sinkgroups.collectorGroup.processor.priority.collectorCassandraSink = 15
> collector.sinkgroups.collectorGroup.processor.priority.collectorFile = 5
> collector.sinkgroups.collectorGroup.processor.maxpenalty = 10000
> 
> 
> On Tue, Jun 26, 2012 at 7:01 PM, Mike Percy <mpercy@cloudera.com (mailto:mpercy@cloudera.com)> wrote:
> > We are able to push > 8000 events/sec (2KB per event) through a single file channel if you put checkpoint on one disk and use 2 other disks for data dirs. Not sure what the limit is. This is using the latest trunk code. Other limitations may be you need to add additional sinks to your channel to drain it faster. This is because sinks are single threaded and sources are multithreaded.
> > 
> > Mike
> > 
> > 
> > On Tuesday, June 26, 2012 at 6:30 PM, Juhani Connolly wrote:
> > 
> > > Depending on your setup, you may find that some channels just cannot
> > > keep up with throughput. Is the timestamp on the logs gradually falling
> > > further behind the time it appears on the collector? If so, some
> > > component(likely the channels) are turning into a bottleneck.
> > > 
> > > We have about 1000 events/sec running through to our collector and using
> > > file channel it unfortunately could not keep up. If you're not already,
> > > try running memory channels and see how that works.
> > > 
> > > On 06/27/2012 08:06 AM, Tejinder Aulakh wrote:
> > > > We are using Flume-NG to transfer logs from the log servers (40) to
> > > > collectors (4). However, now we are seeing a long delay before the
> > > > logs show up at the collectors.
> > > > 
> > > > I was wondering if there is any way of running Flume-NG using Multiple
> > > > threads so that collectors can process all the load they get form the
> > > > log servers. Is it configurable in flume-env.sh (http://flume-env.sh) (http://flume-env.sh)? How many threads does
> > > > Flume-NG use by default?
> > > > 
> > > > Thanks,
> > > > Tejinder
> > > 
> > 
> 
> 
> 
> 
> -- 
> Tejinder Aulakh
> Senior Software Engineer, ShareThis
> e: tejinder@sharethis.com (mailto:tejinder@sharethis.com)
> m: 510.708.2499
> 
> (http://pinterest.com/sharethis/)
> Learn More: SQI (Social Quality Index) - A Universal Measure of Social Quality (http://sharethis.com/sqi) 
> 
> 
> 
>

Re: Running Flume-NG in multiple threads

Posted by Tejinder Aulakh <te...@sharethis.com>.

Hi Mike,

Below is the config for agents and collectors.

CassandraSinkST2 - puts the events in Cassandra
AvroSinkParserST2 - customized avro sink which parses the events before
sending.

So you are saying if we simply add more sinks (CassandraSinkST2 on
collectors) and connect them to the same channel, it would speed things up?

-Tejinder

Agent
-------
#ch source, channel, and sink to use
agent.channels = myMemoryChannel
agent.sources = myExecSource
agent.sinks = collector1002sink collector1003sink stFileSink
agent.sinkgroups = collectorGroup

# Define a memory channel called myMemoryChannel
agent.channels.myMemoryChannel.type = memory
agent.channels.myMemoryChannel.capacity = 1000000
agent.channels.myMemoryChannel.transactionCapacity = 10000
agent.channels.myMemoryChannel.keep-alive = 30

# Define an exec source called myExecChannel to tail log file
agent.sources.myExecSource.channels = myMemoryChannel
agent.sources.myExecSource.type = exec
agent.sources.myExecSource.command = tail -F /mnt/nginx/r.log

# Define a custom avro sink called collector1003sink
agent.sinks.collector1003sink.channel = myMemoryChannel
agent.sinks.collector1003sink.type =
com.sharethis.web.flume.AvroSinkParserST2
agent.sinks.collector1003sink.hostname = {PRIMARY_IP}
agent.sinks.collector1003sink.port = 45678
agent.sinks.collector1003sink.batch-size = 100

# Define a custom avro sink called collector1002sink
agent.sinks.collector1002sink.channel = myMemoryChannel
agent.sinks.collector1002sink.type =
com.sharethis.web.flume.AvroSinkParserST2
agent.sinks.collector1002sink.hostname = {BACKUP_IP}
agent.sinks.collector1002sink.port = 45678
agent.sinks.collector1002sink.batch-size = 100

# logger sink called stFileSink
agent.sinks.stFileSink.channel = myMemoryChannel
agent.sinks.stFileSink.type = file_roll
agent.sinks.stFileSink.sink.directory = /var/tmp/unprocessed_events/
agent.sinks.stFileSink.type.sink.rollInterval = 1800

# configure sinkgroup agentGroup. sinks with higher priorities are run first
agent.sinkgroups.collectorGroup.sinks = collector1003sink collector1002sink
stFileSink
agent.sinkgroups.collectorGroup.processor.type = failover
agent.sinkgroups.collectorGroup.processor.priority.collector1003sink = 10
agent.sinkgroups.collectorGroup.processor.priority.collector1002sink = 15
agent.sinkgroups.collectorGroup.processor.priority.stFileSink = 5
agent.sinkgroups.collectorGroup.processor.maxpenalty = 10000


Collector
-------------
# Tell Flume which source, channel, and sink to use
collector.channels = collectorMemoryChannel
collector.sources = collectorAvroSource
collector.sinks = collectorCassandraSink collectorFile
collector.sinkgroups = collectorGroup

# Define a memory channel called collectorMemoryChannel
collector.channels.collectorMemoryChannel.type = memory
collector.channels.collectorMemoryChannel.capacity = 10000000
collector.channels.collectorMemoryChannel.transactionCapacity = 100000
collector.channels.collectorMemoryChannel.keep-alive = 30

# Define an exec source called collectorAvroSource
collector.sources.collectorAvroSource.channels = collectorMemoryChannel
collector.sources.collectorAvroSource.type = avro
collector.sources.collectorAvroSource.bind = 0.0.0.0
collector.sources.collectorAvroSource.port = 45678

# Define a custom sink called collectorCustomSink
collector.sinks.collectorCassandraSink.channel = collectorMemoryChannel
collector.sinks.collectorCassandraSink.type =
com.sharethis.web.flume.CassandraSinkST2
collector.sinks.collectorCassandraSink.cassandraHostname =
{CASSANDRA_HOSTNAME}
collector.sinks.collectorCassandraSink.cassandraPort = 9160

# logger sink called collectorFile
collector.sinks.collectorFile.channel = collectorMemoryChannel
collector.sinks.collectorFile.type = file_roll
collector.sinks.collectorFile.sink.directory = /var/tmp/unprocessed_events/
collector.sinks.collectorFile.type.sink.rollInterval = 1800

# configure sinkgroup collectorGroup. sinks with higher priorities are run
first
collector.sinkgroups.collectorGroup.sinks = collectorCassandraSink
collectorFile
collector.sinkgroups.collectorGroup.processor.type = failover
collector.sinkgroups.collectorGroup.processor.priority.collectorCassandraSink
= 15
collector.sinkgroups.collectorGroup.processor.priority.collectorFile = 5
collector.sinkgroups.collectorGroup.processor.maxpenalty = 10000


On Tue, Jun 26, 2012 at 7:01 PM, Mike Percy <mp...@cloudera.com> wrote:

> We are able to push > 8000 events/sec (2KB per event) through a single
> file channel if you put checkpoint on one disk and use 2 other disks for
> data dirs. Not sure what the limit is. This is using the latest trunk code.
> Other limitations may be you need to add additional sinks to your channel
> to drain it faster. This is because sinks are single threaded and sources
> are multithreaded.
>
> Mike
>
>
> On Tuesday, June 26, 2012 at 6:30 PM, Juhani Connolly wrote:
>
> > Depending on your setup, you may find that some channels just cannot
> > keep up with throughput. Is the timestamp on the logs gradually falling
> > further behind the time it appears on the collector? If so, some
> > component(likely the channels) are turning into a bottleneck.
> >
> > We have about 1000 events/sec running through to our collector and using
> > file channel it unfortunately could not keep up. If you're not already,
> > try running memory channels and see how that works.
> >
> > On 06/27/2012 08:06 AM, Tejinder Aulakh wrote:
> > > We are using Flume-NG to transfer logs from the log servers (40) to
> > > collectors (4). However, now we are seeing a long delay before the
> > > logs show up at the collectors.
> > >
> > > I was wondering if there is any way of running Flume-NG using Multiple
> > > threads so that collectors can process all the load they get form the
> > > log servers. Is it configurable in flume-env.sh (http://flume-env.sh)?
> How many threads does
> > > Flume-NG use by default?
> > >
> > > Thanks,
> > > Tejinder
> >
>
>
>
>


-- 
*Tejinder Aulakh
*Senior Software Engineer, ShareThis
*e:* tejinder@sharethis.com
*m:* 510.708.2499*
*
** <http://pinterest.com/sharethis/>* <http://sharethis.com/>
**Learn More:*  SQI (Social Quality Index) - A Universal Measure of Social
Quality <http://sharethis.com/sqi>

[image: Facebook] <http://www.facebook.com/sharethis> [image:
Twitter]<https://twitter.com/#!/SHARETHIS>
 [image: LinkedIn]<http://www.linkedin.com/company/207839?trk=pro_other_cmpy>
 [image: pinterest] <http://pinterest.com/sharethis/>

Re: File channel performance on a single disk is poor

Posted by Juhani Connolly <ju...@cyberagent.co.jp>.

Hi, thanks for clarifying.

On 07/10/2012 06:36 PM, Arvind Prabhakar wrote:
> Hi,
>
> On Sun, Jul 8, 2012 at 11:14 PM, Juhani Connolly 
> <juhani_connolly@cyberagent.co.jp 
> <ma...@cyberagent.co.jp>> wrote:
>
>     Another matter that I'm curious of is whether or not we actually
>     need separate files for the data and checkpoints...
>
>
> The data file and checkpoint files serve different purpose. Checkpoint 
> resides in memory and simulates the channel. The only difference is 
> that it does not store the data in the queue itself, but pointers to 
> data that resides in the log files. As a result the memory footprint 
> of the checkpoint is very small regardless of how big each event 
> payload is. This size only depends upon the capacity of the channel 
> and nothing else.
This is more or less what I expected. Am I correct in believing that 
each commit has to has to seek back and forth to two different files? 
This would make all access on a single disk non-sequential.

>     Can we not add a magic header before each type of entry to
>     differentiate, and thus guarantee significantly more sequential
>     access?
>
>
> In the general case access will be sequential. In the best case, the 
> channel will have moved the writes to new log files and continue to do 
> reads from old (rolled) files which reduce seek contention. From what 
> I know, I don't think it will be trivial to affect your suggested 
> change without significantly impacting the entire logic of the channel.

I'm not understanding how it reduces the seek contention if the files 
are all on the same disk? I don't think the reads are that painful,a lot 
of it is hopefully taken care of by the os cache...

Implementation would likely be difficult, yes. I've only had an overview 
look at the code, but haven't tried to do it because of this. As you 
suggest it might be better to have a separate implementation.
>
>     What is killing performance on a single disk right now is the
>     constant seeks. The problem with this though would be putting
>     together a file format that allows quick seeking through to the
>     correct position, and rolling would be a lot harder. I think this
>     is a lot more difficult and might be more of a long term target.
>
>
> Perhaps what you are describing is a different type of persistent 
> channel that is optimized for high latency IO systems. I would 
> encourage you to take your idea one step further and see if that can 
> be drafted as yet another channel that serves this particular use-case.
>

I'd like to do this, though it seems quite involved. Hopefully I can get 
some time to figure it out later along the road. Jarcecs spillable 
channel should also help on this front.

For the time being, I've resolved the issue for us with a workaround by 
limiting the number of commits(by making ExecSource commit multiple 
entries at a time).

My concern is that FileChannel is represented by a number of people as 
having good performance, when at current time it depends on one of two 
things being the case for that: multiple disks, or batched transactions.

Thanks,
  Juhani Connolly

> Regards,
> Arvind Prabhakar
>
>
>
>     Juhani
>
>
>>     Regards,
>>     Arvind Prabhakar
>>
>>
>>     On Wed, Jul 4, 2012 at 3:33 AM, Juhani Connolly
>>     <juhani_connolly@cyberagent.co.jp
>>     <ma...@cyberagent.co.jp>> wrote:
>>
>>         It looks good to me as it provides a nice balance between
>>         reliability and throughput.
>>
>>         It's certainly one possible solution to the issue, though I
>>         do believe that the current one could be made more friendly
>>         towards single disk access(e.g. batching writes to the disk
>>         may well be doable and would be curious what someone with
>>         more familiarity with the implementation thinks).
>>
>>
>>         On 07/04/2012 06:36 PM, Jarek Jarcec Cecho wrote:
>>
>>             We had connected discussion about this "SpillableChannel"
>>             (working name) on FLUME-1045 and I believe that consensus
>>             is that we will create something like that. In fact, I'm
>>             planning to do it myself in near future - I just need to
>>             prioritize my todo list first.
>>
>>             Jarcec
>>
>>             On Wed, Jul 04, 2012 at 06:13:43PM +0900, Juhani Connolly
>>             wrote:
>>
>>                 Yes... I was actually poking around for that issue as
>>                 I remembered
>>                 seeing it before.  I had before also suggested a
>>                 compound channel
>>                 that would have worked like the buffer store in
>>                 scribe, but general
>>                 opinion was that it provided too many mixed
>>                 configurations that
>>                 could make testings and verifying correctness difficult.
>>
>>                 On 07/04/2012 04:33 PM, Jarek Jarcec Cecho wrote:
>>
>>                     Hi Juhally,
>>                     while ago I've filled jira FLUME-1227 where I've
>>                     suggested creating some sort of SpillableChannel
>>                     that would behave similarly as scribe. It would
>>                     be normally acting as memory channel and it would
>>                     start spilling data to disk in case that it would
>>                     get full (my primary goal here was to solve issue
>>                     when remote goes down, for example in case of
>>                     HDFS maintenance). Would it be helpful for your case?
>>
>>                     Jarcec
>>
>>                     On Wed, Jul 04, 2012 at 04:07:48PM +0900, Juhani
>>                     Connolly wrote:
>>
>>                         Evaluating flume on some of our servers, the
>>                         file channel seems very
>>                         slow, likely because like most typical web
>>                         servers ours have a
>>                         single raided disk available for writing to.
>>
>>                         Quoted below is a suggestion from a  previous
>>                         issue where our poor
>>                         throughput came up, where it turns out that
>>                         on multiple disks, file
>>                         channel performance is great.
>>
>>                         On 06/27/2012 11:01 AM, Mike Percy wrote:
>>
>>                             We are able to push > 8000 events/sec
>>                             (2KB per event) through a single file
>>                             channel if you put checkpoint on one disk
>>                             and use 2 other disks for data dirs. Not
>>                             sure what the limit is. This is using the
>>                             latest trunk code. Other limitations may
>>                             be you need to add additional sinks to
>>                             your channel to drain it faster. This is
>>                             because sinks are single threaded and
>>                             sources are multithreaded.
>>
>>                             Mike
>>
>>                         For the case where the disks happen to be
>>                         available on the server,
>>                         that's fantastic, but I suspect that most use
>>                         cases are going to be
>>                         similar to ours, where multiple disks are not
>>                         available. Our use
>>                         case isn't unusual as it's primarily
>>                         aggregating logs from various
>>                         services.
>>
>>                         We originally ran our log servers with a
>>                         exec(tail)->file->avro
>>                         setup where throughput was very bad(80mb in
>>                         an hour). We then
>>                         switched this to a memory channel which was
>>                         fine(the peak time 500mb
>>                         worth of hourly logs went through).
>>                         Afterwards we switched back to
>>                         the file channel, but with 5 identical avro
>>                         sinks. This did not
>>                         improve throughput(still 80mb).
>>                         RecoverableMemoryChannel showed very
>>                         similar characteristics.
>>
>>                         I presume this is due to the writes going to
>>                         two separate places,
>>                         and being further compounded by also writing
>>                         out and tailing the
>>                         normal web logs: checking top and iostat, we
>>                         could confirm we have
>>                         significant iowait time, far more than we
>>                         have during typical
>>                         operation.
>>
>>                         As it is, we seem to be more or less
>>                         guaranteeing no loss of logs
>>                         with the file channel. Perhaps we could look
>>                         into batching
>>                         puts/takes for those that do not need 100%
>>                         data retention but want
>>                         more reliability than with the MemoryChannel
>>                         which can potentially
>>                         lose the entire capacity on a restart?
>>                         Another possibility is
>>                         writing an implementation that writes
>>                         primarily sequentially. I've
>>                         been meaning to get a deeper look at the
>>                         implementation itself to
>>                         give a more informed commentary on the
>>                         contents but unfortunately
>>                         don't have the cycles right now, hopefully
>>                         someone with a better
>>                         understanding of the current
>>                         implementation(along with its
>>                         interaction with the OS file cache) can
>>                         comment on this.
>>
>>
>>
>>
>>
>
>
>

Re: File channel performance on a single disk is poor

Posted by Arvind Prabhakar <ar...@apache.org>.

Hi,

On Sun, Jul 8, 2012 at 11:14 PM, Juhani Connolly <
juhani_connolly@cyberagent.co.jp> wrote:
>
>  Another matter that I'm curious of is whether or not we actually need
> separate files for the data and checkpoints...
>

The data file and checkpoint files serve different purpose. Checkpoint
resides in memory and simulates the channel. The only difference is that it
does not store the data in the queue itself, but pointers to data that
resides in the log files. As a result the memory footprint of the
checkpoint is very small regardless of how big each event payload is. This
size only depends upon the capacity of the channel and nothing else.


> Can we not add a magic header before each type of entry to differentiate,
> and thus guarantee significantly more sequential access?
>

In the general case access will be sequential. In the best case, the
channel will have moved the writes to new log files and continue to do
reads from old (rolled) files which reduce seek contention. From what I
know, I don't think it will be trivial to affect your suggested change
without significantly impacting the entire logic of the channel.


> What is killing performance on a single disk right now is the constant
> seeks. The problem with this though would be putting together a file format
> that allows quick seeking through to the correct position, and rolling
> would be a lot harder. I think this is a lot more difficult and might be
> more of a long term target.
>

Perhaps what you are describing is a different type of persistent channel
that is optimized for high latency IO systems. I would encourage you to
take your idea one step further and see if that can be drafted as yet
another channel that serves this particular use-case.

Regards,
Arvind Prabhakar


>
>
> Juhani
>
>
>  Regards,
> Arvind Prabhakar
>
>
> On Wed, Jul 4, 2012 at 3:33 AM, Juhani Connolly <
> juhani_connolly@cyberagent.co.jp> wrote:
>
>> It looks good to me as it provides a nice balance between reliability and
>> throughput.
>>
>> It's certainly one possible solution to the issue, though I do believe
>> that the current one could be made more friendly towards single disk
>> access(e.g. batching writes to the disk may well be doable and would be
>> curious what someone with more familiarity with the implementation thinks).
>>
>>
>> On 07/04/2012 06:36 PM, Jarek Jarcec Cecho wrote:
>>
>>> We had connected discussion about this "SpillableChannel" (working name)
>>> on FLUME-1045 and I believe that consensus is that we will create something
>>> like that. In fact, I'm planning to do it myself in near future - I just
>>> need to prioritize my todo list first.
>>>
>>> Jarcec
>>>
>>> On Wed, Jul 04, 2012 at 06:13:43PM +0900, Juhani Connolly wrote:
>>>
>>>> Yes... I was actually poking around for that issue as I remembered
>>>> seeing it before.  I had before also suggested a compound channel
>>>> that would have worked like the buffer store in scribe, but general
>>>> opinion was that it provided too many mixed configurations that
>>>> could make testings and verifying correctness difficult.
>>>>
>>>> On 07/04/2012 04:33 PM, Jarek Jarcec Cecho wrote:
>>>>
>>>>> Hi Juhally,
>>>>> while ago I've filled jira FLUME-1227 where I've suggested creating
>>>>> some sort of SpillableChannel that would behave similarly as scribe. It
>>>>> would be normally acting as memory channel and it would start spilling data
>>>>> to disk in case that it would get full (my primary goal here was to solve
>>>>> issue when remote goes down, for example in case of HDFS maintenance).
>>>>> Would it be helpful for your case?
>>>>>
>>>>> Jarcec
>>>>>
>>>>> On Wed, Jul 04, 2012 at 04:07:48PM +0900, Juhani Connolly wrote:
>>>>>
>>>>>> Evaluating flume on some of our servers, the file channel seems very
>>>>>> slow, likely because like most typical web servers ours have a
>>>>>> single raided disk available for writing to.
>>>>>>
>>>>>> Quoted below is a suggestion from a  previous issue where our poor
>>>>>> throughput came up, where it turns out that on multiple disks, file
>>>>>> channel performance is great.
>>>>>>
>>>>>> On 06/27/2012 11:01 AM, Mike Percy wrote:
>>>>>>
>>>>>>> We are able to push > 8000 events/sec (2KB per event) through a
>>>>>>> single file channel if you put checkpoint on one disk and use 2 other disks
>>>>>>> for data dirs. Not sure what the limit is. This is using the latest trunk
>>>>>>> code. Other limitations may be you need to add additional sinks to your
>>>>>>> channel to drain it faster. This is because sinks are single threaded and
>>>>>>> sources are multithreaded.
>>>>>>>
>>>>>>> Mike
>>>>>>>
>>>>>> For the case where the disks happen to be available on the server,
>>>>>> that's fantastic, but I suspect that most use cases are going to be
>>>>>> similar to ours, where multiple disks are not available. Our use
>>>>>> case isn't unusual as it's primarily aggregating logs from various
>>>>>> services.
>>>>>>
>>>>>> We originally ran our log servers with a exec(tail)->file->avro
>>>>>> setup where throughput was very bad(80mb in an hour). We then
>>>>>> switched this to a memory channel which was fine(the peak time 500mb
>>>>>> worth of hourly logs went through). Afterwards we switched back to
>>>>>> the file channel, but with 5 identical avro sinks. This did not
>>>>>> improve throughput(still 80mb). RecoverableMemoryChannel showed very
>>>>>> similar characteristics.
>>>>>>
>>>>>> I presume this is due to the writes going to two separate places,
>>>>>> and being further compounded by also writing out and tailing the
>>>>>> normal web logs: checking top and iostat, we could confirm we have
>>>>>> significant iowait time, far more than we have during typical
>>>>>> operation.
>>>>>>
>>>>>> As it is, we seem to be more or less guaranteeing no loss of logs
>>>>>> with the file channel. Perhaps we could look into batching
>>>>>> puts/takes for those that do not need 100% data retention but want
>>>>>> more reliability than with the MemoryChannel which can potentially
>>>>>> lose the entire capacity on a restart? Another possibility is
>>>>>> writing an implementation that writes primarily sequentially. I've
>>>>>> been meaning to get a deeper look at the implementation itself to
>>>>>> give a more informed commentary on the contents but unfortunately
>>>>>> don't have the cycles right now, hopefully someone with a better
>>>>>> understanding of the current implementation(along with its
>>>>>> interaction with the OS file cache) can comment on this.
>>>>>>
>>>>>>
>>>>
>>
>>
>
>
>

Re: File channel performance on a single disk is poor

Posted by Juhani Connolly <ju...@cyberagent.co.jp>.

Hi, thanks for your input.

On 07/09/2012 02:42 PM, Arvind Prabhakar wrote:
> Hi,
>
> > It's certainly one possible solution to the issue, though I do
> > believe that the current one could be made more friendly
> > towards single disk access(e.g. batching writes to the disk
> > may well be doable and would be curious what someone
> > with more familiarity with the implementation thinks).
>
> The implementation of the file channel is that of a write ahead log, 
> in that it serializes all the actions as they happen. Using these 
> actions, it can reconstruct the state of the channel at anytime. There 
> are two mutually exclusive transaction types it supports - a 
> transaction consisting of puts, and one consisting of takes. It may be 
> possible to use the heap to batch the puts and takes and serialize 
> them to disk when the commit occurs.
>
> This approach will minimize the number of disk operations and will 
> have an impact on the performance characteristics of the channel. 
> Although it probably will improve performance, it is hard to tell for 
> sure unless we test it out under load in different scenarios.
>

This does sound a lot better to me. I'm not sure if there is much demand 
for restoring the state of an uncommitted set of puts/takes to a file 
channel after restarting an agent? If the transaction wasn't completed  
its current state  is not really going to be important after a restart. 
I'm really not familiar with WAL implementations, but is it not merely 
enough to write the data to be committed before the commit 
marker/informing of success? I don't think it is necessary to write each 
piece as it comes in, so long as it is done before informing of 
success/failure.

Another matter that I'm curious of is whether or not we actually need 
separate files for the data and checkpoints... Can we not add a magic 
header before each type of entry to differentiate, and thus guarantee 
significantly more sequential access? What is killing performance on a 
single disk right now is the constant seeks. The problem with this 
though would be putting together a file format that allows quick seeking 
through to the correct position, and rolling would be a lot harder. I 
think this is a lot more difficult and might be more of a long term target.

Juhani

> Regards,
> Arvind Prabhakar
>
>
> On Wed, Jul 4, 2012 at 3:33 AM, Juhani Connolly 
> <juhani_connolly@cyberagent.co.jp 
> <ma...@cyberagent.co.jp>> wrote:
>
>     It looks good to me as it provides a nice balance between
>     reliability and throughput.
>
>     It's certainly one possible solution to the issue, though I do
>     believe that the current one could be made more friendly towards
>     single disk access(e.g. batching writes to the disk may well be
>     doable and would be curious what someone with more familiarity
>     with the implementation thinks).
>
>
>     On 07/04/2012 06:36 PM, Jarek Jarcec Cecho wrote:
>
>         We had connected discussion about this "SpillableChannel"
>         (working name) on FLUME-1045 and I believe that consensus is
>         that we will create something like that. In fact, I'm planning
>         to do it myself in near future - I just need to prioritize my
>         todo list first.
>
>         Jarcec
>
>         On Wed, Jul 04, 2012 at 06:13:43PM +0900, Juhani Connolly wrote:
>
>             Yes... I was actually poking around for that issue as I
>             remembered
>             seeing it before.  I had before also suggested a compound
>             channel
>             that would have worked like the buffer store in scribe,
>             but general
>             opinion was that it provided too many mixed configurations
>             that
>             could make testings and verifying correctness difficult.
>
>             On 07/04/2012 04:33 PM, Jarek Jarcec Cecho wrote:
>
>                 Hi Juhally,
>                 while ago I've filled jira FLUME-1227 where I've
>                 suggested creating some sort of SpillableChannel that
>                 would behave similarly as scribe. It would be normally
>                 acting as memory channel and it would start spilling
>                 data to disk in case that it would get full (my
>                 primary goal here was to solve issue when remote goes
>                 down, for example in case of HDFS maintenance). Would
>                 it be helpful for your case?
>
>                 Jarcec
>
>                 On Wed, Jul 04, 2012 at 04:07:48PM +0900, Juhani
>                 Connolly wrote:
>
>                     Evaluating flume on some of our servers, the file
>                     channel seems very
>                     slow, likely because like most typical web servers
>                     ours have a
>                     single raided disk available for writing to.
>
>                     Quoted below is a suggestion from a  previous
>                     issue where our poor
>                     throughput came up, where it turns out that on
>                     multiple disks, file
>                     channel performance is great.
>
>                     On 06/27/2012 11:01 AM, Mike Percy wrote:
>
>                         We are able to push > 8000 events/sec (2KB per
>                         event) through a single file channel if you
>                         put checkpoint on one disk and use 2 other
>                         disks for data dirs. Not sure what the limit
>                         is. This is using the latest trunk code. Other
>                         limitations may be you need to add additional
>                         sinks to your channel to drain it faster. This
>                         is because sinks are single threaded and
>                         sources are multithreaded.
>
>                         Mike
>
>                     For the case where the disks happen to be
>                     available on the server,
>                     that's fantastic, but I suspect that most use
>                     cases are going to be
>                     similar to ours, where multiple disks are not
>                     available. Our use
>                     case isn't unusual as it's primarily aggregating
>                     logs from various
>                     services.
>
>                     We originally ran our log servers with a
>                     exec(tail)->file->avro
>                     setup where throughput was very bad(80mb in an
>                     hour). We then
>                     switched this to a memory channel which was
>                     fine(the peak time 500mb
>                     worth of hourly logs went through). Afterwards we
>                     switched back to
>                     the file channel, but with 5 identical avro sinks.
>                     This did not
>                     improve throughput(still 80mb).
>                     RecoverableMemoryChannel showed very
>                     similar characteristics.
>
>                     I presume this is due to the writes going to two
>                     separate places,
>                     and being further compounded by also writing out
>                     and tailing the
>                     normal web logs: checking top and iostat, we could
>                     confirm we have
>                     significant iowait time, far more than we have
>                     during typical
>                     operation.
>
>                     As it is, we seem to be more or less guaranteeing
>                     no loss of logs
>                     with the file channel. Perhaps we could look into
>                     batching
>                     puts/takes for those that do not need 100% data
>                     retention but want
>                     more reliability than with the MemoryChannel which
>                     can potentially
>                     lose the entire capacity on a restart? Another
>                     possibility is
>                     writing an implementation that writes primarily
>                     sequentially. I've
>                     been meaning to get a deeper look at the
>                     implementation itself to
>                     give a more informed commentary on the contents
>                     but unfortunately
>                     don't have the cycles right now, hopefully someone
>                     with a better
>                     understanding of the current implementation(along
>                     with its
>                     interaction with the OS file cache) can comment on
>                     this.
>
>
>
>
>

Re: File channel performance on a single disk is poor

Posted by Arvind Prabhakar <ar...@apache.org>.

Hi,

> It's certainly one possible solution to the issue, though I do
> believe that the current one could be made more friendly
> towards single disk access(e.g. batching writes to the disk
> may well be doable and would be curious what someone
> with more familiarity with the implementation thinks).

The implementation of the file channel is that of a write ahead log, in
that it serializes all the actions as they happen. Using these actions, it
can reconstruct the state of the channel at anytime. There are two mutually
exclusive transaction types it supports - a transaction consisting of puts,
and one consisting of takes. It may be possible to use the heap to batch
the puts and takes and serialize them to disk when the commit occurs.

This approach will minimize the number of disk operations and will have an
impact on the performance characteristics of the channel. Although it
probably will improve performance, it is hard to tell for sure unless we
test it out under load in different scenarios.

Regards,
Arvind Prabhakar


On Wed, Jul 4, 2012 at 3:33 AM, Juhani Connolly <
juhani_connolly@cyberagent.co.jp> wrote:

> It looks good to me as it provides a nice balance between reliability and
> throughput.
>
> It's certainly one possible solution to the issue, though I do believe
> that the current one could be made more friendly towards single disk
> access(e.g. batching writes to the disk may well be doable and would be
> curious what someone with more familiarity with the implementation thinks).
>
>
> On 07/04/2012 06:36 PM, Jarek Jarcec Cecho wrote:
>
>> We had connected discussion about this "SpillableChannel" (working name)
>> on FLUME-1045 and I believe that consensus is that we will create something
>> like that. In fact, I'm planning to do it myself in near future - I just
>> need to prioritize my todo list first.
>>
>> Jarcec
>>
>> On Wed, Jul 04, 2012 at 06:13:43PM +0900, Juhani Connolly wrote:
>>
>>> Yes... I was actually poking around for that issue as I remembered
>>> seeing it before.  I had before also suggested a compound channel
>>> that would have worked like the buffer store in scribe, but general
>>> opinion was that it provided too many mixed configurations that
>>> could make testings and verifying correctness difficult.
>>>
>>> On 07/04/2012 04:33 PM, Jarek Jarcec Cecho wrote:
>>>
>>>> Hi Juhally,
>>>> while ago I've filled jira FLUME-1227 where I've suggested creating
>>>> some sort of SpillableChannel that would behave similarly as scribe. It
>>>> would be normally acting as memory channel and it would start spilling data
>>>> to disk in case that it would get full (my primary goal here was to solve
>>>> issue when remote goes down, for example in case of HDFS maintenance).
>>>> Would it be helpful for your case?
>>>>
>>>> Jarcec
>>>>
>>>> On Wed, Jul 04, 2012 at 04:07:48PM +0900, Juhani Connolly wrote:
>>>>
>>>>> Evaluating flume on some of our servers, the file channel seems very
>>>>> slow, likely because like most typical web servers ours have a
>>>>> single raided disk available for writing to.
>>>>>
>>>>> Quoted below is a suggestion from a  previous issue where our poor
>>>>> throughput came up, where it turns out that on multiple disks, file
>>>>> channel performance is great.
>>>>>
>>>>> On 06/27/2012 11:01 AM, Mike Percy wrote:
>>>>>
>>>>>> We are able to push > 8000 events/sec (2KB per event) through a
>>>>>> single file channel if you put checkpoint on one disk and use 2 other disks
>>>>>> for data dirs. Not sure what the limit is. This is using the latest trunk
>>>>>> code. Other limitations may be you need to add additional sinks to your
>>>>>> channel to drain it faster. This is because sinks are single threaded and
>>>>>> sources are multithreaded.
>>>>>>
>>>>>> Mike
>>>>>>
>>>>> For the case where the disks happen to be available on the server,
>>>>> that's fantastic, but I suspect that most use cases are going to be
>>>>> similar to ours, where multiple disks are not available. Our use
>>>>> case isn't unusual as it's primarily aggregating logs from various
>>>>> services.
>>>>>
>>>>> We originally ran our log servers with a exec(tail)->file->avro
>>>>> setup where throughput was very bad(80mb in an hour). We then
>>>>> switched this to a memory channel which was fine(the peak time 500mb
>>>>> worth of hourly logs went through). Afterwards we switched back to
>>>>> the file channel, but with 5 identical avro sinks. This did not
>>>>> improve throughput(still 80mb). RecoverableMemoryChannel showed very
>>>>> similar characteristics.
>>>>>
>>>>> I presume this is due to the writes going to two separate places,
>>>>> and being further compounded by also writing out and tailing the
>>>>> normal web logs: checking top and iostat, we could confirm we have
>>>>> significant iowait time, far more than we have during typical
>>>>> operation.
>>>>>
>>>>> As it is, we seem to be more or less guaranteeing no loss of logs
>>>>> with the file channel. Perhaps we could look into batching
>>>>> puts/takes for those that do not need 100% data retention but want
>>>>> more reliability than with the MemoryChannel which can potentially
>>>>> lose the entire capacity on a restart? Another possibility is
>>>>> writing an implementation that writes primarily sequentially. I've
>>>>> been meaning to get a deeper look at the implementation itself to
>>>>> give a more informed commentary on the contents but unfortunately
>>>>> don't have the cycles right now, hopefully someone with a better
>>>>> understanding of the current implementation(along with its
>>>>> interaction with the OS file cache) can comment on this.
>>>>>
>>>>>
>>>
>
>

Re: File channel performance on a single disk is poor

Posted by Juhani Connolly <ju...@cyberagent.co.jp>.

It looks good to me as it provides a nice balance between reliability 
and throughput.

It's certainly one possible solution to the issue, though I do believe 
that the current one could be made more friendly towards single disk 
access(e.g. batching writes to the disk may well be doable and would be 
curious what someone with more familiarity with the implementation thinks).

On 07/04/2012 06:36 PM, Jarek Jarcec Cecho wrote:
> We had connected discussion about this "SpillableChannel" (working name) on FLUME-1045 and I believe that consensus is that we will create something like that. In fact, I'm planning to do it myself in near future - I just need to prioritize my todo list first.
>
> Jarcec
>
> On Wed, Jul 04, 2012 at 06:13:43PM +0900, Juhani Connolly wrote:
>> Yes... I was actually poking around for that issue as I remembered
>> seeing it before.  I had before also suggested a compound channel
>> that would have worked like the buffer store in scribe, but general
>> opinion was that it provided too many mixed configurations that
>> could make testings and verifying correctness difficult.
>>
>> On 07/04/2012 04:33 PM, Jarek Jarcec Cecho wrote:
>>> Hi Juhally,
>>> while ago I've filled jira FLUME-1227 where I've suggested creating some sort of SpillableChannel that would behave similarly as scribe. It would be normally acting as memory channel and it would start spilling data to disk in case that it would get full (my primary goal here was to solve issue when remote goes down, for example in case of HDFS maintenance). Would it be helpful for your case?
>>>
>>> Jarcec
>>>
>>> On Wed, Jul 04, 2012 at 04:07:48PM +0900, Juhani Connolly wrote:
>>>> Evaluating flume on some of our servers, the file channel seems very
>>>> slow, likely because like most typical web servers ours have a
>>>> single raided disk available for writing to.
>>>>
>>>> Quoted below is a suggestion from a  previous issue where our poor
>>>> throughput came up, where it turns out that on multiple disks, file
>>>> channel performance is great.
>>>>
>>>> On 06/27/2012 11:01 AM, Mike Percy wrote:
>>>>> We are able to push > 8000 events/sec (2KB per event) through a single file channel if you put checkpoint on one disk and use 2 other disks for data dirs. Not sure what the limit is. This is using the latest trunk code. Other limitations may be you need to add additional sinks to your channel to drain it faster. This is because sinks are single threaded and sources are multithreaded.
>>>>>
>>>>> Mike
>>>> For the case where the disks happen to be available on the server,
>>>> that's fantastic, but I suspect that most use cases are going to be
>>>> similar to ours, where multiple disks are not available. Our use
>>>> case isn't unusual as it's primarily aggregating logs from various
>>>> services.
>>>>
>>>> We originally ran our log servers with a exec(tail)->file->avro
>>>> setup where throughput was very bad(80mb in an hour). We then
>>>> switched this to a memory channel which was fine(the peak time 500mb
>>>> worth of hourly logs went through). Afterwards we switched back to
>>>> the file channel, but with 5 identical avro sinks. This did not
>>>> improve throughput(still 80mb). RecoverableMemoryChannel showed very
>>>> similar characteristics.
>>>>
>>>> I presume this is due to the writes going to two separate places,
>>>> and being further compounded by also writing out and tailing the
>>>> normal web logs: checking top and iostat, we could confirm we have
>>>> significant iowait time, far more than we have during typical
>>>> operation.
>>>>
>>>> As it is, we seem to be more or less guaranteeing no loss of logs
>>>> with the file channel. Perhaps we could look into batching
>>>> puts/takes for those that do not need 100% data retention but want
>>>> more reliability than with the MemoryChannel which can potentially
>>>> lose the entire capacity on a restart? Another possibility is
>>>> writing an implementation that writes primarily sequentially. I've
>>>> been meaning to get a deeper look at the implementation itself to
>>>> give a more informed commentary on the contents but unfortunately
>>>> don't have the cycles right now, hopefully someone with a better
>>>> understanding of the current implementation(along with its
>>>> interaction with the OS file cache) can comment on this.
>>>>
>>

Re: File channel performance on a single disk is poor

Posted by Jarek Jarcec Cecho <ja...@apache.org>.

We had connected discussion about this "SpillableChannel" (working name) on FLUME-1045 and I believe that consensus is that we will create something like that. In fact, I'm planning to do it myself in near future - I just need to prioritize my todo list first.

Jarcec

On Wed, Jul 04, 2012 at 06:13:43PM +0900, Juhani Connolly wrote:
> Yes... I was actually poking around for that issue as I remembered
> seeing it before.  I had before also suggested a compound channel
> that would have worked like the buffer store in scribe, but general
> opinion was that it provided too many mixed configurations that
> could make testings and verifying correctness difficult.
> 
> On 07/04/2012 04:33 PM, Jarek Jarcec Cecho wrote:
> >Hi Juhally,
> >while ago I've filled jira FLUME-1227 where I've suggested creating some sort of SpillableChannel that would behave similarly as scribe. It would be normally acting as memory channel and it would start spilling data to disk in case that it would get full (my primary goal here was to solve issue when remote goes down, for example in case of HDFS maintenance). Would it be helpful for your case?
> >
> >Jarcec
> >
> >On Wed, Jul 04, 2012 at 04:07:48PM +0900, Juhani Connolly wrote:
> >>Evaluating flume on some of our servers, the file channel seems very
> >>slow, likely because like most typical web servers ours have a
> >>single raided disk available for writing to.
> >>
> >>Quoted below is a suggestion from a  previous issue where our poor
> >>throughput came up, where it turns out that on multiple disks, file
> >>channel performance is great.
> >>
> >>On 06/27/2012 11:01 AM, Mike Percy wrote:
> >>>We are able to push > 8000 events/sec (2KB per event) through a single file channel if you put checkpoint on one disk and use 2 other disks for data dirs. Not sure what the limit is. This is using the latest trunk code. Other limitations may be you need to add additional sinks to your channel to drain it faster. This is because sinks are single threaded and sources are multithreaded.
> >>>
> >>>Mike
> >>For the case where the disks happen to be available on the server,
> >>that's fantastic, but I suspect that most use cases are going to be
> >>similar to ours, where multiple disks are not available. Our use
> >>case isn't unusual as it's primarily aggregating logs from various
> >>services.
> >>
> >>We originally ran our log servers with a exec(tail)->file->avro
> >>setup where throughput was very bad(80mb in an hour). We then
> >>switched this to a memory channel which was fine(the peak time 500mb
> >>worth of hourly logs went through). Afterwards we switched back to
> >>the file channel, but with 5 identical avro sinks. This did not
> >>improve throughput(still 80mb). RecoverableMemoryChannel showed very
> >>similar characteristics.
> >>
> >>I presume this is due to the writes going to two separate places,
> >>and being further compounded by also writing out and tailing the
> >>normal web logs: checking top and iostat, we could confirm we have
> >>significant iowait time, far more than we have during typical
> >>operation.
> >>
> >>As it is, we seem to be more or less guaranteeing no loss of logs
> >>with the file channel. Perhaps we could look into batching
> >>puts/takes for those that do not need 100% data retention but want
> >>more reliability than with the MemoryChannel which can potentially
> >>lose the entire capacity on a restart? Another possibility is
> >>writing an implementation that writes primarily sequentially. I've
> >>been meaning to get a deeper look at the implementation itself to
> >>give a more informed commentary on the contents but unfortunately
> >>don't have the cycles right now, hopefully someone with a better
> >>understanding of the current implementation(along with its
> >>interaction with the OS file cache) can comment on this.
> >>
> 
>

Re: File channel performance on a single disk is poor

Posted by Juhani Connolly <ju...@cyberagent.co.jp>.

Yes... I was actually poking around for that issue as I remembered 
seeing it before.  I had before also suggested a compound channel that 
would have worked like the buffer store in scribe, but general opinion 
was that it provided too many mixed configurations that could make 
testings and verifying correctness difficult.

On 07/04/2012 04:33 PM, Jarek Jarcec Cecho wrote:
> Hi Juhally,
> while ago I've filled jira FLUME-1227 where I've suggested creating some sort of SpillableChannel that would behave similarly as scribe. It would be normally acting as memory channel and it would start spilling data to disk in case that it would get full (my primary goal here was to solve issue when remote goes down, for example in case of HDFS maintenance). Would it be helpful for your case?
>
> Jarcec
>
> On Wed, Jul 04, 2012 at 04:07:48PM +0900, Juhani Connolly wrote:
>> Evaluating flume on some of our servers, the file channel seems very
>> slow, likely because like most typical web servers ours have a
>> single raided disk available for writing to.
>>
>> Quoted below is a suggestion from a  previous issue where our poor
>> throughput came up, where it turns out that on multiple disks, file
>> channel performance is great.
>>
>> On 06/27/2012 11:01 AM, Mike Percy wrote:
>>> We are able to push > 8000 events/sec (2KB per event) through a single file channel if you put checkpoint on one disk and use 2 other disks for data dirs. Not sure what the limit is. This is using the latest trunk code. Other limitations may be you need to add additional sinks to your channel to drain it faster. This is because sinks are single threaded and sources are multithreaded.
>>>
>>> Mike
>> For the case where the disks happen to be available on the server,
>> that's fantastic, but I suspect that most use cases are going to be
>> similar to ours, where multiple disks are not available. Our use
>> case isn't unusual as it's primarily aggregating logs from various
>> services.
>>
>> We originally ran our log servers with a exec(tail)->file->avro
>> setup where throughput was very bad(80mb in an hour). We then
>> switched this to a memory channel which was fine(the peak time 500mb
>> worth of hourly logs went through). Afterwards we switched back to
>> the file channel, but with 5 identical avro sinks. This did not
>> improve throughput(still 80mb). RecoverableMemoryChannel showed very
>> similar characteristics.
>>
>> I presume this is due to the writes going to two separate places,
>> and being further compounded by also writing out and tailing the
>> normal web logs: checking top and iostat, we could confirm we have
>> significant iowait time, far more than we have during typical
>> operation.
>>
>> As it is, we seem to be more or less guaranteeing no loss of logs
>> with the file channel. Perhaps we could look into batching
>> puts/takes for those that do not need 100% data retention but want
>> more reliability than with the MemoryChannel which can potentially
>> lose the entire capacity on a restart? Another possibility is
>> writing an implementation that writes primarily sequentially. I've
>> been meaning to get a deeper look at the implementation itself to
>> give a more informed commentary on the contents but unfortunately
>> don't have the cycles right now, hopefully someone with a better
>> understanding of the current implementation(along with its
>> interaction with the OS file cache) can comment on this.
>>

Re: File channel performance on a single disk is poor

Posted by Jarek Jarcec Cecho <ja...@apache.org>.

Hi Juhally,
while ago I've filled jira FLUME-1227 where I've suggested creating some sort of SpillableChannel that would behave similarly as scribe. It would be normally acting as memory channel and it would start spilling data to disk in case that it would get full (my primary goal here was to solve issue when remote goes down, for example in case of HDFS maintenance). Would it be helpful for your case?

Jarcec

On Wed, Jul 04, 2012 at 04:07:48PM +0900, Juhani Connolly wrote:
> Evaluating flume on some of our servers, the file channel seems very
> slow, likely because like most typical web servers ours have a
> single raided disk available for writing to.
> 
> Quoted below is a suggestion from a  previous issue where our poor
> throughput came up, where it turns out that on multiple disks, file
> channel performance is great.
> 
> On 06/27/2012 11:01 AM, Mike Percy wrote:
> >We are able to push > 8000 events/sec (2KB per event) through a single file channel if you put checkpoint on one disk and use 2 other disks for data dirs. Not sure what the limit is. This is using the latest trunk code. Other limitations may be you need to add additional sinks to your channel to drain it faster. This is because sinks are single threaded and sources are multithreaded.
> >
> >Mike
> 
> For the case where the disks happen to be available on the server,
> that's fantastic, but I suspect that most use cases are going to be
> similar to ours, where multiple disks are not available. Our use
> case isn't unusual as it's primarily aggregating logs from various
> services.
> 
> We originally ran our log servers with a exec(tail)->file->avro
> setup where throughput was very bad(80mb in an hour). We then
> switched this to a memory channel which was fine(the peak time 500mb
> worth of hourly logs went through). Afterwards we switched back to
> the file channel, but with 5 identical avro sinks. This did not
> improve throughput(still 80mb). RecoverableMemoryChannel showed very
> similar characteristics.
> 
> I presume this is due to the writes going to two separate places,
> and being further compounded by also writing out and tailing the
> normal web logs: checking top and iostat, we could confirm we have
> significant iowait time, far more than we have during typical
> operation.
> 
> As it is, we seem to be more or less guaranteeing no loss of logs
> with the file channel. Perhaps we could look into batching
> puts/takes for those that do not need 100% data retention but want
> more reliability than with the MemoryChannel which can potentially
> lose the entire capacity on a restart? Another possibility is
> writing an implementation that writes primarily sequentially. I've
> been meaning to get a deeper look at the implementation itself to
> give a more informed commentary on the contents but unfortunately
> don't have the cycles right now, hopefully someone with a better
> understanding of the current implementation(along with its
> interaction with the OS file cache) can comment on this.
>

File channel performance on a single disk is poor

Posted by Juhani Connolly <ju...@cyberagent.co.jp>.

Evaluating flume on some of our servers, the file channel seems very
slow, likely because like most typical web servers ours have a single
raided disk available for writing to.

Quoted below is a suggestion from a previous issue where our poor
throughput came up, where it turns out that on multiple disks, file
channel performance is great.

On 06/27/2012 11:01 AM, Mike Percy wrote:
> We are able to push > 8000 events/sec (2KB per event) through a single file channel if you put checkpoint on one disk and use 2 other disks for data dirs. Not sure what the limit is. This is using the latest trunk code. Other limitations may be you need to add additional sinks to your channel to drain it faster. This is because sinks are single threaded and sources are multithreaded.
>
> Mike

For the case where the disks happen to be available on the server,
that's fantastic, but I suspect that most use cases are going to be
similar to ours, where multiple disks are not available. Our use case
isn't unusual as it's primarily aggregating logs from various services.

We originally ran our log servers with a exec(tail)->file->avro setup
where throughput was very bad(80mb in an hour). We then switched this to
a memory channel which was fine(the peak time 500mb worth of hourly logs
went through). Afterwards we switched back to the file channel, but with
5 identical avro sinks. This did not improve throughput(still 80mb).
RecoverableMemoryChannel showed very similar characteristics.

I presume this is due to the writes going to two separate places, and
being further compounded by also writing out and tailing the normal web
logs: checking top and iostat, we could confirm we have significant
iowait time, far more than we have during typical operation.

As it is, we seem to be more or less guaranteeing no loss of logs with
the file channel. Perhaps we could look into batching puts/takes for
those that do not need 100% data retention but want more reliability
than with the MemoryChannel which can potentially lose the entire
capacity on a restart? Another possibility is writing an implementation
that writes primarily sequentially. I've been meaning to get a deeper
look at the implementation itself to give a more informed commentary on
the contents but unfortunately don't have the cycles right now,
hopefully someone with a better understanding of the current
implementation(along with its interaction with the OS file cache) can
comment on this.

Re: Running Flume-NG in multiple threads

Posted by Mike Percy <mp...@cloudera.com>.

We are able to push > 8000 events/sec (2KB per event) through a single file channel if you put checkpoint on one disk and use 2 other disks for data dirs. Not sure what the limit is. This is using the latest trunk code. Other limitations may be you need to add additional sinks to your channel to drain it faster. This is because sinks are single threaded and sources are multithreaded. 

Mike


On Tuesday, June 26, 2012 at 6:30 PM, Juhani Connolly wrote:

> Depending on your setup, you may find that some channels just cannot 
> keep up with throughput. Is the timestamp on the logs gradually falling 
> further behind the time it appears on the collector? If so, some 
> component(likely the channels) are turning into a bottleneck.
> 
> We have about 1000 events/sec running through to our collector and using 
> file channel it unfortunately could not keep up. If you're not already, 
> try running memory channels and see how that works.
> 
> On 06/27/2012 08:06 AM, Tejinder Aulakh wrote:
> > We are using Flume-NG to transfer logs from the log servers (40) to 
> > collectors (4). However, now we are seeing a long delay before the 
> > logs show up at the collectors.
> > 
> > I was wondering if there is any way of running Flume-NG using Multiple 
> > threads so that collectors can process all the load they get form the 
> > log servers. Is it configurable in flume-env.sh (http://flume-env.sh)? How many threads does 
> > Flume-NG use by default?
> > 
> > Thanks,
> > Tejinder
>

Re: Running Flume-NG in multiple threads

Posted by Juhani Connolly <ju...@cyberagent.co.jp>.

Depending on your setup, you may find that some channels just cannot 
keep up with throughput. Is the timestamp on the logs gradually falling 
further behind the time it appears on the collector? If so, some 
component(likely the channels) are turning into a bottleneck.

We have about 1000 events/sec running through to our collector and using 
file channel it unfortunately  could not keep up. If you're not already, 
try running memory channels and see how that works.

On 06/27/2012 08:06 AM, Tejinder Aulakh wrote:
> We are using Flume-NG to transfer logs from the log servers (40)  to 
> collectors (4). However, now we are seeing a long delay before the 
> logs show up at the collectors.
>
> I was wondering if there is any way of running Flume-NG using Multiple 
> threads so that collectors can process all the load they get form the 
> log servers. Is it configurable in flume-env.sh? How many threads does 
> Flume-NG use by default?
>
> Thanks,
> Tejinder
>