You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flume.apache.org by Roshan Naik <ro...@hortonworks.com> on 2013/10/01 00:50:26 UTC

Re: Avro sink to source is too slow

Anat,
   Can you give details on the second flume agent ?  for measuring, I
suggest you
- switch to mem channel  on both agents
- make your taget destination a separate disk (or diff host with fast n/w
connection)

it seems like there maybe too many components contending on the same disk
(spool source, file channels and sink on 2nd agent)

-roshan



On Mon, Sep 30, 2013 at 1:02 PM, Mike Keane <mk...@dotomi.com> wrote:

> As far as a fast disk if you only have one the drive head will be seeking
> constantly and performance will be awful we were having problems at 10,000
> log lines per second.  I've pushed over 270,000 lines per second compressed.
>
> I don't think it is avro, I'm able to saturate a gigabit line easily, so
> ~100mb / second of compressed data.
>
> I don't see a sink group in your configuration, I'm curious as to what the
> default behavior is when you tie multiple sinks to a file channel without a
> sink group.  That said I found performance issues using a single file
> channel with compression.  To get maximum performance I put a header on my
> events called "channel" since our servers are all numbered I was able to
> take (server# mod 6)+1 and make that the value for the "channel" header
> thus getting fairly even distribution of log data.  On my source I send
> data by channel header to the appropriate channel.  This parallelized the
> compression down 6 file channels.  I then have 3 sinks per channel using a
> failover sink group.   Also, do you need compression level 9?  I've found
> the gains in higher compression level are negligable compared to the
> performance expense (not with flume/deflate specifically but in general).
>  I found with turning compression level to 1 caused my sink to run 6-7
> times slower, my solution was to parallelize the compression and by trial
> and error found this to be the best case.
>
> agentName.sources.collector_source.selector.type = multiplexing
> agentName.sources.collector_source.selector.header = channel
> agentName.sources.collector_source.selector.mapping.1 = channel_1
> agentName.sources.collector_source.selector.mapping.2 = channel_2
> agentName.sources.collector_source.selector.mapping.3 = channel_3
> agentName.sources.collector_source.selector.mapping.4 = channel_4
> agentName.sources.collector_source.selector.mapping.5 = channel_5
> agentName.sources.collector_source.selector.default = channel_6
>
>
> -Mike
>
>
>
> On 09/30/2013 02:30 PM, Anat Rozenzon wrote:
> AFAIK we have a fast disk
> However I think  the problem is with avro and not the channel as you can
> see in the metrics below the channel got filled quickly but draining very
> slowly.
> After a few minutes of running only 70-80 batches were sent by each sink.
> {
>
> "SINK.AvroSink1-4":{"BatchCompleteCount":"74","ConnectionFailedCount":"0","EventDrainAttemptCount":"74000","ConnectionCreatedCount":"3","Type":"SINK","BatchEmptyCount":"1","ConnectionClosedCount":"2","EventDrainSuccessCount":"71000","StopTime":"0","StartTime":"1380568140738","BatchUnderflowCount":"0"},
>
> "SOURCE.logsdir":{"OpenConnectionCount":"0","Type":"SOURCE","AppendBatchAcceptedCount":"1330","AppendBatchReceivedCount":"1330","EventAcceptedCount":"1326298","AppendReceivedCount":"0","StopTime":"0","StartTime":"1380568140830","EventReceivedCount":"1326298","AppendAcceptedCount":"0"},
>
> "CHANNEL.fileChannel":{"EventPutSuccessCount":"1326298","ChannelFillPercentage":"51.314899999999994","Type":"CHANNEL","StopTime":"0","EventPutAttemptCount":"1326298","ChannelSize":"1026298","StartTime":"1380568140730","EventTakeSuccessCount":"300000","ChannelCapacity":"2000000","EventTakeAttemptCount":"310073"},
>
> "SINK.AvroSink1-2":{"BatchCompleteCount":"78","ConnectionFailedCount":"0","EventDrainAttemptCount":"78000","ConnectionCreatedCount":"3","Type":"SINK","BatchEmptyCount":"1","ConnectionClosedCount":"2","EventDrainSuccessCount":"75000","StopTime":"0","StartTime":"1380568140736","BatchUnderflowCount":"0"},
>
> "SINK.AvroSink1-3":{"BatchCompleteCount":"81","ConnectionFailedCount":"0","EventDrainAttemptCount":"81000","ConnectionCreatedCount":"3","Type":"SINK","BatchEmptyCount":"1","ConnectionClosedCount":"2","EventDrainSuccessCount":"79000","StopTime":"0","StartTime":"1380568140736","BatchUnderflowCount":"0"},
>
> "SINK.AvroSink1-1":{"BatchCompleteCount":"77","ConnectionFailedCount":"0","EventDrainAttemptCount":"77000","ConnectionCreatedCount":"2","Type":"SINK","BatchEmptyCount":"1","ConnectionClosedCount":"1","EventDrainSuccessCount":"75000","StopTime":"0","StartTime":"1380568140734","BatchUnderflowCount":"0"}}
>
>
> On Mon, Sep 30, 2013 at 7:21 PM, Mike Keane <mkeane@dotomi.com<mailto:
> mkeane@dotomi.com>> wrote:
> What kind of disk configuration on your file channel?  With a single disk
> configuration (Dell Blade server) performance was awful.  I believe what
> Flume needs at a minimum is a separate disk for the check point and data
> directories.  When I switched to a SSD or a 13 disk raid setup my problems
> went away with one exception.   Compression was still very slow.  I ended
> up distributing my flow over several file channels to get good throughput
> with compression.
>
> -Mike
>
>
> On 09/30/2013 11:11 AM, Anat Rozenzon wrote:
> Hi
>
> I'm trying to read 100MB of files using directory spooler, file channel
> and 4 avro sinks into an avro source running on another flume process.
> Both flume processes are running on same machine just for eliminating
> network issues.
>
> However it takes more than 5 minutes to read & pass the 100MB data, this
> is too slow for our needs.
>
> After about 1 minute the files are read into the file channel and then
> quite a long time where the file channel is draining really slowly with the
> four sinks.
>
> Copying the same data using scp from a remote machine takes 7 seconds.
>
> Below is my config, anything I can do to improve this?
>
> agent.sources = logsdir
> agent.sources.logsdir.type = spooldir
> agent.sources.logsdir.channels = fileChannel
> agent.sources.logsdir.spoolDir = %%WORK_DIR%%
> agent.sources.logsdir.fileHeader = true
> agent.sources.logsdir.batchSize=1000
> agent.sources.logsdir.deletePolicy=immediate
> agent.sources.logsdir.interceptors =  ihost iserver_type iserver_id
> agent.sources.logsdir.interceptors.ihost.type = host
> agent.sources.logsdir.interceptors.ihost.useIP = false
> agent.sources.logsdir.interceptors.ihost.hostHeader = server_hostname
>
> agent.sources.logsdir.interceptors.iserver_type.type = static
> agent.sources.logsdir.interceptors.iserver_type.key = server_type
> agent.sources.logsdir.interceptors.iserver_type.value = %%SERVER_TYPE%%
> agent.sources.logsdir.interceptors.iserver_id.type = static
> agent.sources.logsdir.interceptors.iserver_id.key = server_id
> agent.sources.logsdir.interceptors.iserver_id.value = %%SERVER_ID%%
>
> agent.sources.logsdir.deserializer.maxLineLength = 10240
>
>
> agent.channels = fileChannel
> agent.channels.fileChannel.type = file
>
> agent.channels.fileChannel.checkpointDir=%%WORK_DIR%%/flume/filechannel/checkpoint
> agent.channels.fileChannel.dataDirs=%%WORK_DIR%%/flume/filechannel/data
> agent.channels.fileChannel.capacity=2000000
> agent.channels.fileChannel.transactionCapacity=1000
> agent.channels.fileChannel.use-fast-replay=true
> agent.channels.fileChannel.useDualCheckpoints=true
>
> agent.channels.fileChannel.backupCheckpointDir=%%WORK_DIR%%/flume/filechannel/backupCheckpointDir
> agent.channels.fileChannel.minimumRequiredSpace=1073741824
> agent.channels.fileChannel.maxFileSize=524288000
>
> ## Send to  multiple Collectors for load balancing
> agent.sinks = AvroSink1-1 AvroSink1-2 AvroSink1-3 AvroSink1-4
>
> agent.sinks.AvroSink1-1.type = avro
> agent.sinks.AvroSink1-1.channel = fileChannel
> agent.sinks.AvroSink1-1.hostname = %%COLLECTOR1_SERVER%%
> agent.sinks.AvroSink1-1.port = 4545%%COLLECTOR1_SLOT%%
> agent.sinks.AvroSink1-1.connect-timeout = 60000
> agent.sinks.AvroSink1-1.request-timeout = 60000
> agent.sinks.AvroSink1-1.batch-size = 1000
> agent.sinks.AvroSink1-1.compression-type=deflate
> agent.sinks.AvroSink1-1.compression-level=9
>
> agent.sinks.AvroSink1-2.type = avro
> agent.sinks.AvroSink1-2.channel = fileChannel
> agent.sinks.AvroSink1-2.hostname = %%COLLECTOR1_SERVER%%
> agent.sinks.AvroSink1-2.port = 4545%%COLLECTOR1_SLOT%%
> agent.sinks.AvroSink1-2.connect-timeout = 60000
> agent.sinks.AvroSink1-2.request-timeout = 60000
> agent.sinks.AvroSink1-2.batch-size = 1000
> agent.sinks.AvroSink1-2.compression-type=deflate
> agent.sinks.AvroSink1-2.compression-level=9
>
> agent.sinks.AvroSink1-3.type = avro
> agent.sinks.AvroSink1-3.channel = fileChannel
> agent.sinks.AvroSink1-3.hostname = %%COLLECTOR1_SERVER%%
> agent.sinks.AvroSink1-3.port = 4545%%COLLECTOR1_SLOT%%
> agent.sinks.AvroSink1-3.connect-timeout = 60000
> agent.sinks.AvroSink1-3.request-timeout = 60000
> agent.sinks.AvroSink1-3.batch-size = 1000
> agent.sinks.AvroSink1-3.compression-type=deflate
> agent.sinks.AvroSink1-3.compression-level=9
>
> agent.sinks.AvroSink1-4.type = avro
> agent.sinks.AvroSink1-4.channel = fileChannel
> agent.sinks.AvroSink1-4.hostname = %%COLLECTOR1_SERVER%%
> agent.sinks.AvroSink1-4.port = 4545%%COLLECTOR1_SLOT%%
> agent.sinks.AvroSink1-4.connect-timeout = 60000
> agent.sinks.AvroSink1-4.request-timeout = 60000
> agent.sinks.AvroSink1-4.batch-size = 1000
> agent.sinks.AvroSink1-4.compression-type=deflate
> agent.sinks.AvroSink1-4.compression-level=9
>
> Thanks
> Anat
>
>
>
>
>
> This email and any files included with it may contain privileged,
> proprietary and/or confidential information that is for the sole use
> of the intended recipient(s).  Any disclosure, copying, distribution,
> posting, or use of the information contained in or attached to this
> email is prohibited unless permitted by the sender.  If you have
> received this email in error, please immediately notify the sender
> via return email, telephone, or fax and destroy this original transmission
> and its included files without reading or saving it in any manner.
> Thank you.
>
>
>
>
>
>
>
> This email and any files included with it may contain privileged,
> proprietary and/or confidential information that is for the sole use
> of the intended recipient(s).  Any disclosure, copying, distribution,
> posting, or use of the information contained in or attached to this
> email is prohibited unless permitted by the sender.  If you have
> received this email in error, please immediately notify the sender
> via return email, telephone, or fax and destroy this original transmission
> and its included files without reading or saving it in any manner.
> Thank you.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Avro sink to source is too slow

Posted by Anat Rozenzon <an...@viber.com>.

Yes all 3 channels were writing to the same disk, however we are using
Amazon servers, not sure if their 'separate' disks are really separated.


On Thu, Oct 3, 2013 at 6:28 PM, Hari Shreedharan
<hs...@cloudera.com>wrote:

> Yes. Using multiple sinks with no sink groups would give each sin nuts own
> thread. Each time you add a channel to a source you will take some
> performance hit, because the channels are written to one after the
> other.also were these channels sharing disks? Was the checkpoint and data
> files for each of them on separate disks?
>
>
> On Thursday, October 3, 2013, Anat Rozenzon wrote:
>
>> Just a quick update, I found two issues that slowed down flume:
>> 1. Using 3 file replicating channels on the avro source slowed down the
>> acceptance of flume events, it takes up to 5-10  times more than writing to
>> one channel. So I'm now trying to change the collector's configuration to 1
>> file channel and then a spooldir source that will read out of the
>> Collector's file system and into a memory channel for replication.
>> 2. More disturbing is that I see many disconnections in Avro Sink-Source
>> pair while the Source flume (e.g. collector) is doing Full GCs, also the
>> Full GCs were quite long (~ 15 seconds). Changing Java to a non-hanging GC
>> (i.e. gc1) solved this issue as well.
>>
>> BTW Regarding Mike's question above:
>> What is the correct way to put multiple threads that will drain a channel
>> quickly?
>> I thought the correct way is simply to put multiple sinks that are using
>> the same channel, without any sink groups, is that correct?
>>
>> Thanks
>> Anat
>>
>>
>> On Tue, Oct 1, 2013 at 11:10 PM, Roshan Naik <ro...@hortonworks.com>wrote:
>>
>>> My thoughts...You have 4 sinks draining the same channel and each has a
>>> batch size 1000. Since they will contend on the same channel & *assuming*
>>> events are evenly distributed among the sinks, there is potential for some
>>> starvation happening in the sinks as their batch sizes may not be reached
>>> until about 4 batches  are inserted by the source. I dont know if there is
>>> a good thumb rule here.
>>>
>>> try these:
>>> -  See if sink batch size of 250 helps.
>>> -  Using a single avro sink instead of 4 with batch size of 1k.
>>> -  Replacing the  avro sink with the null sink on the first agent and
>>> take a measurement. it would be good to ensure spool source is not the
>>> bottle neck.
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>>
>>
>>

Re: Avro sink to source is too slow

Posted by Hari Shreedharan <hs...@cloudera.com>.

Yes. Using multiple sinks with no sink groups would give each sin nuts own
thread. Each time you add a channel to a source you will take some
performance hit, because the channels are written to one after the
other.also were these channels sharing disks? Was the checkpoint and data
files for each of them on separate disks?

On Thursday, October 3, 2013, Anat Rozenzon wrote:

> Just a quick update, I found two issues that slowed down flume:
> 1. Using 3 file replicating channels on the avro source slowed down the
> acceptance of flume events, it takes up to 5-10  times more than writing to
> one channel. So I'm now trying to change the collector's configuration to 1
> file channel and then a spooldir source that will read out of the
> Collector's file system and into a memory channel for replication.
> 2. More disturbing is that I see many disconnections in Avro Sink-Source
> pair while the Source flume (e.g. collector) is doing Full GCs, also the
> Full GCs were quite long (~ 15 seconds). Changing Java to a non-hanging GC
> (i.e. gc1) solved this issue as well.
>
> BTW Regarding Mike's question above:
> What is the correct way to put multiple threads that will drain a channel
> quickly?
> I thought the correct way is simply to put multiple sinks that are using
> the same channel, without any sink groups, is that correct?
>
> Thanks
> Anat
>
>
> On Tue, Oct 1, 2013 at 11:10 PM, Roshan Naik <roshan@hortonworks.com<javascript:_e({}, 'cvml', 'roshan@hortonworks.com');>
> > wrote:
>
>> My thoughts...You have 4 sinks draining the same channel and each has a
>> batch size 1000. Since they will contend on the same channel & *assuming*
>> events are evenly distributed among the sinks, there is potential for some
>> starvation happening in the sinks as their batch sizes may not be reached
>> until about 4 batches  are inserted by the source. I dont know if there is
>> a good thumb rule here.
>>
>> try these:
>> -  See if sink batch size of 250 helps.
>> -  Using a single avro sink instead of 4 with batch size of 1k.
>> -  Replacing the  avro sink with the null sink on the first agent and
>> take a measurement. it would be good to ensure spool source is not the
>> bottle neck.
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
>

Re: Avro sink to source is too slow

Posted by Anat Rozenzon <an...@viber.com>.

Just a quick update, I found two issues that slowed down flume:
1. Using 3 file replicating channels on the avro source slowed down the
acceptance of flume events, it takes up to 5-10  times more than writing to
one channel. So I'm now trying to change the collector's configuration to 1
file channel and then a spooldir source that will read out of the
Collector's file system and into a memory channel for replication.
2. More disturbing is that I see many disconnections in Avro Sink-Source
pair while the Source flume (e.g. collector) is doing Full GCs, also the
Full GCs were quite long (~ 15 seconds). Changing Java to a non-hanging GC
(i.e. gc1) solved this issue as well.

BTW Regarding Mike's question above:
What is the correct way to put multiple threads that will drain a channel
quickly?
I thought the correct way is simply to put multiple sinks that are using
the same channel, without any sink groups, is that correct?

Thanks
Anat

On Tue, Oct 1, 2013 at 11:10 PM, Roshan Naik <ro...@hortonworks.com> wrote:

> My thoughts...You have 4 sinks draining the same channel and each has a
> batch size 1000. Since they will contend on the same channel & *assuming*
> events are evenly distributed among the sinks, there is potential for some
> starvation happening in the sinks as their batch sizes may not be reached
> until about 4 batches  are inserted by the source. I dont know if there is
> a good thumb rule here.
>
> try these:
> -  See if sink batch size of 250 helps.
> -  Using a single avro sink instead of 4 with batch size of 1k.
> -  Replacing the  avro sink with the null sink on the first agent and take
> a measurement. it would be good to ensure spool source is not the bottle
> neck.
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Avro sink to source is too slow

Posted by Roshan Naik <ro...@hortonworks.com>.

My thoughts...You have 4 sinks draining the same channel and each has a
batch size 1000. Since they will contend on the same channel & *assuming*
events are evenly distributed among the sinks, there is potential for some
starvation happening in the sinks as their batch sizes may not be reached
until about 4 batches  are inserted by the source. I dont know if there is
a good thumb rule here.

try these:
-  See if sink batch size of 250 helps.
-  Using a single avro sink instead of 4 with batch size of 1k.
-  Replacing the  avro sink with the null sink on the first agent and take
a measurement. it would be good to ensure spool source is not the bottle
neck.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Avro sink to source is too slow

Posted by Anat Rozenzon <an...@viber.com>.

Thank you Mike & Roshan.

I've changed both flumes(agent & collector) to run with memory channels, I
also removed the compression for now.
I sample the EventAcceptedCount metric on collector's avro source every 2
minutes and it seem that during two minutes it received 1,027,000 records
which is ~ 77MB.

It is a better throughput but still not what I expect.

This is my current agent config:

agent.sources = logsdir
agent.sources.logsdir.type = spooldir
agent.sources.logsdir.channels = fileChannel
agent.sources.logsdir.spoolDir = /disk/old_logs
agent.sources.logsdir.fileHeader = true
agent.sources.logsdir.batchSize=1000
agent.sources.logsdir.deletePolicy=immediate
agent.sources.logsdir.interceptors =  ihost iserver_type iserver_id
#agent.sources.logsdir.interceptors.ihost.type = host
#agent.sources.logsdir.interceptors.ihost.useIP = false
#agent.sources.logsdir.interceptors.ihost.hostHeader = server_hostname
agent.sources.logsdir.interceptors.ihost.type = static
agent.sources.logsdir.interceptors.ihost.key = server_hostname
agent.sources.logsdir.interceptors.ihost.value = hs666
agent.sources.logsdir.interceptors.iserver_type.type = static
agent.sources.logsdir.interceptors.iserver_type.key = server_type
agent.sources.logsdir.interceptors.iserver_type.value = Push
agent.sources.logsdir.interceptors.iserver_id.type = static
agent.sources.logsdir.interceptors.iserver_id.key = server_id
agent.sources.logsdir.interceptors.iserver_id.value = 1

agent.sources.logsdir.deserializer.maxLineLength = 10240


agent.channels = fileChannel
agent.channels.fileChannel.type = memory
agent.channels.fileChannel.capacity = 100000
agent.channels.fileChannel.transactionCapacity = 1000

#agent.channels.fileChannel.type = file
#agent.channels.fileChannel.checkpointDir=/mnt/flume/filechannel/checkpoint
#agent.channels.fileChannel.dataDirs=/mnt/flume/filechannel/data
#agent.channels.fileChannel.capacity=2000000
#agent.channels.fileChannel.transactionCapacity=1000
#agent.channels.fileChannel.use-fast-replay=true
#agent.channels.fileChannel.useDualCheckpoints=true
#agent.channels.fileChannel.backupCheckpointDir=/mnt/flume/filechannel/backupCheckpointDir
#agent.channels.fileChannel.minimumRequiredSpace=1073741824
#agent.channels.fileChannel.maxFileSize=524288000

agent.sinks = AvroSink1-1 AvroSink1-2 AvroSink1-3 AvroSink1-4

agent.sinks.AvroSink1-1.type = avro
agent.sinks.AvroSink1-1.channel = fileChannel
agent.sinks.AvroSink1-1.hostname = X.X.X.X
agent.sinks.AvroSink1-1.port = 45451
agent.sinks.AvroSink1-1.connect-timeout = 60000
agent.sinks.AvroSink1-1.request-timeout = 60000
agent.sinks.AvroSink1-1.batch-size = 1000
#agent.sinks.AvroSink1-1.compression-type=deflate
#agent.sinks.AvroSink1-1.compression-level=9

agent.sinks.AvroSink1-2.type = avro
agent.sinks.AvroSink1-2.channel = fileChannel
agent.sinks.AvroSink1-2.hostname = X.X.X.X
agent.sinks.AvroSink1-2.port = 45451
agent.sinks.AvroSink1-2.connect-timeout = 60000
agent.sinks.AvroSink1-2.request-timeout = 60000
agent.sinks.AvroSink1-2.batch-size = 1000
#agent.sinks.AvroSink1-2.compression-type=deflate
#agent.sinks.AvroSink1-2.compression-level=9

agent.sinks.AvroSink1-3.type = avro
agent.sinks.AvroSink1-3.channel = fileChannel
agent.sinks.AvroSink1-3.hostname = X.X.X.X
agent.sinks.AvroSink1-3.port = 45451
agent.sinks.AvroSink1-3.connect-timeout = 60000
agent.sinks.AvroSink1-3.request-timeout = 60000
agent.sinks.AvroSink1-3.batch-size = 1000
#agent.sinks.AvroSink1-3.compression-type=deflate
#agent.sinks.AvroSink1-3.compression-level=9

agent.sinks.AvroSink1-4.type = avro
agent.sinks.AvroSink1-4.channel = fileChannel
agent.sinks.AvroSink1-4.hostname = X.X.X.X
agent.sinks.AvroSink1-4.port = 45451
agent.sinks.AvroSink1-4.connect-timeout = 60000
agent.sinks.AvroSink1-4.request-timeout = 60000
agent.sinks.AvroSink1-4.batch-size = 1000
#agent.sinks.AvroSink1-4.compression-type=deflate
#agent.sinks.AvroSink1-4.compression-level=9

This is the relevant part of the collector config (in general 1 avro source
writes to 3 channels):

collector.sources = ExternalAvroSource

collector.sources.ExternalAvroSource.type = avro
collector.sources.ExternalAvroSource.bind = 0.0.0.0
collector.sources.ExternalAvroSource.port = 45451
#collector.sources.ExternalAvroSource.compression-type=deflate
#collector.sources.ExternalAvroSource.threads = 64
## Source writes to 3 channels, one for each sink (Fan Out)
collector.sources.ExternalAvroSource.channels = filechannel-backup
filechannel-s3raw filechannel-s3prep-internal
collector.sources.ExternalAvroSource.selector.type = replicating
collector.sources.ExternalAvroSource.interceptors = iviber itime_default
collector.sources.ExternalAvroSource.interceptors.itime_default.type =
static
collector.sources.ExternalAvroSource.interceptors.itime_default.preserveExisting
= true
collector.sources.ExternalAvroSource.interceptors.itime_default.key =
timestamp
collector.sources.ExternalAvroSource.interceptors.itime_default.value = 1
collector.sources.ExternalAvroSource.interceptors.iviber.type =
com.viber.bigdata.flume.ViberInterceptor$Builder
collector.sources.ExternalAvroSource.interceptors.iviber.file_types= FILE
collector.sources.ExternalAvroSource.interceptors.iviber.collector_id=1

collector.channels = filechannel-backup filechannel-s3raw
filechannel-s3prep-internal memorychannel-s3prep

collector.channels.filechannel-backup.type = memory
collector.channels.filechannel-backup.capacity = 1000000
collector.channels.filechannel-backup.transactionCapacity = 10000
#collector.channels.filechannel-backup.type = file
#collector.channels.filechannel-backup.checkpointDir=/disk3/flume_data/flume/collector1/channels/filechannel-backup/checkpoint
#collector.channels.filechannel-backup.dataDirs=/disk3/flume_data/flume/collector1/channels/filechannel-backup/data1,/disk3/flume_data/flume/collector1/channels/filechannel-backup/data2,/disk3/flume_data/flume/collector1/channels/filechannel-backup/data3,/disk3/flume_data/flume/collector1/channels/filechannel-backup/data4
#collector.channels.filechannel-backup.capacity=100000000
#collector.channels.filechannel-backup.maxFileSize = 375809638400
#collector.channels.filechannel-backup.transactionCapacity=10000
#collector.channels.filechannel-backup.use-fast-replay=true
#collector.channels.filechannel-backup.useDualCheckpoints=true
#collector.channels.filechannel-backup.backupCheckpointDir=/disk3/flume_data/flume/collector1/channels/filechannel-backup/backupCheckpointDir
#collector.channels.filechannel-backup.checkpointInterval=120000
#collector.channels.filechannel-backup.write-timeout = 60

collector.channels.filechannel-s3raw.type = memory
collector.channels.filechannel-s3raw.capacity = 1000000
collector.channels.filechannel-s3raw.transactionCapacity = 10000

#collector.channels.filechannel-s3raw.type = file
#collector.channels.filechannel-s3raw.checkpointDir=/disk3/flume_data/flume/collector1/channels/filechannel-s3raw/checkpoint
#collector.channels.filechannel-s3raw.dataDirs=/disk3/flume_data/flume/collector1/channels/filechannel-s3raw/data1,/disk3/flume_data/flume/collector1/channels/filechannel-s3raw/data2,/disk3/flume_data/flume/collector1/channels/filechannel-s3raw/data3,/disk3/flume_data/flume/collector1/channels/filechannel-s3raw/data4
#collector.channels.filechannel-s3raw.capacity=100000000
#collector.channels.filechannel-s3raw.maxFileSize = 375809638400
#collector.channels.filechannel-s3raw.transactionCapacity=10000
#collector.channels.filechannel-s3raw.use-fast-replay=true
#collector.channels.filechannel-s3raw.useDualCheckpoints=true
#collector.channels.filechannel-s3raw.backupCheckpointDir=/disk3/flume_data/flume/collector1/channels/filechannel-s3raw/backupCheckpointDir
#collector.channels.filechannel-s3raw.checkpointInterval=120000
#collector.channels.filechannel-s3raw.write-timeout = 60

collector.channels.filechannel-s3prep-internal.type = memory
collector.channels.filechannel-s3prep-internal.capacity = 1000000
collector.channels.filechannel-s3prep-internal.transactionCapacity = 10000
#collector.channels.filechannel-s3prep-internal.type = file
#collector.channels.filechannel-s3prep-internal.checkpointDir=/disk3/flume_data/flume/collector1/channels/filechannel-s3prep-internal/checkpoint
#collector.channels.filechannel-s3prep-internal.dataDirs=/disk3/flume_data/flume/collector1/channels/filechannel-s3prep-internal/data1,/disk3/flume_data/flume/collector1/channels/filechannel-s3prep-internal/data2,/disk3/flume_data/flume/collector1/channels/filechannel-s3prep-internal/data3,/disk3/flume_data/flume/collector1/channels/filechannel-s3prep-internal/data4
#collector.channels.filechannel-s3prep-internal.maxFileSize = 375809638400
#collector.channels.filechannel-s3prep-internal.capacity=100000000
#collector.channels.filechannel-s3prep-internal.transactionCapacity=10000
#collector.channels.filechannel-s3prep-internal.use-fast-replay=true
#collector.channels.filechannel-s3prep-internal.useDualCheckpoints=true
#collector.channels.filechannel-s3prep-internal.backupCheckpointDir=/disk3/flume_data/flume/collector1/channels/filechannel-s3prep-internal/backupCheckpointDir
#collector.channels.filechannel-s3prep-internal.checkpointInterval=120000
#collector.channels.filechannel-s3prep-internal.write-timeout = 60




On Tue, Oct 1, 2013 at 1:50 AM, Roshan Naik <ro...@hortonworks.com> wrote:

> Anat,
>    Can you give details on the second flume agent ?  for measuring, I
> suggest you
> - switch to mem channel  on both agents
> - make your taget destination a separate disk (or diff host with fast n/w
> connection)
>
> it seems like there maybe too many components contending on the same disk
> (spool source, file channels and sink on 2nd agent)
>
> -roshan
>
>
>
> On Mon, Sep 30, 2013 at 1:02 PM, Mike Keane <mk...@dotomi.com> wrote:
>
>> As far as a fast disk if you only have one the drive head will be seeking
>> constantly and performance will be awful we were having problems at 10,000
>> log lines per second.  I've pushed over 270,000 lines per second compressed.
>>
>> I don't think it is avro, I'm able to saturate a gigabit line easily, so
>> ~100mb / second of compressed data.
>>
>> I don't see a sink group in your configuration, I'm curious as to what
>> the default behavior is when you tie multiple sinks to a file channel
>> without a sink group.  That said I found performance issues using a single
>> file channel with compression.  To get maximum performance I put a header
>> on my events called "channel" since our servers are all numbered I was able
>> to take (server# mod 6)+1 and make that the value for the "channel" header
>> thus getting fairly even distribution of log data.  On my source I send
>> data by channel header to the appropriate channel.  This parallelized the
>> compression down 6 file channels.  I then have 3 sinks per channel using a
>> failover sink group.   Also, do you need compression level 9?  I've found
>> the gains in higher compression level are negligable compared to the
>> performance expense (not with flume/deflate specifically but in general).
>>  I found with turning compression level to 1 caused my sink to run 6-7
>> times slower, my solution was to parallelize the compression and by trial
>> and error found this to be the best case.
>>
>> agentName.sources.collector_source.selector.type = multiplexing
>> agentName.sources.collector_source.selector.header = channel
>> agentName.sources.collector_source.selector.mapping.1 = channel_1
>> agentName.sources.collector_source.selector.mapping.2 = channel_2
>> agentName.sources.collector_source.selector.mapping.3 = channel_3
>> agentName.sources.collector_source.selector.mapping.4 = channel_4
>> agentName.sources.collector_source.selector.mapping.5 = channel_5
>> agentName.sources.collector_source.selector.default = channel_6
>>
>>
>> -Mike
>>
>>
>>
>> On 09/30/2013 02:30 PM, Anat Rozenzon wrote:
>> AFAIK we have a fast disk
>> However I think  the problem is with avro and not the channel as you can
>> see in the metrics below the channel got filled quickly but draining very
>> slowly.
>> After a few minutes of running only 70-80 batches were sent by each sink.
>> {
>>
>> "SINK.AvroSink1-4":{"BatchCompleteCount":"74","ConnectionFailedCount":"0","EventDrainAttemptCount":"74000","ConnectionCreatedCount":"3","Type":"SINK","BatchEmptyCount":"1","ConnectionClosedCount":"2","EventDrainSuccessCount":"71000","StopTime":"0","StartTime":"1380568140738","BatchUnderflowCount":"0"},
>>
>> "SOURCE.logsdir":{"OpenConnectionCount":"0","Type":"SOURCE","AppendBatchAcceptedCount":"1330","AppendBatchReceivedCount":"1330","EventAcceptedCount":"1326298","AppendReceivedCount":"0","StopTime":"0","StartTime":"1380568140830","EventReceivedCount":"1326298","AppendAcceptedCount":"0"},
>>
>> "CHANNEL.fileChannel":{"EventPutSuccessCount":"1326298","ChannelFillPercentage":"51.314899999999994","Type":"CHANNEL","StopTime":"0","EventPutAttemptCount":"1326298","ChannelSize":"1026298","StartTime":"1380568140730","EventTakeSuccessCount":"300000","ChannelCapacity":"2000000","EventTakeAttemptCount":"310073"},
>>
>> "SINK.AvroSink1-2":{"BatchCompleteCount":"78","ConnectionFailedCount":"0","EventDrainAttemptCount":"78000","ConnectionCreatedCount":"3","Type":"SINK","BatchEmptyCount":"1","ConnectionClosedCount":"2","EventDrainSuccessCount":"75000","StopTime":"0","StartTime":"1380568140736","BatchUnderflowCount":"0"},
>>
>> "SINK.AvroSink1-3":{"BatchCompleteCount":"81","ConnectionFailedCount":"0","EventDrainAttemptCount":"81000","ConnectionCreatedCount":"3","Type":"SINK","BatchEmptyCount":"1","ConnectionClosedCount":"2","EventDrainSuccessCount":"79000","StopTime":"0","StartTime":"1380568140736","BatchUnderflowCount":"0"},
>>
>> "SINK.AvroSink1-1":{"BatchCompleteCount":"77","ConnectionFailedCount":"0","EventDrainAttemptCount":"77000","ConnectionCreatedCount":"2","Type":"SINK","BatchEmptyCount":"1","ConnectionClosedCount":"1","EventDrainSuccessCount":"75000","StopTime":"0","StartTime":"1380568140734","BatchUnderflowCount":"0"}}
>>
>>
>> On Mon, Sep 30, 2013 at 7:21 PM, Mike Keane <mkeane@dotomi.com<mailto:
>> mkeane@dotomi.com>> wrote:
>> What kind of disk configuration on your file channel?  With a single disk
>> configuration (Dell Blade server) performance was awful.  I believe what
>> Flume needs at a minimum is a separate disk for the check point and data
>> directories.  When I switched to a SSD or a 13 disk raid setup my problems
>> went away with one exception.   Compression was still very slow.  I ended
>> up distributing my flow over several file channels to get good throughput
>> with compression.
>>
>> -Mike
>>
>>
>> On 09/30/2013 11:11 AM, Anat Rozenzon wrote:
>> Hi
>>
>> I'm trying to read 100MB of files using directory spooler, file channel
>> and 4 avro sinks into an avro source running on another flume process.
>> Both flume processes are running on same machine just for eliminating
>> network issues.
>>
>> However it takes more than 5 minutes to read & pass the 100MB data, this
>> is too slow for our needs.
>>
>> After about 1 minute the files are read into the file channel and then
>> quite a long time where the file channel is draining really slowly with the
>> four sinks.
>>
>> Copying the same data using scp from a remote machine takes 7 seconds.
>>
>> Below is my config, anything I can do to improve this?
>>
>> agent.sources = logsdir
>> agent.sources.logsdir.type = spooldir
>> agent.sources.logsdir.channels = fileChannel
>> agent.sources.logsdir.spoolDir = %%WORK_DIR%%
>> agent.sources.logsdir.fileHeader = true
>> agent.sources.logsdir.batchSize=1000
>> agent.sources.logsdir.deletePolicy=immediate
>> agent.sources.logsdir.interceptors =  ihost iserver_type iserver_id
>> agent.sources.logsdir.interceptors.ihost.type = host
>> agent.sources.logsdir.interceptors.ihost.useIP = false
>> agent.sources.logsdir.interceptors.ihost.hostHeader = server_hostname
>>
>> agent.sources.logsdir.interceptors.iserver_type.type = static
>> agent.sources.logsdir.interceptors.iserver_type.key = server_type
>> agent.sources.logsdir.interceptors.iserver_type.value = %%SERVER_TYPE%%
>> agent.sources.logsdir.interceptors.iserver_id.type = static
>> agent.sources.logsdir.interceptors.iserver_id.key = server_id
>> agent.sources.logsdir.interceptors.iserver_id.value = %%SERVER_ID%%
>>
>> agent.sources.logsdir.deserializer.maxLineLength = 10240
>>
>>
>> agent.channels = fileChannel
>> agent.channels.fileChannel.type = file
>>
>> agent.channels.fileChannel.checkpointDir=%%WORK_DIR%%/flume/filechannel/checkpoint
>> agent.channels.fileChannel.dataDirs=%%WORK_DIR%%/flume/filechannel/data
>> agent.channels.fileChannel.capacity=2000000
>> agent.channels.fileChannel.transactionCapacity=1000
>> agent.channels.fileChannel.use-fast-replay=true
>> agent.channels.fileChannel.useDualCheckpoints=true
>>
>> agent.channels.fileChannel.backupCheckpointDir=%%WORK_DIR%%/flume/filechannel/backupCheckpointDir
>> agent.channels.fileChannel.minimumRequiredSpace=1073741824
>> agent.channels.fileChannel.maxFileSize=524288000
>>
>> ## Send to  multiple Collectors for load balancing
>> agent.sinks = AvroSink1-1 AvroSink1-2 AvroSink1-3 AvroSink1-4
>>
>> agent.sinks.AvroSink1-1.type = avro
>> agent.sinks.AvroSink1-1.channel = fileChannel
>> agent.sinks.AvroSink1-1.hostname = %%COLLECTOR1_SERVER%%
>> agent.sinks.AvroSink1-1.port = 4545%%COLLECTOR1_SLOT%%
>> agent.sinks.AvroSink1-1.connect-timeout = 60000
>> agent.sinks.AvroSink1-1.request-timeout = 60000
>> agent.sinks.AvroSink1-1.batch-size = 1000
>> agent.sinks.AvroSink1-1.compression-type=deflate
>> agent.sinks.AvroSink1-1.compression-level=9
>>
>> agent.sinks.AvroSink1-2.type = avro
>> agent.sinks.AvroSink1-2.channel = fileChannel
>> agent.sinks.AvroSink1-2.hostname = %%COLLECTOR1_SERVER%%
>> agent.sinks.AvroSink1-2.port = 4545%%COLLECTOR1_SLOT%%
>> agent.sinks.AvroSink1-2.connect-timeout = 60000
>> agent.sinks.AvroSink1-2.request-timeout = 60000
>> agent.sinks.AvroSink1-2.batch-size = 1000
>> agent.sinks.AvroSink1-2.compression-type=deflate
>> agent.sinks.AvroSink1-2.compression-level=9
>>
>> agent.sinks.AvroSink1-3.type = avro
>> agent.sinks.AvroSink1-3.channel = fileChannel
>> agent.sinks.AvroSink1-3.hostname = %%COLLECTOR1_SERVER%%
>> agent.sinks.AvroSink1-3.port = 4545%%COLLECTOR1_SLOT%%
>> agent.sinks.AvroSink1-3.connect-timeout = 60000
>> agent.sinks.AvroSink1-3.request-timeout = 60000
>> agent.sinks.AvroSink1-3.batch-size = 1000
>> agent.sinks.AvroSink1-3.compression-type=deflate
>> agent.sinks.AvroSink1-3.compression-level=9
>>
>> agent.sinks.AvroSink1-4.type = avro
>> agent.sinks.AvroSink1-4.channel = fileChannel
>> agent.sinks.AvroSink1-4.hostname = %%COLLECTOR1_SERVER%%
>> agent.sinks.AvroSink1-4.port = 4545%%COLLECTOR1_SLOT%%
>> agent.sinks.AvroSink1-4.connect-timeout = 60000
>> agent.sinks.AvroSink1-4.request-timeout = 60000
>> agent.sinks.AvroSink1-4.batch-size = 1000
>> agent.sinks.AvroSink1-4.compression-type=deflate
>> agent.sinks.AvroSink1-4.compression-level=9
>>
>> Thanks
>> Anat
>>
>>
>>
>>
>>
>> This email and any files included with it may contain privileged,
>> proprietary and/or confidential information that is for the sole use
>> of the intended recipient(s).  Any disclosure, copying, distribution,
>> posting, or use of the information contained in or attached to this
>> email is prohibited unless permitted by the sender.  If you have
>> received this email in error, please immediately notify the sender
>> via return email, telephone, or fax and destroy this original transmission
>> and its included files without reading or saving it in any manner.
>> Thank you.
>>
>>
>>
>>
>>
>>
>>
>> This email and any files included with it may contain privileged,
>> proprietary and/or confidential information that is for the sole use
>> of the intended recipient(s).  Any disclosure, copying, distribution,
>> posting, or use of the information contained in or attached to this
>> email is prohibited unless permitted by the sender.  If you have
>> received this email in error, please immediately notify the sender
>> via return email, telephone, or fax and destroy this original transmission
>> and its included files without reading or saving it in any manner.
>> Thank you.
>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.