You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flume.apache.org by Jan Van Besien <ja...@ngdata.com> on 2013/11/07 16:10:11 UTC

increase load on tier2 flume agents

Hi,

I have a 2 tier flume setup. Tier 1 are agents that accept incomming 
requests (http source) and put them on (large) file channels. Tier 2 
does a lot of processing on these events (with custom interceptors) and 
a custom sink to store the result in a custom data storage. These tier 2 
agents use a (small) memory channel.

The tier 2 interceptors and data storage are all mostly IO bound.

I seem to struggle to saturate the tier 2 agents. They are slower than 
they should be, mostly due to various flume unrelated reasons.

However, assume that I would like my tier 2 agents to process more 
events in parallel. What would be the appropriate way to do this?

Do I need multiple avro sinks on the tier 1 agents that map to the same 
tier 2 avro source? I tried this, and this seems to increase the number 
of threads on the tier 2 agent that are actually processing events indeed.

Is this the way to do it, or not?

thanks,
Jan

Re: increase load on tier2 flume agents

Posted by Jan Van Besien <ja...@ngdata.com>.

On 11/07/2013 04:10 PM, Jan Van Besien wrote:
> Do I need multiple avro sinks on the tier 1 agents that map to the same
> tier 2 avro source? I tried this, and this seems to increase the number
> of threads on the tier 2 agent that are actually processing events indeed.
>
> Is this the way to do it, or not?

Or should I build a fan-out flow as described in the docs. If so, why is 
this better than what I described above?

thanks
Jan