You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Haidang N <ha...@hotmail.com> on 2014/09/16 20:26:12 UTC

Reading Flume spoolDir in parallel

Since I'm not allowed to set up Flume on prod servers, I have to download the logs, put them in a Flume spoolDir and have a sink to consume from the channel and write to Cassandra. Everything is working fine.
However, as I have a lot of log files in the spoolDir, and the current setup is only processing 1 file at a time, it's taking a while. I want to be able to process many files concurrently. One way I thought of is to use the spoolDir but distribute the files into 5-10 different directories, and define multiple sources/channels/sinks, but this is a bit clumsy. Is there a better way to achieve this?
Thanks




 		 	   		  

Re: Reading Flume spoolDir in parallel

Posted by Hari Shreedharan <hs...@cloudera.com>.
Unfortunately, no. The spoolDir source was kept single-threaded so that
deserializer implementations can be kept simple. The approach with mutliple
spoolDir sources is the correct one, though they can all write to the same
channel(s) - so you'd need only a larger number of sources, they can all
share the same channel(s) and you don't need more sinks unless you want to
pull data out faster.

On Tue, Sep 16, 2014 at 11:26 AM, Haidang N <ha...@hotmail.com> wrote:

> Since I'm not allowed to set up Flume on prod servers, I have to download
> the logs, put them in a Flume spoolDir and have a sink to consume from the
> channel and write to Cassandra. Everything is working fine.
>
>
> However, as I have a lot of log files in the spoolDir, and the current
> setup is only processing 1 file at a time, it's taking a while. I want to
> be able to process many files concurrently. One way I thought of is to use
> the spoolDir but distribute the files into 5-10 different directories, and
> define multiple sources/channels/sinks, but this is a bit clumsy. Is there
> a better way to achieve this?
>
>
> Thanks
>
>
>