You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Sverre Bakke <sv...@gmail.com> on 2015/05/19 14:35:21 UTC

Channel locks

Hi,

When using a single source, channel and sink setup with Syslog, memory
channel and KafkaSink (from
https://github.com/thilinamb/flume-ng-kafka-sink) I am experiencing
performance issues.

After looking at the source code of the sink, it seems that the sink
will begin a transaction, then do everything (e.g. compress, send over
the network, wait for ack), and eventually close/commit the
transaction after everything is done.

Is there any way to increase the performance of this setup?

Earlier I have seen people proposing adding more sinks to get higher
throughput, but if every sink holds a lock on the channel until done
with processing, then I would assume that this only causes additional
performance issues? Are my understanding correct on how channel
locking works?

Regards,
Sverre

Re: Channel locks

Posted by Gwen Shapira <gs...@cloudera.com>.
distributed between the sinks.

(Is that VSRE enough? ;)

On Tue, May 19, 2015 at 7:18 PM, Needham, Guy <Guy.Needham@virginmedia.co.uk
> wrote:

>  With multiple sinks reading from one channel, will each sink read each
> event, or will the events be distributed between the sinks?
>
>
>
> Regards,
>
> Guy Needham | Data Discovery
> Virgin Media   | Technology and Transformation | Data
> Bartley Wood Business Park, Hook, Hampshire RG27 9UP
> D 01256 75 3362
> I welcome VSRE emails. Learn more at http://vsre.info/
>
> *From:* Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> *Sent:* 19 May 2015 16:30
> *To:* user@flume.apache.org
> *Subject:* Re: Channel locks
>
>
>
> Each sink does not hold a lock on the channel. Channels can handle
> multiple sinks reading from them and in fact, that is encouraged.
>
> On Tuesday, May 19, 2015, Sverre Bakke <sv...@gmail.com> wrote:
>
> Hi,
>
> When using a single source, channel and sink setup with Syslog, memory
> channel and KafkaSink (from
> https://github.com/thilinamb/flume-ng-kafka-sink) I am experiencing
> performance issues.
>
> After looking at the source code of the sink, it seems that the sink
> will begin a transaction, then do everything (e.g. compress, send over
> the network, wait for ack), and eventually close/commit the
> transaction after everything is done.
>
> Is there any way to increase the performance of this setup?
>
> Earlier I have seen people proposing adding more sinks to get higher
> throughput, but if every sink holds a lock on the channel until done
> with processing, then I would assume that this only causes additional
> performance issues? Are my understanding correct on how channel
> locking works?
>
> Regards,
> Sverre
>
>
>
> --
>
>
>
> Thanks,
>
> Hari
>
>
>
>
> --------------------------------------------------------------------
> Save Paper - Do you really need to print this e-mail?
>
> Visit www.virginmedia.com for more information, and more fun.
>
> This email and any attachments are or may be confidential and legally
> privileged
> and are sent solely for the attention of the addressee(s). If you have
> received this
> email in error, please delete it from your system: its use, disclosure or
> copying is
> unauthorised. Statements and opinions expressed in this email may not
> represent
> those of Virgin Media. Any representations or commitments in this email are
> subject to contract.
>
> Registered office: Media House, Bartley Wood Business Park, Hook,
> Hampshire, RG27 9UP
> Registered in England and Wales with number 2591237
>

RE: Channel locks

Posted by "Needham, Guy" <Gu...@virginmedia.co.uk>.
With multiple sinks reading from one channel, will each sink read each event, or will the events be distributed between the sinks?

Regards,
Guy Needham | Data Discovery
Virgin Media   | Technology and Transformation | Data
Bartley Wood Business Park, Hook, Hampshire RG27 9UP
D 01256 75 3362
I welcome VSRE emails. Learn more at http://vsre.info/
From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: 19 May 2015 16:30
To: user@flume.apache.org
Subject: Re: Channel locks

Each sink does not hold a lock on the channel. Channels can handle multiple sinks reading from them and in fact, that is encouraged.

On Tuesday, May 19, 2015, Sverre Bakke <sv...@gmail.com>> wrote:
Hi,

When using a single source, channel and sink setup with Syslog, memory
channel and KafkaSink (from
https://github.com/thilinamb/flume-ng-kafka-sink) I am experiencing
performance issues.

After looking at the source code of the sink, it seems that the sink
will begin a transaction, then do everything (e.g. compress, send over
the network, wait for ack), and eventually close/commit the
transaction after everything is done.

Is there any way to increase the performance of this setup?

Earlier I have seen people proposing adding more sinks to get higher
throughput, but if every sink holds a lock on the channel until done
with processing, then I would assume that this only causes additional
performance issues? Are my understanding correct on how channel
locking works?

Regards,
Sverre


--

Thanks,
Hari


--------------------------------------------------------------------
Save Paper - Do you really need to print this e-mail?

Visit www.virginmedia.com for more information, and more fun.

This email and any attachments are or may be confidential and legally privileged
and are sent solely for the attention of the addressee(s). If you have received this
email in error, please delete it from your system: its use, disclosure or copying is
unauthorised. Statements and opinions expressed in this email may not represent
those of Virgin Media. Any representations or commitments in this email are
subject to contract. 

Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, RG27 9UP
Registered in England and Wales with number 2591237

Re: Channel locks

Posted by Sverre Bakke <sv...@gmail.com>.
Hi,

Thanks for the response. Does that mean that transaction.begin() does not
acquire a lock that is kept until transaction.commit()?

If not, how can a sink slow down the source? I would think that the channel
would fill up and throw channelfullexception instead rather than slow down
the source?

Regards,
Sverre
On May 19, 2015 17:31, "Hari Shreedharan" <hs...@cloudera.com> wrote:

> Each sink does not hold a lock on the channel. Channels can handle
> multiple sinks reading from them and in fact, that is encouraged.
>
> On Tuesday, May 19, 2015, Sverre Bakke <sv...@gmail.com> wrote:
>
>> Hi,
>>
>> When using a single source, channel and sink setup with Syslog, memory
>> channel and KafkaSink (from
>> https://github.com/thilinamb/flume-ng-kafka-sink) I am experiencing
>> performance issues.
>>
>> After looking at the source code of the sink, it seems that the sink
>> will begin a transaction, then do everything (e.g. compress, send over
>> the network, wait for ack), and eventually close/commit the
>> transaction after everything is done.
>>
>> Is there any way to increase the performance of this setup?
>>
>> Earlier I have seen people proposing adding more sinks to get higher
>> throughput, but if every sink holds a lock on the channel until done
>> with processing, then I would assume that this only causes additional
>> performance issues? Are my understanding correct on how channel
>> locking works?
>>
>> Regards,
>> Sverre
>>
>
>
> --
>
> Thanks,
> Hari
>
>

Re: Channel locks

Posted by Hari Shreedharan <hs...@cloudera.com>.
Each sink does not hold a lock on the channel. Channels can handle multiple
sinks reading from them and in fact, that is encouraged.

On Tuesday, May 19, 2015, Sverre Bakke <sv...@gmail.com> wrote:

> Hi,
>
> When using a single source, channel and sink setup with Syslog, memory
> channel and KafkaSink (from
> https://github.com/thilinamb/flume-ng-kafka-sink) I am experiencing
> performance issues.
>
> After looking at the source code of the sink, it seems that the sink
> will begin a transaction, then do everything (e.g. compress, send over
> the network, wait for ack), and eventually close/commit the
> transaction after everything is done.
>
> Is there any way to increase the performance of this setup?
>
> Earlier I have seen people proposing adding more sinks to get higher
> throughput, but if every sink holds a lock on the channel until done
> with processing, then I would assume that this only causes additional
> performance issues? Are my understanding correct on how channel
> locking works?
>
> Regards,
> Sverre
>


-- 

Thanks,
Hari