You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Hemanth Abbina <He...@eiqnetworks.com> on 2015/11/17 18:16:50 UTC

Possibility of persisting the connection

Hi,

Though it's against the basic design principle of Flume, I have one question.

Is this possible to persist the connection between source & sink and re-use ?

We are using HTTP source, File channel & Kafka sink and with that configuration, not getting the expected throughput because of the reconnections of the source & sink for every event.

So, would it be possible to re-use the same HTTP and Kafka connections for multiple transactions ? (even with a custom source & sink)

Thanks,
Hemanth

Re: Possibility of persisting the connection

Posted by Hemanth Abbina <He...@eiqnetworks.com>.
Hi Hari,

Thanks for the response. Agree with you on the HTTP source case.

Will check the Kafka sink again, to see what causes the reconnections.

Sent from my HTC

----- Reply message -----
From: "Hari Shreedharan" <hs...@cloudera.com>
To: "user@flume.apache.org" <us...@flume.apache.org>
Subject: Possibility of persisting the connection
Date: Tue, Nov 17, 2015 11:33 PM

Actually in both cases, the connections should be persistent. In HTTP Source case, the client decides when to close the connection - the HTTP Source is the server, it does not close any connections.

Kafka Sink uses the Kafka Producer API to talk to Kafka. If the connections are re-opened it could be because of a bug in the Kafka API, or because of the way your events are being partitioned between brokers (which is based on the event key you set).

Thanks,
Hari Shreedharan




On Nov 17, 2015, at 9:58 AM, Hemanth Abbina <He...@eiqnetworks.com>> wrote:

Hi Gonzalo,

Thanks for your response.

No, the Kafka sink connection is not the same all times.I have observed the connections closing and reconnecting.

Sent from my HTC

----- Reply message -----
From: "Gonzalo Herreros" <gh...@gmail.com>>
To: "user" <us...@flume.apache.org>>
Subject: Possibility of persisting the connection
Date: Tue, Nov 17, 2015 11:08 PM

For the sink, I would be surprised if the connection to kafka is not the same all the time.
For the http source you could create a custom source where you keep a long lived http connection and have some way of detecting where a batch of events is sent (e.g. a new line character).

Regards,
Gonzalo

On 17 November 2015 at 17:16, Hemanth Abbina <He...@eiqnetworks.com>> wrote:
Hi,

Though it's against the basic design principle of Flume, I have one question.

Is this possible to persist the connection between source & sink and re-use ?

We are using HTTP source, File channel & Kafka sink and with that configuration, not getting the expected throughput because of the reconnections of the source & sink for every event.

So, would it be possible to re-use the same HTTP and Kafka connections for multiple transactions ? (even with a custom source & sink)

Thanks,
Hemanth



Re: Possibility of persisting the connection

Posted by Hari Shreedharan <hs...@cloudera.com>.
Actually in both cases, the connections should be persistent. In HTTP Source case, the client decides when to close the connection - the HTTP Source is the server, it does not close any connections.

Kafka Sink uses the Kafka Producer API to talk to Kafka. If the connections are re-opened it could be because of a bug in the Kafka API, or because of the way your events are being partitioned between brokers (which is based on the event key you set).

Thanks,
Hari Shreedharan




> On Nov 17, 2015, at 9:58 AM, Hemanth Abbina <He...@eiqnetworks.com> wrote:
> 
> Hi Gonzalo,
> 
> Thanks for your response.
> 
> No, the Kafka sink connection is not the same all times.I have observed the connections closing and reconnecting.
> 
> Sent from my HTC
> 
> ----- Reply message -----
> From: "Gonzalo Herreros" <gh...@gmail.com>
> To: "user" <us...@flume.apache.org>
> Subject: Possibility of persisting the connection
> Date: Tue, Nov 17, 2015 11:08 PM
> 
> For the sink, I would be surprised if the connection to kafka is not the same all the time.
> For the http source you could create a custom source where you keep a long lived http connection and have some way of detecting where a batch of events is sent (e.g. a new line character).
> 
> Regards,
> Gonzalo
> 
> On 17 November 2015 at 17:16, Hemanth Abbina <HemanthA@eiqnetworks.com <ma...@eiqnetworks.com>> wrote:
> Hi,
> 
>  
> 
> Though it’s against the basic design principle of Flume, I have one question.
> 
>  
> 
> Is this possible to persist the connection between source & sink and re-use ?
> 
>  
> 
> We are using HTTP source, File channel & Kafka sink and with that configuration, not getting the expected throughput because of the reconnections of the source & sink for every event.
> 
>  
> 
> So, would it be possible to re-use the same HTTP and Kafka connections for multiple transactions ? (even with a custom source & sink)
> 
>  
> 
> Thanks,
> 
> Hemanth
> 
> 


Re: Possibility of persisting the connection

Posted by Hemanth Abbina <He...@eiqnetworks.com>.
Hi Gonzalo,

Thanks for your response.

No, the Kafka sink connection is not the same all times.I have observed the connections closing and reconnecting.

Sent from my HTC

----- Reply message -----
From: "Gonzalo Herreros" <gh...@gmail.com>
To: "user" <us...@flume.apache.org>
Subject: Possibility of persisting the connection
Date: Tue, Nov 17, 2015 11:08 PM

For the sink, I would be surprised if the connection to kafka is not the same all the time.
For the http source you could create a custom source where you keep a long lived http connection and have some way of detecting where a batch of events is sent (e.g. a new line character).

Regards,
Gonzalo

On 17 November 2015 at 17:16, Hemanth Abbina <He...@eiqnetworks.com>> wrote:
Hi,

Though it's against the basic design principle of Flume, I have one question.

Is this possible to persist the connection between source & sink and re-use ?

We are using HTTP source, File channel & Kafka sink and with that configuration, not getting the expected throughput because of the reconnections of the source & sink for every event.

So, would it be possible to re-use the same HTTP and Kafka connections for multiple transactions ? (even with a custom source & sink)

Thanks,
Hemanth


Re: Possibility of persisting the connection

Posted by Gonzalo Herreros <gh...@gmail.com>.
For the sink, I would be surprised if the connection to kafka is not the
same all the time.
For the http source you could create a custom source where you keep a long
lived http connection and have some way of detecting where a batch of
events is sent (e.g. a new line character).

Regards,
Gonzalo

On 17 November 2015 at 17:16, Hemanth Abbina <He...@eiqnetworks.com>
wrote:

> Hi,
>
>
>
> Though it’s against the basic design principle of Flume, I have one
> question.
>
>
>
> Is this possible to persist the connection between source & sink and
> re-use ?
>
>
>
> We are using HTTP source, File channel & Kafka sink and with that
> configuration, not getting the expected throughput because of the
> reconnections of the source & sink for every event.
>
>
>
> So, would it be possible to re-use the same HTTP and Kafka connections for
> multiple transactions ? (even with a custom source & sink)
>
>
>
> Thanks,
>
> Hemanth
>