You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Vaibhav Puranik <vp...@gmail.com> on 2012/06/26 03:01:25 UTC

Getting timeouts with elastic load balancer in AWS

Hi all,

We are sending our ad impressions to Kafka 0.7.0. I am using async
prouducers in our web app.
I am pooling kafak producers with commons pool. Pool size - 10. batch.size
is 100.

We have 3 c1.xlarge instances with Kafka brokers installed behind a elastic
load balancer in AWS.
Every minute we loose some events because of the following exception

- Disconnecting from dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
- Error in handling batch of 64 events
java.io.IOException: Connection timed out
    at sun.nio.ch.FileDispatcher.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
    at
kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
    at kafka.network.Send$class.writeCompletely(Transmission.scala:76)
    at
kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
    at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
    at
kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
    at
kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
    at
kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
    at
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
    at
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
    at
kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
    at
kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
- Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 for
producing

Has anybody faced this kind of timeouts before? Do they indicate any
resource misconfiguration? The CPU usage on broker is pretty small.
Also, in spite of setting batch size to 100, the failing batch usually only
have 50 to 60 events. Is there any other limit I am hitting?

Any help is appreciated.


Regards,
Vaibhav
GumGum

Re: Hadoop Consumer

Posted by Min <mi...@gmail.com>.

ConsumerConfig is in the kafka's main trunk.

As I used the same package namespace, kafka.consumer, (sure I don't
think it's good approach), I didn't have to import it explicitly.

kafka jar is not on the maven repository, you might have to register
it into your local maven repository.

> mvn install:install-file -Dfile=kafka-0.7.0.jar -DgroupId=kafka -DartifactId=kafka -Dversion=0.7.0 -Dpackaging=jar

Thanks
Min

2012/7/13 Murtaza Doctor <mu...@richrelevance.com>:
> Hello Min,
>
> In your github project source code are you missing the ConsumerConfig
> class? I was trying to download and play with the source code.
>
> Thanks,
> murtaza
>
> On 7/3/12 6:29 PM, "Min" <mi...@gmail.com> wrote:
>
>>I've created another hadoop consumer which uses zookeeper.
>>
>>https://github.com/miniway/kafka-hadoop-consumer
>>
>>With a hadoop OutputFormatter, I could add new files to the existing
>>target directory.
>>Hope this would help.
>>
>>Thanks
>>Min
>>
>>2012/7/4 Murtaza Doctor <mu...@richrelevance.com>:
>>> +1 This surely sounds interesting.
>>>
>>> On 7/3/12 10:05 AM, "Felix GV" <fe...@mate1inc.com> wrote:
>>>
>>>>Hmm that's surprising. I didn't know about that...!
>>>>
>>>>I wonder if it's a new feature... Judging from your email, I assume
>>>>you're
>>>>using CDH? What version?
>>>>
>>>>Interesting :) ...
>>>>
>>>>--
>>>>Felix
>>>>
>>>>
>>>>
>>>>On Tue, Jul 3, 2012 at 12:34 PM, Sybrandy, Casey <
>>>>Casey.Sybrandy@six3systems.com> wrote:
>>>>
>>>>> >> - Is there a version of consumer which appends to an existing file
>>>>>on
>>>>> HDFS
>>>>> >> until it reaches a specific size?
>>>>> >>
>>>>> >
>>>>> >No there isn't, as far as I know. Potential solutions to this would
>>>>>be:
>>>>> >
>>>>> >   1. Leave the data in the broker long enough for it to reach the
>>>>>size
>>>>> you
>>>>> >   want. Running the SimpleKafkaETLJob at those intervals would give
>>>>>you
>>>>> the
>>>>> >   file size you want. This is the simplest thing to do, but the
>>>>>drawback
>>>>> is
>>>>> >   that your data in HDFS will be less real-time.
>>>>> >   2. Run the SimpleKafkaETLJob as frequently as you want, and then
>>>>>roll
>>>>> up
>>>>> >   / compact your small files into one bigger file. You would need to
>>>>> come up
>>>>> >   with the hadoop job that does the roll up, or find one somewhere.
>>>>> >   3. Don't use the SimpleKafkaETLJob at all and write a new job that
>>>>> makes
>>>>> >   use of hadoop append instead...
>>>>> >
>>>>> >Also, you may be interested to take a look at these
>>>>> >scripts<
>>>>>
>>>>>http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consume
>>>>>r/
>>>>> >I
>>>>> >posted a while ago. If you follow the links in this post, you can get
>>>>> >more details about how the scripts work and why it was necessary to
>>>>>do
>>>>>the
>>>>> >things it does... or you can just use them without reading. They
>>>>>should
>>>>> >work pretty much out of the box...
>>>>>
>>>>> Where I work, we discovered that you can keep a file in HDFS open and
>>>>> still run MapReduce jobs against the data in that file.  What you do
>>>>>is
>>>>>you
>>>>> flush the data periodically (every record for us), but you don't close
>>>>>the
>>>>> file right away.  This allows us to have data files that contain 24
>>>>>hours
>>>>> worth of data, but not have to close the file to run the jobs or to
>>>>> schedule the jobs for after the file is closed.  You can also check
>>>>>the
>>>>> file size periodically and rotate the files based on size.  We use
>>>>>Avro
>>>>> files, but sequence files should work too according to Cloudera.
>>>>>
>>>>> It's a great compromise for when you want the latest and greatest
>>>>>data,
>>>>> but don't want to have to wait until all of the files are closed to
>>>>>get
>>>>>it.
>>>>>
>>>>> Casey
>>>
>

Re: Hadoop Consumer

Posted by Murtaza Doctor <mu...@richrelevance.com>.

Hello Min,

In your github project source code are you missing the ConsumerConfig
class? I was trying to download and play with the source code.

Thanks,
murtaza

On 7/3/12 6:29 PM, "Min" <mi...@gmail.com> wrote:

>I've created another hadoop consumer which uses zookeeper.
>
>https://github.com/miniway/kafka-hadoop-consumer
>
>With a hadoop OutputFormatter, I could add new files to the existing
>target directory.
>Hope this would help.
>
>Thanks
>Min
>
>2012/7/4 Murtaza Doctor <mu...@richrelevance.com>:
>> +1 This surely sounds interesting.
>>
>> On 7/3/12 10:05 AM, "Felix GV" <fe...@mate1inc.com> wrote:
>>
>>>Hmm that's surprising. I didn't know about that...!
>>>
>>>I wonder if it's a new feature... Judging from your email, I assume
>>>you're
>>>using CDH? What version?
>>>
>>>Interesting :) ...
>>>
>>>--
>>>Felix
>>>
>>>
>>>
>>>On Tue, Jul 3, 2012 at 12:34 PM, Sybrandy, Casey <
>>>Casey.Sybrandy@six3systems.com> wrote:
>>>
>>>> >> - Is there a version of consumer which appends to an existing file
>>>>on
>>>> HDFS
>>>> >> until it reaches a specific size?
>>>> >>
>>>> >
>>>> >No there isn't, as far as I know. Potential solutions to this would
>>>>be:
>>>> >
>>>> >   1. Leave the data in the broker long enough for it to reach the
>>>>size
>>>> you
>>>> >   want. Running the SimpleKafkaETLJob at those intervals would give
>>>>you
>>>> the
>>>> >   file size you want. This is the simplest thing to do, but the
>>>>drawback
>>>> is
>>>> >   that your data in HDFS will be less real-time.
>>>> >   2. Run the SimpleKafkaETLJob as frequently as you want, and then
>>>>roll
>>>> up
>>>> >   / compact your small files into one bigger file. You would need to
>>>> come up
>>>> >   with the hadoop job that does the roll up, or find one somewhere.
>>>> >   3. Don't use the SimpleKafkaETLJob at all and write a new job that
>>>> makes
>>>> >   use of hadoop append instead...
>>>> >
>>>> >Also, you may be interested to take a look at these
>>>> >scripts<
>>>>
>>>>http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consume
>>>>r/
>>>> >I
>>>> >posted a while ago. If you follow the links in this post, you can get
>>>> >more details about how the scripts work and why it was necessary to
>>>>do
>>>>the
>>>> >things it does... or you can just use them without reading. They
>>>>should
>>>> >work pretty much out of the box...
>>>>
>>>> Where I work, we discovered that you can keep a file in HDFS open and
>>>> still run MapReduce jobs against the data in that file.  What you do
>>>>is
>>>>you
>>>> flush the data periodically (every record for us), but you don't close
>>>>the
>>>> file right away.  This allows us to have data files that contain 24
>>>>hours
>>>> worth of data, but not have to close the file to run the jobs or to
>>>> schedule the jobs for after the file is closed.  You can also check
>>>>the
>>>> file size periodically and rotate the files based on size.  We use
>>>>Avro
>>>> files, but sequence files should work too according to Cloudera.
>>>>
>>>> It's a great compromise for when you want the latest and greatest
>>>>data,
>>>> but don't want to have to wait until all of the files are closed to
>>>>get
>>>>it.
>>>>
>>>> Casey
>>

Re: Hadoop Consumer

Posted by Min <mi...@gmail.com>.

I've created another hadoop consumer which uses zookeeper.

https://github.com/miniway/kafka-hadoop-consumer

With a hadoop OutputFormatter, I could add new files to the existing
target directory.
Hope this would help.

Thanks
Min

2012/7/4 Murtaza Doctor <mu...@richrelevance.com>:
> +1 This surely sounds interesting.
>
> On 7/3/12 10:05 AM, "Felix GV" <fe...@mate1inc.com> wrote:
>
>>Hmm that's surprising. I didn't know about that...!
>>
>>I wonder if it's a new feature... Judging from your email, I assume you're
>>using CDH? What version?
>>
>>Interesting :) ...
>>
>>--
>>Felix
>>
>>
>>
>>On Tue, Jul 3, 2012 at 12:34 PM, Sybrandy, Casey <
>>Casey.Sybrandy@six3systems.com> wrote:
>>
>>> >> - Is there a version of consumer which appends to an existing file on
>>> HDFS
>>> >> until it reaches a specific size?
>>> >>
>>> >
>>> >No there isn't, as far as I know. Potential solutions to this would be:
>>> >
>>> >   1. Leave the data in the broker long enough for it to reach the size
>>> you
>>> >   want. Running the SimpleKafkaETLJob at those intervals would give
>>>you
>>> the
>>> >   file size you want. This is the simplest thing to do, but the
>>>drawback
>>> is
>>> >   that your data in HDFS will be less real-time.
>>> >   2. Run the SimpleKafkaETLJob as frequently as you want, and then
>>>roll
>>> up
>>> >   / compact your small files into one bigger file. You would need to
>>> come up
>>> >   with the hadoop job that does the roll up, or find one somewhere.
>>> >   3. Don't use the SimpleKafkaETLJob at all and write a new job that
>>> makes
>>> >   use of hadoop append instead...
>>> >
>>> >Also, you may be interested to take a look at these
>>> >scripts<
>>>
>>>http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consumer/
>>> >I
>>> >posted a while ago. If you follow the links in this post, you can get
>>> >more details about how the scripts work and why it was necessary to do
>>>the
>>> >things it does... or you can just use them without reading. They should
>>> >work pretty much out of the box...
>>>
>>> Where I work, we discovered that you can keep a file in HDFS open and
>>> still run MapReduce jobs against the data in that file.  What you do is
>>>you
>>> flush the data periodically (every record for us), but you don't close
>>>the
>>> file right away.  This allows us to have data files that contain 24
>>>hours
>>> worth of data, but not have to close the file to run the jobs or to
>>> schedule the jobs for after the file is closed.  You can also check the
>>> file size periodically and rotate the files based on size.  We use Avro
>>> files, but sequence files should work too according to Cloudera.
>>>
>>> It's a great compromise for when you want the latest and greatest data,
>>> but don't want to have to wait until all of the files are closed to get
>>>it.
>>>
>>> Casey
>

RE: Hadoop Consumer

Posted by Grégoire Seux <g....@criteo.com>.

Thanks a lot Min, this is indeed very useful. 

-- 
Greg

-----Original Message-----
From: Felix GV [mailto:felix@mate1inc.com] 
Sent: mercredi 4 juillet 2012 18:19
To: kafka-users@incubator.apache.org
Subject: Re: Hadoop Consumer

Thanks for the info, that's interesting :) ...

And thanks for the link Min :) Having a hadoop consumer that manages the offsets with ZK is cool :) ...

--
Felix



On Wed, Jul 4, 2012 at 9:04 AM, Sybrandy, Casey < Casey.Sybrandy@six3systems.com> wrote:

> We're using CDH3 update 2 or 3.  I don't know how much the version 
> matters, so it may work on plain-old Hadoop.
> _____________________
> From: Murtaza Doctor [murtaza@richrelevance.com]
> Sent: Tuesday, July 03, 2012 1:56 PM
> To: kafka-users@incubator.apache.org
> Subject: Re: Hadoop Consumer
>
> +1 This surely sounds interesting.
>
> On 7/3/12 10:05 AM, "Felix GV" <fe...@mate1inc.com> wrote:
>
> >Hmm that's surprising. I didn't know about that...!
> >
> >I wonder if it's a new feature... Judging from your email, I assume 
> >you're using CDH? What version?
> >
> >Interesting :) ...
> >
> >--
> >Felix
> >
> >
> >
> >On Tue, Jul 3, 2012 at 12:34 PM, Sybrandy, Casey < 
> >Casey.Sybrandy@six3systems.com> wrote:
> >
> >> >> - Is there a version of consumer which appends to an existing 
> >> >> file on
> >> HDFS
> >> >> until it reaches a specific size?
> >> >>
> >> >
> >> >No there isn't, as far as I know. Potential solutions to this would be:
> >> >
> >> >   1. Leave the data in the broker long enough for it to reach the 
> >> > size
> >> you
> >> >   want. Running the SimpleKafkaETLJob at those intervals would 
> >> > give
> >>you
> >> the
> >> >   file size you want. This is the simplest thing to do, but the
> >>drawback
> >> is
> >> >   that your data in HDFS will be less real-time.
> >> >   2. Run the SimpleKafkaETLJob as frequently as you want, and 
> >> > then
> >>roll
> >> up
> >> >   / compact your small files into one bigger file. You would need 
> >> > to
> >> come up
> >> >   with the hadoop job that does the roll up, or find one somewhere.
> >> >   3. Don't use the SimpleKafkaETLJob at all and write a new job 
> >> > that
> >> makes
> >> >   use of hadoop append instead...
> >> >
> >> >Also, you may be interested to take a look at these scripts<
> >>
> >>
> http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consum
> er/
> >> >I
> >> >posted a while ago. If you follow the links in this post, you can 
> >> >get more details about how the scripts work and why it was 
> >> >necessary to do
> >>the
> >> >things it does... or you can just use them without reading. They 
> >> >should work pretty much out of the box...
> >>
> >> Where I work, we discovered that you can keep a file in HDFS open 
> >>and  still run MapReduce jobs against the data in that file.  What 
> >>you do is you  flush the data periodically (every record for us), 
> >>but you don't close the  file right away.  This allows us to have 
> >>data files that contain 24 hours  worth of data, but not have to 
> >>close the file to run the jobs or to  schedule the jobs for after 
> >>the file is closed.  You can also check the  file size periodically 
> >>and rotate the files based on size.  We use Avro  files, but 
> >>sequence files should work too according to Cloudera.
> >>
> >> It's a great compromise for when you want the latest and greatest 
> >>data,  but don't want to have to wait until all of the files are 
> >>closed to get it.
> >>
> >> Casey
>
>

Re: Hadoop Consumer

Posted by Felix GV <fe...@mate1inc.com>.

Thanks for the info, that's interesting :) ...

And thanks for the link Min :) Having a hadoop consumer that manages the
offsets with ZK is cool :) ...

--
Felix



On Wed, Jul 4, 2012 at 9:04 AM, Sybrandy, Casey <
Casey.Sybrandy@six3systems.com> wrote:

> We're using CDH3 update 2 or 3.  I don't know how much the version
> matters, so it may work on plain-old Hadoop.
> _____________________
> From: Murtaza Doctor [murtaza@richrelevance.com]
> Sent: Tuesday, July 03, 2012 1:56 PM
> To: kafka-users@incubator.apache.org
> Subject: Re: Hadoop Consumer
>
> +1 This surely sounds interesting.
>
> On 7/3/12 10:05 AM, "Felix GV" <fe...@mate1inc.com> wrote:
>
> >Hmm that's surprising. I didn't know about that...!
> >
> >I wonder if it's a new feature... Judging from your email, I assume you're
> >using CDH? What version?
> >
> >Interesting :) ...
> >
> >--
> >Felix
> >
> >
> >
> >On Tue, Jul 3, 2012 at 12:34 PM, Sybrandy, Casey <
> >Casey.Sybrandy@six3systems.com> wrote:
> >
> >> >> - Is there a version of consumer which appends to an existing file on
> >> HDFS
> >> >> until it reaches a specific size?
> >> >>
> >> >
> >> >No there isn't, as far as I know. Potential solutions to this would be:
> >> >
> >> >   1. Leave the data in the broker long enough for it to reach the size
> >> you
> >> >   want. Running the SimpleKafkaETLJob at those intervals would give
> >>you
> >> the
> >> >   file size you want. This is the simplest thing to do, but the
> >>drawback
> >> is
> >> >   that your data in HDFS will be less real-time.
> >> >   2. Run the SimpleKafkaETLJob as frequently as you want, and then
> >>roll
> >> up
> >> >   / compact your small files into one bigger file. You would need to
> >> come up
> >> >   with the hadoop job that does the roll up, or find one somewhere.
> >> >   3. Don't use the SimpleKafkaETLJob at all and write a new job that
> >> makes
> >> >   use of hadoop append instead...
> >> >
> >> >Also, you may be interested to take a look at these
> >> >scripts<
> >>
> >>
> http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consumer/
> >> >I
> >> >posted a while ago. If you follow the links in this post, you can get
> >> >more details about how the scripts work and why it was necessary to do
> >>the
> >> >things it does... or you can just use them without reading. They should
> >> >work pretty much out of the box...
> >>
> >> Where I work, we discovered that you can keep a file in HDFS open and
> >> still run MapReduce jobs against the data in that file.  What you do is
> >>you
> >> flush the data periodically (every record for us), but you don't close
> >>the
> >> file right away.  This allows us to have data files that contain 24
> >>hours
> >> worth of data, but not have to close the file to run the jobs or to
> >> schedule the jobs for after the file is closed.  You can also check the
> >> file size periodically and rotate the files based on size.  We use Avro
> >> files, but sequence files should work too according to Cloudera.
> >>
> >> It's a great compromise for when you want the latest and greatest data,
> >> but don't want to have to wait until all of the files are closed to get
> >>it.
> >>
> >> Casey
>
>

RE: Hadoop Consumer

Posted by "Sybrandy, Casey" <Ca...@Six3Systems.com>.

We're using CDH3 update 2 or 3.  I don't know how much the version matters, so it may work on plain-old Hadoop.
_____________________
From: Murtaza Doctor [murtaza@richrelevance.com]
Sent: Tuesday, July 03, 2012 1:56 PM
To: kafka-users@incubator.apache.org
Subject: Re: Hadoop Consumer

+1 This surely sounds interesting.

On 7/3/12 10:05 AM, "Felix GV" <fe...@mate1inc.com> wrote:

>Hmm that's surprising. I didn't know about that...!
>
>I wonder if it's a new feature... Judging from your email, I assume you're
>using CDH? What version?
>
>Interesting :) ...
>
>--
>Felix
>
>
>
>On Tue, Jul 3, 2012 at 12:34 PM, Sybrandy, Casey <
>Casey.Sybrandy@six3systems.com> wrote:
>
>> >> - Is there a version of consumer which appends to an existing file on
>> HDFS
>> >> until it reaches a specific size?
>> >>
>> >
>> >No there isn't, as far as I know. Potential solutions to this would be:
>> >
>> >   1. Leave the data in the broker long enough for it to reach the size
>> you
>> >   want. Running the SimpleKafkaETLJob at those intervals would give
>>you
>> the
>> >   file size you want. This is the simplest thing to do, but the
>>drawback
>> is
>> >   that your data in HDFS will be less real-time.
>> >   2. Run the SimpleKafkaETLJob as frequently as you want, and then
>>roll
>> up
>> >   / compact your small files into one bigger file. You would need to
>> come up
>> >   with the hadoop job that does the roll up, or find one somewhere.
>> >   3. Don't use the SimpleKafkaETLJob at all and write a new job that
>> makes
>> >   use of hadoop append instead...
>> >
>> >Also, you may be interested to take a look at these
>> >scripts<
>>
>>http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consumer/
>> >I
>> >posted a while ago. If you follow the links in this post, you can get
>> >more details about how the scripts work and why it was necessary to do
>>the
>> >things it does... or you can just use them without reading. They should
>> >work pretty much out of the box...
>>
>> Where I work, we discovered that you can keep a file in HDFS open and
>> still run MapReduce jobs against the data in that file.  What you do is
>>you
>> flush the data periodically (every record for us), but you don't close
>>the
>> file right away.  This allows us to have data files that contain 24
>>hours
>> worth of data, but not have to close the file to run the jobs or to
>> schedule the jobs for after the file is closed.  You can also check the
>> file size periodically and rotate the files based on size.  We use Avro
>> files, but sequence files should work too according to Cloudera.
>>
>> It's a great compromise for when you want the latest and greatest data,
>> but don't want to have to wait until all of the files are closed to get
>>it.
>>
>> Casey

Re: Hadoop Consumer

Posted by Murtaza Doctor <mu...@richrelevance.com>.

+1 This surely sounds interesting.

On 7/3/12 10:05 AM, "Felix GV" <fe...@mate1inc.com> wrote:

>Hmm that's surprising. I didn't know about that...!
>
>I wonder if it's a new feature... Judging from your email, I assume you're
>using CDH? What version?
>
>Interesting :) ...
>
>--
>Felix
>
>
>
>On Tue, Jul 3, 2012 at 12:34 PM, Sybrandy, Casey <
>Casey.Sybrandy@six3systems.com> wrote:
>
>> >> - Is there a version of consumer which appends to an existing file on
>> HDFS
>> >> until it reaches a specific size?
>> >>
>> >
>> >No there isn't, as far as I know. Potential solutions to this would be:
>> >
>> >   1. Leave the data in the broker long enough for it to reach the size
>> you
>> >   want. Running the SimpleKafkaETLJob at those intervals would give
>>you
>> the
>> >   file size you want. This is the simplest thing to do, but the
>>drawback
>> is
>> >   that your data in HDFS will be less real-time.
>> >   2. Run the SimpleKafkaETLJob as frequently as you want, and then
>>roll
>> up
>> >   / compact your small files into one bigger file. You would need to
>> come up
>> >   with the hadoop job that does the roll up, or find one somewhere.
>> >   3. Don't use the SimpleKafkaETLJob at all and write a new job that
>> makes
>> >   use of hadoop append instead...
>> >
>> >Also, you may be interested to take a look at these
>> >scripts<
>> 
>>http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consumer/
>> >I
>> >posted a while ago. If you follow the links in this post, you can get
>> >more details about how the scripts work and why it was necessary to do
>>the
>> >things it does... or you can just use them without reading. They should
>> >work pretty much out of the box...
>>
>> Where I work, we discovered that you can keep a file in HDFS open and
>> still run MapReduce jobs against the data in that file.  What you do is
>>you
>> flush the data periodically (every record for us), but you don't close
>>the
>> file right away.  This allows us to have data files that contain 24
>>hours
>> worth of data, but not have to close the file to run the jobs or to
>> schedule the jobs for after the file is closed.  You can also check the
>> file size periodically and rotate the files based on size.  We use Avro
>> files, but sequence files should work too according to Cloudera.
>>
>> It's a great compromise for when you want the latest and greatest data,
>> but don't want to have to wait until all of the files are closed to get
>>it.
>>
>> Casey

Re: Hadoop Consumer

Posted by Felix GV <fe...@mate1inc.com>.

Hmm that's surprising. I didn't know about that...!

I wonder if it's a new feature... Judging from your email, I assume you're
using CDH? What version?

Interesting :) ...

--
Felix



On Tue, Jul 3, 2012 at 12:34 PM, Sybrandy, Casey <
Casey.Sybrandy@six3systems.com> wrote:

> >> - Is there a version of consumer which appends to an existing file on
> HDFS
> >> until it reaches a specific size?
> >>
> >
> >No there isn't, as far as I know. Potential solutions to this would be:
> >
> >   1. Leave the data in the broker long enough for it to reach the size
> you
> >   want. Running the SimpleKafkaETLJob at those intervals would give you
> the
> >   file size you want. This is the simplest thing to do, but the drawback
> is
> >   that your data in HDFS will be less real-time.
> >   2. Run the SimpleKafkaETLJob as frequently as you want, and then roll
> up
> >   / compact your small files into one bigger file. You would need to
> come up
> >   with the hadoop job that does the roll up, or find one somewhere.
> >   3. Don't use the SimpleKafkaETLJob at all and write a new job that
> makes
> >   use of hadoop append instead...
> >
> >Also, you may be interested to take a look at these
> >scripts<
> http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consumer/
> >I
> >posted a while ago. If you follow the links in this post, you can get
> >more details about how the scripts work and why it was necessary to do the
> >things it does... or you can just use them without reading. They should
> >work pretty much out of the box...
>
> Where I work, we discovered that you can keep a file in HDFS open and
> still run MapReduce jobs against the data in that file.  What you do is you
> flush the data periodically (every record for us), but you don't close the
> file right away.  This allows us to have data files that contain 24 hours
> worth of data, but not have to close the file to run the jobs or to
> schedule the jobs for after the file is closed.  You can also check the
> file size periodically and rotate the files based on size.  We use Avro
> files, but sequence files should work too according to Cloudera.
>
> It's a great compromise for when you want the latest and greatest data,
> but don't want to have to wait until all of the files are closed to get it.
>
> Casey

RE: Hadoop Consumer

Posted by "Sybrandy, Casey" <Ca...@Six3Systems.com>.

>> - Is there a version of consumer which appends to an existing file on HDFS
>> until it reaches a specific size?
>>
>
>No there isn't, as far as I know. Potential solutions to this would be:
>
>   1. Leave the data in the broker long enough for it to reach the size you
>   want. Running the SimpleKafkaETLJob at those intervals would give you the
>   file size you want. This is the simplest thing to do, but the drawback is
>   that your data in HDFS will be less real-time.
>   2. Run the SimpleKafkaETLJob as frequently as you want, and then roll up
>   / compact your small files into one bigger file. You would need to come up
>   with the hadoop job that does the roll up, or find one somewhere.
>   3. Don't use the SimpleKafkaETLJob at all and write a new job that makes
>   use of hadoop append instead...
>
>Also, you may be interested to take a look at these
>scripts<http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consumer/>I
>posted a while ago. If you follow the links in this post, you can get
>more details about how the scripts work and why it was necessary to do the
>things it does... or you can just use them without reading. They should
>work pretty much out of the box...

Where I work, we discovered that you can keep a file in HDFS open and still run MapReduce jobs against the data in that file.  What you do is you flush the data periodically (every record for us), but you don't close the file right away.  This allows us to have data files that contain 24 hours worth of data, but not have to close the file to run the jobs or to schedule the jobs for after the file is closed.  You can also check the file size periodically and rotate the files based on size.  We use Avro files, but sequence files should work too according to Cloudera.

It's a great compromise for when you want the latest and greatest data, but don't want to have to wait until all of the files are closed to get it.

Casey

Re: Hadoop Consumer

Posted by Murtaza Doctor <mu...@richrelevance.com>.

>>
>>- We have event data under the topic "foo" written to the kafka
>> Server/Broker in avro format and want to write those events to HDFS.
>>Does
>> the Hadoop consumer expect the data written to HDFS already?
>
>
>No it doesn't expect the data to be written into HDFS already... There
>wouldn't be much point to it, otherwise, no ;) ?
>

Sorry, my note was unclear. I meant the SimpleKafkaETLJob requires a
sequence file with an offset written to HDFS and then uses that as a
bookmark to pull the data from the broker?
This file has a checksum and I was trying to modify the topic in it, which
then of course messes up the checksum. I already have events generated on
my Kafka server and all I wanted to do is run SimpleKafkaETLJob to pull
out the data and write to HDFS. Was trying to fulfill the sequence file
pre-requisite and that does not seem to work for me.

>
>> Based on the
>> doc looks like the DataGenerator is pulling events from the broker and
>> writing to HDFS. In our case we only wanted to utilize the
>> SimpleKafkaETLJob to write to HDFS.
>
>
>That's what it does. It spawns a (map only) Map Reduce job that pulls in
>parallel from the broker(s) and writes that data into HDFS.
>
>
>> I am surely missing something here?
>>
>
>Maybe...? I don't know. Do tell if anything is not clear still...!

Thanks for asserting, just want to make sure I got it right.

>
>
>> - Is there a version of consumer which appends to an existing file on
>>HDFS
>> until it reaches a specific size?
>>
>
>No there isn't, as far as I know. Potential solutions to this would be:
>
>   1. Leave the data in the broker long enough for it to reach the size
>you
>   want. Running the SimpleKafkaETLJob at those intervals would give you
>the
>   file size you want. This is the simplest thing to do, but the drawback
>is
>   that your data in HDFS will be less real-time.
>   2. Run the SimpleKafkaETLJob as frequently as you want, and then roll
>up
>   / compact your small files into one bigger file. You would need to
>come up
>   with the hadoop job that does the roll up, or find one somewhere.
>   3. Don't use the SimpleKafkaETLJob at all and write a new job that
>makes
>   use of hadoop append instead...

These options are very useful. I like option 3 the most :)

>
>Also, you may be interested to take a look at these
>scripts<http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-co
>nsumer/>I
>posted a while ago. If you follow the links in this post, you can get
>more details about how the scripts work and why it was necessary to do the
>things it does... or you can just use them without reading. They should
>work pretty much out of the box...

Will surely give them a spin. Thanks!
>
>>
>> Thanks,
>> murtaza
>>
>>

Re: Hadoop Consumer

Posted by Felix GV <fe...@mate1inc.com>.

Answer inlined...

--
Felix

On Fri, Jun 29, 2012 at 9:24 PM, Murtaza Doctor
<mu...@richrelevance.com>wrote:

> Had a few questions around the Hadoop Consumer.
>
> - We have event data under the topic "foo" written to the kafka
> Server/Broker in avro format and want to write those events to HDFS. Does
> the Hadoop consumer expect the data written to HDFS already?

No it doesn't expect the data to be written into HDFS already... There
wouldn't be much point to it, otherwise, no ;) ?

> Based on the
> doc looks like the DataGenerator is pulling events from the broker and
> writing to HDFS. In our case we only wanted to utilize the
> SimpleKafkaETLJob to write to HDFS.

That's what it does. It spawns a (map only) Map Reduce job that pulls in
parallel from the broker(s) and writes that data into HDFS.

> I am surely missing something here?
>

Maybe...? I don't know. Do tell if anything is not clear still...!

> - Is there a version of consumer which appends to an existing file on HDFS
> until it reaches a specific size?
>

No there isn't, as far as I know. Potential solutions to this would be:

   1. Leave the data in the broker long enough for it to reach the size you
   want. Running the SimpleKafkaETLJob at those intervals would give you the
   file size you want. This is the simplest thing to do, but the drawback is
   that your data in HDFS will be less real-time.
   2. Run the SimpleKafkaETLJob as frequently as you want, and then roll up
   / compact your small files into one bigger file. You would need to come up
   with the hadoop job that does the roll up, or find one somewhere.
   3. Don't use the SimpleKafkaETLJob at all and write a new job that makes
   use of hadoop append instead...

Also, you may be interested to take a look at these
scripts<http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consumer/>I
posted a while ago. If you follow the links in this post, you can get
more details about how the scripts work and why it was necessary to do the
things it does... or you can just use them without reading. They should
work pretty much out of the box...

>
> Thanks,
> murtaza
>
>

Hadoop Consumer

Posted by Murtaza Doctor <mu...@richrelevance.com>.

Had a few questions around the Hadoop Consumer.

- We have event data under the topic "foo" written to the kafka
Server/Broker in avro format and want to write those events to HDFS. Does
the Hadoop consumer expect the data written to HDFS already? Based on the
doc looks like the DataGenerator is pulling events from the broker and
writing to HDFS. In our case we only wanted to utilize the
SimpleKafkaETLJob to write to HDFS. I am surely missing something here?
- Is there a version of consumer which appends to an existing file on HDFS
until it reaches a specific size?

Thanks,
murtaza

Re: Kafka - Avro Encoder

Posted by Neha Narkhede <ne...@gmail.com>.

Hi Murtaza,

>> - Is there any sample code around this since this is probably a common use-case. I meant is there a CustomAvroEncoder which we can use out of the box or any chance this can also be open-sourced?

The encoding/decoding using Avro is pretty simple. We just use the
BinaryEncoder with the Specific/Generic DatumWriter to write
IndexedRecord objects.

>> - In terms of internals - are we converting avro into byte stream and creating a Message Object and then writing to the queue, does this incur any overhead in your opinion?

The overhead of serialization I've seen in production is ~0.05 ms per record.

Thanks,
Neha


On Tue, Jun 26, 2012 at 10:07 PM, Murtaza Doctor
<mu...@richrelevance.com> wrote:
> Hello Folks,
>
> We are currently evaluating Kafka and had a few questions around the
> Encoder functionality.
> Our data is in avro format and we wish to send the data to the broker in
> this format as well eventually write to HDFS. As documented, we do realize
> that we need a Custom Encoder to achieve creation of the Message object.
>
> Questions we had:
> - Is there any sample code around this since this is probably a common
> use-case. I meant is there a CustomAvroEncoder which we can use out of the
> box or any chance this can also be open-sourced?
> - In terms of internals - are we converting avro into byte stream and
> creating a Message Object and then writing to the queue, does this incur
> any overhead in your opinion?
> - Any best practices around this or how others would approach this problem?
>
> If there is any value we would definitely like to see this added to the
> FAQs or even part of some sample code.
>
> Thanks,
> murtaza
>
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Jun Rao <ju...@gmail.com>.

>From the log, it seems that while message 6 was being sent, it hit an
exception (potentially due to broker down) and caused a resend. And it
seems that message 6 reached the broker. When message 5 was sent, it didn't
hit any exception. So there is no resend. The reason that message 5 didn't
reach broker could be that the broker was shut down before producer socket
buffer was flushed.

Thanks,

Jun

On Thu, Jun 28, 2012 at 1:36 PM, Vaibhav Puranik <vp...@gmail.com> wrote:

> Jun,
>
> Here is the log with SynProducer and DefaultEventHandler trace enabled.
>
> http://pastebin.com/dTm5RSJ9
>
> Here are my producer settings:
>
> properties.put("serializer.class", "kafka.serializer.StringEncoder")
> properties.put("broker.list", "0:localhost:9092")
> properties.put("producer.type", "async");
> properties.put("num.retries", "3");
> properties.put("batch.size", "5");
>
> (This batch size does't work because I think the some flush time  is small
> - 5 seconds - It sends every message as it comes). I am sleeping for 15
> seconds between each messages.
>
> Here is my broker output:
> _____0_____  {�D�_____1_____  �&6c_____2_____  6z��_____3_____  +
> �~_____4_____  f�tu_____6_____  ����_____7_____  \� _____8_____
>  ��Ơ_____9_____
>
>
> Notice number 5 is missing. I restarted broker between 4 and 5. You can see
> that the message  5 is missing. On producer for some reason the error
> appears between 6 and 7. Don't know why.
>
> Regards,
> Vaibhav
>
>
> On Thu, Jun 28, 2012 at 11:15 AM, Jun Rao <ju...@gmail.com> wrote:
>
> > Could you enable trace logging in DefaultEventHandler to see if the
> > following message shows up after the warning?
> >          trace("kafka producer sent messages for topics %s to broker
> %s:%d
> > (on attempt %d)"
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <vpuranik@gmail.com
> > >wrote:
> >
> > > Hi all,
> > >
> > > I don't think the num.retries (0.7.1) is working. Here is how I tested
> > it.
> > >
> > > I wrote a simple producer that sends messages with the following
> strings
> > -
> > > "____1_____", "_____2_____"..... . As you can see all the messages are
> > > sequential.
> > > I tailed the topic log on broker. After sending every message, I have
> > added
> > > Thread.sleep for 15 seconds.
> > >
> > > Everytime I send the message, it immediately appears in the broker log.
> > But
> > > if I restart the broker to simulate producer connection drop (in the 15
> > > seconds producer sleep period), it prints the following message in the
> > > logs:
> > >
> > > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092
> > > (kafka.producer.SyncProducer)
> > > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts
> > remaining
> > > (kafka.producer.async.DefaultEventHandler)
> > > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for
> producing
> > > (kafka.producer.SyncProducer)
> > >
> > > But the message that was sent right after the broker restart never
> > reaches
> > > the broker. The message after that (2nd message after restart) gets to
> > > broker fine and the sequence continues. Thus if I restart the broker in
> > the
> > > sleep period between message 4 and 5. I don't get the message 5. I get
> > > message 1,2,3,4,6,7,.....
> > >
> > > I tried setting num.retries to 1 and 2 thinking that in the first retry
> > it
> > > might reconnect and the second retry is where it's resending the
> message.
> > > But that doesn't work. Number of retries doesn't improve the situation.
> > >
> > > Can you see any flaw in my testing? What can I do to better test this
> > > scenario? How can I ensure that no messages are dropped? I don't think
> I
> > am
> > > loosing the message because it's in broker memory. Please correct me
> if I
> > > am wrong.
> > >
> > > Regards,
> > > Vaibhav
> > > GumGum <http://gumgum.com>
> > >
> > >
> > >
> > > On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <jj...@gmail.com>
> wrote:
> > >
> > > > 0.7.1 has this: reconnect.time.interval.ms
> > > >
> > > > Thanks,
> > > >
> > > > Joel
> > > >
> > > > On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <vpuranik@gmail.com
> >
> > > > wrote:
> > > >
> > > > > That will be awesome. It will definitely address AWS ELB problem.
> > > > >
> > > > > +1 for "reconnect.interval".
> > > > >
> > > > > Regards,
> > > > > Vaibhav
> > > > > GumGum
> > > > >
> > > > >
> > > > > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <
> > niek.sanders@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Do producers currently leave the sockets to the brokers open
> > > > > indefinitely?
> > > > > >
> > > > > > It might make sense to add a second producer config param similar
> > to
> > > > > > "reconnect.interval" which limits on time instead of message
> count.
> > > > > > (And then reconnect based on whichever criteria is hit first).
>  For
> > > > > > folks going through ELBs on AWS, they'd set the
> > > reconnect.interval.sec
> > > > > > to something like 50 sec as a workaround for low-volume
> producers.
> > > > > >
> > > > > > - Niek
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com>
> wrote:
> > > > > > > Set num.retries in producer config property file. It defaults
> to
> > 0.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Vaibhav Puranik <vp...@gmail.com>.

Just to remove all the variables regarding me restarting the broker, I did
a test with Amazon ELB. (0.7.1 producer and 0.7.0 broker)
Thus, no broker restarts. The connection was getting broken because Amazon
ELB was closing all the connections.

I found the exact same result. In spite of specifying num.retries and
reconnect.time.interval.ms = 50000, we loose one batch. I understand that
num.retries does not gurantee that all the messages will be sent.
But I feel like it should do it in this case though. Please let me know if
my expectation is unjust.

Regards,
Vaibhav


On Thu, Jun 28, 2012 at 2:37 PM, Joel Koshy <jj...@gmail.com> wrote:

> Just to clarify: num.retries > 0 does not guarantee that all messages will
> be received at the broker. It guarantees retry on exceptions - so it cannot
> handle the corner case when the broker goes down after the message is
> written to the socket buffer but before the buffer is flushed (in which
> case no exceptions are thrown). This is addressed in 0.8 with producer
> acks.
>
> That said, you have a fairly large interval between messages so it's rather
> surprising. It might help to correlate this with broker-side logs to see if
> the "Message sent" for message 5 was actually received on the broker.
>
> Thanks,
>
> Joel
>
> On Thu, Jun 28, 2012 at 1:36 PM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
>
> > Jun,
> >
> > Here is the log with SynProducer and DefaultEventHandler trace enabled.
> >
> > http://pastebin.com/dTm5RSJ9
> >
> > Here are my producer settings:
> >
> > properties.put("serializer.class", "kafka.serializer.StringEncoder")
> > properties.put("broker.list", "0:localhost:9092")
> > properties.put("producer.type", "async");
> > properties.put("num.retries", "3");
> > properties.put("batch.size", "5");
> >
> > (This batch size does't work because I think the some flush time  is
> small
> > - 5 seconds - It sends every message as it comes). I am sleeping for 15
> > seconds between each messages.
> >
> > Here is my broker output:
> > _____0_____  {�D�_____1_____  �&6c_____2_____  6z��_____3_____  +
> > �~_____4_____  f�tu_____6_____  ����_____7_____  \� _____8_____
> >  ��Ơ_____9_____
> >
> >
> > Notice number 5 is missing. I restarted broker between 4 and 5. You can
> see
> > that the message  5 is missing. On producer for some reason the error
> > appears between 6 and 7. Don't know why.
> >
> > Regards,
> > Vaibhav
> >
> >
> > On Thu, Jun 28, 2012 at 11:15 AM, Jun Rao <ju...@gmail.com> wrote:
> >
> > > Could you enable trace logging in DefaultEventHandler to see if the
> > > following message shows up after the warning?
> > >          trace("kafka producer sent messages for topics %s to broker
> > %s:%d
> > > (on attempt %d)"
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <vpuranik@gmail.com
> > > >wrote:
> > >
> > > > Hi all,
> > > >
> > > > I don't think the num.retries (0.7.1) is working. Here is how I
> tested
> > > it.
> > > >
> > > > I wrote a simple producer that sends messages with the following
> > strings
> > > -
> > > > "____1_____", "_____2_____"..... . As you can see all the messages
> are
> > > > sequential.
> > > > I tailed the topic log on broker. After sending every message, I have
> > > added
> > > > Thread.sleep for 15 seconds.
> > > >
> > > > Everytime I send the message, it immediately appears in the broker
> log.
> > > But
> > > > if I restart the broker to simulate producer connection drop (in the
> 15
> > > > seconds producer sleep period), it prints the following message in
> the
> > > > logs:
> > > >
> > > > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092
> > > > (kafka.producer.SyncProducer)
> > > > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts
> > > remaining
> > > > (kafka.producer.async.DefaultEventHandler)
> > > > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for
> > producing
> > > > (kafka.producer.SyncProducer)
> > > >
> > > > But the message that was sent right after the broker restart never
> > > reaches
> > > > the broker. The message after that (2nd message after restart) gets
> to
> > > > broker fine and the sequence continues. Thus if I restart the broker
> in
> > > the
> > > > sleep period between message 4 and 5. I don't get the message 5. I
> get
> > > > message 1,2,3,4,6,7,.....
> > > >
> > > > I tried setting num.retries to 1 and 2 thinking that in the first
> retry
> > > it
> > > > might reconnect and the second retry is where it's resending the
> > message.
> > > > But that doesn't work. Number of retries doesn't improve the
> situation.
> > > >
> > > > Can you see any flaw in my testing? What can I do to better test this
> > > > scenario? How can I ensure that no messages are dropped? I don't
> think
> > I
> > > am
> > > > loosing the message because it's in broker memory. Please correct me
> > if I
> > > > am wrong.
> > > >
> > > > Regards,
> > > > Vaibhav
> > > > GumGum <http://gumgum.com>
> > > >
> > > >
> > > >
> > > > On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <jj...@gmail.com>
> > wrote:
> > > >
> > > > > 0.7.1 has this: reconnect.time.interval.ms
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Joel
> > > > >
> > > > > On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <
> vpuranik@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > That will be awesome. It will definitely address AWS ELB problem.
> > > > > >
> > > > > > +1 for "reconnect.interval".
> > > > > >
> > > > > > Regards,
> > > > > > Vaibhav
> > > > > > GumGum
> > > > > >
> > > > > >
> > > > > > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <
> > > niek.sanders@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Do producers currently leave the sockets to the brokers open
> > > > > > indefinitely?
> > > > > > >
> > > > > > > It might make sense to add a second producer config param
> similar
> > > to
> > > > > > > "reconnect.interval" which limits on time instead of message
> > count.
> > > > > > > (And then reconnect based on whichever criteria is hit first).
> >  For
> > > > > > > folks going through ELBs on AWS, they'd set the
> > > > reconnect.interval.sec
> > > > > > > to something like 50 sec as a workaround for low-volume
> > producers.
> > > > > > >
> > > > > > > - Niek
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com>
> > wrote:
> > > > > > > > Set num.retries in producer config property file. It defaults
> > to
> > > 0.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Joel Koshy <jj...@gmail.com>.

Just to clarify: num.retries > 0 does not guarantee that all messages will
be received at the broker. It guarantees retry on exceptions - so it cannot
handle the corner case when the broker goes down after the message is
written to the socket buffer but before the buffer is flushed (in which
case no exceptions are thrown). This is addressed in 0.8 with producer acks.

That said, you have a fairly large interval between messages so it's rather
surprising. It might help to correlate this with broker-side logs to see if
the "Message sent" for message 5 was actually received on the broker.

Thanks,

Joel

On Thu, Jun 28, 2012 at 1:36 PM, Vaibhav Puranik <vp...@gmail.com> wrote:

> Jun,
>
> Here is the log with SynProducer and DefaultEventHandler trace enabled.
>
> http://pastebin.com/dTm5RSJ9
>
> Here are my producer settings:
>
> properties.put("serializer.class", "kafka.serializer.StringEncoder")
> properties.put("broker.list", "0:localhost:9092")
> properties.put("producer.type", "async");
> properties.put("num.retries", "3");
> properties.put("batch.size", "5");
>
> (This batch size does't work because I think the some flush time  is small
> - 5 seconds - It sends every message as it comes). I am sleeping for 15
> seconds between each messages.
>
> Here is my broker output:
> _____0_____  {�D�_____1_____  �&6c_____2_____  6z��_____3_____  +
> �~_____4_____  f�tu_____6_____  ����_____7_____  \� _____8_____
>  ��Ơ_____9_____
>
>
> Notice number 5 is missing. I restarted broker between 4 and 5. You can see
> that the message  5 is missing. On producer for some reason the error
> appears between 6 and 7. Don't know why.
>
> Regards,
> Vaibhav
>
>
> On Thu, Jun 28, 2012 at 11:15 AM, Jun Rao <ju...@gmail.com> wrote:
>
> > Could you enable trace logging in DefaultEventHandler to see if the
> > following message shows up after the warning?
> >          trace("kafka producer sent messages for topics %s to broker
> %s:%d
> > (on attempt %d)"
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <vpuranik@gmail.com
> > >wrote:
> >
> > > Hi all,
> > >
> > > I don't think the num.retries (0.7.1) is working. Here is how I tested
> > it.
> > >
> > > I wrote a simple producer that sends messages with the following
> strings
> > -
> > > "____1_____", "_____2_____"..... . As you can see all the messages are
> > > sequential.
> > > I tailed the topic log on broker. After sending every message, I have
> > added
> > > Thread.sleep for 15 seconds.
> > >
> > > Everytime I send the message, it immediately appears in the broker log.
> > But
> > > if I restart the broker to simulate producer connection drop (in the 15
> > > seconds producer sleep period), it prints the following message in the
> > > logs:
> > >
> > > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092
> > > (kafka.producer.SyncProducer)
> > > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts
> > remaining
> > > (kafka.producer.async.DefaultEventHandler)
> > > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for
> producing
> > > (kafka.producer.SyncProducer)
> > >
> > > But the message that was sent right after the broker restart never
> > reaches
> > > the broker. The message after that (2nd message after restart) gets to
> > > broker fine and the sequence continues. Thus if I restart the broker in
> > the
> > > sleep period between message 4 and 5. I don't get the message 5. I get
> > > message 1,2,3,4,6,7,.....
> > >
> > > I tried setting num.retries to 1 and 2 thinking that in the first retry
> > it
> > > might reconnect and the second retry is where it's resending the
> message.
> > > But that doesn't work. Number of retries doesn't improve the situation.
> > >
> > > Can you see any flaw in my testing? What can I do to better test this
> > > scenario? How can I ensure that no messages are dropped? I don't think
> I
> > am
> > > loosing the message because it's in broker memory. Please correct me
> if I
> > > am wrong.
> > >
> > > Regards,
> > > Vaibhav
> > > GumGum <http://gumgum.com>
> > >
> > >
> > >
> > > On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <jj...@gmail.com>
> wrote:
> > >
> > > > 0.7.1 has this: reconnect.time.interval.ms
> > > >
> > > > Thanks,
> > > >
> > > > Joel
> > > >
> > > > On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <vpuranik@gmail.com
> >
> > > > wrote:
> > > >
> > > > > That will be awesome. It will definitely address AWS ELB problem.
> > > > >
> > > > > +1 for "reconnect.interval".
> > > > >
> > > > > Regards,
> > > > > Vaibhav
> > > > > GumGum
> > > > >
> > > > >
> > > > > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <
> > niek.sanders@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Do producers currently leave the sockets to the brokers open
> > > > > indefinitely?
> > > > > >
> > > > > > It might make sense to add a second producer config param similar
> > to
> > > > > > "reconnect.interval" which limits on time instead of message
> count.
> > > > > > (And then reconnect based on whichever criteria is hit first).
>  For
> > > > > > folks going through ELBs on AWS, they'd set the
> > > reconnect.interval.sec
> > > > > > to something like 50 sec as a workaround for low-volume
> producers.
> > > > > >
> > > > > > - Niek
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com>
> wrote:
> > > > > > > Set num.retries in producer config property file. It defaults
> to
> > 0.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Vaibhav Puranik <vp...@gmail.com>.

Jun,

Here is the log with SynProducer and DefaultEventHandler trace enabled.

http://pastebin.com/dTm5RSJ9

Here are my producer settings:

properties.put("serializer.class", "kafka.serializer.StringEncoder")
properties.put("broker.list", "0:localhost:9092")
properties.put("producer.type", "async");
properties.put("num.retries", "3");
properties.put("batch.size", "5");

(This batch size does't work because I think the some flush time  is small
- 5 seconds - It sends every message as it comes). I am sleeping for 15
seconds between each messages.

Here is my broker output:
_____0_____{�D�_____1_____�&6c_____2_____6z��_____3_____+�~_____4_____f�tu_____6_____����_____7_____\�_____8_____��Ơ_____9_____


Notice number 5 is missing. I restarted broker between 4 and 5. You can see
that the message  5 is missing. On producer for some reason the error
appears between 6 and 7. Don't know why.

Regards,
Vaibhav


On Thu, Jun 28, 2012 at 11:15 AM, Jun Rao <ju...@gmail.com> wrote:

> Could you enable trace logging in DefaultEventHandler to see if the
> following message shows up after the warning?
>          trace("kafka producer sent messages for topics %s to broker %s:%d
> (on attempt %d)"
>
> Thanks,
>
> Jun
>
> On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <vpuranik@gmail.com
> >wrote:
>
> > Hi all,
> >
> > I don't think the num.retries (0.7.1) is working. Here is how I tested
> it.
> >
> > I wrote a simple producer that sends messages with the following strings
> -
> > "____1_____", "_____2_____"..... . As you can see all the messages are
> > sequential.
> > I tailed the topic log on broker. After sending every message, I have
> added
> > Thread.sleep for 15 seconds.
> >
> > Everytime I send the message, it immediately appears in the broker log.
> But
> > if I restart the broker to simulate producer connection drop (in the 15
> > seconds producer sleep period), it prints the following message in the
> > logs:
> >
> > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092
> > (kafka.producer.SyncProducer)
> > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts
> remaining
> > (kafka.producer.async.DefaultEventHandler)
> > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for producing
> > (kafka.producer.SyncProducer)
> >
> > But the message that was sent right after the broker restart never
> reaches
> > the broker. The message after that (2nd message after restart) gets to
> > broker fine and the sequence continues. Thus if I restart the broker in
> the
> > sleep period between message 4 and 5. I don't get the message 5. I get
> > message 1,2,3,4,6,7,.....
> >
> > I tried setting num.retries to 1 and 2 thinking that in the first retry
> it
> > might reconnect and the second retry is where it's resending the message.
> > But that doesn't work. Number of retries doesn't improve the situation.
> >
> > Can you see any flaw in my testing? What can I do to better test this
> > scenario? How can I ensure that no messages are dropped? I don't think I
> am
> > loosing the message because it's in broker memory. Please correct me if I
> > am wrong.
> >
> > Regards,
> > Vaibhav
> > GumGum <http://gumgum.com>
> >
> >
> >
> > On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <jj...@gmail.com> wrote:
> >
> > > 0.7.1 has this: reconnect.time.interval.ms
> > >
> > > Thanks,
> > >
> > > Joel
> > >
> > > On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <vp...@gmail.com>
> > > wrote:
> > >
> > > > That will be awesome. It will definitely address AWS ELB problem.
> > > >
> > > > +1 for "reconnect.interval".
> > > >
> > > > Regards,
> > > > Vaibhav
> > > > GumGum
> > > >
> > > >
> > > > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <
> niek.sanders@gmail.com
> > > > >wrote:
> > > >
> > > > > Do producers currently leave the sockets to the brokers open
> > > > indefinitely?
> > > > >
> > > > > It might make sense to add a second producer config param similar
> to
> > > > > "reconnect.interval" which limits on time instead of message count.
> > > > > (And then reconnect based on whichever criteria is hit first).  For
> > > > > folks going through ELBs on AWS, they'd set the
> > reconnect.interval.sec
> > > > > to something like 50 sec as a workaround for low-volume producers.
> > > > >
> > > > > - Niek
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
> > > > > > Set num.retries in producer config property file. It defaults to
> 0.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Jun Rao <ju...@gmail.com>.

Could you enable trace logging in DefaultEventHandler to see if the
following message shows up after the warning?
          trace("kafka producer sent messages for topics %s to broker %s:%d
(on attempt %d)"

Thanks,

Jun

On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <vp...@gmail.com>wrote:

> Hi all,
>
> I don't think the num.retries (0.7.1) is working. Here is how I tested it.
>
> I wrote a simple producer that sends messages with the following strings -
> "____1_____", "_____2_____"..... . As you can see all the messages are
> sequential.
> I tailed the topic log on broker. After sending every message, I have added
> Thread.sleep for 15 seconds.
>
> Everytime I send the message, it immediately appears in the broker log. But
> if I restart the broker to simulate producer connection drop (in the 15
> seconds producer sleep period), it prints the following message in the
> logs:
>
> [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092
> (kafka.producer.SyncProducer)
> [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts remaining
> (kafka.producer.async.DefaultEventHandler)
> [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for producing
> (kafka.producer.SyncProducer)
>
> But the message that was sent right after the broker restart never reaches
> the broker. The message after that (2nd message after restart) gets to
> broker fine and the sequence continues. Thus if I restart the broker in the
> sleep period between message 4 and 5. I don't get the message 5. I get
> message 1,2,3,4,6,7,.....
>
> I tried setting num.retries to 1 and 2 thinking that in the first retry it
> might reconnect and the second retry is where it's resending the message.
> But that doesn't work. Number of retries doesn't improve the situation.
>
> Can you see any flaw in my testing? What can I do to better test this
> scenario? How can I ensure that no messages are dropped? I don't think I am
> loosing the message because it's in broker memory. Please correct me if I
> am wrong.
>
> Regards,
> Vaibhav
> GumGum <http://gumgum.com>
>
>
>
> On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <jj...@gmail.com> wrote:
>
> > 0.7.1 has this: reconnect.time.interval.ms
> >
> > Thanks,
> >
> > Joel
> >
> > On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <vp...@gmail.com>
> > wrote:
> >
> > > That will be awesome. It will definitely address AWS ELB problem.
> > >
> > > +1 for "reconnect.interval".
> > >
> > > Regards,
> > > Vaibhav
> > > GumGum
> > >
> > >
> > > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <niek.sanders@gmail.com
> > > >wrote:
> > >
> > > > Do producers currently leave the sockets to the brokers open
> > > indefinitely?
> > > >
> > > > It might make sense to add a second producer config param similar to
> > > > "reconnect.interval" which limits on time instead of message count.
> > > > (And then reconnect based on whichever criteria is hit first).  For
> > > > folks going through ELBs on AWS, they'd set the
> reconnect.interval.sec
> > > > to something like 50 sec as a workaround for low-volume producers.
> > > >
> > > > - Niek
> > > >
> > > >
> > > >
> > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
> > > > > Set num.retries in producer config property file. It defaults to 0.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > >
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Vaibhav Puranik <vp...@gmail.com>.

Hi all,

I don't think the num.retries (0.7.1) is working. Here is how I tested it.

I wrote a simple producer that sends messages with the following strings -
"____1_____", "_____2_____"..... . As you can see all the messages are
sequential.
I tailed the topic log on broker. After sending every message, I have added
Thread.sleep for 15 seconds.

Everytime I send the message, it immediately appears in the broker log. But
if I restart the broker to simulate producer connection drop (in the 15
seconds producer sleep period), it prints the following message in the logs:

[2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092
(kafka.producer.SyncProducer)
[2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts remaining
(kafka.producer.async.DefaultEventHandler)
[2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for producing
(kafka.producer.SyncProducer)

But the message that was sent right after the broker restart never reaches
the broker. The message after that (2nd message after restart) gets to
broker fine and the sequence continues. Thus if I restart the broker in the
sleep period between message 4 and 5. I don't get the message 5. I get
message 1,2,3,4,6,7,.....

I tried setting num.retries to 1 and 2 thinking that in the first retry it
might reconnect and the second retry is where it's resending the message.
But that doesn't work. Number of retries doesn't improve the situation.

Can you see any flaw in my testing? What can I do to better test this
scenario? How can I ensure that no messages are dropped? I don't think I am
loosing the message because it's in broker memory. Please correct me if I
am wrong.

Regards,
Vaibhav
GumGum <http://gumgum.com>

On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <jj...@gmail.com> wrote:

> 0.7.1 has this: reconnect.time.interval.ms
>
> Thanks,
>
> Joel
>
> On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
>
> > That will be awesome. It will definitely address AWS ELB problem.
> >
> > +1 for "reconnect.interval".
> >
> > Regards,
> > Vaibhav
> > GumGum
> >
> >
> > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <niek.sanders@gmail.com
> > >wrote:
> >
> > > Do producers currently leave the sockets to the brokers open
> > indefinitely?
> > >
> > > It might make sense to add a second producer config param similar to
> > > "reconnect.interval" which limits on time instead of message count.
> > > (And then reconnect based on whichever criteria is hit first).  For
> > > folks going through ELBs on AWS, they'd set the reconnect.interval.sec
> > > to something like 50 sec as a workaround for low-volume producers.
> > >
> > > - Niek
> > >
> > >
> > >
> > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
> > > > Set num.retries in producer config property file. It defaults to 0.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Joel Koshy <jj...@gmail.com>.

0.7.1 has this: reconnect.time.interval.ms

Thanks,

Joel

On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <vp...@gmail.com> wrote:

> That will be awesome. It will definitely address AWS ELB problem.
>
> +1 for "reconnect.interval".
>
> Regards,
> Vaibhav
> GumGum
>
>
> On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <niek.sanders@gmail.com
> >wrote:
>
> > Do producers currently leave the sockets to the brokers open
> indefinitely?
> >
> > It might make sense to add a second producer config param similar to
> > "reconnect.interval" which limits on time instead of message count.
> > (And then reconnect based on whichever criteria is hit first).  For
> > folks going through ELBs on AWS, they'd set the reconnect.interval.sec
> > to something like 50 sec as a workaround for low-volume producers.
> >
> > - Niek
> >
> >
> >
> > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
> > > Set num.retries in producer config property file. It defaults to 0.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Vaibhav Puranik <vp...@gmail.com>.

That will be awesome. It will definitely address AWS ELB problem.

+1 for "reconnect.interval".

Regards,
Vaibhav
GumGum


On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <ni...@gmail.com>wrote:

> Do producers currently leave the sockets to the brokers open indefinitely?
>
> It might make sense to add a second producer config param similar to
> "reconnect.interval" which limits on time instead of message count.
> (And then reconnect based on whichever criteria is hit first).  For
> folks going through ELBs on AWS, they'd set the reconnect.interval.sec
> to something like 50 sec as a workaround for low-volume producers.
>
> - Niek
>
>
>
> On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
> > Set num.retries in producer config property file. It defaults to 0.
> >
> > Thanks,
> >
> > Jun
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Niek Sanders <ni...@gmail.com>.

Do producers currently leave the sockets to the brokers open indefinitely?

It might make sense to add a second producer config param similar to
"reconnect.interval" which limits on time instead of message count.
(And then reconnect based on whichever criteria is hit first).  For
folks going through ELBs on AWS, they'd set the reconnect.interval.sec
to something like 50 sec as a workaround for low-volume producers.

- Niek

On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
> Set num.retries in producer config property file. It defaults to 0.
>
> Thanks,
>
> Jun
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Vaibhav Puranik <vp...@gmail.com>.

Thanks Neha.

I will try num.retries again with this version and post my feedback here.

Regards,
Vaibhav

On Wed, Jun 27, 2012 at 3:13 PM, Neha Narkhede <ne...@gmail.com>wrote:

> You can download it from here -
>
> https://www.apache.org/dyn/closer.cgi/incubator/kafka/kafka-0.7.1-incubating/
>
> Thanks,
> Neha
>
> On Wed, Jun 27, 2012 at 3:03 PM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
> > Thanks Jun. How do I download 0.7.1?
> >
> > I checked SVN tags but the last tag seems to be
> > kafka-0.7.1-incubating-candidate-3/<
> http://svn.apache.org/repos/asf/incubator/kafka/tags/kafka-0.7.1-incubating-candidate-3/
> >
> >
> > Regards,
> > Vaibhav
> >
> > On Wed, Jun 27, 2012 at 2:56 PM, Jun Rao <ju...@gmail.com> wrote:
> >
> >> num.retries is added in 0.7.1, which is just out.
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <vp...@gmail.com>
> >> wrote:
> >>
> >> > Jun,
> >> >
> >> > I wrote a test producer to test if num.retries working or not. But I
> >> found
> >> > that it's not working. No matter how many retries I set,  whenever a
> >> > message send fails, it always never gets to the broker.
> >> > I am using Kafka 0.7.0
> >> >
> >> > Is this a known  problem? Do I need to file a JIRA issue?
> >> >
> >> > Because we are using Async producer we have no way to catch the
> exception
> >> > ourselves and act on it. Is that right? Any ideas how we can ensure
> that
> >> > every single message is sent with retries?
> >> >
> >> > Regards,
> >> > Vaibhav
> >> >
> >> > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
> >> >
> >> > > Set num.retries in producer config property file. It defaults to 0.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Jun
> >> > >
> >> > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <
> vpuranik@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > I reduced the batch size and reduced the pooled connections.
> Number
> >> of
> >> > > > errors have gone down significantly. But they are not eliminated
> yet.
> >> > > >
> >> > > > We definitely don't want to loose any events.
> >> > > >
> >> > > > Jun, how do I configure the client resend you mentioned below? I
> >> > couldn't
> >> > > > find any configuration.
> >> > > >
> >> > > > Regards,
> >> > > > Vaibhav
> >> > > >
> >> > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <
> vpuranik@gmail.com
> >> >
> >> > > > wrote:
> >> > > >
> >> > > > > These are great pointers.
> >> > > > > I found some more discussion here:
> >> > > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427
> >> > > > >
> >> > > > > I can do the following to keep using the elastic load balancer:
> >> > > > >
> >> > > > > 1) Reduce the producer pool size to 1 or 2 because looks like
> >> > > connections
> >> > > > > are sitting idle. My volume does not desire that big pool.
> >> > > > > 2) Reduce the batch size so that the webapp frequently dumps the
> >> data
> >> > > to
> >> > > > > brokers. It's better for us anyways.
> >> > > > >
> >> > > > > I will try both of these options and report back.
> >> > > > >
> >> > > > > Thank you very much Jun and Niek.
> >> > > > >
> >> > > > > Regards,
> >> > > > > Vaibhav
> >> > > > >
> >> > > > >
> >> > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <
> >> > niek.sanders@gmail.com
> >> > > > >wrote:
> >> > > > >
> >> > > > >> ELBs will close connections that have no data going across them
> >> > over a
> >> > > > >> 60 sec period.  A reference to this behavior can be found at
> the
> >> > > > >> bottom of this page:
> >> > > > >>
> >> > > > >> http://aws.amazon.com/articles/1636185810492479
> >> > > > >>
> >> > > > >> There is currently no way for customers to increase this
> timeout.
> >> >  If
> >> > > > >> this timeout is in fact the problem, then the alternative is to
> >> use
> >> > HA
> >> > > > >> proxy for load balancing instead.
> >> > > > >>
> >> > > > >> - Niek
> >> > > > >>
> >> > > > >>
> >> > > > >>
> >> > > > >>
> >> > > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com>
> >> wrote:
> >> > > > >> > Vaibhav,
> >> > > > >> >
> >> > > > >> > Does elastic load balancer have any timeouts or quotas that
> kill
> >> > > > >> existing
> >> > > > >> > socket connections? Does client resend succeed (you can
> >> configure
> >> > > > >> resend in
> >> > > > >> > DefaultEventHandler)?
> >> > > > >> >
> >> > > > >> > Thanks,
> >> > > > >> >
> >> > > > >> > Jun
> >> > > > >> >
> >> > > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
> >> > > vpuranik@gmail.com>
> >> > > > >> wrote:
> >> > > > >> >
> >> > > > >> >> Hi all,
> >> > > > >> >>
> >> > > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using
> >> > async
> >> > > > >> >> prouducers in our web app.
> >> > > > >> >> I am pooling kafak producers with commons pool. Pool size -
> 10.
> >> > > > >> batch.size
> >> > > > >> >> is 100.
> >> > > > >> >>
> >> > > > >> >> We have 3 c1.xlarge instances with Kafka brokers installed
> >> > behind a
> >> > > > >> elastic
> >> > > > >> >> load balancer in AWS.
> >> > > > >> >> Every minute we loose some events because of the following
> >> > > exception
> >> > > > >> >>
> >> > > > >> >> - Disconnecting from
> >> > > > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> >> > > > >> >> - Error in handling batch of 64 events
> >> > > > >> >> java.io.IOException: Connection timed out
> >> > > > >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
> >> > > > >> >>    at
> >> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> >> > > > >> >>    at
> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
> >> > > > >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
> >> > > > >> >>    at
> >> > > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
> >> > > > >> >>    at
> >> > > kafka.network.Send$class.writeCompletely(Transmission.scala:76)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> >> > > > >> >>    at
> >> > > > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
> >> > > > >> >>    at
> kafka.producer.SyncProducer.send(SyncProducer.scala:87)
> >> > > > >> >>    at
> >> > kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
> >> > > > >> >>    at
> >> scala.collection.immutable.Stream.foreach(Stream.scala:254)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >>
> >> > >
> >> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> >> > > > >> >> - Connected to
> >> > > dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> >> > > > for
> >> > > > >> >> producing
> >> > > > >> >>
> >> > > > >> >> Has anybody faced this kind of timeouts before? Do they
> >> indicate
> >> > > any
> >> > > > >> >> resource misconfiguration? The CPU usage on broker is pretty
> >> > small.
> >> > > > >> >> Also, in spite of setting batch size to 100, the failing
> batch
> >> > > > usually
> >> > > > >> only
> >> > > > >> >> have 50 to 60 events. Is there any other limit I am hitting?
> >> > > > >> >>
> >> > > > >> >> Any help is appreciated.
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >> >> Regards,
> >> > > > >> >> Vaibhav
> >> > > > >> >> GumGum
> >> > > > >> >>
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Neha Narkhede <ne...@gmail.com>.

You can download it from here -
https://www.apache.org/dyn/closer.cgi/incubator/kafka/kafka-0.7.1-incubating/

Thanks,
Neha

On Wed, Jun 27, 2012 at 3:03 PM, Vaibhav Puranik <vp...@gmail.com> wrote:
> Thanks Jun. How do I download 0.7.1?
>
> I checked SVN tags but the last tag seems to be
> kafka-0.7.1-incubating-candidate-3/<http://svn.apache.org/repos/asf/incubator/kafka/tags/kafka-0.7.1-incubating-candidate-3/>
>
> Regards,
> Vaibhav
>
> On Wed, Jun 27, 2012 at 2:56 PM, Jun Rao <ju...@gmail.com> wrote:
>
>> num.retries is added in 0.7.1, which is just out.
>>
>> Thanks,
>>
>> Jun
>>
>> On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <vp...@gmail.com>
>> wrote:
>>
>> > Jun,
>> >
>> > I wrote a test producer to test if num.retries working or not. But I
>> found
>> > that it's not working. No matter how many retries I set,  whenever a
>> > message send fails, it always never gets to the broker.
>> > I am using Kafka 0.7.0
>> >
>> > Is this a known  problem? Do I need to file a JIRA issue?
>> >
>> > Because we are using Async producer we have no way to catch the exception
>> > ourselves and act on it. Is that right? Any ideas how we can ensure that
>> > every single message is sent with retries?
>> >
>> > Regards,
>> > Vaibhav
>> >
>> > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
>> >
>> > > Set num.retries in producer config property file. It defaults to 0.
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <vp...@gmail.com>
>> > > wrote:
>> > >
>> > > > I reduced the batch size and reduced the pooled connections. Number
>> of
>> > > > errors have gone down significantly. But they are not eliminated yet.
>> > > >
>> > > > We definitely don't want to loose any events.
>> > > >
>> > > > Jun, how do I configure the client resend you mentioned below? I
>> > couldn't
>> > > > find any configuration.
>> > > >
>> > > > Regards,
>> > > > Vaibhav
>> > > >
>> > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vpuranik@gmail.com
>> >
>> > > > wrote:
>> > > >
>> > > > > These are great pointers.
>> > > > > I found some more discussion here:
>> > > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427
>> > > > >
>> > > > > I can do the following to keep using the elastic load balancer:
>> > > > >
>> > > > > 1) Reduce the producer pool size to 1 or 2 because looks like
>> > > connections
>> > > > > are sitting idle. My volume does not desire that big pool.
>> > > > > 2) Reduce the batch size so that the webapp frequently dumps the
>> data
>> > > to
>> > > > > brokers. It's better for us anyways.
>> > > > >
>> > > > > I will try both of these options and report back.
>> > > > >
>> > > > > Thank you very much Jun and Niek.
>> > > > >
>> > > > > Regards,
>> > > > > Vaibhav
>> > > > >
>> > > > >
>> > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <
>> > niek.sanders@gmail.com
>> > > > >wrote:
>> > > > >
>> > > > >> ELBs will close connections that have no data going across them
>> > over a
>> > > > >> 60 sec period.  A reference to this behavior can be found at the
>> > > > >> bottom of this page:
>> > > > >>
>> > > > >> http://aws.amazon.com/articles/1636185810492479
>> > > > >>
>> > > > >> There is currently no way for customers to increase this timeout.
>> >  If
>> > > > >> this timeout is in fact the problem, then the alternative is to
>> use
>> > HA
>> > > > >> proxy for load balancing instead.
>> > > > >>
>> > > > >> - Niek
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com>
>> wrote:
>> > > > >> > Vaibhav,
>> > > > >> >
>> > > > >> > Does elastic load balancer have any timeouts or quotas that kill
>> > > > >> existing
>> > > > >> > socket connections? Does client resend succeed (you can
>> configure
>> > > > >> resend in
>> > > > >> > DefaultEventHandler)?
>> > > > >> >
>> > > > >> > Thanks,
>> > > > >> >
>> > > > >> > Jun
>> > > > >> >
>> > > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
>> > > vpuranik@gmail.com>
>> > > > >> wrote:
>> > > > >> >
>> > > > >> >> Hi all,
>> > > > >> >>
>> > > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using
>> > async
>> > > > >> >> prouducers in our web app.
>> > > > >> >> I am pooling kafak producers with commons pool. Pool size - 10.
>> > > > >> batch.size
>> > > > >> >> is 100.
>> > > > >> >>
>> > > > >> >> We have 3 c1.xlarge instances with Kafka brokers installed
>> > behind a
>> > > > >> elastic
>> > > > >> >> load balancer in AWS.
>> > > > >> >> Every minute we loose some events because of the following
>> > > exception
>> > > > >> >>
>> > > > >> >> - Disconnecting from
>> > > > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
>> > > > >> >> - Error in handling batch of 64 events
>> > > > >> >> java.io.IOException: Connection timed out
>> > > > >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
>> > > > >> >>    at
>> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>> > > > >> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>> > > > >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
>> > > > >> >>    at
>> > > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>> > > > >> >>    at
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> >
>> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
>> > > > >> >>    at
>> > > kafka.network.Send$class.writeCompletely(Transmission.scala:76)
>> > > > >> >>    at
>> > > > >> >>
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> >
>> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
>> > > > >> >>    at
>> > > > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
>> > > > >> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
>> > > > >> >>    at
>> > kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
>> > > > >> >>    at
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> >
>> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
>> > > > >> >>    at
>> > > > >> >>
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> >
>> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
>> > > > >> >>    at
>> > > > >> >>
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> >
>> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
>> > > > >> >>    at
>> > > > >> >>
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> >
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
>> > > > >> >>    at
>> > > > >> >>
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> >
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
>> > > > >> >>    at
>> scala.collection.immutable.Stream.foreach(Stream.scala:254)
>> > > > >> >>    at
>> > > > >> >>
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> >
>> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
>> > > > >> >>    at
>> > > > >> >>
>> > > > >>
>> > >
>> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
>> > > > >> >> - Connected to
>> > > dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
>> > > > for
>> > > > >> >> producing
>> > > > >> >>
>> > > > >> >> Has anybody faced this kind of timeouts before? Do they
>> indicate
>> > > any
>> > > > >> >> resource misconfiguration? The CPU usage on broker is pretty
>> > small.
>> > > > >> >> Also, in spite of setting batch size to 100, the failing batch
>> > > > usually
>> > > > >> only
>> > > > >> >> have 50 to 60 events. Is there any other limit I am hitting?
>> > > > >> >>
>> > > > >> >> Any help is appreciated.
>> > > > >> >>
>> > > > >> >>
>> > > > >> >> Regards,
>> > > > >> >> Vaibhav
>> > > > >> >> GumGum
>> > > > >> >>
>> > > > >>
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Vaibhav Puranik <vp...@gmail.com>.

Thanks Jun. How do I download 0.7.1?

I checked SVN tags but the last tag seems to be
kafka-0.7.1-incubating-candidate-3/<http://svn.apache.org/repos/asf/incubator/kafka/tags/kafka-0.7.1-incubating-candidate-3/>

Regards,
Vaibhav

On Wed, Jun 27, 2012 at 2:56 PM, Jun Rao <ju...@gmail.com> wrote:

> num.retries is added in 0.7.1, which is just out.
>
> Thanks,
>
> Jun
>
> On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
>
> > Jun,
> >
> > I wrote a test producer to test if num.retries working or not. But I
> found
> > that it's not working. No matter how many retries I set,  whenever a
> > message send fails, it always never gets to the broker.
> > I am using Kafka 0.7.0
> >
> > Is this a known  problem? Do I need to file a JIRA issue?
> >
> > Because we are using Async producer we have no way to catch the exception
> > ourselves and act on it. Is that right? Any ideas how we can ensure that
> > every single message is sent with retries?
> >
> > Regards,
> > Vaibhav
> >
> > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
> >
> > > Set num.retries in producer config property file. It defaults to 0.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <vp...@gmail.com>
> > > wrote:
> > >
> > > > I reduced the batch size and reduced the pooled connections. Number
> of
> > > > errors have gone down significantly. But they are not eliminated yet.
> > > >
> > > > We definitely don't want to loose any events.
> > > >
> > > > Jun, how do I configure the client resend you mentioned below? I
> > couldn't
> > > > find any configuration.
> > > >
> > > > Regards,
> > > > Vaibhav
> > > >
> > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vpuranik@gmail.com
> >
> > > > wrote:
> > > >
> > > > > These are great pointers.
> > > > > I found some more discussion here:
> > > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427
> > > > >
> > > > > I can do the following to keep using the elastic load balancer:
> > > > >
> > > > > 1) Reduce the producer pool size to 1 or 2 because looks like
> > > connections
> > > > > are sitting idle. My volume does not desire that big pool.
> > > > > 2) Reduce the batch size so that the webapp frequently dumps the
> data
> > > to
> > > > > brokers. It's better for us anyways.
> > > > >
> > > > > I will try both of these options and report back.
> > > > >
> > > > > Thank you very much Jun and Niek.
> > > > >
> > > > > Regards,
> > > > > Vaibhav
> > > > >
> > > > >
> > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <
> > niek.sanders@gmail.com
> > > > >wrote:
> > > > >
> > > > >> ELBs will close connections that have no data going across them
> > over a
> > > > >> 60 sec period.  A reference to this behavior can be found at the
> > > > >> bottom of this page:
> > > > >>
> > > > >> http://aws.amazon.com/articles/1636185810492479
> > > > >>
> > > > >> There is currently no way for customers to increase this timeout.
> >  If
> > > > >> this timeout is in fact the problem, then the alternative is to
> use
> > HA
> > > > >> proxy for load balancing instead.
> > > > >>
> > > > >> - Niek
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com>
> wrote:
> > > > >> > Vaibhav,
> > > > >> >
> > > > >> > Does elastic load balancer have any timeouts or quotas that kill
> > > > >> existing
> > > > >> > socket connections? Does client resend succeed (you can
> configure
> > > > >> resend in
> > > > >> > DefaultEventHandler)?
> > > > >> >
> > > > >> > Thanks,
> > > > >> >
> > > > >> > Jun
> > > > >> >
> > > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
> > > vpuranik@gmail.com>
> > > > >> wrote:
> > > > >> >
> > > > >> >> Hi all,
> > > > >> >>
> > > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using
> > async
> > > > >> >> prouducers in our web app.
> > > > >> >> I am pooling kafak producers with commons pool. Pool size - 10.
> > > > >> batch.size
> > > > >> >> is 100.
> > > > >> >>
> > > > >> >> We have 3 c1.xlarge instances with Kafka brokers installed
> > behind a
> > > > >> elastic
> > > > >> >> load balancer in AWS.
> > > > >> >> Every minute we loose some events because of the following
> > > exception
> > > > >> >>
> > > > >> >> - Disconnecting from
> > > > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> > > > >> >> - Error in handling batch of 64 events
> > > > >> >> java.io.IOException: Connection timed out
> > > > >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
> > > > >> >>    at
> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> > > > >> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
> > > > >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
> > > > >> >>    at
> > > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> > > > >> >>    at
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
> > > > >> >>    at
> > > kafka.network.Send$class.writeCompletely(Transmission.scala:76)
> > > > >> >>    at
> > > > >> >>
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> > > > >> >>    at
> > > > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
> > > > >> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
> > > > >> >>    at
> > kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
> > > > >> >>    at
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
> > > > >> >>    at
> > > > >> >>
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
> > > > >> >>    at
> > > > >> >>
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
> > > > >> >>    at
> > > > >> >>
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
> > > > >> >>    at
> > > > >> >>
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
> > > > >> >>    at
> scala.collection.immutable.Stream.foreach(Stream.scala:254)
> > > > >> >>    at
> > > > >> >>
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
> > > > >> >>    at
> > > > >> >>
> > > > >>
> > >
> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> > > > >> >> - Connected to
> > > dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> > > > for
> > > > >> >> producing
> > > > >> >>
> > > > >> >> Has anybody faced this kind of timeouts before? Do they
> indicate
> > > any
> > > > >> >> resource misconfiguration? The CPU usage on broker is pretty
> > small.
> > > > >> >> Also, in spite of setting batch size to 100, the failing batch
> > > > usually
> > > > >> only
> > > > >> >> have 50 to 60 events. Is there any other limit I am hitting?
> > > > >> >>
> > > > >> >> Any help is appreciated.
> > > > >> >>
> > > > >> >>
> > > > >> >> Regards,
> > > > >> >> Vaibhav
> > > > >> >> GumGum
> > > > >> >>
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Jun Rao <ju...@gmail.com>.

num.retries is added in 0.7.1, which is just out.

Thanks,

Jun

On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <vp...@gmail.com> wrote:

> Jun,
>
> I wrote a test producer to test if num.retries working or not. But I found
> that it's not working. No matter how many retries I set,  whenever a
> message send fails, it always never gets to the broker.
> I am using Kafka 0.7.0
>
> Is this a known  problem? Do I need to file a JIRA issue?
>
> Because we are using Async producer we have no way to catch the exception
> ourselves and act on it. Is that right? Any ideas how we can ensure that
> every single message is sent with retries?
>
> Regards,
> Vaibhav
>
> On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
>
> > Set num.retries in producer config property file. It defaults to 0.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <vp...@gmail.com>
> > wrote:
> >
> > > I reduced the batch size and reduced the pooled connections. Number of
> > > errors have gone down significantly. But they are not eliminated yet.
> > >
> > > We definitely don't want to loose any events.
> > >
> > > Jun, how do I configure the client resend you mentioned below? I
> couldn't
> > > find any configuration.
> > >
> > > Regards,
> > > Vaibhav
> > >
> > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vp...@gmail.com>
> > > wrote:
> > >
> > > > These are great pointers.
> > > > I found some more discussion here:
> > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427
> > > >
> > > > I can do the following to keep using the elastic load balancer:
> > > >
> > > > 1) Reduce the producer pool size to 1 or 2 because looks like
> > connections
> > > > are sitting idle. My volume does not desire that big pool.
> > > > 2) Reduce the batch size so that the webapp frequently dumps the data
> > to
> > > > brokers. It's better for us anyways.
> > > >
> > > > I will try both of these options and report back.
> > > >
> > > > Thank you very much Jun and Niek.
> > > >
> > > > Regards,
> > > > Vaibhav
> > > >
> > > >
> > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <
> niek.sanders@gmail.com
> > > >wrote:
> > > >
> > > >> ELBs will close connections that have no data going across them
> over a
> > > >> 60 sec period.  A reference to this behavior can be found at the
> > > >> bottom of this page:
> > > >>
> > > >> http://aws.amazon.com/articles/1636185810492479
> > > >>
> > > >> There is currently no way for customers to increase this timeout.
>  If
> > > >> this timeout is in fact the problem, then the alternative is to use
> HA
> > > >> proxy for load balancing instead.
> > > >>
> > > >> - Niek
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com> wrote:
> > > >> > Vaibhav,
> > > >> >
> > > >> > Does elastic load balancer have any timeouts or quotas that kill
> > > >> existing
> > > >> > socket connections? Does client resend succeed (you can configure
> > > >> resend in
> > > >> > DefaultEventHandler)?
> > > >> >
> > > >> > Thanks,
> > > >> >
> > > >> > Jun
> > > >> >
> > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
> > vpuranik@gmail.com>
> > > >> wrote:
> > > >> >
> > > >> >> Hi all,
> > > >> >>
> > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using
> async
> > > >> >> prouducers in our web app.
> > > >> >> I am pooling kafak producers with commons pool. Pool size - 10.
> > > >> batch.size
> > > >> >> is 100.
> > > >> >>
> > > >> >> We have 3 c1.xlarge instances with Kafka brokers installed
> behind a
> > > >> elastic
> > > >> >> load balancer in AWS.
> > > >> >> Every minute we loose some events because of the following
> > exception
> > > >> >>
> > > >> >> - Disconnecting from
> > > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> > > >> >> - Error in handling batch of 64 events
> > > >> >> java.io.IOException: Connection timed out
> > > >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
> > > >> >>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> > > >> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
> > > >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
> > > >> >>    at
> > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> > > >> >>    at
> > > >> >>
> > > >>
> > >
> >
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
> > > >> >>    at
> > kafka.network.Send$class.writeCompletely(Transmission.scala:76)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> > > >> >>    at
> > > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
> > > >> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
> > > >> >>    at
> kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
> > > >> >>    at
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
> > > >> >>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
> > > >> >>    at
> > > >> >>
> > > >>
> > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> > > >> >> - Connected to
> > dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> > > for
> > > >> >> producing
> > > >> >>
> > > >> >> Has anybody faced this kind of timeouts before? Do they indicate
> > any
> > > >> >> resource misconfiguration? The CPU usage on broker is pretty
> small.
> > > >> >> Also, in spite of setting batch size to 100, the failing batch
> > > usually
> > > >> only
> > > >> >> have 50 to 60 events. Is there any other limit I am hitting?
> > > >> >>
> > > >> >> Any help is appreciated.
> > > >> >>
> > > >> >>
> > > >> >> Regards,
> > > >> >> Vaibhav
> > > >> >> GumGum
> > > >> >>
> > > >>
> > > >
> > > >
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Neha Narkhede <ne...@gmail.com>.

Vaibhav,

>> No matter how many retries I set,  whenever a
message send fails, it always never gets to the broker.

Please can you send across the error message that you see on the
producer side ?

Thanks,
Neha

On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <vp...@gmail.com> wrote:
> Jun,
>
> I wrote a test producer to test if num.retries working or not. But I found
> that it's not working. No matter how many retries I set,  whenever a
> message send fails, it always never gets to the broker.
> I am using Kafka 0.7.0
>
> Is this a known  problem? Do I need to file a JIRA issue?
>
> Because we are using Async producer we have no way to catch the exception
> ourselves and act on it. Is that right? Any ideas how we can ensure that
> every single message is sent with retries?
>
> Regards,
> Vaibhav
>
> On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:
>
>> Set num.retries in producer config property file. It defaults to 0.
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <vp...@gmail.com>
>> wrote:
>>
>> > I reduced the batch size and reduced the pooled connections. Number of
>> > errors have gone down significantly. But they are not eliminated yet.
>> >
>> > We definitely don't want to loose any events.
>> >
>> > Jun, how do I configure the client resend you mentioned below? I couldn't
>> > find any configuration.
>> >
>> > Regards,
>> > Vaibhav
>> >
>> > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vp...@gmail.com>
>> > wrote:
>> >
>> > > These are great pointers.
>> > > I found some more discussion here:
>> > > https://forums.aws.amazon.com/thread.jspa?threadID=33427
>> > >
>> > > I can do the following to keep using the elastic load balancer:
>> > >
>> > > 1) Reduce the producer pool size to 1 or 2 because looks like
>> connections
>> > > are sitting idle. My volume does not desire that big pool.
>> > > 2) Reduce the batch size so that the webapp frequently dumps the data
>> to
>> > > brokers. It's better for us anyways.
>> > >
>> > > I will try both of these options and report back.
>> > >
>> > > Thank you very much Jun and Niek.
>> > >
>> > > Regards,
>> > > Vaibhav
>> > >
>> > >
>> > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <niek.sanders@gmail.com
>> > >wrote:
>> > >
>> > >> ELBs will close connections that have no data going across them over a
>> > >> 60 sec period.  A reference to this behavior can be found at the
>> > >> bottom of this page:
>> > >>
>> > >> http://aws.amazon.com/articles/1636185810492479
>> > >>
>> > >> There is currently no way for customers to increase this timeout.  If
>> > >> this timeout is in fact the problem, then the alternative is to use HA
>> > >> proxy for load balancing instead.
>> > >>
>> > >> - Niek
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com> wrote:
>> > >> > Vaibhav,
>> > >> >
>> > >> > Does elastic load balancer have any timeouts or quotas that kill
>> > >> existing
>> > >> > socket connections? Does client resend succeed (you can configure
>> > >> resend in
>> > >> > DefaultEventHandler)?
>> > >> >
>> > >> > Thanks,
>> > >> >
>> > >> > Jun
>> > >> >
>> > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
>> vpuranik@gmail.com>
>> > >> wrote:
>> > >> >
>> > >> >> Hi all,
>> > >> >>
>> > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async
>> > >> >> prouducers in our web app.
>> > >> >> I am pooling kafak producers with commons pool. Pool size - 10.
>> > >> batch.size
>> > >> >> is 100.
>> > >> >>
>> > >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a
>> > >> elastic
>> > >> >> load balancer in AWS.
>> > >> >> Every minute we loose some events because of the following
>> exception
>> > >> >>
>> > >> >> - Disconnecting from
>> > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
>> > >> >> - Error in handling batch of 64 events
>> > >> >> java.io.IOException: Connection timed out
>> > >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
>> > >> >>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>> > >> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>> > >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
>> > >> >>    at
>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>> > >> >>    at
>> > >> >>
>> > >>
>> >
>> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
>> > >> >>    at
>> kafka.network.Send$class.writeCompletely(Transmission.scala:76)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
>> > >> >>    at
>> > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
>> > >> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
>> > >> >>    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
>> > >> >>    at
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
>> > >> >>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
>> > >> >>    at
>> > >> >>
>> > >>
>> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
>> > >> >> - Connected to
>> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
>> > for
>> > >> >> producing
>> > >> >>
>> > >> >> Has anybody faced this kind of timeouts before? Do they indicate
>> any
>> > >> >> resource misconfiguration? The CPU usage on broker is pretty small.
>> > >> >> Also, in spite of setting batch size to 100, the failing batch
>> > usually
>> > >> only
>> > >> >> have 50 to 60 events. Is there any other limit I am hitting?
>> > >> >>
>> > >> >> Any help is appreciated.
>> > >> >>
>> > >> >>
>> > >> >> Regards,
>> > >> >> Vaibhav
>> > >> >> GumGum
>> > >> >>
>> > >>
>> > >
>> > >
>> >
>>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Chris Burroughs <ch...@gmail.com>.

On 06/27/2012 05:43 PM, Vaibhav Puranik wrote:
> Is this a known  problem? Do I need to file a JIRA issue?

Thanks!  Please do.

Re: Getting timeouts with elastic load balancer in AWS

Posted by Vaibhav Puranik <vp...@gmail.com>.

Jun,

I wrote a test producer to test if num.retries working or not. But I found
that it's not working. No matter how many retries I set,  whenever a
message send fails, it always never gets to the broker.
I am using Kafka 0.7.0

Is this a known  problem? Do I need to file a JIRA issue?

Because we are using Async producer we have no way to catch the exception
ourselves and act on it. Is that right? Any ideas how we can ensure that
every single message is sent with retries?

Regards,
Vaibhav

On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <ju...@gmail.com> wrote:

> Set num.retries in producer config property file. It defaults to 0.
>
> Thanks,
>
> Jun
>
> On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
>
> > I reduced the batch size and reduced the pooled connections. Number of
> > errors have gone down significantly. But they are not eliminated yet.
> >
> > We definitely don't want to loose any events.
> >
> > Jun, how do I configure the client resend you mentioned below? I couldn't
> > find any configuration.
> >
> > Regards,
> > Vaibhav
> >
> > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vp...@gmail.com>
> > wrote:
> >
> > > These are great pointers.
> > > I found some more discussion here:
> > > https://forums.aws.amazon.com/thread.jspa?threadID=33427
> > >
> > > I can do the following to keep using the elastic load balancer:
> > >
> > > 1) Reduce the producer pool size to 1 or 2 because looks like
> connections
> > > are sitting idle. My volume does not desire that big pool.
> > > 2) Reduce the batch size so that the webapp frequently dumps the data
> to
> > > brokers. It's better for us anyways.
> > >
> > > I will try both of these options and report back.
> > >
> > > Thank you very much Jun and Niek.
> > >
> > > Regards,
> > > Vaibhav
> > >
> > >
> > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <niek.sanders@gmail.com
> > >wrote:
> > >
> > >> ELBs will close connections that have no data going across them over a
> > >> 60 sec period.  A reference to this behavior can be found at the
> > >> bottom of this page:
> > >>
> > >> http://aws.amazon.com/articles/1636185810492479
> > >>
> > >> There is currently no way for customers to increase this timeout.  If
> > >> this timeout is in fact the problem, then the alternative is to use HA
> > >> proxy for load balancing instead.
> > >>
> > >> - Niek
> > >>
> > >>
> > >>
> > >>
> > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com> wrote:
> > >> > Vaibhav,
> > >> >
> > >> > Does elastic load balancer have any timeouts or quotas that kill
> > >> existing
> > >> > socket connections? Does client resend succeed (you can configure
> > >> resend in
> > >> > DefaultEventHandler)?
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Jun
> > >> >
> > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
> vpuranik@gmail.com>
> > >> wrote:
> > >> >
> > >> >> Hi all,
> > >> >>
> > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async
> > >> >> prouducers in our web app.
> > >> >> I am pooling kafak producers with commons pool. Pool size - 10.
> > >> batch.size
> > >> >> is 100.
> > >> >>
> > >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a
> > >> elastic
> > >> >> load balancer in AWS.
> > >> >> Every minute we loose some events because of the following
> exception
> > >> >>
> > >> >> - Disconnecting from
> > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> > >> >> - Error in handling batch of 64 events
> > >> >> java.io.IOException: Connection timed out
> > >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
> > >> >>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> > >> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
> > >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
> > >> >>    at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> > >> >>    at
> > >> >>
> > >>
> >
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
> > >> >>    at
> kafka.network.Send$class.writeCompletely(Transmission.scala:76)
> > >> >>    at
> > >> >>
> > >> >>
> > >>
> >
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> > >> >>    at
> > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
> > >> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
> > >> >>    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
> > >> >>    at
> > >> >>
> > >>
> >
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
> > >> >>    at
> > >> >>
> > >> >>
> > >>
> >
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
> > >> >>    at
> > >> >>
> > >> >>
> > >>
> >
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
> > >> >>    at
> > >> >>
> > >> >>
> > >>
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
> > >> >>    at
> > >> >>
> > >> >>
> > >>
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
> > >> >>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
> > >> >>    at
> > >> >>
> > >> >>
> > >>
> >
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
> > >> >>    at
> > >> >>
> > >>
> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> > >> >> - Connected to
> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> > for
> > >> >> producing
> > >> >>
> > >> >> Has anybody faced this kind of timeouts before? Do they indicate
> any
> > >> >> resource misconfiguration? The CPU usage on broker is pretty small.
> > >> >> Also, in spite of setting batch size to 100, the failing batch
> > usually
> > >> only
> > >> >> have 50 to 60 events. Is there any other limit I am hitting?
> > >> >>
> > >> >> Any help is appreciated.
> > >> >>
> > >> >>
> > >> >> Regards,
> > >> >> Vaibhav
> > >> >> GumGum
> > >> >>
> > >>
> > >
> > >
> >
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Jun Rao <ju...@gmail.com>.

Set num.retries in producer config property file. It defaults to 0.

Thanks,

Jun

On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <vp...@gmail.com> wrote:

> I reduced the batch size and reduced the pooled connections. Number of
> errors have gone down significantly. But they are not eliminated yet.
>
> We definitely don't want to loose any events.
>
> Jun, how do I configure the client resend you mentioned below? I couldn't
> find any configuration.
>
> Regards,
> Vaibhav
>
> On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
>
> > These are great pointers.
> > I found some more discussion here:
> > https://forums.aws.amazon.com/thread.jspa?threadID=33427
> >
> > I can do the following to keep using the elastic load balancer:
> >
> > 1) Reduce the producer pool size to 1 or 2 because looks like connections
> > are sitting idle. My volume does not desire that big pool.
> > 2) Reduce the batch size so that the webapp frequently dumps the data to
> > brokers. It's better for us anyways.
> >
> > I will try both of these options and report back.
> >
> > Thank you very much Jun and Niek.
> >
> > Regards,
> > Vaibhav
> >
> >
> > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <niek.sanders@gmail.com
> >wrote:
> >
> >> ELBs will close connections that have no data going across them over a
> >> 60 sec period.  A reference to this behavior can be found at the
> >> bottom of this page:
> >>
> >> http://aws.amazon.com/articles/1636185810492479
> >>
> >> There is currently no way for customers to increase this timeout.  If
> >> this timeout is in fact the problem, then the alternative is to use HA
> >> proxy for load balancing instead.
> >>
> >> - Niek
> >>
> >>
> >>
> >>
> >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com> wrote:
> >> > Vaibhav,
> >> >
> >> > Does elastic load balancer have any timeouts or quotas that kill
> >> existing
> >> > socket connections? Does client resend succeed (you can configure
> >> resend in
> >> > DefaultEventHandler)?
> >> >
> >> > Thanks,
> >> >
> >> > Jun
> >> >
> >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <vp...@gmail.com>
> >> wrote:
> >> >
> >> >> Hi all,
> >> >>
> >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async
> >> >> prouducers in our web app.
> >> >> I am pooling kafak producers with commons pool. Pool size - 10.
> >> batch.size
> >> >> is 100.
> >> >>
> >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a
> >> elastic
> >> >> load balancer in AWS.
> >> >> Every minute we loose some events because of the following exception
> >> >>
> >> >> - Disconnecting from
> >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> >> >> - Error in handling batch of 64 events
> >> >> java.io.IOException: Connection timed out
> >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
> >> >>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> >> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
> >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
> >> >>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> >> >>    at
> >> >>
> >>
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
> >> >>    at kafka.network.Send$class.writeCompletely(Transmission.scala:76)
> >> >>    at
> >> >>
> >> >>
> >>
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> >> >>    at
> kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
> >> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
> >> >>    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
> >> >>    at
> >> >>
> >>
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
> >> >>    at
> >> >>
> >> >>
> >>
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
> >> >>    at
> >> >>
> >> >>
> >>
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
> >> >>    at
> >> >>
> >> >>
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
> >> >>    at
> >> >>
> >> >>
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
> >> >>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
> >> >>    at
> >> >>
> >> >>
> >>
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
> >> >>    at
> >> >>
> >> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> >> >> - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> for
> >> >> producing
> >> >>
> >> >> Has anybody faced this kind of timeouts before? Do they indicate any
> >> >> resource misconfiguration? The CPU usage on broker is pretty small.
> >> >> Also, in spite of setting batch size to 100, the failing batch
> usually
> >> only
> >> >> have 50 to 60 events. Is there any other limit I am hitting?
> >> >>
> >> >> Any help is appreciated.
> >> >>
> >> >>
> >> >> Regards,
> >> >> Vaibhav
> >> >> GumGum
> >> >>
> >>
> >
> >
>

Kafka - Avro Encoder

Posted by Murtaza Doctor <mu...@richrelevance.com>.

Hello Folks,

We are currently evaluating Kafka and had a few questions around the
Encoder functionality.
Our data is in avro format and we wish to send the data to the broker in
this format as well eventually write to HDFS. As documented, we do realize
that we need a Custom Encoder to achieve creation of the Message object.

Questions we had:
- Is there any sample code around this since this is probably a common
use-case. I meant is there a CustomAvroEncoder which we can use out of the
box or any chance this can also be open-sourced?
- In terms of internals - are we converting avro into byte stream and
creating a Message Object and then writing to the queue, does this incur
any overhead in your opinion?
- Any best practices around this or how others would approach this problem?

If there is any value we would definitely like to see this added to the
FAQs or even part of some sample code.

Thanks,
murtaza

Re: Getting timeouts with elastic load balancer in AWS

Posted by Vaibhav Puranik <vp...@gmail.com>.

I reduced the batch size and reduced the pooled connections. Number of
errors have gone down significantly. But they are not eliminated yet.

We definitely don't want to loose any events.

Jun, how do I configure the client resend you mentioned below? I couldn't
find any configuration.

Regards,
Vaibhav

On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vp...@gmail.com> wrote:

> These are great pointers.
> I found some more discussion here:
> https://forums.aws.amazon.com/thread.jspa?threadID=33427
>
> I can do the following to keep using the elastic load balancer:
>
> 1) Reduce the producer pool size to 1 or 2 because looks like connections
> are sitting idle. My volume does not desire that big pool.
> 2) Reduce the batch size so that the webapp frequently dumps the data to
> brokers. It's better for us anyways.
>
> I will try both of these options and report back.
>
> Thank you very much Jun and Niek.
>
> Regards,
> Vaibhav
>
>
> On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <ni...@gmail.com>wrote:
>
>> ELBs will close connections that have no data going across them over a
>> 60 sec period.  A reference to this behavior can be found at the
>> bottom of this page:
>>
>> http://aws.amazon.com/articles/1636185810492479
>>
>> There is currently no way for customers to increase this timeout.  If
>> this timeout is in fact the problem, then the alternative is to use HA
>> proxy for load balancing instead.
>>
>> - Niek
>>
>>
>>
>>
>> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com> wrote:
>> > Vaibhav,
>> >
>> > Does elastic load balancer have any timeouts or quotas that kill
>> existing
>> > socket connections? Does client resend succeed (you can configure
>> resend in
>> > DefaultEventHandler)?
>> >
>> > Thanks,
>> >
>> > Jun
>> >
>> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <vp...@gmail.com>
>> wrote:
>> >
>> >> Hi all,
>> >>
>> >> We are sending our ad impressions to Kafka 0.7.0. I am using async
>> >> prouducers in our web app.
>> >> I am pooling kafak producers with commons pool. Pool size - 10.
>> batch.size
>> >> is 100.
>> >>
>> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a
>> elastic
>> >> load balancer in AWS.
>> >> Every minute we loose some events because of the following exception
>> >>
>> >> - Disconnecting from
>> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
>> >> - Error in handling batch of 64 events
>> >> java.io.IOException: Connection timed out
>> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
>> >>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
>> >>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>> >>    at
>> >>
>> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
>> >>    at kafka.network.Send$class.writeCompletely(Transmission.scala:76)
>> >>    at
>> >>
>> >>
>> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
>> >>    at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
>> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
>> >>    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
>> >>    at
>> >>
>> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
>> >>    at
>> >>
>> >>
>> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
>> >>    at
>> >>
>> >>
>> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
>> >>    at
>> >>
>> >>
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
>> >>    at
>> >>
>> >>
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
>> >>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
>> >>    at
>> >>
>> >>
>> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
>> >>    at
>> >>
>> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
>> >> - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092for
>> >> producing
>> >>
>> >> Has anybody faced this kind of timeouts before? Do they indicate any
>> >> resource misconfiguration? The CPU usage on broker is pretty small.
>> >> Also, in spite of setting batch size to 100, the failing batch usually
>> only
>> >> have 50 to 60 events. Is there any other limit I am hitting?
>> >>
>> >> Any help is appreciated.
>> >>
>> >>
>> >> Regards,
>> >> Vaibhav
>> >> GumGum
>> >>
>>
>
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Vaibhav Puranik <vp...@gmail.com>.

These are great pointers.
I found some more discussion here:
https://forums.aws.amazon.com/thread.jspa?threadID=33427

I can do the following to keep using the elastic load balancer:

1) Reduce the producer pool size to 1 or 2 because looks like connections
are sitting idle. My volume does not desire that big pool.
2) Reduce the batch size so that the webapp frequently dumps the data to
brokers. It's better for us anyways.

I will try both of these options and report back.

Thank you very much Jun and Niek.

Regards,
Vaibhav

On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <ni...@gmail.com>wrote:

> ELBs will close connections that have no data going across them over a
> 60 sec period.  A reference to this behavior can be found at the
> bottom of this page:
>
> http://aws.amazon.com/articles/1636185810492479
>
> There is currently no way for customers to increase this timeout.  If
> this timeout is in fact the problem, then the alternative is to use HA
> proxy for load balancing instead.
>
> - Niek
>
>
>
>
> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com> wrote:
> > Vaibhav,
> >
> > Does elastic load balancer have any timeouts or quotas that kill existing
> > socket connections? Does client resend succeed (you can configure resend
> in
> > DefaultEventHandler)?
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
> >
> >> Hi all,
> >>
> >> We are sending our ad impressions to Kafka 0.7.0. I am using async
> >> prouducers in our web app.
> >> I am pooling kafak producers with commons pool. Pool size - 10.
> batch.size
> >> is 100.
> >>
> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a
> elastic
> >> load balancer in AWS.
> >> Every minute we loose some events because of the following exception
> >>
> >> - Disconnecting from
> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> >> - Error in handling batch of 64 events
> >> java.io.IOException: Connection timed out
> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
> >>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
> >>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> >>    at
> >>
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
> >>    at kafka.network.Send$class.writeCompletely(Transmission.scala:76)
> >>    at
> >>
> >>
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> >>    at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
> >>    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
> >>    at
> >>
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
> >>    at
> >>
> >>
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
> >>    at
> >>
> >>
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
> >>    at
> >>
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
> >>    at
> >>
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
> >>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
> >>    at
> >>
> >>
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
> >>    at
> >> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> >> - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 for
> >> producing
> >>
> >> Has anybody faced this kind of timeouts before? Do they indicate any
> >> resource misconfiguration? The CPU usage on broker is pretty small.
> >> Also, in spite of setting batch size to 100, the failing batch usually
> only
> >> have 50 to 60 events. Is there any other limit I am hitting?
> >>
> >> Any help is appreciated.
> >>
> >>
> >> Regards,
> >> Vaibhav
> >> GumGum
> >>
>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Niek Sanders <ni...@gmail.com>.

ELBs will close connections that have no data going across them over a
60 sec period.  A reference to this behavior can be found at the
bottom of this page:

http://aws.amazon.com/articles/1636185810492479

There is currently no way for customers to increase this timeout.  If
this timeout is in fact the problem, then the alternative is to use HA
proxy for load balancing instead.

- Niek




On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <ju...@gmail.com> wrote:
> Vaibhav,
>
> Does elastic load balancer have any timeouts or quotas that kill existing
> socket connections? Does client resend succeed (you can configure resend in
> DefaultEventHandler)?
>
> Thanks,
>
> Jun
>
> On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <vp...@gmail.com> wrote:
>
>> Hi all,
>>
>> We are sending our ad impressions to Kafka 0.7.0. I am using async
>> prouducers in our web app.
>> I am pooling kafak producers with commons pool. Pool size - 10. batch.size
>> is 100.
>>
>> We have 3 c1.xlarge instances with Kafka brokers installed behind a elastic
>> load balancer in AWS.
>> Every minute we loose some events because of the following exception
>>
>> - Disconnecting from dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
>> - Error in handling batch of 64 events
>> java.io.IOException: Connection timed out
>>    at sun.nio.ch.FileDispatcher.write0(Native Method)
>>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
>>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>>    at
>> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
>>    at kafka.network.Send$class.writeCompletely(Transmission.scala:76)
>>    at
>>
>> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
>>    at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
>>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
>>    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
>>    at
>> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
>>    at
>>
>> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
>>    at
>>
>> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
>>    at
>>
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
>>    at
>>
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
>>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
>>    at
>>
>> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
>>    at
>> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
>> - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 for
>> producing
>>
>> Has anybody faced this kind of timeouts before? Do they indicate any
>> resource misconfiguration? The CPU usage on broker is pretty small.
>> Also, in spite of setting batch size to 100, the failing batch usually only
>> have 50 to 60 events. Is there any other limit I am hitting?
>>
>> Any help is appreciated.
>>
>>
>> Regards,
>> Vaibhav
>> GumGum
>>

Re: Getting timeouts with elastic load balancer in AWS

Posted by Jun Rao <ju...@gmail.com>.

Vaibhav,

Does elastic load balancer have any timeouts or quotas that kill existing
socket connections? Does client resend succeed (you can configure resend in
DefaultEventHandler)?

Thanks,

Jun

On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <vp...@gmail.com> wrote:

> Hi all,
>
> We are sending our ad impressions to Kafka 0.7.0. I am using async
> prouducers in our web app.
> I am pooling kafak producers with commons pool. Pool size - 10. batch.size
> is 100.
>
> We have 3 c1.xlarge instances with Kafka brokers installed behind a elastic
> load balancer in AWS.
> Every minute we loose some events because of the following exception
>
> - Disconnecting from dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> - Error in handling batch of 64 events
> java.io.IOException: Connection timed out
>    at sun.nio.ch.FileDispatcher.write0(Native Method)
>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>    at
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
>    at kafka.network.Send$class.writeCompletely(Transmission.scala:76)
>    at
>
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
>    at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
>    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
>    at
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
>    at
>
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
>    at
>
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
>    at
>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
>    at
>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
>    at
>
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
>    at
> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 for
> producing
>
> Has anybody faced this kind of timeouts before? Do they indicate any
> resource misconfiguration? The CPU usage on broker is pretty small.
> Also, in spite of setting batch size to 100, the failing batch usually only
> have 50 to 60 events. Is there any other limit I am hitting?
>
> Any help is appreciated.
>
>
> Regards,
> Vaibhav
> GumGum
>