You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Shayne S <sh...@gmail.com> on 2015/06/30 19:14:28 UTC

Producer repeatedly locking up

This problem is intermittent, not sure what is causing it. Some days
everything runs non-stop with no issues, some days I get the following.

Setup:
- Single broker
- Running 0.8.2.1

I'm running a single broker. When the problem is presenting, anywhere from
5,000 to 30,000 messages may be processed in a few seconds, then everything
stops for 15-16 minutes before I see this:

2015-06-30 16:48:03,737 [async-thread-macro-1] INFO
kafka.producer.SyncProducer - Disconnecting from
ip-10-0-9-105.ec2.internal:9092
2015-06-30 16:48:03,744 [async-thread-macro-1] WARN
kafka.producer.async.DefaultEventHandler - Failed to send producer request
with correlation id 42474 to broker 0 with data for partitions
[refrigerators_archive_features,0]
java.io.IOException: Connection timed out
        at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
        at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
        at sun.nio.ch.IOUtil.write(IOUtil.java:148)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:524)
        at java.nio.channels.SocketChannel.write(SocketChannel.java:493)
        at
kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
        at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
        at
kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
        at kafka.network.BlockingChannel.send(BlockingChannel.scala:92)
        at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:72)
        at
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71)
        at
kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:102)
        at
kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
        at
kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
        at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
        at
kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:101)
        at
kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
        at
kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
        at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
        at kafka.producer.SyncProducer.send(SyncProducer.scala:100)
        at
kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
        at
kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
        at
kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
........

And the cycle repeats. Another few thousand messages and then it stops once
more. A thread dump shows this:

"async-thread-macro-3" daemon prio=10 tid=0x00007f90383a5800 nid=0x6a8
runnable [0x00007f9025872000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
        at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
        at sun.nio.ch.IOUtil.write(IOUtil.java:148)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:524)
        - locked <0x000000008dd16c70> (a java.lang.Object)
        at java.nio.channels.SocketChannel.write(SocketChannel.java:493)
        at
kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
        at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
        at
kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
        at kafka.network.BlockingChannel.send(BlockingChannel.scala:92)
        at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:72)
        at
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71)
        - locked <0x000000008dd16b30> (a java.lang.Object)
        at
kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:102)
        at
kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
        at
kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
        at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
        at
kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:101)
        at
kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
        at
kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
        at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
        at kafka.producer.SyncProducer.send(SyncProducer.scala:100)
        at
kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
        at
kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
        at
kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
.........

I found an issue that sounds similar that is fixed in 0.8.3, but it doesn't
sound like my producer is getting stuck in the same place:
https://issues.apache.org/jira/browse/KAFKA-1733

Thanks in advance!
Shayne

Re: Producer repeatedly locking up

Posted by Shayne S <sh...@gmail.com>.
The problem is gone, but I'm unsure of the root cause. The client library I
use recently added support for the new producer. Switching to that seems to
have sidestepped the problem.

On Tue, Jun 30, 2015 at 12:53 PM, Shayne S <sh...@gmail.com> wrote:

> Thanks for responding Gwen.
>
> There is sometimes an error in server.log, but it isn't 1 to 1 with the
> connection resets on the producer (although this is a socket from that
> box). :
>
> [2015-06-30 15:26:09,544] ERROR Closing socket for /10.0.6.244 because of
> error (kafka.network.Processor)
> java.io.IOException: Connection reset by peer
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>         at kafka.utils.Utils$.read(Utils.scala:380)
>         at
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>         at kafka.network.Processor.read(SocketServer.scala:444)
>         at kafka.network.Processor.run(SocketServer.scala:340)
>         at java.lang.Thread.run(Thread.java:745)
>
> Maybe interesting, I also see the following in the server.log with the
> mentioned 15-16 minute quiet window:
>
> [2015-06-30 14:41:55,978] INFO Closing socket connection to /10.0.6.244.
> (kafka.network.Processor)
> [2015-06-30 14:41:55,979] INFO Closing socket connection to /10.0.6.244.
> (kafka.network.Processor)
> [2015-06-30 14:41:55,982] INFO Closing socket connection to /10.0.6.244.
> (kafka.network.Processor)
> [2015-06-30 14:57:27,308] INFO Closing socket connection to /10.0.6.244.
> (kafka.network.Processor)
> [2015-06-30 14:57:27,309] INFO Closing socket connection to /10.0.6.244.
> (kafka.network.Processor)
> [2015-06-30 14:57:27,312] INFO Closing socket connection to /10.0.6.244.
> (kafka.network.Processor)
>
> On Tue, Jun 30, 2015 at 12:44 PM, Gwen Shapira <gs...@cloudera.com>
> wrote:
>
>> Do you see any errors on the brokers when this happens?
>>
>> On Tue, Jun 30, 2015 at 10:14 AM, Shayne S <sh...@gmail.com> wrote:
>> > This problem is intermittent, not sure what is causing it. Some days
>> > everything runs non-stop with no issues, some days I get the following.
>> >
>> > Setup:
>> > - Single broker
>> > - Running 0.8.2.1
>> >
>> > I'm running a single broker. When the problem is presenting, anywhere
>> from
>> > 5,000 to 30,000 messages may be processed in a few seconds, then
>> everything
>> > stops for 15-16 minutes before I see this:
>> >
>> > 2015-06-30 16:48:03,737 [async-thread-macro-1] INFO
>> > kafka.producer.SyncProducer - Disconnecting from
>> > ip-10-0-9-105.ec2.internal:9092
>> > 2015-06-30 16:48:03,744 [async-thread-macro-1] WARN
>> > kafka.producer.async.DefaultEventHandler - Failed to send producer
>> request
>> > with correlation id 42474 to broker 0 with data for partitions
>> > [refrigerators_archive_features,0]
>> > java.io.IOException: Connection timed out
>> >         at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
>> >         at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
>> >         at sun.nio.ch.IOUtil.write(IOUtil.java:148)
>> >         at
>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:524)
>> >         at java.nio.channels.SocketChannel.write(SocketChannel.java:493)
>> >         at
>> >
>> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
>> >         at
>> kafka.network.Send$class.writeCompletely(Transmission.scala:75)
>> >         at
>> >
>> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
>> >         at kafka.network.BlockingChannel.send(BlockingChannel.scala:92)
>> >         at
>> kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:72)
>> >         at
>> >
>> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:102)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
>> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:101)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
>> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>> >         at kafka.producer.SyncProducer.send(SyncProducer.scala:100)
>> >         at
>> >
>> kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
>> >         at
>> >
>> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
>> >         at
>> >
>> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
>> > ........
>> >
>> > And the cycle repeats. Another few thousand messages and then it stops
>> once
>> > more. A thread dump shows this:
>> >
>> > "async-thread-macro-3" daemon prio=10 tid=0x00007f90383a5800 nid=0x6a8
>> > runnable [0x00007f9025872000]
>> >    java.lang.Thread.State: RUNNABLE
>> >         at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
>> >         at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
>> >         at sun.nio.ch.IOUtil.write(IOUtil.java:148)
>> >         at
>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:524)
>> >         - locked <0x000000008dd16c70> (a java.lang.Object)
>> >         at java.nio.channels.SocketChannel.write(SocketChannel.java:493)
>> >         at
>> >
>> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
>> >         at
>> kafka.network.Send$class.writeCompletely(Transmission.scala:75)
>> >         at
>> >
>> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
>> >         at kafka.network.BlockingChannel.send(BlockingChannel.scala:92)
>> >         at
>> kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:72)
>> >         at
>> >
>> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71)
>> >         - locked <0x000000008dd16b30> (a java.lang.Object)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:102)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
>> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:101)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
>> >         at
>> >
>> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
>> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>> >         at kafka.producer.SyncProducer.send(SyncProducer.scala:100)
>> >         at
>> >
>> kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
>> >         at
>> >
>> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
>> >         at
>> >
>> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
>> > .........
>> >
>> > I found an issue that sounds similar that is fixed in 0.8.3, but it
>> doesn't
>> > sound like my producer is getting stuck in the same place:
>> > https://issues.apache.org/jira/browse/KAFKA-1733
>> >
>> > Thanks in advance!
>> > Shayne
>>
>
>

Re: Producer repeatedly locking up

Posted by Shayne S <sh...@gmail.com>.
Thanks for responding Gwen.

There is sometimes an error in server.log, but it isn't 1 to 1 with the
connection resets on the producer (although this is a socket from that
box). :

[2015-06-30 15:26:09,544] ERROR Closing socket for /10.0.6.244 because of
error (kafka.network.Processor)
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        at kafka.utils.Utils$.read(Utils.scala:380)
        at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
        at kafka.network.Processor.read(SocketServer.scala:444)
        at kafka.network.Processor.run(SocketServer.scala:340)
        at java.lang.Thread.run(Thread.java:745)

Maybe interesting, I also see the following in the server.log with the
mentioned 15-16 minute quiet window:

[2015-06-30 14:41:55,978] INFO Closing socket connection to /10.0.6.244.
(kafka.network.Processor)
[2015-06-30 14:41:55,979] INFO Closing socket connection to /10.0.6.244.
(kafka.network.Processor)
[2015-06-30 14:41:55,982] INFO Closing socket connection to /10.0.6.244.
(kafka.network.Processor)
[2015-06-30 14:57:27,308] INFO Closing socket connection to /10.0.6.244.
(kafka.network.Processor)
[2015-06-30 14:57:27,309] INFO Closing socket connection to /10.0.6.244.
(kafka.network.Processor)
[2015-06-30 14:57:27,312] INFO Closing socket connection to /10.0.6.244.
(kafka.network.Processor)

On Tue, Jun 30, 2015 at 12:44 PM, Gwen Shapira <gs...@cloudera.com>
wrote:

> Do you see any errors on the brokers when this happens?
>
> On Tue, Jun 30, 2015 at 10:14 AM, Shayne S <sh...@gmail.com> wrote:
> > This problem is intermittent, not sure what is causing it. Some days
> > everything runs non-stop with no issues, some days I get the following.
> >
> > Setup:
> > - Single broker
> > - Running 0.8.2.1
> >
> > I'm running a single broker. When the problem is presenting, anywhere
> from
> > 5,000 to 30,000 messages may be processed in a few seconds, then
> everything
> > stops for 15-16 minutes before I see this:
> >
> > 2015-06-30 16:48:03,737 [async-thread-macro-1] INFO
> > kafka.producer.SyncProducer - Disconnecting from
> > ip-10-0-9-105.ec2.internal:9092
> > 2015-06-30 16:48:03,744 [async-thread-macro-1] WARN
> > kafka.producer.async.DefaultEventHandler - Failed to send producer
> request
> > with correlation id 42474 to broker 0 with data for partitions
> > [refrigerators_archive_features,0]
> > java.io.IOException: Connection timed out
> >         at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
> >         at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
> >         at sun.nio.ch.IOUtil.write(IOUtil.java:148)
> >         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:524)
> >         at java.nio.channels.SocketChannel.write(SocketChannel.java:493)
> >         at
> >
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
> >         at
> kafka.network.Send$class.writeCompletely(Transmission.scala:75)
> >         at
> >
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
> >         at kafka.network.BlockingChannel.send(BlockingChannel.scala:92)
> >         at
> kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:72)
> >         at
> >
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71)
> >         at
> >
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:102)
> >         at
> >
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
> >         at
> >
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> >         at
> >
> kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:101)
> >         at
> > kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
> >         at
> > kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> >         at kafka.producer.SyncProducer.send(SyncProducer.scala:100)
> >         at
> >
> kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
> >         at
> >
> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
> >         at
> >
> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
> > ........
> >
> > And the cycle repeats. Another few thousand messages and then it stops
> once
> > more. A thread dump shows this:
> >
> > "async-thread-macro-3" daemon prio=10 tid=0x00007f90383a5800 nid=0x6a8
> > runnable [0x00007f9025872000]
> >    java.lang.Thread.State: RUNNABLE
> >         at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
> >         at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
> >         at sun.nio.ch.IOUtil.write(IOUtil.java:148)
> >         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:524)
> >         - locked <0x000000008dd16c70> (a java.lang.Object)
> >         at java.nio.channels.SocketChannel.write(SocketChannel.java:493)
> >         at
> >
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
> >         at
> kafka.network.Send$class.writeCompletely(Transmission.scala:75)
> >         at
> >
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
> >         at kafka.network.BlockingChannel.send(BlockingChannel.scala:92)
> >         at
> kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:72)
> >         at
> >
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71)
> >         - locked <0x000000008dd16b30> (a java.lang.Object)
> >         at
> >
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:102)
> >         at
> >
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
> >         at
> >
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> >         at
> >
> kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:101)
> >         at
> > kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
> >         at
> > kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
> >         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> >         at kafka.producer.SyncProducer.send(SyncProducer.scala:100)
> >         at
> >
> kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
> >         at
> >
> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
> >         at
> >
> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
> > .........
> >
> > I found an issue that sounds similar that is fixed in 0.8.3, but it
> doesn't
> > sound like my producer is getting stuck in the same place:
> > https://issues.apache.org/jira/browse/KAFKA-1733
> >
> > Thanks in advance!
> > Shayne
>

Re: Producer repeatedly locking up

Posted by Gwen Shapira <gs...@cloudera.com>.
Do you see any errors on the brokers when this happens?

On Tue, Jun 30, 2015 at 10:14 AM, Shayne S <sh...@gmail.com> wrote:
> This problem is intermittent, not sure what is causing it. Some days
> everything runs non-stop with no issues, some days I get the following.
>
> Setup:
> - Single broker
> - Running 0.8.2.1
>
> I'm running a single broker. When the problem is presenting, anywhere from
> 5,000 to 30,000 messages may be processed in a few seconds, then everything
> stops for 15-16 minutes before I see this:
>
> 2015-06-30 16:48:03,737 [async-thread-macro-1] INFO
> kafka.producer.SyncProducer - Disconnecting from
> ip-10-0-9-105.ec2.internal:9092
> 2015-06-30 16:48:03,744 [async-thread-macro-1] WARN
> kafka.producer.async.DefaultEventHandler - Failed to send producer request
> with correlation id 42474 to broker 0 with data for partitions
> [refrigerators_archive_features,0]
> java.io.IOException: Connection timed out
>         at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
>         at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
>         at sun.nio.ch.IOUtil.write(IOUtil.java:148)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:524)
>         at java.nio.channels.SocketChannel.write(SocketChannel.java:493)
>         at
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
>         at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
>         at
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
>         at kafka.network.BlockingChannel.send(BlockingChannel.scala:92)
>         at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:72)
>         at
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:102)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:101)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>         at kafka.producer.SyncProducer.send(SyncProducer.scala:100)
>         at
> kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
>         at
> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
>         at
> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
> ........
>
> And the cycle repeats. Another few thousand messages and then it stops once
> more. A thread dump shows this:
>
> "async-thread-macro-3" daemon prio=10 tid=0x00007f90383a5800 nid=0x6a8
> runnable [0x00007f9025872000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
>         at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
>         at sun.nio.ch.IOUtil.write(IOUtil.java:148)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:524)
>         - locked <0x000000008dd16c70> (a java.lang.Object)
>         at java.nio.channels.SocketChannel.write(SocketChannel.java:493)
>         at
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
>         at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
>         at
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
>         at kafka.network.BlockingChannel.send(BlockingChannel.scala:92)
>         at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:72)
>         at
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71)
>         - locked <0x000000008dd16b30> (a java.lang.Object)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:102)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:102)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:101)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
>         at
> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:101)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>         at kafka.producer.SyncProducer.send(SyncProducer.scala:100)
>         at
> kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
>         at
> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
>         at
> kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
> .........
>
> I found an issue that sounds similar that is fixed in 0.8.3, but it doesn't
> sound like my producer is getting stuck in the same place:
> https://issues.apache.org/jira/browse/KAFKA-1733
>
> Thanks in advance!
> Shayne