You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jason Gustafson (JIRA)" <ji...@apache.org> on 2017/09/21 23:39:00 UTC

[jira] [Created] (KAFKA-5957) Producer IllegalStateException due to second deallocate after aborting a batch

Jason Gustafson created KAFKA-5957:
--------------------------------------

             Summary: Producer IllegalStateException due to second deallocate after aborting a batch
                 Key: KAFKA-5957
                 URL: https://issues.apache.org/jira/browse/KAFKA-5957
             Project: Kafka
          Issue Type: Bug
            Reporter: Jason Gustafson
            Assignee: Jason Gustafson
             Fix For: 1.0.0


Saw this recently in a system test failure:

{code}
[2017-09-21 05:04:52,033] ERROR [Producer clientId=producer-1, transactionalId=my-second-transactional-id] Aborting producer batches due to fatal error (org.apache.kafka.clients.producer.internals.Sender)
org.apache.kafka.common.KafkaException: The client hasn't received acknowledgment for some previously sent messages and can no longer retry them. It isn't safe to continue.
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:211)
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:164)
        at java.lang.Thread.run(Thread.java:745)
[2017-09-21 05:04:52,033] TRACE Aborting batch for partition output-topic-2 (org.apache.kafka.clients.producer.internals.ProducerBatch)
org.apache.kafka.common.KafkaException: The client hasn't received acknowledgment for some previously sent messages and can no longer retry them. It isn't safe to continue.
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:211)
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:164)
        at java.lang.Thread.run(Thread.java:745)
[2017-09-21 05:04:52,134] TRACE [Producer clientId=producer-1, transactionalId=my-second-transactional-id] Not sending transactional request (type=EndTxnRequest, transactionalId=my-second-transactional-id, producerId=1000, producerEpoch=0, result=COMMIT) because we are in an error state (org.apache.kafka.clients.producer.internals.TransactionManager)
[2017-09-21 05:04:52,134] INFO [Producer clientId=producer-1, transactionalId=my-second-transactional-id] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2017-09-21 05:04:52,134] DEBUG [Producer clientId=producer-1, transactionalId=my-second-transactional-id] Beginning shutdown of Kafka producer I/O thread, sending remaining records. (org.apache.kafka.clients.producer.internals.Sender)
[2017-09-21 05:04:52,360] TRACE [Producer clientId=producer-1, transactionalId=my-second-transactional-id] Received produce response from node 1 with correlation id 245 (org.apache.kafka.clients.producer.internals.Sender)
[2017-09-21 05:04:52,360] DEBUG [Producer clientId=producer-1, transactionalId=my-second-transactional-id] ProducerId: 1000; Set last ack'd sequence number for topic-partition output-topic-2 to 136 (org.apache.kafka.clients.producer.internals.Sender)
[2017-09-21 05:04:52,360] TRACE Successfully produced messages to output-topic-2 with base offset 387. (org.apache.kafka.clients.producer.internals.ProducerBatch)
[2017-09-21 05:04:52,360] DEBUG ProduceResponse returned for output-topic-2 after batch had already been aborted. (org.apache.kafka.clients.producer.internals.ProducerBatch)
[2017-09-21 05:04:52,360] ERROR [Producer clientId=producer-1, transactionalId=my-second-transactional-id] Uncaught error in request completion: (org.apache.kafka.clients.NetworkClient)
java.lang.IllegalStateException: Remove from the incomplete set failed. This should be impossible.
        at org.apache.kafka.clients.producer.internals.IncompleteBatches.remove(IncompleteBatches.java:44)
        at org.apache.kafka.clients.producer.internals.RecordAccumulator.deallocate(RecordAccumulator.java:612)
        at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:585)
        at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:561)
        at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:475)
        at org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74)
        at org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:685)
        at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101)
        at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:481)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:473)
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:225)
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:177)
        at java.lang.Thread.run(Thread.java:745)
{code}
Although we allow a batch to be aborted before it returns, we are not careful about preventing a second call to {{deallocate()}} which causes this error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)