You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Aarti Gupta <aa...@gmail.com> on 2017/11/29 23:07:26 UTC

Re: java.lang.IllegalStateException: Correlation id for response () does not match request ()

https://issues.apache.org/jira/browse/KAFKA-4669?focusedCommentId=16271727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16271727
Can we reopen this issue ?

We saw this on the consumer in production today. We are on 0.11.01

ERROR c.v.v.h.m.k.KafkaEventConsumerDelegate- Error fetching next record
java.lang.Exception: Error fetching next new record from kafka queue
at
com.vmware.vchs.hybridity.messaging.kafka.KafkaEventConsumerDelegate.getNextEventWithTimeout(KafkaEventConsumerDelegate.java:121)
at
com.vmware.vchs.hybridity.messaging.kafka.KafkaEventConsumerDelegate.getNextEvent(KafkaEventConsumerDelegate.java:64)
at
com.vmware.vchs.hybridity.messaging.adapter.EventListenerAdapter.getNextEvent(EventListenerAdapter.java:76)
at
com.vmware.vchs.hybridity.messaging.adapter.EventListenerAdapter.getNextEventAndAck(EventListenerAdapter.java:94)
at
com.vmware.vchs.hybridity.messaging.adapter.EventListenerAdapter$1.run(EventListenerAdapter.java:125)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Correlation id for response
(386681) does not match request (386680), request header:
{api_key=9,api_version=3,correlation_id=386680,client_id=consumer-36}

at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:752)
at
org.apache.kafka.clients.NetworkClient.parseStructMaybeUpdateThrottleTimeMetrics(NetworkClient.java:561)
at
org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:657)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:442)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:232)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:208)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:168)
at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.fetchCommittedOffsets(ConsumerCoordinator.java:477)
at
org.apache.kafka.clients.consumer.KafkaConsumer.committed(KafkaConsumer.java:1346)
at
com.vmware.vchs.hybridity.messaging.kafka.KafkaEventConsumerDelegate.getNextEventWithTimeout(KafkaEventConsumerDelegate.java:86)
... 5 common frames omitted

On Mon, Mar 6, 2017 at 9:44 AM, Ismael Juma <is...@juma.me.uk> wrote:

> Hi Mickael,
>
> This looks to be the same as KAFKA-4669. In theory, this should never
> happen and it's unclear when/how it can happen. Not sure if someone has
> investigated it in more detail.
>
> Ismael
>
> On Mon, Mar 6, 2017 at 5:15 PM, Mickael Maison <mi...@gmail.com>
> wrote:
>
> > Hi,
> >
> > In one of our clusters, some of our clients occasionally see this
> > exception:
> > java.lang.IllegalStateException: Correlation id for response (4564)
> > does not match request (4562)
> > at org.apache.kafka.clients.NetworkClient.correlate(
> > NetworkClient.java:486)
> > at org.apache.kafka.clients.NetworkClient.parseResponse(
> > NetworkClient.java:381)
> > at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> > NetworkClient.java:449)
> > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
> > at org.apache.kafka.clients.producer.internals.Sender.run(
> Sender.java:229)
> > at org.apache.kafka.clients.producer.internals.Sender.run(
> Sender.java:134)
> > at java.lang.Thread.run(Unknown Source)
> >
> > We've also seen it from consumer poll() and commit()
> >
> > Usually the response's correlation id is off by just 1 or 2 (like
> > above) but we've also seen it off by a few hundreds:
> > java.lang.IllegalStateException: Correlation id for response (742)
> > does not match request (174)
> >     at org.apache.kafka.clients.NetworkClient.correlate(
> > NetworkClient.java:486)
> >     at org.apache.kafka.clients.NetworkClient.parseResponse(
> > NetworkClient.java:381)
> >     at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> > NetworkClient.java:449)
> >     at org.apache.kafka.clients.NetworkClient.poll(
> NetworkClient.java:269)
> >     at org.apache.kafka.clients.consumer.internals.
> ConsumerNetworkClient.
> > clientPoll(ConsumerNetworkClient.java:360)
> >     at org.apache.kafka.clients.consumer.internals.
> > ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
> >     at org.apache.kafka.clients.consumer.internals.
> > ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
> >     at org.apache.kafka.clients.consumer.internals.
> > ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
> >     at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.
> > commitOffsetsSync(ConsumerCoordinator.java:426)
> >     at org.apache.kafka.clients.consumer.KafkaConsumer.
> > commitSync(KafkaConsumer.java:1059)
> >     at org.apache.kafka.clients.consumer.KafkaConsumer.
> > commitSync(KafkaConsumer.java:1027)
> >
> > When this happens, all subsequent responses are also shifted:
> > java.lang.IllegalStateException: Correlation id for response (743)
> > does not match request (742)
> > java.lang.IllegalStateException: Correlation id for response (744)
> > does not match request (743)
> > java.lang.IllegalStateException: Correlation id for response (745)
> > does not match request (744)
> > java.lang.IllegalStateException: Correlation id for response (746)
> > does not match request (745)
> >  ...
> > It's easy to discard and recreate the consumer instance to recover
> > however we can't do that with the producer as it occurs in the Sender
> > thread.
> >
> > Our cluster and our clients are running Kafka 0.10.0.1.
> > Under which circumstances would such an error happen ?
> > Even with logging set to TRACE, we can't spot anything suspicious
> > shortly before the issue. Is there any data we should try to capture
> > when this happens ?
> >
> > Thanks!
> >
>