You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Mickael Maison <mi...@gmail.com> on 2017/03/06 17:15:39 UTC

java.lang.IllegalStateException: Correlation id for response () does not match request ()

Hi,

In one of our clusters, some of our clients occasionally see this exception:
java.lang.IllegalStateException: Correlation id for response (4564)
does not match request (4562)
at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:486)
at org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:381)
at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:449)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
at java.lang.Thread.run(Unknown Source)

We've also seen it from consumer poll() and commit()

Usually the response's correlation id is off by just 1 or 2 (like
above) but we've also seen it off by a few hundreds:
java.lang.IllegalStateException: Correlation id for response (742)
does not match request (174)
    at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:486)
    at org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:381)
    at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:449)
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
    at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:426)
    at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1059)
    at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1027)

When this happens, all subsequent responses are also shifted:
java.lang.IllegalStateException: Correlation id for response (743)
does not match request (742)
java.lang.IllegalStateException: Correlation id for response (744)
does not match request (743)
java.lang.IllegalStateException: Correlation id for response (745)
does not match request (744)
java.lang.IllegalStateException: Correlation id for response (746)
does not match request (745)
 ...
It's easy to discard and recreate the consumer instance to recover
however we can't do that with the producer as it occurs in the Sender
thread.

Our cluster and our clients are running Kafka 0.10.0.1.
Under which circumstances would such an error happen ?
Even with logging set to TRACE, we can't spot anything suspicious
shortly before the issue. Is there any data we should try to capture
when this happens ?

Thanks!

Re: java.lang.IllegalStateException: Correlation id for response () does not match request ()

Posted by Aarti Gupta <aa...@gmail.com>.
https://issues.apache.org/jira/browse/KAFKA-4669?focusedCommentId=16271727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16271727
Can we reopen this issue ?

We saw this on the consumer in production today. We are on 0.11.01

ERROR c.v.v.h.m.k.KafkaEventConsumerDelegate- Error fetching next record
java.lang.Exception: Error fetching next new record from kafka queue
at
com.vmware.vchs.hybridity.messaging.kafka.KafkaEventConsumerDelegate.getNextEventWithTimeout(KafkaEventConsumerDelegate.java:121)
at
com.vmware.vchs.hybridity.messaging.kafka.KafkaEventConsumerDelegate.getNextEvent(KafkaEventConsumerDelegate.java:64)
at
com.vmware.vchs.hybridity.messaging.adapter.EventListenerAdapter.getNextEvent(EventListenerAdapter.java:76)
at
com.vmware.vchs.hybridity.messaging.adapter.EventListenerAdapter.getNextEventAndAck(EventListenerAdapter.java:94)
at
com.vmware.vchs.hybridity.messaging.adapter.EventListenerAdapter$1.run(EventListenerAdapter.java:125)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Correlation id for response
(386681) does not match request (386680), request header:
{api_key=9,api_version=3,correlation_id=386680,client_id=consumer-36}

at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:752)
at
org.apache.kafka.clients.NetworkClient.parseStructMaybeUpdateThrottleTimeMetrics(NetworkClient.java:561)
at
org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:657)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:442)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:232)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:208)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:168)
at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.fetchCommittedOffsets(ConsumerCoordinator.java:477)
at
org.apache.kafka.clients.consumer.KafkaConsumer.committed(KafkaConsumer.java:1346)
at
com.vmware.vchs.hybridity.messaging.kafka.KafkaEventConsumerDelegate.getNextEventWithTimeout(KafkaEventConsumerDelegate.java:86)
... 5 common frames omitted

On Mon, Mar 6, 2017 at 9:44 AM, Ismael Juma <is...@juma.me.uk> wrote:

> Hi Mickael,
>
> This looks to be the same as KAFKA-4669. In theory, this should never
> happen and it's unclear when/how it can happen. Not sure if someone has
> investigated it in more detail.
>
> Ismael
>
> On Mon, Mar 6, 2017 at 5:15 PM, Mickael Maison <mi...@gmail.com>
> wrote:
>
> > Hi,
> >
> > In one of our clusters, some of our clients occasionally see this
> > exception:
> > java.lang.IllegalStateException: Correlation id for response (4564)
> > does not match request (4562)
> > at org.apache.kafka.clients.NetworkClient.correlate(
> > NetworkClient.java:486)
> > at org.apache.kafka.clients.NetworkClient.parseResponse(
> > NetworkClient.java:381)
> > at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> > NetworkClient.java:449)
> > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
> > at org.apache.kafka.clients.producer.internals.Sender.run(
> Sender.java:229)
> > at org.apache.kafka.clients.producer.internals.Sender.run(
> Sender.java:134)
> > at java.lang.Thread.run(Unknown Source)
> >
> > We've also seen it from consumer poll() and commit()
> >
> > Usually the response's correlation id is off by just 1 or 2 (like
> > above) but we've also seen it off by a few hundreds:
> > java.lang.IllegalStateException: Correlation id for response (742)
> > does not match request (174)
> >     at org.apache.kafka.clients.NetworkClient.correlate(
> > NetworkClient.java:486)
> >     at org.apache.kafka.clients.NetworkClient.parseResponse(
> > NetworkClient.java:381)
> >     at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> > NetworkClient.java:449)
> >     at org.apache.kafka.clients.NetworkClient.poll(
> NetworkClient.java:269)
> >     at org.apache.kafka.clients.consumer.internals.
> ConsumerNetworkClient.
> > clientPoll(ConsumerNetworkClient.java:360)
> >     at org.apache.kafka.clients.consumer.internals.
> > ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
> >     at org.apache.kafka.clients.consumer.internals.
> > ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
> >     at org.apache.kafka.clients.consumer.internals.
> > ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
> >     at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.
> > commitOffsetsSync(ConsumerCoordinator.java:426)
> >     at org.apache.kafka.clients.consumer.KafkaConsumer.
> > commitSync(KafkaConsumer.java:1059)
> >     at org.apache.kafka.clients.consumer.KafkaConsumer.
> > commitSync(KafkaConsumer.java:1027)
> >
> > When this happens, all subsequent responses are also shifted:
> > java.lang.IllegalStateException: Correlation id for response (743)
> > does not match request (742)
> > java.lang.IllegalStateException: Correlation id for response (744)
> > does not match request (743)
> > java.lang.IllegalStateException: Correlation id for response (745)
> > does not match request (744)
> > java.lang.IllegalStateException: Correlation id for response (746)
> > does not match request (745)
> >  ...
> > It's easy to discard and recreate the consumer instance to recover
> > however we can't do that with the producer as it occurs in the Sender
> > thread.
> >
> > Our cluster and our clients are running Kafka 0.10.0.1.
> > Under which circumstances would such an error happen ?
> > Even with logging set to TRACE, we can't spot anything suspicious
> > shortly before the issue. Is there any data we should try to capture
> > when this happens ?
> >
> > Thanks!
> >
>

Re: java.lang.IllegalStateException: Correlation id for response () does not match request ()

Posted by Aarti Gupta <aa...@gmail.com>.
https://issues.apache.org/jira/browse/KAFKA-4669?focusedCommentId=16271727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16271727
Can we reopen this issue ?

We saw this on the consumer in production today. We are on 0.11.01

ERROR c.v.v.h.m.k.KafkaEventConsumerDelegate- Error fetching next record
java.lang.Exception: Error fetching next new record from kafka queue
at
com.vmware.vchs.hybridity.messaging.kafka.KafkaEventConsumerDelegate.getNextEventWithTimeout(KafkaEventConsumerDelegate.java:121)
at
com.vmware.vchs.hybridity.messaging.kafka.KafkaEventConsumerDelegate.getNextEvent(KafkaEventConsumerDelegate.java:64)
at
com.vmware.vchs.hybridity.messaging.adapter.EventListenerAdapter.getNextEvent(EventListenerAdapter.java:76)
at
com.vmware.vchs.hybridity.messaging.adapter.EventListenerAdapter.getNextEventAndAck(EventListenerAdapter.java:94)
at
com.vmware.vchs.hybridity.messaging.adapter.EventListenerAdapter$1.run(EventListenerAdapter.java:125)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Correlation id for response
(386681) does not match request (386680), request header:
{api_key=9,api_version=3,correlation_id=386680,client_id=consumer-36}

at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:752)
at
org.apache.kafka.clients.NetworkClient.parseStructMaybeUpdateThrottleTimeMetrics(NetworkClient.java:561)
at
org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:657)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:442)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:232)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:208)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:168)
at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.fetchCommittedOffsets(ConsumerCoordinator.java:477)
at
org.apache.kafka.clients.consumer.KafkaConsumer.committed(KafkaConsumer.java:1346)
at
com.vmware.vchs.hybridity.messaging.kafka.KafkaEventConsumerDelegate.getNextEventWithTimeout(KafkaEventConsumerDelegate.java:86)
... 5 common frames omitted

On Mon, Mar 6, 2017 at 9:44 AM, Ismael Juma <is...@juma.me.uk> wrote:

> Hi Mickael,
>
> This looks to be the same as KAFKA-4669. In theory, this should never
> happen and it's unclear when/how it can happen. Not sure if someone has
> investigated it in more detail.
>
> Ismael
>
> On Mon, Mar 6, 2017 at 5:15 PM, Mickael Maison <mi...@gmail.com>
> wrote:
>
> > Hi,
> >
> > In one of our clusters, some of our clients occasionally see this
> > exception:
> > java.lang.IllegalStateException: Correlation id for response (4564)
> > does not match request (4562)
> > at org.apache.kafka.clients.NetworkClient.correlate(
> > NetworkClient.java:486)
> > at org.apache.kafka.clients.NetworkClient.parseResponse(
> > NetworkClient.java:381)
> > at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> > NetworkClient.java:449)
> > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
> > at org.apache.kafka.clients.producer.internals.Sender.run(
> Sender.java:229)
> > at org.apache.kafka.clients.producer.internals.Sender.run(
> Sender.java:134)
> > at java.lang.Thread.run(Unknown Source)
> >
> > We've also seen it from consumer poll() and commit()
> >
> > Usually the response's correlation id is off by just 1 or 2 (like
> > above) but we've also seen it off by a few hundreds:
> > java.lang.IllegalStateException: Correlation id for response (742)
> > does not match request (174)
> >     at org.apache.kafka.clients.NetworkClient.correlate(
> > NetworkClient.java:486)
> >     at org.apache.kafka.clients.NetworkClient.parseResponse(
> > NetworkClient.java:381)
> >     at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> > NetworkClient.java:449)
> >     at org.apache.kafka.clients.NetworkClient.poll(
> NetworkClient.java:269)
> >     at org.apache.kafka.clients.consumer.internals.
> ConsumerNetworkClient.
> > clientPoll(ConsumerNetworkClient.java:360)
> >     at org.apache.kafka.clients.consumer.internals.
> > ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
> >     at org.apache.kafka.clients.consumer.internals.
> > ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
> >     at org.apache.kafka.clients.consumer.internals.
> > ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
> >     at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.
> > commitOffsetsSync(ConsumerCoordinator.java:426)
> >     at org.apache.kafka.clients.consumer.KafkaConsumer.
> > commitSync(KafkaConsumer.java:1059)
> >     at org.apache.kafka.clients.consumer.KafkaConsumer.
> > commitSync(KafkaConsumer.java:1027)
> >
> > When this happens, all subsequent responses are also shifted:
> > java.lang.IllegalStateException: Correlation id for response (743)
> > does not match request (742)
> > java.lang.IllegalStateException: Correlation id for response (744)
> > does not match request (743)
> > java.lang.IllegalStateException: Correlation id for response (745)
> > does not match request (744)
> > java.lang.IllegalStateException: Correlation id for response (746)
> > does not match request (745)
> >  ...
> > It's easy to discard and recreate the consumer instance to recover
> > however we can't do that with the producer as it occurs in the Sender
> > thread.
> >
> > Our cluster and our clients are running Kafka 0.10.0.1.
> > Under which circumstances would such an error happen ?
> > Even with logging set to TRACE, we can't spot anything suspicious
> > shortly before the issue. Is there any data we should try to capture
> > when this happens ?
> >
> > Thanks!
> >
>

Re: java.lang.IllegalStateException: Correlation id for response () does not match request ()

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Mickael,

This looks to be the same as KAFKA-4669. In theory, this should never
happen and it's unclear when/how it can happen. Not sure if someone has
investigated it in more detail.

Ismael

On Mon, Mar 6, 2017 at 5:15 PM, Mickael Maison <mi...@gmail.com>
wrote:

> Hi,
>
> In one of our clusters, some of our clients occasionally see this
> exception:
> java.lang.IllegalStateException: Correlation id for response (4564)
> does not match request (4562)
> at org.apache.kafka.clients.NetworkClient.correlate(
> NetworkClient.java:486)
> at org.apache.kafka.clients.NetworkClient.parseResponse(
> NetworkClient.java:381)
> at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> NetworkClient.java:449)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
> at java.lang.Thread.run(Unknown Source)
>
> We've also seen it from consumer poll() and commit()
>
> Usually the response's correlation id is off by just 1 or 2 (like
> above) but we've also seen it off by a few hundreds:
> java.lang.IllegalStateException: Correlation id for response (742)
> does not match request (174)
>     at org.apache.kafka.clients.NetworkClient.correlate(
> NetworkClient.java:486)
>     at org.apache.kafka.clients.NetworkClient.parseResponse(
> NetworkClient.java:381)
>     at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> NetworkClient.java:449)
>     at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
>     at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.
> clientPoll(ConsumerNetworkClient.java:360)
>     at org.apache.kafka.clients.consumer.internals.
> ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
>     at org.apache.kafka.clients.consumer.internals.
> ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
>     at org.apache.kafka.clients.consumer.internals.
> ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
>     at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.
> commitOffsetsSync(ConsumerCoordinator.java:426)
>     at org.apache.kafka.clients.consumer.KafkaConsumer.
> commitSync(KafkaConsumer.java:1059)
>     at org.apache.kafka.clients.consumer.KafkaConsumer.
> commitSync(KafkaConsumer.java:1027)
>
> When this happens, all subsequent responses are also shifted:
> java.lang.IllegalStateException: Correlation id for response (743)
> does not match request (742)
> java.lang.IllegalStateException: Correlation id for response (744)
> does not match request (743)
> java.lang.IllegalStateException: Correlation id for response (745)
> does not match request (744)
> java.lang.IllegalStateException: Correlation id for response (746)
> does not match request (745)
>  ...
> It's easy to discard and recreate the consumer instance to recover
> however we can't do that with the producer as it occurs in the Sender
> thread.
>
> Our cluster and our clients are running Kafka 0.10.0.1.
> Under which circumstances would such an error happen ?
> Even with logging set to TRACE, we can't spot anything suspicious
> shortly before the issue. Is there any data we should try to capture
> when this happens ?
>
> Thanks!
>

Re: java.lang.IllegalStateException: Correlation id for response () does not match request ()

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Mickael,

This looks to be the same as KAFKA-4669. In theory, this should never
happen and it's unclear when/how it can happen. Not sure if someone has
investigated it in more detail.

Ismael

On Mon, Mar 6, 2017 at 5:15 PM, Mickael Maison <mi...@gmail.com>
wrote:

> Hi,
>
> In one of our clusters, some of our clients occasionally see this
> exception:
> java.lang.IllegalStateException: Correlation id for response (4564)
> does not match request (4562)
> at org.apache.kafka.clients.NetworkClient.correlate(
> NetworkClient.java:486)
> at org.apache.kafka.clients.NetworkClient.parseResponse(
> NetworkClient.java:381)
> at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> NetworkClient.java:449)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
> at java.lang.Thread.run(Unknown Source)
>
> We've also seen it from consumer poll() and commit()
>
> Usually the response's correlation id is off by just 1 or 2 (like
> above) but we've also seen it off by a few hundreds:
> java.lang.IllegalStateException: Correlation id for response (742)
> does not match request (174)
>     at org.apache.kafka.clients.NetworkClient.correlate(
> NetworkClient.java:486)
>     at org.apache.kafka.clients.NetworkClient.parseResponse(
> NetworkClient.java:381)
>     at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(
> NetworkClient.java:449)
>     at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
>     at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.
> clientPoll(ConsumerNetworkClient.java:360)
>     at org.apache.kafka.clients.consumer.internals.
> ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
>     at org.apache.kafka.clients.consumer.internals.
> ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
>     at org.apache.kafka.clients.consumer.internals.
> ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
>     at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.
> commitOffsetsSync(ConsumerCoordinator.java:426)
>     at org.apache.kafka.clients.consumer.KafkaConsumer.
> commitSync(KafkaConsumer.java:1059)
>     at org.apache.kafka.clients.consumer.KafkaConsumer.
> commitSync(KafkaConsumer.java:1027)
>
> When this happens, all subsequent responses are also shifted:
> java.lang.IllegalStateException: Correlation id for response (743)
> does not match request (742)
> java.lang.IllegalStateException: Correlation id for response (744)
> does not match request (743)
> java.lang.IllegalStateException: Correlation id for response (745)
> does not match request (744)
> java.lang.IllegalStateException: Correlation id for response (746)
> does not match request (745)
>  ...
> It's easy to discard and recreate the consumer instance to recover
> however we can't do that with the producer as it occurs in the Sender
> thread.
>
> Our cluster and our clients are running Kafka 0.10.0.1.
> Under which circumstances would such an error happen ?
> Even with logging set to TRACE, we can't spot anything suspicious
> shortly before the issue. Is there any data we should try to capture
> when this happens ?
>
> Thanks!
>