You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kiran Shivappa Japannavar (JIRA)" <ji...@apache.org> on 2018/01/08 15:22:00 UTC

[jira] [Commented] (SPARK-22991) High read latency with spark streaming 2.2.1 and kafka 0.10.0.1

    [ https://issues.apache.org/jira/browse/SPARK-22991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316468#comment-16316468 ] 

Kiran Shivappa Japannavar commented on SPARK-22991:
---------------------------------------------------

[~apachespark] Please look into this.

> High read latency with spark streaming 2.2.1 and kafka 0.10.0.1
> ---------------------------------------------------------------
>
>                 Key: SPARK-22991
>                 URL: https://issues.apache.org/jira/browse/SPARK-22991
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, Structured Streaming
>    Affects Versions: 2.2.1
>            Reporter: Kiran Shivappa Japannavar
>            Priority: Critical
>
> Spark 2.2.1 + Kafka 0.10 + Spark streaming.
> Batch duration is 1s, Max rate per partition is 500, poll interval is 120 seconds, max poll records is 500 and no of partitions in Kafka is 500, enabled cache consumer.
> While trying to read data from Kafka we are observing very high read latencies intermittently.The high latencies results in Kafka consumer session expiration and hence the Kafka brokers removes the consumer from the group. The consumer keeps retrying and finally fails with the
> [org.apache.kafka.clients.NetworkClient] - Disconnecting from node 12 due to request timeout
> [org.apache.kafka.clients.NetworkClient] - Cancelled request ClientRequest
> [org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient] - Cancelled FETCH request ClientRequest.**
> Due to this a lot of batches are in the queued state.
> The high read latencies are occurring whenever multiple clients are parallelly trying to read the data from the same Kafka cluster. The Kafka cluster is having a large number of brokers and can support high network bandwidth.
> When running with spark 1.5 and Kafka 0.8 consumer client against the same Kafka cluster we are not seeing any read latencies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org