You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Jiecxy <25...@qq.com> on 2016/12/09 02:15:09 UTC

The connection between kafka and zookeeper is often closed by zookeeper, lead to NotLeaderForPartitionException: This server is not the leader for that topic-partition.

Hi guys,

Situation:
  3 nodes, each 32G memory, CPU 24 cores, 1T hd.
  3 brokers on 3 nodes, and 3 zookeeper on these 3 nodes too, all the properties are default, start the zookeeper cluster and kafka cluster.
  Create a topic (3 replications, 6 partions), like below:
    bin/kafka-topics.sh --create --zookeeper hw168:2181 --replication-factor 3 --partitions 6 --topic test
  And run the ProducerPerformance given by kafka on the two nodes at the same time, it means we have two producers, command like below:
    bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic test --num-records 100000000 --record-size 100 --throughput -1 --producer-props bootstrap.servers=hw168:9092 buffer.memory=67108864 batch.size=65536 acks=1

Problem:
  We can see from the producer, a lot of  NotLeaderForPartitionException:
    org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
    org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
    org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
    …

Track the process (by using DEBUG):
  There is a INFO: 
    INFO Client session timed out, have not heard from server in 11647ms for sessionid 0x258de4a26a40000, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)

  And We found that the connection between zkClient (kafka holds) and zookeeper server is closed by zookeeper server, the reason is that time is out, for details:
    [2016-12-08 20:24:00,547] DEBUG Partition [test,5] on broker 1: Skipping update high watermark since Old hw 15986847 [8012779 : 1068525112] is larger than new hw 15986847 [8012779 : 1068525112] for partition [test,5]. All leo's are 16566175 [16025299 : 72477384],15986847 [8012779 : 1068525112],16103549 [16025299 : 10485500] (kafka.cluster.Partition)
    [2016-12-08 20:24:00,547] DEBUG Adding index entry 16566161 => 72475508 to 00000000000016025299.index. (kafka.log.OffsetIndex)
    [2016-12-08 20:24:11,368] DEBUG [Replica Manager on Broker 1]: Request key test-2 unblocked 0 fetch requests. (kafka.server.ReplicaManager)
    [2016-12-08 20:24:11,368] DEBUG Partition [test,2] on broker 1: Skipping update high watermark since Old hw 16064424 [16025299 : 5242750] is larger than new hw 16064424 [16025299 : 5242750] for partition [test,2]. All leo's are 16566175 [16025299 : 72477384],16205274 [16025299 : 24116650],16064424 [16025299 : 5242750] (kafka.cluster.Partition)
    [2016-12-08 20:24:11,369] DEBUG [Replica Manager on Broker 1]: Produce to local log in 10821 ms (kafka.server.ReplicaManager)
    [2016-12-08 20:24:11,369] INFO Client session timed out, have not heard from server in 11647ms for sessionid 0x258de4a26a40000, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)

  Please watch the time, the there is no DEBUG between 20:24:00,547 and 20:24:11,368, it already exceeded the time for session timeout (6000ms), so it causes this disconnection.  We keep digging:
  We found that it got stuck in the function:
      selector.select(waitTimeOut);     — in the method doTransport(…) in class org.apache.zookeeper.ClientCnxnSocketNIO
  and that is the time ww got no DEBUG.
  For more details, Call procedure (zookeeper client):
      org.apache.zookeeper.ClientCnxn -> run() -> doTransport(..)
  In the function run(), every time it will check whether there is a timeout, if not, it will run doTransport, but the doTransport costs about 10s, so next loop, it will find the timeout.

  Keep going, I thought there could be a deadlock at that time, so I keep printing the jstack of the kafka and zookeeper. Using the shell like below:
      while true; do echo -e "\n\n"`date "+%Y-%m-%d %H:%M:%S,%N"`"\n"`jstack 12165` >> zkjstack; echo -e "\n\n"`date "+%Y-%m-%d %H:%M:%S,%N"`"\n"`jstack 12425` >> kafkajstack; done
  And I check the period we got NO DEBUG in the stack logs, surprise, there also NO LOG at that time!!! Why?

So I’m confused that why it got stuck in that function? Why there is no DEBUG or LOG in that weird period? Please help me. Thank you all.
  

Xiaoyuan

Re: The connection between kafka and zookeeper is often closed byzookeeper, lead to NotLeaderForPartitionException: This server is not theleader for that topic-partition.

Posted by Guozhang Wang <wa...@gmail.com>.

Xiaoyuan,

I am not an expert in ZK so here is what I can tell:

1. "NotLeaderForPartitionException" is not usually thrown when ZK
connection timed out, it is thrown when the produce requests has arrived
the broker but the brokers think themselves as not the leader (any more)
for the requested partitions. So it usually indicates that the partition
leaders has migrated.

2. It is generally suggested to not co-located the Kafka brokers and
Zookeepers on the same machines, since both are memory and IO-consumption
heavy applications and hence could increase long GCs, which can also
causing ZK or Kafka brokers "stalls". See this for more details:

https://kafka.apache.org/documentation/#zk

2. Without the version information of the brokers and clients I cannot tell
further what could be the issue.


Guozhang


On Thu, Dec 15, 2016 at 5:17 PM, Xiaoyuan Chen <25...@qq.com> wrote:

> Any solution?
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Xiaoyuan Chen"<25...@qq.com>;
> 发送时间: 2016年12月9日(星期五) 上午10:15
> 收件人: "users"<us...@kafka.apache.org>;
> 主题: The connection between kafka and zookeeper is often closed
> byzookeeper, lead to NotLeaderForPartitionException: This server is not
> theleader for that topic-partition.
>
>
>
> Hi guys,
>
> Situation:
>   3 nodes, each 32G memory, CPU 24 cores, 1T hd.
>   3 brokers on 3 nodes, and 3 zookeeper on these 3 nodes too, all the
> properties are default, start the zookeeper cluster and kafka cluster.
>   Create a topic (3 replications, 6 partions), like below:
>     bin/kafka-topics.sh --create --zookeeper hw168:2181
> --replication-factor 3 --partitions 6 --topic test
>   And run the ProducerPerformance given by kafka on the two nodes at the
> same time, it means we have two producers, command like below:
>     bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance
> --topic test --num-records 100000000 --record-size 100 --throughput -1
> --producer-props bootstrap.servers=hw168:9092 buffer.memory=67108864
> batch.size=65536 acks=1
>
> Problem:
>   We can see from the producer, a lot of  NotLeaderForPartitionException:
>     org.apache.kafka.common.errors.NotLeaderForPartitionException: This
> server is not the leader for that topic-partition.
>     org.apache.kafka.common.errors.NotLeaderForPartitionException: This
> server is not the leader for that topic-partition.
>     org.apache.kafka.common.errors.NotLeaderForPartitionException: This
> server is not the leader for that topic-partition.
>     …
>
> Track the process (by using DEBUG):
>   There is a INFO:
>     INFO Client session timed out, have not heard from server in 11647ms
> for sessionid 0x258de4a26a40000, closing socket connection and attempting
> reconnect (org.apache.zookeeper.ClientCnxn)
>
>   And We found that the connection between zkClient (kafka holds) and
> zookeeper server is closed by zookeeper server, the reason is that time is
> out, for details:
>     [2016-12-08 20:24:00,547] DEBUG Partition [test,5] on broker 1:
> Skipping update high watermark since Old hw 15986847 [8012779 : 1068525112]
> is larger than new hw 15986847 [8012779 : 1068525112] for partition
> [test,5]. All leo's are 16566175 [16025299 : 72477384],15986847 [8012779 :
> 1068525112],16103549 [16025299 : 10485500] (kafka.cluster.Partition)
>     [2016-12-08 20:24:00,547] DEBUG Adding index entry 16566161 =>
> 72475508 to 00000000000016025299.index. (kafka.log.OffsetIndex)
>     [2016-12-08 20:24:11,368] DEBUG [Replica Manager on Broker 1]: Request
> key test-2 unblocked 0 fetch requests. (kafka.server.ReplicaManager)
>     [2016-12-08 20:24:11,368] DEBUG Partition [test,2] on broker 1:
> Skipping update high watermark since Old hw 16064424 [16025299 : 5242750]
> is larger than new hw 16064424 [16025299 : 5242750] for partition [test,2].
> All leo's are 16566175 [16025299 : 72477384],16205274 [16025299 :
> 24116650],16064424 [16025299 : 5242750] (kafka.cluster.Partition)
>     [2016-12-08 20:24:11,369] DEBUG [Replica Manager on Broker 1]: Produce
> to local log in 10821 ms (kafka.server.ReplicaManager)
>     [2016-12-08 20:24:11,369] INFO Client session timed out, have not
> heard from server in 11647ms for sessionid 0x258de4a26a40000, closing
> socket connection and attempting reconnect (org.apache.zookeeper.
> ClientCnxn)
>
>   Please watch the time, the there is no DEBUG between 20:24:00,547 and
> 20:24:11,368, it already exceeded the time for session timeout (6000ms), so
> it causes this disconnection.  We keep digging:
>   We found that it got stuck in the function:
>       selector.select(waitTimeOut);     — in the method doTransport(…) in
> class org.apache.zookeeper.ClientCnxnSocketNIO
>   and that is the time ww got no DEBUG.
>   For more details, Call procedure (zookeeper client):
>       org.apache.zookeeper.ClientCnxn -> run() -> doTransport(..)
>   In the function run(), every time it will check whether there is a
> timeout, if not, it will run doTransport, but the doTransport costs about
> 10s, so next loop, it will find the timeout.
>
>   Keep going, I thought there could be a deadlock at that time, so I keep
> printing the jstack of the kafka and zookeeper. Using the shell like below:
>       while true; do echo -e "\n\n"`date "+%Y-%m-%d
> %H:%M:%S,%N"`"\n"`jstack 12165` >> zkjstack; echo -e "\n\n"`date "+%Y-%m-%d
> %H:%M:%S,%N"`"\n"`jstack 12425` >> kafkajstack; done
>   And I check the period we got NO DEBUG in the stack logs, surprise,
> there also NO LOG at that time!!! Why?
>
> So I’m confused that why it got stuck in that function? Why there is no
> DEBUG or LOG in that weird period? Please help me. Thank you all.
>
>
> Xiaoyuan
>



-- 
-- Guozhang

回复：The connection between kafka and zookeeper is often closed byzookeeper, lead to NotLeaderForPartitionException: This server is not theleader for that topic-partition.

Posted by Xiaoyuan Chen <25...@qq.com>.

Any solution?




------------------ 原始邮件 ------------------
发件人: "Xiaoyuan Chen"<25...@qq.com>; 
发送时间: 2016年12月9日(星期五) 上午10:15
收件人: "users"<us...@kafka.apache.org>; 
主题: The connection between kafka and zookeeper is often closed byzookeeper, lead to NotLeaderForPartitionException: This server is not theleader for that topic-partition.



Hi guys,

Situation:
  3 nodes, each 32G memory, CPU 24 cores, 1T hd.
  3 brokers on 3 nodes, and 3 zookeeper on these 3 nodes too, all the properties are default, start the zookeeper cluster and kafka cluster.
  Create a topic (3 replications, 6 partions), like below:
    bin/kafka-topics.sh --create --zookeeper hw168:2181 --replication-factor 3 --partitions 6 --topic test
  And run the ProducerPerformance given by kafka on the two nodes at the same time, it means we have two producers, command like below:
    bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic test --num-records 100000000 --record-size 100 --throughput -1 --producer-props bootstrap.servers=hw168:9092 buffer.memory=67108864 batch.size=65536 acks=1

Problem:
  We can see from the producer, a lot of  NotLeaderForPartitionException:
    org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
    org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
    org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
    …

Track the process (by using DEBUG):
  There is a INFO: 
    INFO Client session timed out, have not heard from server in 11647ms for sessionid 0x258de4a26a40000, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)

  And We found that the connection between zkClient (kafka holds) and zookeeper server is closed by zookeeper server, the reason is that time is out, for details:
    [2016-12-08 20:24:00,547] DEBUG Partition [test,5] on broker 1: Skipping update high watermark since Old hw 15986847 [8012779 : 1068525112] is larger than new hw 15986847 [8012779 : 1068525112] for partition [test,5]. All leo's are 16566175 [16025299 : 72477384],15986847 [8012779 : 1068525112],16103549 [16025299 : 10485500] (kafka.cluster.Partition)
    [2016-12-08 20:24:00,547] DEBUG Adding index entry 16566161 => 72475508 to 00000000000016025299.index. (kafka.log.OffsetIndex)
    [2016-12-08 20:24:11,368] DEBUG [Replica Manager on Broker 1]: Request key test-2 unblocked 0 fetch requests. (kafka.server.ReplicaManager)
    [2016-12-08 20:24:11,368] DEBUG Partition [test,2] on broker 1: Skipping update high watermark since Old hw 16064424 [16025299 : 5242750] is larger than new hw 16064424 [16025299 : 5242750] for partition [test,2]. All leo's are 16566175 [16025299 : 72477384],16205274 [16025299 : 24116650],16064424 [16025299 : 5242750] (kafka.cluster.Partition)
    [2016-12-08 20:24:11,369] DEBUG [Replica Manager on Broker 1]: Produce to local log in 10821 ms (kafka.server.ReplicaManager)
    [2016-12-08 20:24:11,369] INFO Client session timed out, have not heard from server in 11647ms for sessionid 0x258de4a26a40000, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)

  Please watch the time, the there is no DEBUG between 20:24:00,547 and 20:24:11,368, it already exceeded the time for session timeout (6000ms), so it causes this disconnection.  We keep digging:
  We found that it got stuck in the function:
      selector.select(waitTimeOut);     — in the method doTransport(…) in class org.apache.zookeeper.ClientCnxnSocketNIO
  and that is the time ww got no DEBUG.
  For more details, Call procedure (zookeeper client):
      org.apache.zookeeper.ClientCnxn -> run() -> doTransport(..)
  In the function run(), every time it will check whether there is a timeout, if not, it will run doTransport, but the doTransport costs about 10s, so next loop, it will find the timeout.

  Keep going, I thought there could be a deadlock at that time, so I keep printing the jstack of the kafka and zookeeper. Using the shell like below:
      while true; do echo -e "\n\n"`date "+%Y-%m-%d %H:%M:%S,%N"`"\n"`jstack 12165` >> zkjstack; echo -e "\n\n"`date "+%Y-%m-%d %H:%M:%S,%N"`"\n"`jstack 12425` >> kafkajstack; done
  And I check the period we got NO DEBUG in the stack logs, surprise, there also NO LOG at that time!!! Why?

So I’m confused that why it got stuck in that function? Why there is no DEBUG or LOG in that weird period? Please help me. Thank you all.
  

Xiaoyuan