You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Rafael Telles (JIRA)" <ji...@apache.org> on 2017/04/01 17:30:42 UTC

[jira] [Commented] (KAFKA-4975) Kafka process is running, but not listening to 9092 port

    [ https://issues.apache.org/jira/browse/KAFKA-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952312#comment-15952312 ] 

Rafael Telles commented on KAFKA-4975:
--------------------------------------

Hello [~Aegeaner],

The property unclean.leader.election.enable is set to false.
If I enable unclean leader election, I would lose all messages that weren't replicated until the failure, right? If so, it wouldn't solve the actual problem.

And why would it make the server unbind 9092 port? It looks like a bug

> Kafka process is running, but not listening to 9092 port
> --------------------------------------------------------
>
>                 Key: KAFKA-4975
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4975
>             Project: Kafka
>          Issue Type: Bug
>          Components: network
>    Affects Versions: 0.10.1.1
>         Environment: A cluster of 15 Kafka brokers connected to a cluster of 3 Zookeeper servers, all in the same data center.
> uname -a: Linux dc3-kafka-02 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> Kafka brokers hardware specs:
> H/W path       Device     Class      Description
> ================================================
>                           system     SR ((^_^))
> /0                        bus        SR
> /0/0                      memory     128KiB BIOS
> /0/4                      processor  Intel(R) Atom(TM) CPU  C2750  @ 2.40GHz
> /0/4/5                    memory     448KiB L1 cache
> /0/4/6                    memory     4MiB L2 cache
> /0/15                     memory     16GiB System Memory
> /0/15/0                   memory     8GiB DIMM DDR3 Synchronous 1600 MHz (0.6 ns)
> /0/15/1                   memory     DIMM DDR3 Synchronous [empty]
> /0/15/2                   memory     8GiB DIMM DDR3 Synchronous 1600 MHz (0.6 ns)
> /0/15/3                   memory     DIMM DDR3 Synchronous [empty]
> /0/100                    bridge     Atom processor C2000 SoC Transaction Router
> /0/100/f                  generic    Atom processor C2000 RCEC
> /0/100/13                 generic    Atom processor C2000 SMBus 2.0
> /0/100/14      enp0s20f0  network    Ethernet Connection I354 2.5 GbE Backplane
> /0/100/14.1    enp0s20f1  network    Ethernet Connection I354 2.5 GbE Backplane
> /0/100/16                 bus        Atom processor C2000 USB Enhanced Host Controller
> /0/100/16/1    usb1       bus        EHCI Host Controller
> /0/100/16/1/1             bus        USB hub
> /0/100/18                 storage    Atom processor C2000 AHCI SATA3 Controller
> /0/100/1f                 bridge     Atom processor C2000 PCU
> /0/100/1f.3               bus        Atom processor C2000 PCU SMBus
> /0/101                    bridge     Atom processor C2000 RAS
> /0/1           scsi0      storage    
> /0/1/0.0.0     /dev/sda   disk       256GB SAMSUNG MZ7LN256
> /0/1/0.0.0/1   /dev/sda1  volume     190MiB EXT4 volume
> /0/1/0.0.0/2   /dev/sda2  volume     237GiB EXT4 volume
> /0/1/0.0.0/3   /dev/sda3  volume     976MiB Linux swap volume
> /1                        power      CRB Battery 0
> /2                        power      OEM Define 5
>            Reporter: Rafael Telles
>            Priority: Critical
>
> I have two clusters of Kafka brokers, one of them (with 15 brokers + 3 Zookeeper servers) became sick (a lot of under-replicated partitions, throwing a lot of NotEnoughReplicasExceptions). I logged in some of the brokers that other couldn't connect to, and I found out that they were all running their Kafka process, but they were not listening to the default TCP port (9092) as expected:
> root@dc3-kafka-02:/home/kafka/kafka_2.11-0.10.1.1# ps aux | grep kafka
> root     14055 21.6 33.6 23001236 5513176 ?    Sl   Mar23 1866:20 /usr/lib/jvm/java-8-oracle/bin/java -Xms2G -Xmx6G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC -Djava.awt.headless=true -Xloggc:/home/kafka/kafka_2.11-0.10.1.1/bin/../logs/kafkaServer-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=17264 -Dkafka.logs.dir=/home/kafka/kafka_2.11-0.10.1.1/bin/../logs -Dlog4j.configuration=file:/home/kafka/kafka_2.11-0.10.1.1/bin/../config/log4j.properties -cp :/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/argparse4j-0.5.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/connect-api-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/connect-file-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/connect-json-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/connect-runtime-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/guava-18.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/hk2-api-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/hk2-locator-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/hk2-utils-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-annotations-2.6.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-core-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-databind-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javassist-3.18.2-GA.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.annotation-api-1.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.inject-1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.inject-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.servlet-api-3.1.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.ws.rs-api-2.0.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-client-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-common-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-container-servlet-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-guava-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-media-jaxb-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-server-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-http-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-io-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-security-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-server-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-util-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jopt-simple-4.9.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka_2.11-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka_2.11-0.10.1.1-sources.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka_2.11-0.10.1.1-test-sources.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-clients-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-log4j-appender-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-streams-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-streams-examples-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-tools-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/log4j-1.2.17.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/lz4-1.3.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/metrics-core-2.2.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/osgi-resource-locator-1.0.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/raven-7.8.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/raven-log4j-7.8.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/reflections-0.9.10.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/rocksdbjni-4.9.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/scala-library-2.11.8.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/scala-parser-combinators_2.11-1.0.4.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/slf4j-api-1.7.21.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/slf4j-log4j12-1.7.21.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/snappy-java-1.1.2.6.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/validation-api-1.1.0.Final.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/zkclient-0.9.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/zookeeper-3.4.8.jar kafka.Kafka /home/kafka/kafka_2.11-0.10.1.1/config/server.properties
> root     28615  0.0  0.0  14180  1024 pts/0    S+   13:35   0:00 grep --color=auto kafka
> root@dc3-kafka-02:/home/kafka/kafka_2.11-0.10.1.1# netstat -tulpn | grep 9092
> ...returns empty
> If I restart Kafka in these brokers, they start listening to 9092 again.
> Update, I found this in the logs, (I restarted the broker, it started listening to 9092, then it stopped):
> [2017-03-29 15:11:38,181] INFO Awaiting socket connections on xxx:9092. (kafka.network.Acceptor)
> [2017-03-29 15:11:38,195] INFO [Socket Server on Broker 15], Started 1 acceptor threads (kafka.network.SocketServer)
> [2017-03-29 15:15:15,254] INFO [Socket Server on Broker 15], Shutting down (kafka.network.SocketServer)
> [2017-03-29 15:15:15,357] INFO [Socket Server on Broker 15], Shutdown completed (kafka.network.SocketServer)
> And there are these FATAL errors too:
> [2017-03-29 15:13:30,114] FATAL [ReplicaFetcherThread-0-7], Exiting because log truncation is not allowed for partition __consumer_offsets-27, Current leader 7's latest offset 0 is less than replica 15's latest offset 1734972 (kafka.server.ReplicaFetcherThread)
> [2017-03-29 15:13:30,114] FATAL [ReplicaFetcherThread-0-7], Exiting because log truncation is not allowed for partition __consumer_offsets-27, Current leader 7's latest offset 0 is less than replica 15's latest offset 1734972 (kafka.server.ReplicaFetcherThread)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)