You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Francesco vigotti (JIRA)" <ji...@apache.org> on 2017/10/26 10:34:00 UTC

[jira] [Created] (KAFKA-6129) kafka issue when exposing through nodeport in kubernetes

Francesco vigotti created KAFKA-6129:
----------------------------------------

             Summary: kafka issue when exposing through nodeport in kubernetes
                 Key: KAFKA-6129
                 URL: https://issues.apache.org/jira/browse/KAFKA-6129
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.10.2.1
         Environment: kubernetes
            Reporter: Francesco vigotti
            Priority: Critical


I've started writing in this issue: https://issues.apache.org/jira/browse/KAFKA-2729
but then I'm going to open this new issue because I've probably found the cause in my kubernetes setup, but In my opinion kubernetes did nothing wrong in his setup ( and all other application works using the same nodeport redirection , ie: zookeeper )
kafka brokers fails , silently (randomly in multiple brokers setup)  and with a misleading error from producer so I think that Kafka should be improved, providing more robust pre-startup flight-checks and identifying/reporting the current issue 

After further investigation from my reply here https://issues.apache.org/jira/browse/KAFKA-2729  with a minimum size cluster ( 1 zk + 1 kafka-broker ) I've found the problem, 
the problem is with kubernetes, ( I don't know why this issue appeared only now to me , if something changed in recent kube-proxy versions or in kafka 0.10+ , or ... ) 
anyway my old kafka cluster started being underreplicated and return various problem , 

the problem happens when in kubernetes pods are created and redirected using a nodeport-service ( over a static ip in my case ) to expose kafka brokers from the host, when using hostNetwork  ( so no redirection ) everything works, what is strange is that zookeeper instead works fine with nodeport ( which create a redirection rule in iptables->nat->prerouting ) the only application I've found problems with this kubernetes configuration is kafka,
what is weird is that kafka starts correctly without errors, but on multiple broker clusters there are random issues, on single broker cluster instead the console-producer fails with infinite looop of :

```
[2017-10-26 09:38:23,281] WARN Error while fetching metadata with correlation id 5 : {test6=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-10-26 09:38:23,383] WARN Error while fetching metadata with correlation id 6 : {test6=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-10-26 09:38:23,485] WARN Error while fetching metadata with correlation id 7 : {test6=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
```
, still no errors reported from broker or zookeeper,
Also I want to say that I've come across this discussion : 
             https://stackoverflow.com/questions/35788697/leader-not-available-kafka-in-console-producer 
but the proposed solution for the host pod ( to allow self-resolving of advertised hostname) didn't worked 
``` 
hostAliases:
      - ip: "127.0.0.1"
        hostnames:
        - "---myhosthostname---"
````






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)