You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Seweryn Habdank-Wojewodzki (JIRA)" <ji...@apache.org> on 2018/05/08 06:38:00 UTC

[jira] [Created] (KAFKA-6882) Wrong producer settings may lead to DoS on Kafka Server

Seweryn Habdank-Wojewodzki created KAFKA-6882:
-------------------------------------------------

             Summary: Wrong producer settings may lead to DoS on Kafka Server
                 Key: KAFKA-6882
                 URL: https://issues.apache.org/jira/browse/KAFKA-6882
             Project: Kafka
          Issue Type: Bug
          Components: core, producer 
    Affects Versions: 1.0.1, 1.1.0
            Reporter: Seweryn Habdank-Wojewodzki


The documentation of the following parameters “linger.ms” and “batch.size” is a bit confusing. In fact those parameters wrongly set on the producer side might completely destroy BROKER throughput.

I see, that smart developers they are reading documentation of those parameters.
Then they want to have super performance and super safety, so they set something like this below:

{code}
kafkaProps.put(ProducerConfig.LINGER_MS_CONFIG, 1);
kafkaProps.put(ProducerConfig.BATCH_SIZE_CONFIG, 0);
{code}

Then we have situation, when each and every message is send separately. TCP/IP protocol is really busy in that case and when they needed high throughput they got much less throughput, as every message is goes separately causing all network communication and TCP/IP overhead significant.

Those settings are good only if someone sends critical messages like once a while (e.g. one message per minute) and not when throughput is important by sending thousands messages per second.

Situation is even worse when smart developers are reading that for safety they need acknowledges from all cluster nodes. So they are adding:

{code}
kafkaProps.put(ProducerConfig.ACKS_CONFIG, "all");
{code}

And this is the end of Kafka performance! 

Even worse it is not a problem for the Kafka producer. The problem remains at the server (cluster, broker) side. The server is so busy by acknowledging *each and every* message from all nodes, that other work is NOT performed, so the end to end performance is almost none.

I would like to ask you to improve documentation of this parameters.
And consider corner cases is case of providing detailed information how extreme values of parameters - namely lowest and highest – may influence work of the cluster.
This was documentation issue. 

On the other hand it is security/safetly matter.

Technically the problem is that __commit_offsets topic is loaded with enormous amount of messages. It leads to the situation, when Kafka Broker is exposed to *DoS *due to the Producer settings. Three lines of code a bit load and the Kafka cluster is dead.
I suppose there are ways to prevent such a situation on the cluster side, but it require some loginc to be implemented to detect such a simple but efficient DoS.

BTW. Do Kafka Admin Tools provide any kind of "kill" connection, when one or the other producer makes problems?




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)