You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Hu...@hardis.fr on 2017/09/18 16:02:27 UTC

implementing kafka transactions : performance issue

Hi,
I am testing an app with transactions on the producer side of kafka 
(0.11.0.1) .   I  defined the producer config (see below) and added the 
necessary lines in the app (#initTransaction, #begintransaction and 
#commitTransaction) around the existing #send
The problem I am facing is that each transcation takes up to 150ms to be 
treated which doesn't make sense, even for a laptop !
I have tested some batch size config witout any success (messages are 
around 100 bytes)
I certainly made a mistake in the setup but can't figure out which one, or 
how to investigate. I checked by removing the transaction lines and the 
app works fine (in my case less than 200 ms for 100 "send"s  to kafka) 

My config is : 3 VMs on my laptop for the kafka cluster.  My main topic 
has 3 partitions, with 3 replicas and the min.insync .replicas is set to 2 


the producer is defined by (remaing configs by default)
                final Properties props = new Properties();
                props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, 
bootstrap_Servers);
                props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
org.apache.kafka.common.serialization.StringSerializer.class);
                props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
io.confluent.kafka.serializers.KafkaAvroSerializer.class);
                props.put(AbstractKafkaAvroSerDeConfig.
SCHEMA_REGISTRY_URL_CONFIG,schema_Registry_URL);

                props.put(ProducerConfig.ACKS_CONFIG, "all");
                props.put(ProducerConfig.RETRIES_CONFIG , 5);

                props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG,true);
                props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG,
transactionnalId);
                props.put(ProducerConfig.
MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION,1);
 
                confluentProducer = new KafkaProducer<>(props);

Any idea what could be wrong ? have I forgotten something ?
Thanks
Hugues DESLANDES






Re: implementing kafka transactions : performance issue

Posted by Hu...@hardis.fr.
Hi  Apurva,

My transactions are pretty small : only one producer.send to kafka in this 
particular case (even if I have tested with up to 100)
The producer code is embedded in an app linked with JDBC connection to 
some Database. 
I tested kafka-producer-perf-test.sh  : not sure to clearly understand the 
results. Latency is higher than expected but consistant with what I have 
with my app 
(Each time my  app enters   producer.commitTransaction() it takes 100- 
200ms to get out) 

To say it differently, when I push 1 message in kafka with transaction I 
am able to push  40 messages without transactions : what is wrong ?  the 
way I consider transactions (too small ?)  my app ? 
Any thoughts ?
Thanks


Results from kafka-producer-perf-test.sh (on my laptop ..., not a 
production cluster !)
usr/local/kafka/bin/kafka-producer-perf-test.sh --topic my_topic 
--num-records 6000 --throughput 300  --producer-props 
bootstrap.servers=tpg59:9092,tpg59:9093,tpg59:9094 
key.serializer=org.apache.kafka.common.serialization.StringSerializer 
value.serializer=io.confluent.kafka.serializers.KafkaAvroSerializer 
--record-size 80 --print-metrics  --transactional-id test 
--transaction-duration-ms 5
1481 records sent, 294,1 records/sec (0,02 MB/sec), 54,8 ms avg latency, 
116,0 max latency.
1511 records sent, 300,5 records/sec (0,02 MB/sec), 53,3 ms avg latency, 
153,0 max latency.
1514 records sent, 300,0 records/sec (0,02 MB/sec), 45,5 ms avg latency, 
101,0 max latency.
6000 records sent, 298,834545 records/sec (0,02 MB/sec), 50,41 ms avg 
latency, 153,00 ms max latency, 45 ms 50th, 91 ms 95th, 114 ms 99th, 153 
ms 99.9th.

Metric Name          Value
kafka-metrics-count:count:{client-id=producer-1}        : 76,000
producer-metrics:batch-size-avg:{client-id=producer-1}        : 646,526
producer-metrics:batch-size-max:{client-id=producer-1}        : 1574,000
producer-metrics:batch-split-rate:{client-id=producer-1}        : 0,000
producer-metrics:buffer-available-bytes:{client-id=producer-1}        : 
33554432,000
producer-metrics:buffer-exhausted-rate:{client-id=producer-1}        : 
0,000
producer-metrics:buffer-total-bytes:{client-id=producer-1}        : 
33554432,000
producer-metrics:bufferpool-wait-ratio:{client-id=producer-1}        : 
0,000
producer-metrics:compression-rate-avg:{client-id=producer-1}        : 
1,000
producer-metrics:connection-close-rate:{client-id=producer-1}        : 
0,000
producer-metrics:connection-count:{client-id=producer-1}        : 4,000
producer-metrics:connection-creation-rate:{client-id=producer-1}        : 
0,079
producer-metrics:incoming-byte-rate:{client-id=producer-1}        : 
1555,399
producer-metrics:io-ratio:{client-id=producer-1}        : 0,004
producer-metrics:io-time-ns-avg:{client-id=producer-1}        : 38931,988
producer-metrics:io-wait-ratio:{client-id=producer-1}        : 0,258
producer-metrics:io-wait-time-ns-avg:{client-id=producer-1}        : 
2866393,133
producer-metrics:metadata-age:{client-id=producer-1}        : 20,071
producer-metrics:network-io-rate:{client-id=producer-1}        : 76,761
producer-metrics:outgoing-byte-rate:{client-id=producer-1}        : 
14062,380
producer-metrics:produce-throttle-time-avg:{client-id=producer-1}        : 
0,000
producer-metrics:produce-throttle-time-max:{client-id=producer-1}        : 
0,000
producer-metrics:record-error-rate:{client-id=producer-1}        : 0,000
producer-metrics:record-queue-time-avg:{client-id=producer-1}        : 
36,874
producer-metrics:record-queue-time-max:{client-id=producer-1}        : 
131,000
producer-metrics:record-retry-rate:{client-id=producer-1}        : 0,000
producer-metrics:record-send-rate:{client-id=producer-1}        : 119,950
producer-metrics:record-size-avg:{client-id=producer-1}        : 166,000
producer-metrics:record-size-max:{client-id=producer-1}        : 166,000
producer-metrics:records-per-request-avg:{client-id=producer-1}        : 
6,579
producer-metrics:request-latency-avg:{client-id=producer-1}        : 
12,739
producer-metrics:request-latency-max:{client-id=producer-1}        : 
86,000
producer-metrics:request-rate:{client-id=producer-1}        : 38,381
producer-metrics:request-size-avg:{client-id=producer-1}        : 366,386
producer-metrics:request-size-max:{client-id=producer-1}        : 1636,000
producer-metrics:requests-in-flight:{client-id=producer-1}        : 0,000
producer-metrics:response-rate:{client-id=producer-1}        : 38,383
producer-metrics:select-rate:{client-id=producer-1}        : 90,160
producer-metrics:waiting-threads:{client-id=producer-1}        : 0,000
producer-node-metrics:incoming-byte-rate:{client-id=producer-1, 
node-id=node--1}  : 10,390
producer-node-metrics:incoming-byte-rate:{client-id=producer-1, 
node-id=node-1}   : 959,564
producer-node-metrics:incoming-byte-rate:{client-id=producer-1, 
node-id=node-2}   : 296,117
producer-node-metrics:incoming-byte-rate:{client-id=producer-1, 
node-id=node-3}   : 295,731
producer-node-metrics:outgoing-byte-rate:{client-id=producer-1, 
node-id=node--1}  : 1,867
producer-node-metrics:outgoing-byte-rate:{client-id=producer-1, 
node-id=node-1}   : 5508,490
producer-node-metrics:outgoing-byte-rate:{client-id=producer-1, 
node-id=node-2}   : 4307,028
producer-node-metrics:outgoing-byte-rate:{client-id=producer-1, 
node-id=node-3}   : 4312,847
producer-node-metrics:request-latency-avg:{client-id=producer-1, 
node-id=node--1} : 0,000
producer-node-metrics:request-latency-avg:{client-id=producer-1, 
node-id=node-1}  : 13,098
producer-node-metrics:request-latency-avg:{client-id=producer-1, 
node-id=node-2}  : 12,441
producer-node-metrics:request-latency-avg:{client-id=producer-1, 
node-id=node-3}  : 12,677
producer-node-metrics:request-latency-max:{client-id=producer-1, 
node-id=node--1} : -Infinity
producer-node-metrics:request-latency-max:{client-id=producer-1, 
node-id=node-1}  : 86,000
producer-node-metrics:request-latency-max:{client-id=producer-1, 
node-id=node-2}  : 84,000
producer-node-metrics:request-latency-max:{client-id=producer-1, 
node-id=node-3}  : 84,000
producer-node-metrics:request-rate:{client-id=producer-1, node-id=node--1} 
       : 0,060
producer-node-metrics:request-rate:{client-id=producer-1, node-id=node-1}  
      : 26,245
producer-node-metrics:request-rate:{client-id=producer-1, node-id=node-2}  
      : 6,098
producer-node-metrics:request-rate:{client-id=producer-1, node-id=node-3}  
      : 6,090
producer-node-metrics:request-size-avg:{client-id=producer-1, 
node-id=node--1}    : 31,333
producer-node-metrics:request-size-avg:{client-id=producer-1, 
node-id=node-1}     : 209,890
producer-node-metrics:request-size-avg:{client-id=producer-1, 
node-id=node-2}     : 706,282
producer-node-metrics:request-size-avg:{client-id=producer-1, 
node-id=node-3}     : 708,201
producer-node-metrics:request-size-max:{client-id=producer-1, 
node-id=node--1}    : 39,000
producer-node-metrics:request-size-max:{client-id=producer-1, 
node-id=node-1}     : 1636,000
producer-node-metrics:request-size-max:{client-id=producer-1, 
node-id=node-2}     : 1636,000
producer-node-metrics:request-size-max:{client-id=producer-1, 
node-id=node-3}     : 1636,000
producer-node-metrics:response-rate:{client-id=producer-1, 
node-id=node--1}       : 0,060
producer-node-metrics:response-rate:{client-id=producer-1, node-id=node-1} 
       : 26,281
producer-node-metrics:response-rate:{client-id=producer-1, node-id=node-2} 
       : 6,098
producer-node-metrics:response-rate:{client-id=producer-1, node-id=node-3} 
       : 6,090
producer-topic-metrics:byte-rate:{client-id=producer-1, topic=my_topic}    
 : 11787,925
producer-topic-metrics:compression-rate:{client-id=producer-1, 
topic=my_topic}    : 1,000
producer-topic-metrics:record-error-rate:{client-id=producer-1, 
topic=my_topic}   : 0,000
producer-topic-metrics:record-retry-rate:{client-id=producer-1, 
topic=my_topic}   : 0,000
producer-topic-metrics:record-send-rate:{client-id=producer-1, 
topic=my_topic}    : 119,952



"Apurva Mehta" <ap...@confluent.io> a écrit sur 18/09/2017 19:59:21 :

> De : "Apurva Mehta" <ap...@confluent.io>
> A : "Users" <us...@kafka.apache.org>
> Date : 18/09/2017 19:59
> Objet : Re: implementing kafka transactions : performance issue
> 
> Hi Hugues.
> 
> How 'big' are your transactions? In particular, how many produce records
> are in a single transaction? Can you share your actual producer code?
> 
> Also, did you try the `kafka-producer-perf-test.sh` tool with a
> transactional id and see what the latency is for transactions with that
> tool?
> 
> Thanks,
> Apurva
> 
> On Mon, Sep 18, 2017 at 9:02 AM, <Hu...@hardis.fr> wrote:
> 
> > Hi,
> > I am testing an app with transactions on the producer side of kafka
> > (0.11.0.1) .   I  defined the producer config (see below) and added 
the
> > necessary lines in the app (#initTransaction, #begintransaction and
> > #commitTransaction) around the existing #send
> > The problem I am facing is that each transcation takes up to 150ms to 
be
> > treated which doesn't make sense, even for a laptop !
> > I have tested some batch size config witout any success (messages are
> > around 100 bytes)
> > I certainly made a mistake in the setup but can't figure out which 
one, or
> > how to investigate. I checked by removing the transaction lines and 
the
> > app works fine (in my case less than 200 ms for 100 "send"s  to kafka)
> >
> > My config is : 3 VMs on my laptop for the kafka cluster.  My main 
topic
> > has 3 partitions, with 3 replicas and the min.insync .replicas is set 
to 2
> >
> >
> > the producer is defined by (remaing configs by default)
> >                 final Properties props = new Properties();
> >                 props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
> > bootstrap_Servers);
> >                 props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
> > org.apache.kafka.common.serialization.StringSerializer.class);
> > props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
> > io.confluent.kafka.serializers.KafkaAvroSerializer.class);
> >                 props.put(AbstractKafkaAvroSerDeConfig.
> > SCHEMA_REGISTRY_URL_CONFIG,schema_Registry_URL);
> >
> >                 props.put(ProducerConfig.ACKS_CONFIG, "all");
> >                 props.put(ProducerConfig.RETRIES_CONFIG , 5);
> >
> > props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG,true);
> >                 props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG,
> > transactionnalId);
> >                 props.put(ProducerConfig.
> > MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION,1);
> >
> >                 confluentProducer = new KafkaProducer<>(props);
> >
> > Any idea what could be wrong ? have I forgotten something ?
> > Thanks
> > Hugues DESLANDES
> >
> >
> >
> >
> >
> >

Re: implementing kafka transactions : performance issue

Posted by Apurva Mehta <ap...@confluent.io>.
Hi Hugues.

How 'big' are your transactions? In particular, how many produce records
are in a single transaction? Can you share your actual producer code?

Also, did you try the `kafka-producer-perf-test.sh` tool with a
transactional id and see what the latency is for transactions with that
tool?

Thanks,
Apurva

On Mon, Sep 18, 2017 at 9:02 AM, <Hu...@hardis.fr> wrote:

> Hi,
> I am testing an app with transactions on the producer side of kafka
> (0.11.0.1) .   I  defined the producer config (see below) and added the
> necessary lines in the app (#initTransaction, #begintransaction and
> #commitTransaction) around the existing #send
> The problem I am facing is that each transcation takes up to 150ms to be
> treated which doesn't make sense, even for a laptop !
> I have tested some batch size config witout any success (messages are
> around 100 bytes)
> I certainly made a mistake in the setup but can't figure out which one, or
> how to investigate. I checked by removing the transaction lines and the
> app works fine (in my case less than 200 ms for 100 "send"s  to kafka)
>
> My config is : 3 VMs on my laptop for the kafka cluster.  My main topic
> has 3 partitions, with 3 replicas and the min.insync .replicas is set to 2
>
>
> the producer is defined by (remaing configs by default)
>                 final Properties props = new Properties();
>                 props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
> bootstrap_Servers);
>                 props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
> org.apache.kafka.common.serialization.StringSerializer.class);
>                 props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
> io.confluent.kafka.serializers.KafkaAvroSerializer.class);
>                 props.put(AbstractKafkaAvroSerDeConfig.
> SCHEMA_REGISTRY_URL_CONFIG,schema_Registry_URL);
>
>                 props.put(ProducerConfig.ACKS_CONFIG, "all");
>                 props.put(ProducerConfig.RETRIES_CONFIG , 5);
>
>                 props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG,true);
>                 props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG,
> transactionnalId);
>                 props.put(ProducerConfig.
> MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION,1);
>
>                 confluentProducer = new KafkaProducer<>(props);
>
> Any idea what could be wrong ? have I forgotten something ?
> Thanks
> Hugues DESLANDES
>
>
>
>
>
>