You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Luke Forehand <lu...@networkedinsights.com> on 2014/06/20 23:15:42 UTC

uneven message distribution

We are upgrading to kafka 0.8.1.1 from 0.8-beta

My first task was to start a stream of messages into a topic, using a 4
node cluster.  The topic has 10 partitions and 3 replicas.

I ran the following to produce messages to the topic:

socat - TCP-LISTEN:10000 | ./kafka-console-producer.sh --batch-size 10000
--broker-list kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092
--compress --request-required-acks 1 --topic luke1 &

And then ran a loop to produce messages via telnet:

while true; do echo "blah blah blah"; done | telnet 127.0.0.1 10000

I verified via the console consumer that messages were being received.
What was strange is that after some time I quit the program and checked
the partition offsets:

​​​./ka​fka-run-class.sh kafka.tools.GetOffsetShell --broker-list
kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic luke1 --time
-1​

luke1:0:1853139
luke1:1:9
luke1:2:1
luke1:3:3
luke1:4:50
luke1:5:266603
luke1:6:80035
luke1:7:3455509
luke1:8:3756164
luke1:9:5


They are very uneven, any ideas what is going on here?

Thanks,
Luke Forehand |  Networked Insights  |  Software Engineer



Re: uneven message distribution

Posted by Neha Narkhede <ne...@gmail.com>.
I think this is most likely due to
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified?.
Try reducing topic.metadata.refresh.interval.ms to something low like 2000.

Thanks,
Neha


On Fri, Jun 20, 2014 at 2:15 PM, Luke Forehand <
luke.forehand@networkedinsights.com> wrote:

> We are upgrading to kafka 0.8.1.1 from 0.8-beta
>
> My first task was to start a stream of messages into a topic, using a 4
> node cluster.  The topic has 10 partitions and 3 replicas.
>
> I ran the following to produce messages to the topic:
>
> socat - TCP-LISTEN:10000 | ./kafka-console-producer.sh --batch-size 10000
> --broker-list kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092
> --compress --request-required-acks 1 --topic luke1 &
>
> And then ran a loop to produce messages via telnet:
>
> while true; do echo "blah blah blah"; done | telnet 127.0.0.1 10000
>
> I verified via the console consumer that messages were being received.
> What was strange is that after some time I quit the program and checked
> the partition offsets:
>
> ​​​./ka​fka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic luke1 --time
> -1​
>
> luke1:0:1853139
> luke1:1:9
> luke1:2:1
> luke1:3:3
> luke1:4:50
> luke1:5:266603
> luke1:6:80035
> luke1:7:3455509
> luke1:8:3756164
> luke1:9:5
>
>
> They are very uneven, any ideas what is going on here?
>
> Thanks,
> Luke Forehand |  Networked Insights  |  Software Engineer
>
>
>