You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Luke Forehand <lu...@networkedinsights.com> on 2014/06/20 23:15:42 UTC
uneven message distribution
We are upgrading to kafka 0.8.1.1 from 0.8-beta
My first task was to start a stream of messages into a topic, using a 4
node cluster. The topic has 10 partitions and 3 replicas.
I ran the following to produce messages to the topic:
socat - TCP-LISTEN:10000 | ./kafka-console-producer.sh --batch-size 10000
--broker-list kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092
--compress --request-required-acks 1 --topic luke1 &
And then ran a loop to produce messages via telnet:
while true; do echo "blah blah blah"; done | telnet 127.0.0.1 10000
I verified via the console consumer that messages were being received.
What was strange is that after some time I quit the program and checked
the partition offsets:
./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic luke1 --time
-1
luke1:0:1853139
luke1:1:9
luke1:2:1
luke1:3:3
luke1:4:50
luke1:5:266603
luke1:6:80035
luke1:7:3455509
luke1:8:3756164
luke1:9:5
They are very uneven, any ideas what is going on here?
Thanks,
Luke Forehand | Networked Insights | Software Engineer
Re: uneven message distribution
Posted by Neha Narkhede <ne...@gmail.com>.
I think this is most likely due to
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified?.
Try reducing topic.metadata.refresh.interval.ms to something low like 2000.
Thanks,
Neha
On Fri, Jun 20, 2014 at 2:15 PM, Luke Forehand <
luke.forehand@networkedinsights.com> wrote:
> We are upgrading to kafka 0.8.1.1 from 0.8-beta
>
> My first task was to start a stream of messages into a topic, using a 4
> node cluster. The topic has 10 partitions and 3 replicas.
>
> I ran the following to produce messages to the topic:
>
> socat - TCP-LISTEN:10000 | ./kafka-console-producer.sh --batch-size 10000
> --broker-list kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092
> --compress --request-required-acks 1 --topic luke1 &
>
> And then ran a loop to produce messages via telnet:
>
> while true; do echo "blah blah blah"; done | telnet 127.0.0.1 10000
>
> I verified via the console consumer that messages were being received.
> What was strange is that after some time I quit the program and checked
> the partition offsets:
>
> ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic luke1 --time
> -1
>
> luke1:0:1853139
> luke1:1:9
> luke1:2:1
> luke1:3:3
> luke1:4:50
> luke1:5:266603
> luke1:6:80035
> luke1:7:3455509
> luke1:8:3756164
> luke1:9:5
>
>
> They are very uneven, any ideas what is going on here?
>
> Thanks,
> Luke Forehand | Networked Insights | Software Engineer
>
>
>