You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Asaf Mesika <as...@gmail.com> on 2016/07/05 04:01:30 UTC

Re: Partition size skew using default partitioner (without key)

As we continue to track down the cause I'm trying to ping back here in case
someone new might have an answer to the question below?

On Thu, Jun 16, 2016 at 12:39 PM Asaf Mesika <as...@gmail.com> wrote:

> Hi,
>
> We've noticed that we have some partitions receiving more messages than
> others. What I've done to learn that is:
> * In Kafka Manager, per a given topic, the list of Partition Information
> is displayed.
> * For each partition there's a column called Latest Offset - which I
> assume is the producer offset. As I understand, this is the total number of
> messages written to this partition since the topic was created (and/or the
> partition was added)
> * I plotted the two columns: Partition number and Latest Offset.
>
> This is what I got:
> [image: pasted1]
>
> I'm using Kafka 0.8.2.1 with the new Producer API. We've using default
> partitioner, and we're *not* supplying partition number nor a key, thus
> it should be round-robin.
>
> From some reason, we see this.
>
> I was wondering if someone from the community ever encountered such a
> behaviour?
>
> Thanks!
>
> Asaf Mesika
> Logz.io
>
>

Re: Partition size skew using default partitioner (without key)

Posted by Asaf Mesika <as...@gmail.com>.
Apparently it's documented in the FAQ - but I ignored it since it said
"0.8.0" and I was using 0.8.2.1. After reading all the lengthy forum post
dating to 2013: The problematic code there is in DefaultEventHandler.scala,
but if I'm only using KafkaProducer.java - the java flavor - I won't be
exposed to this behaviour since you moved to NIO thus have a single socket
per broker?

On Tue, Jul 5, 2016 at 7:14 AM Asaf Mesika <as...@gmail.com> wrote:

> Since the image is now shown, here's a direct link to it:
> https://s32.postimg.org/xoet3vu2t/image.png
>
> On Tue, Jul 5, 2016 at 7:01 AM Asaf Mesika <as...@gmail.com> wrote:
>
>> As we continue to track down the cause I'm trying to ping back here in
>> case someone new might have an answer to the question below?
>>
>>
>> On Thu, Jun 16, 2016 at 12:39 PM Asaf Mesika <as...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> We've noticed that we have some partitions receiving more messages than
>>> others. What I've done to learn that is:
>>> * In Kafka Manager, per a given topic, the list of Partition Information
>>> is displayed.
>>> * For each partition there's a column called Latest Offset - which I
>>> assume is the producer offset. As I understand, this is the total number of
>>> messages written to this partition since the topic was created (and/or the
>>> partition was added)
>>> * I plotted the two columns: Partition number and Latest Offset.
>>>
>>> This is what I got:
>>> [image: pasted1]
>>>
>>> I'm using Kafka 0.8.2.1 with the new Producer API. We've using default
>>> partitioner, and we're *not* supplying partition number nor a key, thus
>>> it should be round-robin.
>>>
>>> From some reason, we see this.
>>>
>>> I was wondering if someone from the community ever encountered such a
>>> behaviour?
>>>
>>> Thanks!
>>>
>>> Asaf Mesika
>>> Logz.io
>>>
>>>

Re: Partition size skew using default partitioner (without key)

Posted by Asaf Mesika <as...@gmail.com>.
Since the image is now shown, here's a direct link to it:
https://s32.postimg.org/xoet3vu2t/image.png

On Tue, Jul 5, 2016 at 7:01 AM Asaf Mesika <as...@gmail.com> wrote:

> As we continue to track down the cause I'm trying to ping back here in
> case someone new might have an answer to the question below?
>
>
> On Thu, Jun 16, 2016 at 12:39 PM Asaf Mesika <as...@gmail.com>
> wrote:
>
>> Hi,
>>
>> We've noticed that we have some partitions receiving more messages than
>> others. What I've done to learn that is:
>> * In Kafka Manager, per a given topic, the list of Partition Information
>> is displayed.
>> * For each partition there's a column called Latest Offset - which I
>> assume is the producer offset. As I understand, this is the total number of
>> messages written to this partition since the topic was created (and/or the
>> partition was added)
>> * I plotted the two columns: Partition number and Latest Offset.
>>
>> This is what I got:
>> [image: pasted1]
>>
>> I'm using Kafka 0.8.2.1 with the new Producer API. We've using default
>> partitioner, and we're *not* supplying partition number nor a key, thus
>> it should be round-robin.
>>
>> From some reason, we see this.
>>
>> I was wondering if someone from the community ever encountered such a
>> behaviour?
>>
>> Thanks!
>>
>> Asaf Mesika
>> Logz.io
>>
>>