You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Churu Tang <ct...@rubiconproject.com> on 2014/03/06 02:21:35 UTC

Question concerning partitionNumber and Key

Hi,

I have 2 questions about the partition number and key. 
1. The produceRequest will  explicitly include a partitionNumber, and messageSet which contains messages with key(can be NULL, used to calculate partitionNumber when specified). I am assuming all the messages in the messageSet will be published to the partitionNumber specified. So My question is, since the partitionNumber is explicitly specified in the produce request, will the key still be used in calculating the partition number when the request is handled at the server side? If so, how?

2. In the introduction, it mentioned we can customize the kafka.producer.Partitioner to influence the routing decision. Would you please tell me where I can add my own Partitioner? I use the kafka_2.8.0-0.8.0 binary, and the C++ client supporting kafka 0.8. 

Since I am kinda of new to Kafka, my questions might be a little trivial... Thanks for helping out!

Cheers,
Churu

Re: Question concerning partitionNumber and Key

Posted by Joel Koshy <jj...@gmail.com>.
The key actually may be used for log compaction: can you read this and
let us know if it makes sense?

http://kafka.apache.org/081/documentation.html#compaction

If you don't want your messages to be compacted you can explicitly
specify a different key (called partitionKey) in the message. That
will be used for partitioning and then discarded. If you only specify
the key field that will be used for partitioning, then stored on the
broker along with the message and then used for compaction (if
compaction is enabled on the broker).

Joel

On Thu, Mar 06, 2014 at 11:16:45AM -0800, Churu Tang wrote:
> Now I understand. The key in the messages will no longer be used after the partition number is specified in the produce request.
> Thanks! 
> 
> Cheers,
> Churu
> 
> On Mar 6, 2014, at 10:41 AM, Joel Koshy <jj...@gmail.com> wrote:
> 
> > It is done by the producer - it calls the partitioner before creating
> > the producer-request.
> > 
> > On Thu, Mar 06, 2014 at 10:17:22AM -0800, Churu Tang wrote:
> >> Thanks for the reply! If the broker does not make the decision, then where and how the key is used to calculate the partition number?
> >> 
> >> On Mar 5, 2014, at 5:42 PM, Joel Koshy <jj...@gmail.com> wrote:
> >> 
> >>>> I have 2 questions about the partition number and key. 
> >>>> 1. The produceRequest will  explicitly include a partitionNumber, and messageSet which contains messages with key(can be NULL, used to calculate partitionNumber when specified). I am assuming all the messages in the messageSet will be published to the partitionNumber specified. So My question is, since the partitionNumber is explicitly specified in the produce request, will the key still be used in calculating the partition number when the request is handled at the server side? If so, how?
> >>> 
> >>> The broker does not make this decision. i.e., it will attempt to
> >>> append the message to the partition specified in the produce request.
> >>> 
> >>>> 2. In the introduction, it mentioned we can customize the kafka.producer.Partitioner to influence the routing decision. Would you please tell me where I can add my own Partitioner? I use the kafka_2.8.0-0.8.0 binary, and the C++ client supporting kafka 0.8. 
> >>> 
> >>> Which C++ library do you use? Non-java clients are maintained outside
> >>> the main Kafka code base. You will have to contact that client's
> >>> maintainer to get an answer to this question.
> >>> 
> >>> Joel
> >> 
> > 
> 


Re: Question concerning partitionNumber and Key

Posted by Churu Tang <ct...@rubiconproject.com>.
Now I understand. The key in the messages will no longer be used after the partition number is specified in the produce request.
Thanks! 

Cheers,
Churu

On Mar 6, 2014, at 10:41 AM, Joel Koshy <jj...@gmail.com> wrote:

> It is done by the producer - it calls the partitioner before creating
> the producer-request.
> 
> On Thu, Mar 06, 2014 at 10:17:22AM -0800, Churu Tang wrote:
>> Thanks for the reply! If the broker does not make the decision, then where and how the key is used to calculate the partition number?
>> 
>> On Mar 5, 2014, at 5:42 PM, Joel Koshy <jj...@gmail.com> wrote:
>> 
>>>> I have 2 questions about the partition number and key. 
>>>> 1. The produceRequest will  explicitly include a partitionNumber, and messageSet which contains messages with key(can be NULL, used to calculate partitionNumber when specified). I am assuming all the messages in the messageSet will be published to the partitionNumber specified. So My question is, since the partitionNumber is explicitly specified in the produce request, will the key still be used in calculating the partition number when the request is handled at the server side? If so, how?
>>> 
>>> The broker does not make this decision. i.e., it will attempt to
>>> append the message to the partition specified in the produce request.
>>> 
>>>> 2. In the introduction, it mentioned we can customize the kafka.producer.Partitioner to influence the routing decision. Would you please tell me where I can add my own Partitioner? I use the kafka_2.8.0-0.8.0 binary, and the C++ client supporting kafka 0.8. 
>>> 
>>> Which C++ library do you use? Non-java clients are maintained outside
>>> the main Kafka code base. You will have to contact that client's
>>> maintainer to get an answer to this question.
>>> 
>>> Joel
>> 
> 


Re: Question concerning partitionNumber and Key

Posted by Joel Koshy <jj...@gmail.com>.
It is done by the producer - it calls the partitioner before creating
the producer-request.

On Thu, Mar 06, 2014 at 10:17:22AM -0800, Churu Tang wrote:
> Thanks for the reply! If the broker does not make the decision, then where and how the key is used to calculate the partition number?
> 
> On Mar 5, 2014, at 5:42 PM, Joel Koshy <jj...@gmail.com> wrote:
> 
> >> I have 2 questions about the partition number and key. 
> >> 1. The produceRequest will  explicitly include a partitionNumber, and messageSet which contains messages with key(can be NULL, used to calculate partitionNumber when specified). I am assuming all the messages in the messageSet will be published to the partitionNumber specified. So My question is, since the partitionNumber is explicitly specified in the produce request, will the key still be used in calculating the partition number when the request is handled at the server side? If so, how?
> > 
> > The broker does not make this decision. i.e., it will attempt to
> > append the message to the partition specified in the produce request.
> > 
> >> 2. In the introduction, it mentioned we can customize the kafka.producer.Partitioner to influence the routing decision. Would you please tell me where I can add my own Partitioner? I use the kafka_2.8.0-0.8.0 binary, and the C++ client supporting kafka 0.8. 
> > 
> > Which C++ library do you use? Non-java clients are maintained outside
> > the main Kafka code base. You will have to contact that client's
> > maintainer to get an answer to this question.
> > 
> > Joel
> 


Re: Question concerning partitionNumber and Key

Posted by Churu Tang <ct...@rubiconproject.com>.
Thanks for the reply! If the broker does not make the decision, then where and how the key is used to calculate the partition number?

On Mar 5, 2014, at 5:42 PM, Joel Koshy <jj...@gmail.com> wrote:

>> I have 2 questions about the partition number and key. 
>> 1. The produceRequest will  explicitly include a partitionNumber, and messageSet which contains messages with key(can be NULL, used to calculate partitionNumber when specified). I am assuming all the messages in the messageSet will be published to the partitionNumber specified. So My question is, since the partitionNumber is explicitly specified in the produce request, will the key still be used in calculating the partition number when the request is handled at the server side? If so, how?
> 
> The broker does not make this decision. i.e., it will attempt to
> append the message to the partition specified in the produce request.
> 
>> 2. In the introduction, it mentioned we can customize the kafka.producer.Partitioner to influence the routing decision. Would you please tell me where I can add my own Partitioner? I use the kafka_2.8.0-0.8.0 binary, and the C++ client supporting kafka 0.8. 
> 
> Which C++ library do you use? Non-java clients are maintained outside
> the main Kafka code base. You will have to contact that client's
> maintainer to get an answer to this question.
> 
> Joel


Re: Question concerning partitionNumber and Key

Posted by Joel Koshy <jj...@gmail.com>.
> I have 2 questions about the partition number and key. 
> 1. The produceRequest will  explicitly include a partitionNumber, and messageSet which contains messages with key(can be NULL, used to calculate partitionNumber when specified). I am assuming all the messages in the messageSet will be published to the partitionNumber specified. So My question is, since the partitionNumber is explicitly specified in the produce request, will the key still be used in calculating the partition number when the request is handled at the server side? If so, how?

The broker does not make this decision. i.e., it will attempt to
append the message to the partition specified in the produce request.

> 2. In the introduction, it mentioned we can customize the kafka.producer.Partitioner to influence the routing decision. Would you please tell me where I can add my own Partitioner? I use the kafka_2.8.0-0.8.0 binary, and the C++ client supporting kafka 0.8. 

Which C++ library do you use? Non-java clients are maintained outside
the main Kafka code base. You will have to contact that client's
maintainer to get an answer to this question.

Joel