You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Dave Peterson <ds...@tagged.com> on 2013/05/21 18:54:49 UTC

produce request wire format question

In the version 0.8 wire format for a produce request, does a value of -1
still indicate "use a random partition" as it did for 0.7?

Thanks,
Dave

Re: produce request wire format question

Posted by Neha Narkhede <ne...@gmail.com>.

Yes. You can see the details of the wire protocol here
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-MetadataResponse

Thanks,
Neha
On May 23, 2013 9:43 AM, "Dave Peterson" <ds...@tagged.com> wrote:

> Ok, thanks for the information.  Looking at the wire format for the
> metadata response, I see that the right hand side of the TopicMetadata
> production contains a TopicErrorCode, and the right hand side of the
> PartitionMetadata production contains a PartitionErrorCode.  Are both
> of these 16-bit values?  In general, where it isn't stated explicitly in
> the documentation, can I assume that all error codes are 16-bit values?
>
> Thanks,
> Dave
>
>
> On Wed, May 22, 2013 at 4:29 PM, Neha Narkhede <ne...@gmail.com>
> wrote:
> > 1. Correct
> > 2. The producer does not use or depend on zookeeper anymore. It refreshes
> > its view of the cluster metadata by using a TopicMetadataRequest to any
> of
> > the kafka brokers. It maps a message to a partition using the following
> > rules -
> > 2.1 If a message has no key, use any available partition
> > 2.2 If a message has a key and the user has defined a custom partitioner,
> > use it to map the key to a partition id
> > 2.3 If a message has a key and the user has not defined a custom
> > partitioner, use the default hash based partitioner that ships with Kafka
> >
> > Thanks,
> > Neha
> >
> >
> > On Wed, May 22, 2013 at 1:33 PM, Dave Peterson <dspeterson@tagged.com
> >wrote:
> >
> >> Ok, the picture I have in my mind of how things work in 0.8 (from a
> >> producer's point of view) is as follows:
> >>
> >>     1.  An application program sends log messages to a producer.  Each
> >>         message is provided as a key/value pair, where the key is chosen
> >>         by the application and the value is the message contents.  By
> its
> >>         choice of key, the application may influence or control which
> >>         partition the message gets sent to.
> >>
> >>     2.  The producer receives messages as key/value pairs.  From talking
> >>         with zookeeper, it knows the set of available brokers and which
> >>         partitions each broker has.  If the sending application
> provided a
> >> key
> >>         for a given message, the contents of the key may optionally
> >>         influence the producer's choice of broker and partition to send
> the
> >>         message to, according to some convention understood by both
> >>         application program and producer.
> >>
> >> Is this correct?
> >>
> >> Thanks,
> >> Dave
> >>
> >> On Wed, May 22, 2013 at 9:28 AM, Jun Rao <ju...@gmail.com> wrote:
> >> > Dave,
> >> >
> >> > Currently, the broker expects each producer request to specify the
> exact
> >> > partition id (-1 is on longer valid). The mapping from a message to a
> >> > partition is done at the producer client. The producer can choose a
> >> random
> >> > partition (from the existing list of partitions) or deterministically
> >> > choose a partition based on the key.
> >> >
> >> > Thanks,
> >> >
> >> > Jun
> >> >
> >> >
> >> > On Tue, May 21, 2013 at 1:12 PM, Dave Peterson <dspeterson@tagged.com
> >> >wrote:
> >> >
> >> >> In my case, there is a load balancer between the producers and the
> >> >> brokers, so I want the behavior described for the Java client (null
> key
> >> >> specifies "any partition").  If the Key field of each individual
> message
> >> >> specifies the partition to send it to, then I don't understand the
> >> purpose
> >> >> of the 32-bit partition identifier that precedes each message set in
> a
> >> >> produce request: what if a produce request specifies "partition N"
> for a
> >> >> given message set, and then each individual message in the set
> >> >> specifies a different partition in its Key field?  Also, the above-
> >> >> mentioned partition identifier is a 32-bit integer and the Key field
> of
> >> >> each individual message can contain data of arbitrary length, which
> >> >> seems inconsistent.  Is a partition identifier a 32-bit integer, or
> can
> >> it
> >> >> be of arbitrary length?
> >> >>
> >> >> Thanks,
> >> >> Dave
> >> >>
> >> >> On Tue, May 21, 2013 at 12:30 PM, Neha Narkhede <
> >> neha.narkhede@gmail.com>
> >> >> wrote:
> >> >> > Dave,
> >> >> >
> >> >> > Colin described the producer behavior of picking the partition for
> a
> >> >> > message before it is sent to Kafka broker correctly. However, I'm
> >> >> > interested in knowing your use case a little before to see why you
> >> would
> >> >> > rather have the broker decide the partition?
> >> >> >
> >> >> > Thanks,
> >> >> > Neha
> >> >> >
> >> >> >
> >> >> > On Tue, May 21, 2013 at 12:05 PM, Colin Blower <
> cblower@barracuda.com
> >> >> >wrote:
> >> >> >
> >> >> >> The key is used by the client to decide which partition to send
> the
> >> >> >> message to. By the time the client is creating the produce
> request,
> >> it
> >> >> >> should be known which partition each message is being sent to. I
> >> believe
> >> >> >> Neha described the behavior of the Java client which sends
> messages
> >> with
> >> >> >> a null key to any partition.
> >> >> >>
> >> >> >> The key is described in past tense because of the use case for
> >> >> >> persisting keys with messages. The key is persisted through the
> >> broker
> >> >> >> so that a consumer knows what key was used to partition the
> message
> >> on
> >> >> >> the producer side.
> >> >> >>
> >> >> >> I don't believe that you can have the broker decide which
> partition a
> >> >> >> message goes to.
> >> >> >>
> >> >> >> --
> >> >> >> Colin B.
> >> >> >>
> >> >> >> On 05/21/2013 11:48 AM, Dave Peterson wrote:
> >> >> >> > I'm looking at the document entitled "A Guide to the Kafka
> >> Protocol"
> >> >> >> > located here:
> >> >> >> >
> >> >> >> >
> >> https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html
> >> >> >> >
> >> >> >> > It shows a produce request as containing a number of message
> sets,
> >> >> which
> >> >> >> are
> >> >> >> > grouped first by topic and second by partition (a 32-bit
> integer).
> >> >> >> > However, each
> >> >> >> > message in a message set contains a Key field, which is
> described
> >> as
> >> >> >> follows:
> >> >> >> >
> >> >> >> >     The key is an optional message key that was used for
> partition
> >> >> >> assignment.
> >> >> >> >     The key can be null.
> >> >> >> >
> >> >> >> > I notice the use of "was" (past tense) above.  That seems to
> >> suggest
> >> >> >> that the
> >> >> >> > Key field was once used to specify a partition (at the
> granularity
> >> of
> >> >> >> each
> >> >> >> > individual message), but the plan for the future is to instead
> use
> >> the
> >> >> >> 32-bit
> >> >> >> > partition value preceding each message set.  Is this correct?
>  If
> >> so,
> >> >> >> when I am
> >> >> >> > creating a produce request for 0.8, what should I use for the
> >> 32-bit
> >> >> >> partition
> >> >> >> > value, and how does this relate to the Key field of each
> individual
> >> >> >> message?
> >> >> >> > Ideally, I would like to just send a produce request and let the
> >> >> broker
> >> >> >> choose
> >> >> >> > the partition.  How do I accomplish this in 0.8, and are there
> >> plans
> >> >> to
> >> >> >> change
> >> >> >> > this after 0.8?
> >> >> >> >
> >> >> >> > Thanks,
> >> >> >> > Dave
> >> >> >> >
> >> >> >> > On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <
> >> >> neha.narkhede@gmail.com>
> >> >> >> wrote:
> >> >> >> >> No. In 0.8, if you don't specify a key for a message, it is
> sent
> >> to
> >> >> any
> >> >> >> of
> >> >> >> >> the available partitions. In other words, the partition id is
> >> >> selected
> >> >> >> on
> >> >> >> >> the partition and the server doesn't get -1 as the partition
> id.
> >> >> >> >>
> >> >> >> >> Thanks,
> >> >> >> >> Neha
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <
> >> >> dspeterson@tagged.com
> >> >> >> >wrote:
> >> >> >> >>
> >> >> >> >>> In the version 0.8 wire format for a produce request, does a
> >> value
> >> >> of
> >> >> >> -1
> >> >> >> >>> still indicate "use a random partition" as it did for 0.7?
> >> >> >> >>>
> >> >> >> >>> Thanks,
> >> >> >> >>> Dave
> >> >> >> >>>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >>
> >>
>

Re: produce request wire format question

Posted by Colin Blower <cb...@barracuda.com>.

Yes I believe the TopicErrorCode is also an int16. I will update the
documentation accordingly.

In general, if the protocol specification is not explicit it should be
made explicit. This will help other client driver authors in the future.
If you find any other errors or omissions in the documentation please
let us know. Also, in general error codes are int16.

--
Colin B.

On 05/23/2013 09:43 AM, Dave Peterson wrote:
> Ok, thanks for the information.  Looking at the wire format for the
> metadata response, I see that the right hand side of the TopicMetadata
> production contains a TopicErrorCode, and the right hand side of the
> PartitionMetadata production contains a PartitionErrorCode.  Are both
> of these 16-bit values?  In general, where it isn't stated explicitly in
> the documentation, can I assume that all error codes are 16-bit values?
>
> Thanks,
> Dave
>
>
> On Wed, May 22, 2013 at 4:29 PM, Neha Narkhede <ne...@gmail.com> wrote:
>> 1. Correct
>> 2. The producer does not use or depend on zookeeper anymore. It refreshes
>> its view of the cluster metadata by using a TopicMetadataRequest to any of
>> the kafka brokers. It maps a message to a partition using the following
>> rules -
>> 2.1 If a message has no key, use any available partition
>> 2.2 If a message has a key and the user has defined a custom partitioner,
>> use it to map the key to a partition id
>> 2.3 If a message has a key and the user has not defined a custom
>> partitioner, use the default hash based partitioner that ships with Kafka
>>
>> Thanks,
>> Neha
>>
>>
>> On Wed, May 22, 2013 at 1:33 PM, Dave Peterson <ds...@tagged.com>wrote:
>>
>>> Ok, the picture I have in my mind of how things work in 0.8 (from a
>>> producer's point of view) is as follows:
>>>
>>>     1.  An application program sends log messages to a producer.  Each
>>>         message is provided as a key/value pair, where the key is chosen
>>>         by the application and the value is the message contents.  By its
>>>         choice of key, the application may influence or control which
>>>         partition the message gets sent to.
>>>
>>>     2.  The producer receives messages as key/value pairs.  From talking
>>>         with zookeeper, it knows the set of available brokers and which
>>>         partitions each broker has.  If the sending application provided a
>>> key
>>>         for a given message, the contents of the key may optionally
>>>         influence the producer's choice of broker and partition to send the
>>>         message to, according to some convention understood by both
>>>         application program and producer.
>>>
>>> Is this correct?
>>>
>>> Thanks,
>>> Dave
>>>
>>> On Wed, May 22, 2013 at 9:28 AM, Jun Rao <ju...@gmail.com> wrote:
>>>> Dave,
>>>>
>>>> Currently, the broker expects each producer request to specify the exact
>>>> partition id (-1 is on longer valid). The mapping from a message to a
>>>> partition is done at the producer client. The producer can choose a
>>> random
>>>> partition (from the existing list of partitions) or deterministically
>>>> choose a partition based on the key.
>>>>
>>>> Thanks,
>>>>
>>>> Jun
>>>>
>>>>
>>>> On Tue, May 21, 2013 at 1:12 PM, Dave Peterson <dspeterson@tagged.com
>>>> wrote:
>>>>
>>>>> In my case, there is a load balancer between the producers and the
>>>>> brokers, so I want the behavior described for the Java client (null key
>>>>> specifies "any partition").  If the Key field of each individual message
>>>>> specifies the partition to send it to, then I don't understand the
>>> purpose
>>>>> of the 32-bit partition identifier that precedes each message set in a
>>>>> produce request: what if a produce request specifies "partition N" for a
>>>>> given message set, and then each individual message in the set
>>>>> specifies a different partition in its Key field?  Also, the above-
>>>>> mentioned partition identifier is a 32-bit integer and the Key field of
>>>>> each individual message can contain data of arbitrary length, which
>>>>> seems inconsistent.  Is a partition identifier a 32-bit integer, or can
>>> it
>>>>> be of arbitrary length?
>>>>>
>>>>> Thanks,
>>>>> Dave
>>>>>
>>>>> On Tue, May 21, 2013 at 12:30 PM, Neha Narkhede <
>>> neha.narkhede@gmail.com>
>>>>> wrote:
>>>>>> Dave,
>>>>>>
>>>>>> Colin described the producer behavior of picking the partition for a
>>>>>> message before it is sent to Kafka broker correctly. However, I'm
>>>>>> interested in knowing your use case a little before to see why you
>>> would
>>>>>> rather have the broker decide the partition?
>>>>>>
>>>>>> Thanks,
>>>>>> Neha
>>>>>>
>>>>>>
>>>>>> On Tue, May 21, 2013 at 12:05 PM, Colin Blower <cblower@barracuda.com
>>>>>> wrote:
>>>>>>
>>>>>>> The key is used by the client to decide which partition to send the
>>>>>>> message to. By the time the client is creating the produce request,
>>> it
>>>>>>> should be known which partition each message is being sent to. I
>>> believe
>>>>>>> Neha described the behavior of the Java client which sends messages
>>> with
>>>>>>> a null key to any partition.
>>>>>>>
>>>>>>> The key is described in past tense because of the use case for
>>>>>>> persisting keys with messages. The key is persisted through the
>>> broker
>>>>>>> so that a consumer knows what key was used to partition the message
>>> on
>>>>>>> the producer side.
>>>>>>>
>>>>>>> I don't believe that you can have the broker decide which partition a
>>>>>>> message goes to.
>>>>>>>
>>>>>>> --
>>>>>>> Colin B.
>>>>>>>
>>>>>>> On 05/21/2013 11:48 AM, Dave Peterson wrote:
>>>>>>>> I'm looking at the document entitled "A Guide to the Kafka
>>> Protocol"
>>>>>>>> located here:
>>>>>>>>
>>>>>>>>
>>> https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html
>>>>>>>> It shows a produce request as containing a number of message sets,
>>>>> which
>>>>>>> are
>>>>>>>> grouped first by topic and second by partition (a 32-bit integer).
>>>>>>>> However, each
>>>>>>>> message in a message set contains a Key field, which is described
>>> as
>>>>>>> follows:
>>>>>>>>     The key is an optional message key that was used for partition
>>>>>>> assignment.
>>>>>>>>     The key can be null.
>>>>>>>>
>>>>>>>> I notice the use of "was" (past tense) above.  That seems to
>>> suggest
>>>>>>> that the
>>>>>>>> Key field was once used to specify a partition (at the granularity
>>> of
>>>>>>> each
>>>>>>>> individual message), but the plan for the future is to instead use
>>> the
>>>>>>> 32-bit
>>>>>>>> partition value preceding each message set.  Is this correct?  If
>>> so,
>>>>>>> when I am
>>>>>>>> creating a produce request for 0.8, what should I use for the
>>> 32-bit
>>>>>>> partition
>>>>>>>> value, and how does this relate to the Key field of each individual
>>>>>>> message?
>>>>>>>> Ideally, I would like to just send a produce request and let the
>>>>> broker
>>>>>>> choose
>>>>>>>> the partition.  How do I accomplish this in 0.8, and are there
>>> plans
>>>>> to
>>>>>>> change
>>>>>>>> this after 0.8?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Dave
>>>>>>>>
>>>>>>>> On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <
>>>>> neha.narkhede@gmail.com>
>>>>>>> wrote:
>>>>>>>>> No. In 0.8, if you don't specify a key for a message, it is sent
>>> to
>>>>> any
>>>>>>> of
>>>>>>>>> the available partitions. In other words, the partition id is
>>>>> selected
>>>>>>> on
>>>>>>>>> the partition and the server doesn't get -1 as the partition id.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Neha
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <
>>>>> dspeterson@tagged.com
>>>>>>>> wrote:
>>>>>>>>>> In the version 0.8 wire format for a produce request, does a
>>> value
>>>>> of
>>>>>>> -1
>>>>>>>>>> still indicate "use a random partition" as it did for 0.7?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Dave
>>>>>>>>>>
>>>>>>>
>>>>>>>

Re: produce request wire format question

Posted by Dave Peterson <ds...@tagged.com>.

Ok, thanks for the information.  Looking at the wire format for the
metadata response, I see that the right hand side of the TopicMetadata
production contains a TopicErrorCode, and the right hand side of the
PartitionMetadata production contains a PartitionErrorCode.  Are both
of these 16-bit values?  In general, where it isn't stated explicitly in
the documentation, can I assume that all error codes are 16-bit values?

Thanks,
Dave


On Wed, May 22, 2013 at 4:29 PM, Neha Narkhede <ne...@gmail.com> wrote:
> 1. Correct
> 2. The producer does not use or depend on zookeeper anymore. It refreshes
> its view of the cluster metadata by using a TopicMetadataRequest to any of
> the kafka brokers. It maps a message to a partition using the following
> rules -
> 2.1 If a message has no key, use any available partition
> 2.2 If a message has a key and the user has defined a custom partitioner,
> use it to map the key to a partition id
> 2.3 If a message has a key and the user has not defined a custom
> partitioner, use the default hash based partitioner that ships with Kafka
>
> Thanks,
> Neha
>
>
> On Wed, May 22, 2013 at 1:33 PM, Dave Peterson <ds...@tagged.com>wrote:
>
>> Ok, the picture I have in my mind of how things work in 0.8 (from a
>> producer's point of view) is as follows:
>>
>>     1.  An application program sends log messages to a producer.  Each
>>         message is provided as a key/value pair, where the key is chosen
>>         by the application and the value is the message contents.  By its
>>         choice of key, the application may influence or control which
>>         partition the message gets sent to.
>>
>>     2.  The producer receives messages as key/value pairs.  From talking
>>         with zookeeper, it knows the set of available brokers and which
>>         partitions each broker has.  If the sending application provided a
>> key
>>         for a given message, the contents of the key may optionally
>>         influence the producer's choice of broker and partition to send the
>>         message to, according to some convention understood by both
>>         application program and producer.
>>
>> Is this correct?
>>
>> Thanks,
>> Dave
>>
>> On Wed, May 22, 2013 at 9:28 AM, Jun Rao <ju...@gmail.com> wrote:
>> > Dave,
>> >
>> > Currently, the broker expects each producer request to specify the exact
>> > partition id (-1 is on longer valid). The mapping from a message to a
>> > partition is done at the producer client. The producer can choose a
>> random
>> > partition (from the existing list of partitions) or deterministically
>> > choose a partition based on the key.
>> >
>> > Thanks,
>> >
>> > Jun
>> >
>> >
>> > On Tue, May 21, 2013 at 1:12 PM, Dave Peterson <dspeterson@tagged.com
>> >wrote:
>> >
>> >> In my case, there is a load balancer between the producers and the
>> >> brokers, so I want the behavior described for the Java client (null key
>> >> specifies "any partition").  If the Key field of each individual message
>> >> specifies the partition to send it to, then I don't understand the
>> purpose
>> >> of the 32-bit partition identifier that precedes each message set in a
>> >> produce request: what if a produce request specifies "partition N" for a
>> >> given message set, and then each individual message in the set
>> >> specifies a different partition in its Key field?  Also, the above-
>> >> mentioned partition identifier is a 32-bit integer and the Key field of
>> >> each individual message can contain data of arbitrary length, which
>> >> seems inconsistent.  Is a partition identifier a 32-bit integer, or can
>> it
>> >> be of arbitrary length?
>> >>
>> >> Thanks,
>> >> Dave
>> >>
>> >> On Tue, May 21, 2013 at 12:30 PM, Neha Narkhede <
>> neha.narkhede@gmail.com>
>> >> wrote:
>> >> > Dave,
>> >> >
>> >> > Colin described the producer behavior of picking the partition for a
>> >> > message before it is sent to Kafka broker correctly. However, I'm
>> >> > interested in knowing your use case a little before to see why you
>> would
>> >> > rather have the broker decide the partition?
>> >> >
>> >> > Thanks,
>> >> > Neha
>> >> >
>> >> >
>> >> > On Tue, May 21, 2013 at 12:05 PM, Colin Blower <cblower@barracuda.com
>> >> >wrote:
>> >> >
>> >> >> The key is used by the client to decide which partition to send the
>> >> >> message to. By the time the client is creating the produce request,
>> it
>> >> >> should be known which partition each message is being sent to. I
>> believe
>> >> >> Neha described the behavior of the Java client which sends messages
>> with
>> >> >> a null key to any partition.
>> >> >>
>> >> >> The key is described in past tense because of the use case for
>> >> >> persisting keys with messages. The key is persisted through the
>> broker
>> >> >> so that a consumer knows what key was used to partition the message
>> on
>> >> >> the producer side.
>> >> >>
>> >> >> I don't believe that you can have the broker decide which partition a
>> >> >> message goes to.
>> >> >>
>> >> >> --
>> >> >> Colin B.
>> >> >>
>> >> >> On 05/21/2013 11:48 AM, Dave Peterson wrote:
>> >> >> > I'm looking at the document entitled "A Guide to the Kafka
>> Protocol"
>> >> >> > located here:
>> >> >> >
>> >> >> >
>> https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html
>> >> >> >
>> >> >> > It shows a produce request as containing a number of message sets,
>> >> which
>> >> >> are
>> >> >> > grouped first by topic and second by partition (a 32-bit integer).
>> >> >> > However, each
>> >> >> > message in a message set contains a Key field, which is described
>> as
>> >> >> follows:
>> >> >> >
>> >> >> >     The key is an optional message key that was used for partition
>> >> >> assignment.
>> >> >> >     The key can be null.
>> >> >> >
>> >> >> > I notice the use of "was" (past tense) above.  That seems to
>> suggest
>> >> >> that the
>> >> >> > Key field was once used to specify a partition (at the granularity
>> of
>> >> >> each
>> >> >> > individual message), but the plan for the future is to instead use
>> the
>> >> >> 32-bit
>> >> >> > partition value preceding each message set.  Is this correct?  If
>> so,
>> >> >> when I am
>> >> >> > creating a produce request for 0.8, what should I use for the
>> 32-bit
>> >> >> partition
>> >> >> > value, and how does this relate to the Key field of each individual
>> >> >> message?
>> >> >> > Ideally, I would like to just send a produce request and let the
>> >> broker
>> >> >> choose
>> >> >> > the partition.  How do I accomplish this in 0.8, and are there
>> plans
>> >> to
>> >> >> change
>> >> >> > this after 0.8?
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Dave
>> >> >> >
>> >> >> > On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <
>> >> neha.narkhede@gmail.com>
>> >> >> wrote:
>> >> >> >> No. In 0.8, if you don't specify a key for a message, it is sent
>> to
>> >> any
>> >> >> of
>> >> >> >> the available partitions. In other words, the partition id is
>> >> selected
>> >> >> on
>> >> >> >> the partition and the server doesn't get -1 as the partition id.
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> Neha
>> >> >> >>
>> >> >> >>
>> >> >> >> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <
>> >> dspeterson@tagged.com
>> >> >> >wrote:
>> >> >> >>
>> >> >> >>> In the version 0.8 wire format for a produce request, does a
>> value
>> >> of
>> >> >> -1
>> >> >> >>> still indicate "use a random partition" as it did for 0.7?
>> >> >> >>>
>> >> >> >>> Thanks,
>> >> >> >>> Dave
>> >> >> >>>
>> >> >>
>> >> >>
>> >> >>
>> >>
>>

Re: produce request wire format question

Posted by Neha Narkhede <ne...@gmail.com>.

1. Correct
2. The producer does not use or depend on zookeeper anymore. It refreshes
its view of the cluster metadata by using a TopicMetadataRequest to any of
the kafka brokers. It maps a message to a partition using the following
rules -
2.1 If a message has no key, use any available partition
2.2 If a message has a key and the user has defined a custom partitioner,
use it to map the key to a partition id
2.3 If a message has a key and the user has not defined a custom
partitioner, use the default hash based partitioner that ships with Kafka

Thanks,
Neha


On Wed, May 22, 2013 at 1:33 PM, Dave Peterson <ds...@tagged.com>wrote:

> Ok, the picture I have in my mind of how things work in 0.8 (from a
> producer's point of view) is as follows:
>
>     1.  An application program sends log messages to a producer.  Each
>         message is provided as a key/value pair, where the key is chosen
>         by the application and the value is the message contents.  By its
>         choice of key, the application may influence or control which
>         partition the message gets sent to.
>
>     2.  The producer receives messages as key/value pairs.  From talking
>         with zookeeper, it knows the set of available brokers and which
>         partitions each broker has.  If the sending application provided a
> key
>         for a given message, the contents of the key may optionally
>         influence the producer's choice of broker and partition to send the
>         message to, according to some convention understood by both
>         application program and producer.
>
> Is this correct?
>
> Thanks,
> Dave
>
> On Wed, May 22, 2013 at 9:28 AM, Jun Rao <ju...@gmail.com> wrote:
> > Dave,
> >
> > Currently, the broker expects each producer request to specify the exact
> > partition id (-1 is on longer valid). The mapping from a message to a
> > partition is done at the producer client. The producer can choose a
> random
> > partition (from the existing list of partitions) or deterministically
> > choose a partition based on the key.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Tue, May 21, 2013 at 1:12 PM, Dave Peterson <dspeterson@tagged.com
> >wrote:
> >
> >> In my case, there is a load balancer between the producers and the
> >> brokers, so I want the behavior described for the Java client (null key
> >> specifies "any partition").  If the Key field of each individual message
> >> specifies the partition to send it to, then I don't understand the
> purpose
> >> of the 32-bit partition identifier that precedes each message set in a
> >> produce request: what if a produce request specifies "partition N" for a
> >> given message set, and then each individual message in the set
> >> specifies a different partition in its Key field?  Also, the above-
> >> mentioned partition identifier is a 32-bit integer and the Key field of
> >> each individual message can contain data of arbitrary length, which
> >> seems inconsistent.  Is a partition identifier a 32-bit integer, or can
> it
> >> be of arbitrary length?
> >>
> >> Thanks,
> >> Dave
> >>
> >> On Tue, May 21, 2013 at 12:30 PM, Neha Narkhede <
> neha.narkhede@gmail.com>
> >> wrote:
> >> > Dave,
> >> >
> >> > Colin described the producer behavior of picking the partition for a
> >> > message before it is sent to Kafka broker correctly. However, I'm
> >> > interested in knowing your use case a little before to see why you
> would
> >> > rather have the broker decide the partition?
> >> >
> >> > Thanks,
> >> > Neha
> >> >
> >> >
> >> > On Tue, May 21, 2013 at 12:05 PM, Colin Blower <cblower@barracuda.com
> >> >wrote:
> >> >
> >> >> The key is used by the client to decide which partition to send the
> >> >> message to. By the time the client is creating the produce request,
> it
> >> >> should be known which partition each message is being sent to. I
> believe
> >> >> Neha described the behavior of the Java client which sends messages
> with
> >> >> a null key to any partition.
> >> >>
> >> >> The key is described in past tense because of the use case for
> >> >> persisting keys with messages. The key is persisted through the
> broker
> >> >> so that a consumer knows what key was used to partition the message
> on
> >> >> the producer side.
> >> >>
> >> >> I don't believe that you can have the broker decide which partition a
> >> >> message goes to.
> >> >>
> >> >> --
> >> >> Colin B.
> >> >>
> >> >> On 05/21/2013 11:48 AM, Dave Peterson wrote:
> >> >> > I'm looking at the document entitled "A Guide to the Kafka
> Protocol"
> >> >> > located here:
> >> >> >
> >> >> >
> https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html
> >> >> >
> >> >> > It shows a produce request as containing a number of message sets,
> >> which
> >> >> are
> >> >> > grouped first by topic and second by partition (a 32-bit integer).
> >> >> > However, each
> >> >> > message in a message set contains a Key field, which is described
> as
> >> >> follows:
> >> >> >
> >> >> >     The key is an optional message key that was used for partition
> >> >> assignment.
> >> >> >     The key can be null.
> >> >> >
> >> >> > I notice the use of "was" (past tense) above.  That seems to
> suggest
> >> >> that the
> >> >> > Key field was once used to specify a partition (at the granularity
> of
> >> >> each
> >> >> > individual message), but the plan for the future is to instead use
> the
> >> >> 32-bit
> >> >> > partition value preceding each message set.  Is this correct?  If
> so,
> >> >> when I am
> >> >> > creating a produce request for 0.8, what should I use for the
> 32-bit
> >> >> partition
> >> >> > value, and how does this relate to the Key field of each individual
> >> >> message?
> >> >> > Ideally, I would like to just send a produce request and let the
> >> broker
> >> >> choose
> >> >> > the partition.  How do I accomplish this in 0.8, and are there
> plans
> >> to
> >> >> change
> >> >> > this after 0.8?
> >> >> >
> >> >> > Thanks,
> >> >> > Dave
> >> >> >
> >> >> > On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <
> >> neha.narkhede@gmail.com>
> >> >> wrote:
> >> >> >> No. In 0.8, if you don't specify a key for a message, it is sent
> to
> >> any
> >> >> of
> >> >> >> the available partitions. In other words, the partition id is
> >> selected
> >> >> on
> >> >> >> the partition and the server doesn't get -1 as the partition id.
> >> >> >>
> >> >> >> Thanks,
> >> >> >> Neha
> >> >> >>
> >> >> >>
> >> >> >> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <
> >> dspeterson@tagged.com
> >> >> >wrote:
> >> >> >>
> >> >> >>> In the version 0.8 wire format for a produce request, does a
> value
> >> of
> >> >> -1
> >> >> >>> still indicate "use a random partition" as it did for 0.7?
> >> >> >>>
> >> >> >>> Thanks,
> >> >> >>> Dave
> >> >> >>>
> >> >>
> >> >>
> >> >>
> >>
>

Re: produce request wire format question

Posted by Dave Peterson <ds...@tagged.com>.

Ok, the picture I have in my mind of how things work in 0.8 (from a
producer's point of view) is as follows:

    1.  An application program sends log messages to a producer.  Each
        message is provided as a key/value pair, where the key is chosen
        by the application and the value is the message contents.  By its
        choice of key, the application may influence or control which
        partition the message gets sent to.

    2.  The producer receives messages as key/value pairs.  From talking
        with zookeeper, it knows the set of available brokers and which
        partitions each broker has.  If the sending application provided a key
        for a given message, the contents of the key may optionally
        influence the producer's choice of broker and partition to send the
        message to, according to some convention understood by both
        application program and producer.

Is this correct?

Thanks,
Dave

On Wed, May 22, 2013 at 9:28 AM, Jun Rao <ju...@gmail.com> wrote:
> Dave,
>
> Currently, the broker expects each producer request to specify the exact
> partition id (-1 is on longer valid). The mapping from a message to a
> partition is done at the producer client. The producer can choose a random
> partition (from the existing list of partitions) or deterministically
> choose a partition based on the key.
>
> Thanks,
>
> Jun
>
>
> On Tue, May 21, 2013 at 1:12 PM, Dave Peterson <ds...@tagged.com>wrote:
>
>> In my case, there is a load balancer between the producers and the
>> brokers, so I want the behavior described for the Java client (null key
>> specifies "any partition").  If the Key field of each individual message
>> specifies the partition to send it to, then I don't understand the purpose
>> of the 32-bit partition identifier that precedes each message set in a
>> produce request: what if a produce request specifies "partition N" for a
>> given message set, and then each individual message in the set
>> specifies a different partition in its Key field?  Also, the above-
>> mentioned partition identifier is a 32-bit integer and the Key field of
>> each individual message can contain data of arbitrary length, which
>> seems inconsistent.  Is a partition identifier a 32-bit integer, or can it
>> be of arbitrary length?
>>
>> Thanks,
>> Dave
>>
>> On Tue, May 21, 2013 at 12:30 PM, Neha Narkhede <ne...@gmail.com>
>> wrote:
>> > Dave,
>> >
>> > Colin described the producer behavior of picking the partition for a
>> > message before it is sent to Kafka broker correctly. However, I'm
>> > interested in knowing your use case a little before to see why you would
>> > rather have the broker decide the partition?
>> >
>> > Thanks,
>> > Neha
>> >
>> >
>> > On Tue, May 21, 2013 at 12:05 PM, Colin Blower <cblower@barracuda.com
>> >wrote:
>> >
>> >> The key is used by the client to decide which partition to send the
>> >> message to. By the time the client is creating the produce request, it
>> >> should be known which partition each message is being sent to. I believe
>> >> Neha described the behavior of the Java client which sends messages with
>> >> a null key to any partition.
>> >>
>> >> The key is described in past tense because of the use case for
>> >> persisting keys with messages. The key is persisted through the broker
>> >> so that a consumer knows what key was used to partition the message on
>> >> the producer side.
>> >>
>> >> I don't believe that you can have the broker decide which partition a
>> >> message goes to.
>> >>
>> >> --
>> >> Colin B.
>> >>
>> >> On 05/21/2013 11:48 AM, Dave Peterson wrote:
>> >> > I'm looking at the document entitled "A Guide to the Kafka Protocol"
>> >> > located here:
>> >> >
>> >> >     https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html
>> >> >
>> >> > It shows a produce request as containing a number of message sets,
>> which
>> >> are
>> >> > grouped first by topic and second by partition (a 32-bit integer).
>> >> > However, each
>> >> > message in a message set contains a Key field, which is described as
>> >> follows:
>> >> >
>> >> >     The key is an optional message key that was used for partition
>> >> assignment.
>> >> >     The key can be null.
>> >> >
>> >> > I notice the use of "was" (past tense) above.  That seems to suggest
>> >> that the
>> >> > Key field was once used to specify a partition (at the granularity of
>> >> each
>> >> > individual message), but the plan for the future is to instead use the
>> >> 32-bit
>> >> > partition value preceding each message set.  Is this correct?  If so,
>> >> when I am
>> >> > creating a produce request for 0.8, what should I use for the 32-bit
>> >> partition
>> >> > value, and how does this relate to the Key field of each individual
>> >> message?
>> >> > Ideally, I would like to just send a produce request and let the
>> broker
>> >> choose
>> >> > the partition.  How do I accomplish this in 0.8, and are there plans
>> to
>> >> change
>> >> > this after 0.8?
>> >> >
>> >> > Thanks,
>> >> > Dave
>> >> >
>> >> > On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <
>> neha.narkhede@gmail.com>
>> >> wrote:
>> >> >> No. In 0.8, if you don't specify a key for a message, it is sent to
>> any
>> >> of
>> >> >> the available partitions. In other words, the partition id is
>> selected
>> >> on
>> >> >> the partition and the server doesn't get -1 as the partition id.
>> >> >>
>> >> >> Thanks,
>> >> >> Neha
>> >> >>
>> >> >>
>> >> >> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <
>> dspeterson@tagged.com
>> >> >wrote:
>> >> >>
>> >> >>> In the version 0.8 wire format for a produce request, does a value
>> of
>> >> -1
>> >> >>> still indicate "use a random partition" as it did for 0.7?
>> >> >>>
>> >> >>> Thanks,
>> >> >>> Dave
>> >> >>>
>> >>
>> >>
>> >>
>>

Re: produce request wire format question

Posted by Jun Rao <ju...@gmail.com>.

Dave,

Currently, the broker expects each producer request to specify the exact
partition id (-1 is on longer valid). The mapping from a message to a
partition is done at the producer client. The producer can choose a random
partition (from the existing list of partitions) or deterministically
choose a partition based on the key.

Thanks,

Jun


On Tue, May 21, 2013 at 1:12 PM, Dave Peterson <ds...@tagged.com>wrote:

> In my case, there is a load balancer between the producers and the
> brokers, so I want the behavior described for the Java client (null key
> specifies "any partition").  If the Key field of each individual message
> specifies the partition to send it to, then I don't understand the purpose
> of the 32-bit partition identifier that precedes each message set in a
> produce request: what if a produce request specifies "partition N" for a
> given message set, and then each individual message in the set
> specifies a different partition in its Key field?  Also, the above-
> mentioned partition identifier is a 32-bit integer and the Key field of
> each individual message can contain data of arbitrary length, which
> seems inconsistent.  Is a partition identifier a 32-bit integer, or can it
> be of arbitrary length?
>
> Thanks,
> Dave
>
> On Tue, May 21, 2013 at 12:30 PM, Neha Narkhede <ne...@gmail.com>
> wrote:
> > Dave,
> >
> > Colin described the producer behavior of picking the partition for a
> > message before it is sent to Kafka broker correctly. However, I'm
> > interested in knowing your use case a little before to see why you would
> > rather have the broker decide the partition?
> >
> > Thanks,
> > Neha
> >
> >
> > On Tue, May 21, 2013 at 12:05 PM, Colin Blower <cblower@barracuda.com
> >wrote:
> >
> >> The key is used by the client to decide which partition to send the
> >> message to. By the time the client is creating the produce request, it
> >> should be known which partition each message is being sent to. I believe
> >> Neha described the behavior of the Java client which sends messages with
> >> a null key to any partition.
> >>
> >> The key is described in past tense because of the use case for
> >> persisting keys with messages. The key is persisted through the broker
> >> so that a consumer knows what key was used to partition the message on
> >> the producer side.
> >>
> >> I don't believe that you can have the broker decide which partition a
> >> message goes to.
> >>
> >> --
> >> Colin B.
> >>
> >> On 05/21/2013 11:48 AM, Dave Peterson wrote:
> >> > I'm looking at the document entitled "A Guide to the Kafka Protocol"
> >> > located here:
> >> >
> >> >     https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html
> >> >
> >> > It shows a produce request as containing a number of message sets,
> which
> >> are
> >> > grouped first by topic and second by partition (a 32-bit integer).
> >> > However, each
> >> > message in a message set contains a Key field, which is described as
> >> follows:
> >> >
> >> >     The key is an optional message key that was used for partition
> >> assignment.
> >> >     The key can be null.
> >> >
> >> > I notice the use of "was" (past tense) above.  That seems to suggest
> >> that the
> >> > Key field was once used to specify a partition (at the granularity of
> >> each
> >> > individual message), but the plan for the future is to instead use the
> >> 32-bit
> >> > partition value preceding each message set.  Is this correct?  If so,
> >> when I am
> >> > creating a produce request for 0.8, what should I use for the 32-bit
> >> partition
> >> > value, and how does this relate to the Key field of each individual
> >> message?
> >> > Ideally, I would like to just send a produce request and let the
> broker
> >> choose
> >> > the partition.  How do I accomplish this in 0.8, and are there plans
> to
> >> change
> >> > this after 0.8?
> >> >
> >> > Thanks,
> >> > Dave
> >> >
> >> > On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <
> neha.narkhede@gmail.com>
> >> wrote:
> >> >> No. In 0.8, if you don't specify a key for a message, it is sent to
> any
> >> of
> >> >> the available partitions. In other words, the partition id is
> selected
> >> on
> >> >> the partition and the server doesn't get -1 as the partition id.
> >> >>
> >> >> Thanks,
> >> >> Neha
> >> >>
> >> >>
> >> >> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <
> dspeterson@tagged.com
> >> >wrote:
> >> >>
> >> >>> In the version 0.8 wire format for a produce request, does a value
> of
> >> -1
> >> >>> still indicate "use a random partition" as it did for 0.7?
> >> >>>
> >> >>> Thanks,
> >> >>> Dave
> >> >>>
> >>
> >>
> >>
>

Re: produce request wire format question

Posted by Dave Peterson <ds...@tagged.com>.

In my case, there is a load balancer between the producers and the
brokers, so I want the behavior described for the Java client (null key
specifies "any partition").  If the Key field of each individual message
specifies the partition to send it to, then I don't understand the purpose
of the 32-bit partition identifier that precedes each message set in a
produce request: what if a produce request specifies "partition N" for a
given message set, and then each individual message in the set
specifies a different partition in its Key field?  Also, the above-
mentioned partition identifier is a 32-bit integer and the Key field of
each individual message can contain data of arbitrary length, which
seems inconsistent.  Is a partition identifier a 32-bit integer, or can it
be of arbitrary length?

Thanks,
Dave

On Tue, May 21, 2013 at 12:30 PM, Neha Narkhede <ne...@gmail.com> wrote:
> Dave,
>
> Colin described the producer behavior of picking the partition for a
> message before it is sent to Kafka broker correctly. However, I'm
> interested in knowing your use case a little before to see why you would
> rather have the broker decide the partition?
>
> Thanks,
> Neha
>
>
> On Tue, May 21, 2013 at 12:05 PM, Colin Blower <cb...@barracuda.com>wrote:
>
>> The key is used by the client to decide which partition to send the
>> message to. By the time the client is creating the produce request, it
>> should be known which partition each message is being sent to. I believe
>> Neha described the behavior of the Java client which sends messages with
>> a null key to any partition.
>>
>> The key is described in past tense because of the use case for
>> persisting keys with messages. The key is persisted through the broker
>> so that a consumer knows what key was used to partition the message on
>> the producer side.
>>
>> I don't believe that you can have the broker decide which partition a
>> message goes to.
>>
>> --
>> Colin B.
>>
>> On 05/21/2013 11:48 AM, Dave Peterson wrote:
>> > I'm looking at the document entitled "A Guide to the Kafka Protocol"
>> > located here:
>> >
>> >     https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html
>> >
>> > It shows a produce request as containing a number of message sets, which
>> are
>> > grouped first by topic and second by partition (a 32-bit integer).
>> > However, each
>> > message in a message set contains a Key field, which is described as
>> follows:
>> >
>> >     The key is an optional message key that was used for partition
>> assignment.
>> >     The key can be null.
>> >
>> > I notice the use of "was" (past tense) above.  That seems to suggest
>> that the
>> > Key field was once used to specify a partition (at the granularity of
>> each
>> > individual message), but the plan for the future is to instead use the
>> 32-bit
>> > partition value preceding each message set.  Is this correct?  If so,
>> when I am
>> > creating a produce request for 0.8, what should I use for the 32-bit
>> partition
>> > value, and how does this relate to the Key field of each individual
>> message?
>> > Ideally, I would like to just send a produce request and let the broker
>> choose
>> > the partition.  How do I accomplish this in 0.8, and are there plans to
>> change
>> > this after 0.8?
>> >
>> > Thanks,
>> > Dave
>> >
>> > On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <ne...@gmail.com>
>> wrote:
>> >> No. In 0.8, if you don't specify a key for a message, it is sent to any
>> of
>> >> the available partitions. In other words, the partition id is selected
>> on
>> >> the partition and the server doesn't get -1 as the partition id.
>> >>
>> >> Thanks,
>> >> Neha
>> >>
>> >>
>> >> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <dspeterson@tagged.com
>> >wrote:
>> >>
>> >>> In the version 0.8 wire format for a produce request, does a value of
>> -1
>> >>> still indicate "use a random partition" as it did for 0.7?
>> >>>
>> >>> Thanks,
>> >>> Dave
>> >>>
>>
>>
>>

Re: produce request wire format question

Posted by Neha Narkhede <ne...@gmail.com>.

Dave,

Colin described the producer behavior of picking the partition for a
message before it is sent to Kafka broker correctly. However, I'm
interested in knowing your use case a little before to see why you would
rather have the broker decide the partition?

Thanks,
Neha


On Tue, May 21, 2013 at 12:05 PM, Colin Blower <cb...@barracuda.com>wrote:

> The key is used by the client to decide which partition to send the
> message to. By the time the client is creating the produce request, it
> should be known which partition each message is being sent to. I believe
> Neha described the behavior of the Java client which sends messages with
> a null key to any partition.
>
> The key is described in past tense because of the use case for
> persisting keys with messages. The key is persisted through the broker
> so that a consumer knows what key was used to partition the message on
> the producer side.
>
> I don't believe that you can have the broker decide which partition a
> message goes to.
>
> --
> Colin B.
>
> On 05/21/2013 11:48 AM, Dave Peterson wrote:
> > I'm looking at the document entitled "A Guide to the Kafka Protocol"
> > located here:
> >
> >     https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html
> >
> > It shows a produce request as containing a number of message sets, which
> are
> > grouped first by topic and second by partition (a 32-bit integer).
> > However, each
> > message in a message set contains a Key field, which is described as
> follows:
> >
> >     The key is an optional message key that was used for partition
> assignment.
> >     The key can be null.
> >
> > I notice the use of "was" (past tense) above.  That seems to suggest
> that the
> > Key field was once used to specify a partition (at the granularity of
> each
> > individual message), but the plan for the future is to instead use the
> 32-bit
> > partition value preceding each message set.  Is this correct?  If so,
> when I am
> > creating a produce request for 0.8, what should I use for the 32-bit
> partition
> > value, and how does this relate to the Key field of each individual
> message?
> > Ideally, I would like to just send a produce request and let the broker
> choose
> > the partition.  How do I accomplish this in 0.8, and are there plans to
> change
> > this after 0.8?
> >
> > Thanks,
> > Dave
> >
> > On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <ne...@gmail.com>
> wrote:
> >> No. In 0.8, if you don't specify a key for a message, it is sent to any
> of
> >> the available partitions. In other words, the partition id is selected
> on
> >> the partition and the server doesn't get -1 as the partition id.
> >>
> >> Thanks,
> >> Neha
> >>
> >>
> >> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <dspeterson@tagged.com
> >wrote:
> >>
> >>> In the version 0.8 wire format for a produce request, does a value of
> -1
> >>> still indicate "use a random partition" as it did for 0.7?
> >>>
> >>> Thanks,
> >>> Dave
> >>>
>
>
>

Re: produce request wire format question

Posted by Colin Blower <cb...@barracuda.com>.

The key is used by the client to decide which partition to send the
message to. By the time the client is creating the produce request, it
should be known which partition each message is being sent to. I believe
Neha described the behavior of the Java client which sends messages with
a null key to any partition.

The key is described in past tense because of the use case for
persisting keys with messages. The key is persisted through the broker
so that a consumer knows what key was used to partition the message on
the producer side.

I don't believe that you can have the broker decide which partition a
message goes to.

--
Colin B.

On 05/21/2013 11:48 AM, Dave Peterson wrote:
> I'm looking at the document entitled "A Guide to the Kafka Protocol"
> located here:
>
>     https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html
>
> It shows a produce request as containing a number of message sets, which are
> grouped first by topic and second by partition (a 32-bit integer).
> However, each
> message in a message set contains a Key field, which is described as follows:
>
>     The key is an optional message key that was used for partition assignment.
>     The key can be null.
>
> I notice the use of "was" (past tense) above.  That seems to suggest that the
> Key field was once used to specify a partition (at the granularity of each
> individual message), but the plan for the future is to instead use the 32-bit
> partition value preceding each message set.  Is this correct?  If so, when I am
> creating a produce request for 0.8, what should I use for the 32-bit partition
> value, and how does this relate to the Key field of each individual message?
> Ideally, I would like to just send a produce request and let the broker choose
> the partition.  How do I accomplish this in 0.8, and are there plans to change
> this after 0.8?
>
> Thanks,
> Dave
>
> On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <ne...@gmail.com> wrote:
>> No. In 0.8, if you don't specify a key for a message, it is sent to any of
>> the available partitions. In other words, the partition id is selected on
>> the partition and the server doesn't get -1 as the partition id.
>>
>> Thanks,
>> Neha
>>
>>
>> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <ds...@tagged.com>wrote:
>>
>>> In the version 0.8 wire format for a produce request, does a value of -1
>>> still indicate "use a random partition" as it did for 0.7?
>>>
>>> Thanks,
>>> Dave
>>>

Re: produce request wire format question

Posted by Dave Peterson <ds...@tagged.com>.

I'm looking at the document entitled "A Guide to the Kafka Protocol"
located here:

    https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html

It shows a produce request as containing a number of message sets, which are
grouped first by topic and second by partition (a 32-bit integer).
However, each
message in a message set contains a Key field, which is described as follows:

    The key is an optional message key that was used for partition assignment.
    The key can be null.

I notice the use of "was" (past tense) above.  That seems to suggest that the
Key field was once used to specify a partition (at the granularity of each
individual message), but the plan for the future is to instead use the 32-bit
partition value preceding each message set.  Is this correct?  If so, when I am
creating a produce request for 0.8, what should I use for the 32-bit partition
value, and how does this relate to the Key field of each individual message?
Ideally, I would like to just send a produce request and let the broker choose
the partition.  How do I accomplish this in 0.8, and are there plans to change
this after 0.8?

Thanks,
Dave

On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <ne...@gmail.com> wrote:
> No. In 0.8, if you don't specify a key for a message, it is sent to any of
> the available partitions. In other words, the partition id is selected on
> the partition and the server doesn't get -1 as the partition id.
>
> Thanks,
> Neha
>
>
> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <ds...@tagged.com>wrote:
>
>> In the version 0.8 wire format for a produce request, does a value of -1
>> still indicate "use a random partition" as it did for 0.7?
>>
>> Thanks,
>> Dave
>>

Re: produce request wire format question

Posted by Neha Narkhede <ne...@gmail.com>.

No. In 0.8, if you don't specify a key for a message, it is sent to any of
the available partitions. In other words, the partition id is selected on
the partition and the server doesn't get -1 as the partition id.

Thanks,
Neha

On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <ds...@tagged.com>wrote:

> In the version 0.8 wire format for a produce request, does a value of -1
> still indicate "use a random partition" as it did for 0.7?
>
> Thanks,
> Dave
>