You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by will martin <wm...@gmail.com> on 2012/08/20 17:02:23 UTC

7.1 support for List

This use case is defined by the following snippet from the Design section
of the doc pages.

class Producer {

public void send (ProducerData)

public void send (List<ProducerData>)

public void close()
}

I've tried various composites for the List<ProducerData> argument,
including strings and Messages. All of these throw serialization errors
deep in the engine.

Is the list form of send supported in 7.1?

Thanks in advance,
mmartin

Re: 7.1 support for List

Posted by Felix GV <fe...@mate1inc.com>.
As for scalability being a fundamental aspect of Kafka's design and
implementation, besides the design doc, I guess this would be another
primary reference...

http://www.youtube.com/watch?v=Eq3i2m8aJBI

It's a pretty interesting video that touches on many aspects of Kafka, not
just scalability :)

--
Felix



On Tue, Aug 21, 2012 at 10:41 AM, Felix GV <fe...@mate1inc.com> wrote:

> What I meant is that Kafka has been designed first and foremost as a
> high-throughput system, and it is achieving that with a couple techniques,
> but mainly by batching a bunch of events together so that it can benefit
> from the lesser overhead of writing sequentially (as opposed to random
> access).
>
> Whether you choose to publish synchronously or asynchronously should not
> change anything to the fact that Kafka can achieve a high throughput via
> batching.
>
> --
> Felix
>
>
>
>
> On Mon, Aug 20, 2012 at 10:55 PM, wm <wm...@gmail.com> wrote:
>
>> Felix. My regets for confusing the matter.  Please inform me of a primary
>> source for the canonical use case you reference, unless that was scoped to
>> the kafka community only. That sort of statement should be clearly
>> documented imho.
>>
>> I am considering the matter closed with respect to this list. I have 3
>> publish options each with some degree of autonomy from the calling code's
>> designed behavior.
>>
>> regards
>>
>>
>> On 08/20/2012 02:39 PM, Felix GV wrote:
>>
>>> I think the difference is merely that async publishing is a non-blocking
>>> call, whereas sync publishing is a blocking call, meaning that the code
>>> that does a sync publish call could choose to have an alternate behavior
>>> if
>>> the publish failed, whereas the code that does an async publish would
>>> never
>>> know whether the publish succeeded or not.
>>>
>>> But like I said, in both cases, you can configure the batching size at
>>> the
>>> producer level, and a batching size greater than 1 will provide you with
>>> better throughput capabilities... In fact, I think this is the canonical
>>> use case Kafka was originally built for.
>>>
>>> --
>>> Felix
>>>
>>>
>>>
>>> On Mon, Aug 20, 2012 at 2:24 PM, will martin <wm...@gmail.com>
>>> wrote:
>>>
>>>  My understanding is that async is not meant to be an immediate send. As
>>>> to
>>>> batching, I've not delved into the code differences.
>>>>
>>>> But batching the sync is not possible at the Producer higher level; at
>>>> least that's what I've tried and had no success with, the default and
>>>> string encoders cannot handle lists, although the documentation suggests
>>>> they can.
>>>>
>>>> I'm glad to be wrong on this; but I've had no luck with the serializer
>>>> deep
>>>> in scala code tree accepting a composite of any type containing either
>>>> Message or String.  I can batch myself, but doubt this is what any of us
>>>> think is the design goal?
>>>>
>>>>
>>>>
>>>> On Mon, Aug 20, 2012 at 1:06 PM, Felix GV <fe...@mate1inc.com> wrote:
>>>>
>>>>  This may not be entirely related to what you're talking about, but why
>>>>> would an async producer not be able to meet your throughput needs, and
>>>>> a
>>>>> sync producer be able to?
>>>>>
>>>>> Both sync and async producers can be configured to batch more than one
>>>>> message together, and that's pretty much the main thing that's required
>>>>>
>>>> to
>>>>
>>>>> be able to achieve good throughput, AFAIK.
>>>>>
>>>>> ...?
>>>>>
>>>>> --
>>>>> Felix
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Aug 20, 2012 at 12:49 PM, will martin <wm...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>  Thanks Neha. All my data is of 1 type. The serializer in place doesn't
>>>>>>
>>>>> seem
>>>>>
>>>>>> to handle an array of String.
>>>>>>
>>>>>> The ProducerData I use is a collection of same types of data wrapped
>>>>>>
>>>>> in a
>>>>
>>>>> single defintion, according to as I read spec.  Am I to understand
>>>>>>
>>>>> that,
>>>>
>>>>> having a producer batch records itself is unsupported?  The async
>>>>>>
>>>>> producer
>>>>>
>>>>>> can't meet my throughput needs and as I understand is targetted at
>>>>>>
>>>>> implicit
>>>>>
>>>>>> load balancing among different client machines.
>>>>>>
>>>>>> Additionally, the sync producer can meet my needs, but requires more
>>>>>>
>>>>> use
>>>>
>>>>> of
>>>>>
>>>>>> the lower-level design features. For maintenance, it'd be great if I
>>>>>>
>>>>> could
>>>>>
>>>>>> create a list of Strings, create a ProducerData<String, List<String>>
>>>>>>
>>>>> and
>>>>
>>>>> have this be serialized.
>>>>>>
>>>>>> It occurs to me that the described  serialization may need my
>>>>>>
>>>>> attention?
>>>>
>>>>> Thx
>>>>>>
>>>>>>
>>>>>> On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <
>>>>>>
>>>>> neha.narkhede@gmail.com
>>>>
>>>>> wrote:
>>>>>>> The producer takes in a "serializer.class" config that it uses to
>>>>>>> serialize data sent by the Producer. A Producer instance is tied to
>>>>>>> the type of data it is sending, so you won't be able to send data
>>>>>>> belonging to diverse types using the same Producer object.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Neha
>>>>>>>
>>>>>>> On Mon, Aug 20, 2012 at 8:02 AM, will martin <wm...@gmail.com>
>>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> This use case is defined by the following snippet from the Design
>>>>>>>>
>>>>>>> section
>>>>>>
>>>>>>> of the doc pages.
>>>>>>>>
>>>>>>>> class Producer {
>>>>>>>>
>>>>>>>> public void send (ProducerData)
>>>>>>>>
>>>>>>>> public void send (List<ProducerData>)
>>>>>>>>
>>>>>>>> public void close()
>>>>>>>> }
>>>>>>>>
>>>>>>>> I've tried various composites for the List<ProducerData> argument,
>>>>>>>> including strings and Messages. All of these throw serialization
>>>>>>>>
>>>>>>> errors
>>>>>
>>>>>> deep in the engine.
>>>>>>>>
>>>>>>>> Is the list form of send supported in 7.1?
>>>>>>>>
>>>>>>>> Thanks in advance,
>>>>>>>> mmartin
>>>>>>>>
>>>>>>>
>>
>

Re: 7.1 support for List

Posted by Felix GV <fe...@mate1inc.com>.
What I meant is that Kafka has been designed first and foremost as a
high-throughput system, and it is achieving that with a couple techniques,
but mainly by batching a bunch of events together so that it can benefit
from the lesser overhead of writing sequentially (as opposed to random
access).

Whether you choose to publish synchronously or asynchronously should not
change anything to the fact that Kafka can achieve a high throughput via
batching.

--
Felix



On Mon, Aug 20, 2012 at 10:55 PM, wm <wm...@gmail.com> wrote:

> Felix. My regets for confusing the matter.  Please inform me of a primary
> source for the canonical use case you reference, unless that was scoped to
> the kafka community only. That sort of statement should be clearly
> documented imho.
>
> I am considering the matter closed with respect to this list. I have 3
> publish options each with some degree of autonomy from the calling code's
> designed behavior.
>
> regards
>
>
> On 08/20/2012 02:39 PM, Felix GV wrote:
>
>> I think the difference is merely that async publishing is a non-blocking
>> call, whereas sync publishing is a blocking call, meaning that the code
>> that does a sync publish call could choose to have an alternate behavior
>> if
>> the publish failed, whereas the code that does an async publish would
>> never
>> know whether the publish succeeded or not.
>>
>> But like I said, in both cases, you can configure the batching size at the
>> producer level, and a batching size greater than 1 will provide you with
>> better throughput capabilities... In fact, I think this is the canonical
>> use case Kafka was originally built for.
>>
>> --
>> Felix
>>
>>
>>
>> On Mon, Aug 20, 2012 at 2:24 PM, will martin <wm...@gmail.com>
>> wrote:
>>
>>  My understanding is that async is not meant to be an immediate send. As
>>> to
>>> batching, I've not delved into the code differences.
>>>
>>> But batching the sync is not possible at the Producer higher level; at
>>> least that's what I've tried and had no success with, the default and
>>> string encoders cannot handle lists, although the documentation suggests
>>> they can.
>>>
>>> I'm glad to be wrong on this; but I've had no luck with the serializer
>>> deep
>>> in scala code tree accepting a composite of any type containing either
>>> Message or String.  I can batch myself, but doubt this is what any of us
>>> think is the design goal?
>>>
>>>
>>>
>>> On Mon, Aug 20, 2012 at 1:06 PM, Felix GV <fe...@mate1inc.com> wrote:
>>>
>>>  This may not be entirely related to what you're talking about, but why
>>>> would an async producer not be able to meet your throughput needs, and a
>>>> sync producer be able to?
>>>>
>>>> Both sync and async producers can be configured to batch more than one
>>>> message together, and that's pretty much the main thing that's required
>>>>
>>> to
>>>
>>>> be able to achieve good throughput, AFAIK.
>>>>
>>>> ...?
>>>>
>>>> --
>>>> Felix
>>>>
>>>>
>>>>
>>>> On Mon, Aug 20, 2012 at 12:49 PM, will martin <wm...@gmail.com>
>>>> wrote:
>>>>
>>>>  Thanks Neha. All my data is of 1 type. The serializer in place doesn't
>>>>>
>>>> seem
>>>>
>>>>> to handle an array of String.
>>>>>
>>>>> The ProducerData I use is a collection of same types of data wrapped
>>>>>
>>>> in a
>>>
>>>> single defintion, according to as I read spec.  Am I to understand
>>>>>
>>>> that,
>>>
>>>> having a producer batch records itself is unsupported?  The async
>>>>>
>>>> producer
>>>>
>>>>> can't meet my throughput needs and as I understand is targetted at
>>>>>
>>>> implicit
>>>>
>>>>> load balancing among different client machines.
>>>>>
>>>>> Additionally, the sync producer can meet my needs, but requires more
>>>>>
>>>> use
>>>
>>>> of
>>>>
>>>>> the lower-level design features. For maintenance, it'd be great if I
>>>>>
>>>> could
>>>>
>>>>> create a list of Strings, create a ProducerData<String, List<String>>
>>>>>
>>>> and
>>>
>>>> have this be serialized.
>>>>>
>>>>> It occurs to me that the described  serialization may need my
>>>>>
>>>> attention?
>>>
>>>> Thx
>>>>>
>>>>>
>>>>> On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <
>>>>>
>>>> neha.narkhede@gmail.com
>>>
>>>> wrote:
>>>>>> The producer takes in a "serializer.class" config that it uses to
>>>>>> serialize data sent by the Producer. A Producer instance is tied to
>>>>>> the type of data it is sending, so you won't be able to send data
>>>>>> belonging to diverse types using the same Producer object.
>>>>>>
>>>>>> Thanks,
>>>>>> Neha
>>>>>>
>>>>>> On Mon, Aug 20, 2012 at 8:02 AM, will martin <wm...@gmail.com>
>>>>>>
>>>>> wrote:
>>>>>
>>>>>> This use case is defined by the following snippet from the Design
>>>>>>>
>>>>>> section
>>>>>
>>>>>> of the doc pages.
>>>>>>>
>>>>>>> class Producer {
>>>>>>>
>>>>>>> public void send (ProducerData)
>>>>>>>
>>>>>>> public void send (List<ProducerData>)
>>>>>>>
>>>>>>> public void close()
>>>>>>> }
>>>>>>>
>>>>>>> I've tried various composites for the List<ProducerData> argument,
>>>>>>> including strings and Messages. All of these throw serialization
>>>>>>>
>>>>>> errors
>>>>
>>>>> deep in the engine.
>>>>>>>
>>>>>>> Is the list form of send supported in 7.1?
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>> mmartin
>>>>>>>
>>>>>>
>

Re: 7.1 support for List

Posted by wm <wm...@gmail.com>.
Felix. My regets for confusing the matter.  Please inform me of a 
primary source for the canonical use case you reference, unless that was 
scoped to the kafka community only. That sort of statement should be 
clearly documented imho.

I am considering the matter closed with respect to this list. I have 3 
publish options each with some degree of autonomy from the calling 
code's designed behavior.

regards

On 08/20/2012 02:39 PM, Felix GV wrote:
> I think the difference is merely that async publishing is a non-blocking
> call, whereas sync publishing is a blocking call, meaning that the code
> that does a sync publish call could choose to have an alternate behavior if
> the publish failed, whereas the code that does an async publish would never
> know whether the publish succeeded or not.
>
> But like I said, in both cases, you can configure the batching size at the
> producer level, and a batching size greater than 1 will provide you with
> better throughput capabilities... In fact, I think this is the canonical
> use case Kafka was originally built for.
>
> --
> Felix
>
>
>
> On Mon, Aug 20, 2012 at 2:24 PM, will martin <wm...@gmail.com> wrote:
>
>> My understanding is that async is not meant to be an immediate send. As to
>> batching, I've not delved into the code differences.
>>
>> But batching the sync is not possible at the Producer higher level; at
>> least that's what I've tried and had no success with, the default and
>> string encoders cannot handle lists, although the documentation suggests
>> they can.
>>
>> I'm glad to be wrong on this; but I've had no luck with the serializer deep
>> in scala code tree accepting a composite of any type containing either
>> Message or String.  I can batch myself, but doubt this is what any of us
>> think is the design goal?
>>
>>
>>
>> On Mon, Aug 20, 2012 at 1:06 PM, Felix GV <fe...@mate1inc.com> wrote:
>>
>>> This may not be entirely related to what you're talking about, but why
>>> would an async producer not be able to meet your throughput needs, and a
>>> sync producer be able to?
>>>
>>> Both sync and async producers can be configured to batch more than one
>>> message together, and that's pretty much the main thing that's required
>> to
>>> be able to achieve good throughput, AFAIK.
>>>
>>> ...?
>>>
>>> --
>>> Felix
>>>
>>>
>>>
>>> On Mon, Aug 20, 2012 at 12:49 PM, will martin <wm...@gmail.com>
>>> wrote:
>>>
>>>> Thanks Neha. All my data is of 1 type. The serializer in place doesn't
>>> seem
>>>> to handle an array of String.
>>>>
>>>> The ProducerData I use is a collection of same types of data wrapped
>> in a
>>>> single defintion, according to as I read spec.  Am I to understand
>> that,
>>>> having a producer batch records itself is unsupported?  The async
>>> producer
>>>> can't meet my throughput needs and as I understand is targetted at
>>> implicit
>>>> load balancing among different client machines.
>>>>
>>>> Additionally, the sync producer can meet my needs, but requires more
>> use
>>> of
>>>> the lower-level design features. For maintenance, it'd be great if I
>>> could
>>>> create a list of Strings, create a ProducerData<String, List<String>>
>> and
>>>> have this be serialized.
>>>>
>>>> It occurs to me that the described  serialization may need my
>> attention?
>>>> Thx
>>>>
>>>>
>>>> On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <
>> neha.narkhede@gmail.com
>>>>> wrote:
>>>>> The producer takes in a "serializer.class" config that it uses to
>>>>> serialize data sent by the Producer. A Producer instance is tied to
>>>>> the type of data it is sending, so you won't be able to send data
>>>>> belonging to diverse types using the same Producer object.
>>>>>
>>>>> Thanks,
>>>>> Neha
>>>>>
>>>>> On Mon, Aug 20, 2012 at 8:02 AM, will martin <wm...@gmail.com>
>>>> wrote:
>>>>>> This use case is defined by the following snippet from the Design
>>>> section
>>>>>> of the doc pages.
>>>>>>
>>>>>> class Producer {
>>>>>>
>>>>>> public void send (ProducerData)
>>>>>>
>>>>>> public void send (List<ProducerData>)
>>>>>>
>>>>>> public void close()
>>>>>> }
>>>>>>
>>>>>> I've tried various composites for the List<ProducerData> argument,
>>>>>> including strings and Messages. All of these throw serialization
>>> errors
>>>>>> deep in the engine.
>>>>>>
>>>>>> Is the list form of send supported in 7.1?
>>>>>>
>>>>>> Thanks in advance,
>>>>>> mmartin


Re: 7.1 support for List

Posted by Felix GV <fe...@mate1inc.com>.
I think the difference is merely that async publishing is a non-blocking
call, whereas sync publishing is a blocking call, meaning that the code
that does a sync publish call could choose to have an alternate behavior if
the publish failed, whereas the code that does an async publish would never
know whether the publish succeeded or not.

But like I said, in both cases, you can configure the batching size at the
producer level, and a batching size greater than 1 will provide you with
better throughput capabilities... In fact, I think this is the canonical
use case Kafka was originally built for.

--
Felix



On Mon, Aug 20, 2012 at 2:24 PM, will martin <wm...@gmail.com> wrote:

> My understanding is that async is not meant to be an immediate send. As to
> batching, I've not delved into the code differences.
>
> But batching the sync is not possible at the Producer higher level; at
> least that's what I've tried and had no success with, the default and
> string encoders cannot handle lists, although the documentation suggests
> they can.
>
> I'm glad to be wrong on this; but I've had no luck with the serializer deep
> in scala code tree accepting a composite of any type containing either
> Message or String.  I can batch myself, but doubt this is what any of us
> think is the design goal?
>
>
>
> On Mon, Aug 20, 2012 at 1:06 PM, Felix GV <fe...@mate1inc.com> wrote:
>
> > This may not be entirely related to what you're talking about, but why
> > would an async producer not be able to meet your throughput needs, and a
> > sync producer be able to?
> >
> > Both sync and async producers can be configured to batch more than one
> > message together, and that's pretty much the main thing that's required
> to
> > be able to achieve good throughput, AFAIK.
> >
> > ...?
> >
> > --
> > Felix
> >
> >
> >
> > On Mon, Aug 20, 2012 at 12:49 PM, will martin <wm...@gmail.com>
> > wrote:
> >
> > > Thanks Neha. All my data is of 1 type. The serializer in place doesn't
> > seem
> > > to handle an array of String.
> > >
> > > The ProducerData I use is a collection of same types of data wrapped
> in a
> > > single defintion, according to as I read spec.  Am I to understand
> that,
> > > having a producer batch records itself is unsupported?  The async
> > producer
> > > can't meet my throughput needs and as I understand is targetted at
> > implicit
> > > load balancing among different client machines.
> > >
> > > Additionally, the sync producer can meet my needs, but requires more
> use
> > of
> > > the lower-level design features. For maintenance, it'd be great if I
> > could
> > > create a list of Strings, create a ProducerData<String, List<String>>
> and
> > > have this be serialized.
> > >
> > > It occurs to me that the described  serialization may need my
> attention?
> > >
> > > Thx
> > >
> > >
> > > On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <
> neha.narkhede@gmail.com
> > > >wrote:
> > >
> > > > The producer takes in a "serializer.class" config that it uses to
> > > > serialize data sent by the Producer. A Producer instance is tied to
> > > > the type of data it is sending, so you won't be able to send data
> > > > belonging to diverse types using the same Producer object.
> > > >
> > > > Thanks,
> > > > Neha
> > > >
> > > > On Mon, Aug 20, 2012 at 8:02 AM, will martin <wm...@gmail.com>
> > > wrote:
> > > > > This use case is defined by the following snippet from the Design
> > > section
> > > > > of the doc pages.
> > > > >
> > > > > class Producer {
> > > > >
> > > > > public void send (ProducerData)
> > > > >
> > > > > public void send (List<ProducerData>)
> > > > >
> > > > > public void close()
> > > > > }
> > > > >
> > > > > I've tried various composites for the List<ProducerData> argument,
> > > > > including strings and Messages. All of these throw serialization
> > errors
> > > > > deep in the engine.
> > > > >
> > > > > Is the list form of send supported in 7.1?
> > > > >
> > > > > Thanks in advance,
> > > > > mmartin
> > > >
> > >
> >
>

Re: 7.1 support for List

Posted by will martin <wm...@gmail.com>.
My understanding is that async is not meant to be an immediate send. As to
batching, I've not delved into the code differences.

But batching the sync is not possible at the Producer higher level; at
least that's what I've tried and had no success with, the default and
string encoders cannot handle lists, although the documentation suggests
they can.

I'm glad to be wrong on this; but I've had no luck with the serializer deep
in scala code tree accepting a composite of any type containing either
Message or String.  I can batch myself, but doubt this is what any of us
think is the design goal?



On Mon, Aug 20, 2012 at 1:06 PM, Felix GV <fe...@mate1inc.com> wrote:

> This may not be entirely related to what you're talking about, but why
> would an async producer not be able to meet your throughput needs, and a
> sync producer be able to?
>
> Both sync and async producers can be configured to batch more than one
> message together, and that's pretty much the main thing that's required to
> be able to achieve good throughput, AFAIK.
>
> ...?
>
> --
> Felix
>
>
>
> On Mon, Aug 20, 2012 at 12:49 PM, will martin <wm...@gmail.com>
> wrote:
>
> > Thanks Neha. All my data is of 1 type. The serializer in place doesn't
> seem
> > to handle an array of String.
> >
> > The ProducerData I use is a collection of same types of data wrapped in a
> > single defintion, according to as I read spec.  Am I to understand that,
> > having a producer batch records itself is unsupported?  The async
> producer
> > can't meet my throughput needs and as I understand is targetted at
> implicit
> > load balancing among different client machines.
> >
> > Additionally, the sync producer can meet my needs, but requires more use
> of
> > the lower-level design features. For maintenance, it'd be great if I
> could
> > create a list of Strings, create a ProducerData<String, List<String>> and
> > have this be serialized.
> >
> > It occurs to me that the described  serialization may need my attention?
> >
> > Thx
> >
> >
> > On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <neha.narkhede@gmail.com
> > >wrote:
> >
> > > The producer takes in a "serializer.class" config that it uses to
> > > serialize data sent by the Producer. A Producer instance is tied to
> > > the type of data it is sending, so you won't be able to send data
> > > belonging to diverse types using the same Producer object.
> > >
> > > Thanks,
> > > Neha
> > >
> > > On Mon, Aug 20, 2012 at 8:02 AM, will martin <wm...@gmail.com>
> > wrote:
> > > > This use case is defined by the following snippet from the Design
> > section
> > > > of the doc pages.
> > > >
> > > > class Producer {
> > > >
> > > > public void send (ProducerData)
> > > >
> > > > public void send (List<ProducerData>)
> > > >
> > > > public void close()
> > > > }
> > > >
> > > > I've tried various composites for the List<ProducerData> argument,
> > > > including strings and Messages. All of these throw serialization
> errors
> > > > deep in the engine.
> > > >
> > > > Is the list form of send supported in 7.1?
> > > >
> > > > Thanks in advance,
> > > > mmartin
> > >
> >
>

Re: 7.1 support for List

Posted by will martin <wm...@gmail.com>.
Resolved?  Since a List is not a Message or String, I realize it does
qualify as a different data type.

And the Design doc notes, I missed it earlier, that there is a need for
"user defined encoder" for the composite forms of ProducerData.

thanks for the help in working through this.



On Mon, Aug 20, 2012 at 1:06 PM, Felix GV <fe...@mate1inc.com> wrote:

> This may not be entirely related to what you're talking about, but why
> would an async producer not be able to meet your throughput needs, and a
> sync producer be able to?
>
> Both sync and async producers can be configured to batch more than one
> message together, and that's pretty much the main thing that's required to
> be able to achieve good throughput, AFAIK.
>
> ...?
>
> --
> Felix
>
>
>
> On Mon, Aug 20, 2012 at 12:49 PM, will martin <wm...@gmail.com>
> wrote:
>
> > Thanks Neha. All my data is of 1 type. The serializer in place doesn't
> seem
> > to handle an array of String.
> >
> > The ProducerData I use is a collection of same types of data wrapped in a
> > single defintion, according to as I read spec.  Am I to understand that,
> > having a producer batch records itself is unsupported?  The async
> producer
> > can't meet my throughput needs and as I understand is targetted at
> implicit
> > load balancing among different client machines.
> >
> > Additionally, the sync producer can meet my needs, but requires more use
> of
> > the lower-level design features. For maintenance, it'd be great if I
> could
> > create a list of Strings, create a ProducerData<String, List<String>> and
> > have this be serialized.
> >
> > It occurs to me that the described  serialization may need my attention?
> >
> > Thx
> >
> >
> > On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <neha.narkhede@gmail.com
> > >wrote:
> >
> > > The producer takes in a "serializer.class" config that it uses to
> > > serialize data sent by the Producer. A Producer instance is tied to
> > > the type of data it is sending, so you won't be able to send data
> > > belonging to diverse types using the same Producer object.
> > >
> > > Thanks,
> > > Neha
> > >
> > > On Mon, Aug 20, 2012 at 8:02 AM, will martin <wm...@gmail.com>
> > wrote:
> > > > This use case is defined by the following snippet from the Design
> > section
> > > > of the doc pages.
> > > >
> > > > class Producer {
> > > >
> > > > public void send (ProducerData)
> > > >
> > > > public void send (List<ProducerData>)
> > > >
> > > > public void close()
> > > > }
> > > >
> > > > I've tried various composites for the List<ProducerData> argument,
> > > > including strings and Messages. All of these throw serialization
> errors
> > > > deep in the engine.
> > > >
> > > > Is the list form of send supported in 7.1?
> > > >
> > > > Thanks in advance,
> > > > mmartin
> > >
> >
>

Re: 7.1 support for List

Posted by Felix GV <fe...@mate1inc.com>.
This may not be entirely related to what you're talking about, but why
would an async producer not be able to meet your throughput needs, and a
sync producer be able to?

Both sync and async producers can be configured to batch more than one
message together, and that's pretty much the main thing that's required to
be able to achieve good throughput, AFAIK.

...?

--
Felix



On Mon, Aug 20, 2012 at 12:49 PM, will martin <wm...@gmail.com> wrote:

> Thanks Neha. All my data is of 1 type. The serializer in place doesn't seem
> to handle an array of String.
>
> The ProducerData I use is a collection of same types of data wrapped in a
> single defintion, according to as I read spec.  Am I to understand that,
> having a producer batch records itself is unsupported?  The async producer
> can't meet my throughput needs and as I understand is targetted at implicit
> load balancing among different client machines.
>
> Additionally, the sync producer can meet my needs, but requires more use of
> the lower-level design features. For maintenance, it'd be great if I could
> create a list of Strings, create a ProducerData<String, List<String>> and
> have this be serialized.
>
> It occurs to me that the described  serialization may need my attention?
>
> Thx
>
>
> On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <neha.narkhede@gmail.com
> >wrote:
>
> > The producer takes in a "serializer.class" config that it uses to
> > serialize data sent by the Producer. A Producer instance is tied to
> > the type of data it is sending, so you won't be able to send data
> > belonging to diverse types using the same Producer object.
> >
> > Thanks,
> > Neha
> >
> > On Mon, Aug 20, 2012 at 8:02 AM, will martin <wm...@gmail.com>
> wrote:
> > > This use case is defined by the following snippet from the Design
> section
> > > of the doc pages.
> > >
> > > class Producer {
> > >
> > > public void send (ProducerData)
> > >
> > > public void send (List<ProducerData>)
> > >
> > > public void close()
> > > }
> > >
> > > I've tried various composites for the List<ProducerData> argument,
> > > including strings and Messages. All of these throw serialization errors
> > > deep in the engine.
> > >
> > > Is the list form of send supported in 7.1?
> > >
> > > Thanks in advance,
> > > mmartin
> >
>

Re: 7.1 support for List

Posted by will martin <wm...@gmail.com>.
Thanks Neha. All my data is of 1 type. The serializer in place doesn't seem
to handle an array of String.

The ProducerData I use is a collection of same types of data wrapped in a
single defintion, according to as I read spec.  Am I to understand that,
having a producer batch records itself is unsupported?  The async producer
can't meet my throughput needs and as I understand is targetted at implicit
load balancing among different client machines.

Additionally, the sync producer can meet my needs, but requires more use of
the lower-level design features. For maintenance, it'd be great if I could
create a list of Strings, create a ProducerData<String, List<String>> and
have this be serialized.

It occurs to me that the described  serialization may need my attention?

Thx


On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <ne...@gmail.com>wrote:

> The producer takes in a "serializer.class" config that it uses to
> serialize data sent by the Producer. A Producer instance is tied to
> the type of data it is sending, so you won't be able to send data
> belonging to diverse types using the same Producer object.
>
> Thanks,
> Neha
>
> On Mon, Aug 20, 2012 at 8:02 AM, will martin <wm...@gmail.com> wrote:
> > This use case is defined by the following snippet from the Design section
> > of the doc pages.
> >
> > class Producer {
> >
> > public void send (ProducerData)
> >
> > public void send (List<ProducerData>)
> >
> > public void close()
> > }
> >
> > I've tried various composites for the List<ProducerData> argument,
> > including strings and Messages. All of these throw serialization errors
> > deep in the engine.
> >
> > Is the list form of send supported in 7.1?
> >
> > Thanks in advance,
> > mmartin
>

Re: 7.1 support for List

Posted by Neha Narkhede <ne...@gmail.com>.
The producer takes in a "serializer.class" config that it uses to
serialize data sent by the Producer. A Producer instance is tied to
the type of data it is sending, so you won't be able to send data
belonging to diverse types using the same Producer object.

Thanks,
Neha

On Mon, Aug 20, 2012 at 8:02 AM, will martin <wm...@gmail.com> wrote:
> This use case is defined by the following snippet from the Design section
> of the doc pages.
>
> class Producer {
>
> public void send (ProducerData)
>
> public void send (List<ProducerData>)
>
> public void close()
> }
>
> I've tried various composites for the List<ProducerData> argument,
> including strings and Messages. All of these throw serialization errors
> deep in the engine.
>
> Is the list form of send supported in 7.1?
>
> Thanks in advance,
> mmartin