You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Raphael Hsieh <ra...@gmail.com> on 2014/05/13 19:31:08 UTC

Are batches processed sequentially or in parallel?

In Storm Trident are batches processed sequentially?  Or are they all
processed in parallel?
If they are processed in parallel how does it handle multiple writers to a
datastore ?

I can understand this making sense for append-only implementations, but for
cases where we are updating values in a database, how does it make sure
that values are written, and in the database before another thread reads it
and tries to update it with different data?

Thanks
-- 
Raphael Hsieh

Re: Are batches processed sequentially or in parallel?

Posted by Nathan Marz <na...@nathanmarz.com>.
persistentAggregate uses partitionPersist underneath the hood.


On Wed, May 14, 2014 at 10:13 AM, Nathan Marz <na...@nathanmarz.com> wrote:

> Yes.
>
>
> On Tue, May 13, 2014 at 5:33 PM, Weide Zhang <we...@gmail.com> wrote:
>
>> Hi Nathan,
>>
>> I have a followup question on this. If I'm doing partitionPersist , will
>> my partitionPersist for each partition also preserves the batches in order
>> and commit at the same time coordinated by master batch coordinator ?
>>
>> Weide
>>
>>
>> On Tue, May 13, 2014 at 11:30 AM, Nathan Marz <na...@nathanmarz.com>wrote:
>>
>>> Both. topology.max.spout.pending specifies how many batches are
>>> processed in parallel. However, for state updates, the batches are
>>> processed sequentially. So the state update for batch 2 won't be executed
>>> until the state update for batch 1 succeeds.
>>>
>>>
>>> On Tue, May 13, 2014 at 10:31 AM, Raphael Hsieh <ra...@gmail.com>wrote:
>>>
>>>> In Storm Trident are batches processed sequentially?  Or are they all
>>>> processed in parallel?
>>>> If they are processed in parallel how does it handle multiple writers
>>>> to a datastore ?
>>>>
>>>> I can understand this making sense for append-only implementations, but
>>>> for cases where we are updating values in a database, how does it make sure
>>>> that values are written, and in the database before another thread reads it
>>>> and tries to update it with different data?
>>>>
>>>> Thanks
>>>> --
>>>> Raphael Hsieh
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Twitter: @nathanmarz
>>> http://nathanmarz.com
>>>
>>
>>
>
>
> --
> Twitter: @nathanmarz
> http://nathanmarz.com
>



-- 
Twitter: @nathanmarz
http://nathanmarz.com

Re: Are batches processed sequentially or in parallel?

Posted by Nathan Marz <na...@nathanmarz.com>.
Yes.


On Tue, May 13, 2014 at 5:33 PM, Weide Zhang <we...@gmail.com> wrote:

> Hi Nathan,
>
> I have a followup question on this. If I'm doing partitionPersist , will
> my partitionPersist for each partition also preserves the batches in order
> and commit at the same time coordinated by master batch coordinator ?
>
> Weide
>
>
> On Tue, May 13, 2014 at 11:30 AM, Nathan Marz <na...@nathanmarz.com>wrote:
>
>> Both. topology.max.spout.pending specifies how many batches are processed
>> in parallel. However, for state updates, the batches are processed
>> sequentially. So the state update for batch 2 won't be executed until the
>> state update for batch 1 succeeds.
>>
>>
>> On Tue, May 13, 2014 at 10:31 AM, Raphael Hsieh <ra...@gmail.com>wrote:
>>
>>> In Storm Trident are batches processed sequentially?  Or are they all
>>> processed in parallel?
>>> If they are processed in parallel how does it handle multiple writers to
>>> a datastore ?
>>>
>>> I can understand this making sense for append-only implementations, but
>>> for cases where we are updating values in a database, how does it make sure
>>> that values are written, and in the database before another thread reads it
>>> and tries to update it with different data?
>>>
>>> Thanks
>>> --
>>> Raphael Hsieh
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Twitter: @nathanmarz
>> http://nathanmarz.com
>>
>
>


-- 
Twitter: @nathanmarz
http://nathanmarz.com

Re: Are batches processed sequentially or in parallel?

Posted by Weide Zhang <we...@gmail.com>.
Hi Nathan,

I have a followup question on this. If I'm doing partitionPersist , will my
partitionPersist for each partition also preserves the batches in order and
commit at the same time coordinated by master batch coordinator ?

Weide


On Tue, May 13, 2014 at 11:30 AM, Nathan Marz <na...@nathanmarz.com> wrote:

> Both. topology.max.spout.pending specifies how many batches are processed
> in parallel. However, for state updates, the batches are processed
> sequentially. So the state update for batch 2 won't be executed until the
> state update for batch 1 succeeds.
>
>
> On Tue, May 13, 2014 at 10:31 AM, Raphael Hsieh <ra...@gmail.com>wrote:
>
>> In Storm Trident are batches processed sequentially?  Or are they all
>> processed in parallel?
>> If they are processed in parallel how does it handle multiple writers to
>> a datastore ?
>>
>> I can understand this making sense for append-only implementations, but
>> for cases where we are updating values in a database, how does it make sure
>> that values are written, and in the database before another thread reads it
>> and tries to update it with different data?
>>
>> Thanks
>> --
>> Raphael Hsieh
>>
>>
>>
>>
>
>
>
> --
> Twitter: @nathanmarz
> http://nathanmarz.com
>

Re: Are batches processed sequentially or in parallel?

Posted by Nathan Marz <na...@nathanmarz.com>.
Both. topology.max.spout.pending specifies how many batches are processed
in parallel. However, for state updates, the batches are processed
sequentially. So the state update for batch 2 won't be executed until the
state update for batch 1 succeeds.


On Tue, May 13, 2014 at 10:31 AM, Raphael Hsieh <ra...@gmail.com>wrote:

> In Storm Trident are batches processed sequentially?  Or are they all
> processed in parallel?
> If they are processed in parallel how does it handle multiple writers to a
> datastore ?
>
> I can understand this making sense for append-only implementations, but
> for cases where we are updating values in a database, how does it make sure
> that values are written, and in the database before another thread reads it
> and tries to update it with different data?
>
> Thanks
> --
> Raphael Hsieh
>
>
>
>



-- 
Twitter: @nathanmarz
http://nathanmarz.com