You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by "aka.fe2s" <ak...@gmail.com> on 2014/07/09 16:10:49 UTC

Re: scaling trident spout

Any ideas on the question above?




On Wed, Jun 18, 2014 at 6:13 PM, aka.fe2s <ak...@gmail.com> wrote:

> I'm implementing my own ITridentSpout and would like to make it scalable.
> When I set parallelism hint to 3 with pipelining for my spout, I see the
> following
>
> [Thread-16-$spoutcoord-spout0] MyEmitter - initializeTransaction() txid 1
> prevMetadata null currMetadata null generated metadata 9002537a
> [Thread-26-spout0] MyEmitter - emitBatch() txId 1 attempt id 0
> coordinatorMeta 9002537a this.hashCode() 45c8d2d9 emitted e5f7a9d3
> [Thread-30-spout0] MyEmitter - emitBatch txId 1 attempt id 0
> coordinatorMeta 9002537a this.hashCode() 38ac85a emitted 1f08bdab
> [Thread-28-spout0] MyEmitter - emitBatch txId 1 attempt id 0
> coordinatorMeta 9002537a this.hashCode() 58ed567b emitted ee35006c
>
> It creates 3 emitter instances and 3 threads as expected. But coordinator
> propagates the same tx id 1 to _all_ emitters, this doesn't make much sense
> since we should emit tuples for given tx id only once. I was expecting to
> get the following
>
> [Thread-16-$spoutcoord-spout0] MyEmitter - initializeTransaction() txid 1
> ..
> [Thread-26-spout0] MyEmitter - emitBatch() txId 1 ...
> [Thread-16-$spoutcoord-spout0] MyEmitter - initializeTransaction() txid 2
> ..
> [Thread-30-spout0] MyEmitter - emitBatch txId 2 ...
> [Thread-16-$spoutcoord-spout0] MyEmitter - initializeTransaction() txid 3
> ..
> [Thread-28-spout0] MyEmitter - emitBatch txId 3 ...
>
> Is there any way to achieve this?
>
>
>

Re: scaling trident spout

Posted by Tom Brown <to...@gmail.com>.
With a trident spout, data from all spouts forms a transaction. So each
spout should be using the same txid at the same time.

--Tom

On Wednesday, July 9, 2014, aka.fe2s <ak...@gmail.com> wrote:

> Any ideas on the question above?
>
>
>
>
> On Wed, Jun 18, 2014 at 6:13 PM, aka.fe2s <aka.fe2s@gmail.com
> <javascript:_e(%7B%7D,'cvml','aka.fe2s@gmail.com');>> wrote:
>
>> I'm implementing my own ITridentSpout and would like to make it scalable.
>> When I set parallelism hint to 3 with pipelining for my spout, I see the
>> following
>>
>> [Thread-16-$spoutcoord-spout0] MyEmitter - initializeTransaction() txid 1
>> prevMetadata null currMetadata null generated metadata 9002537a
>> [Thread-26-spout0] MyEmitter - emitBatch() txId 1 attempt id 0
>> coordinatorMeta 9002537a this.hashCode() 45c8d2d9 emitted e5f7a9d3
>> [Thread-30-spout0] MyEmitter - emitBatch txId 1 attempt id 0
>> coordinatorMeta 9002537a this.hashCode() 38ac85a emitted 1f08bdab
>> [Thread-28-spout0] MyEmitter - emitBatch txId 1 attempt id 0
>> coordinatorMeta 9002537a this.hashCode() 58ed567b emitted ee35006c
>>
>> It creates 3 emitter instances and 3 threads as expected. But coordinator
>> propagates the same tx id 1 to _all_ emitters, this doesn't make much sense
>> since we should emit tuples for given tx id only once. I was expecting to
>> get the following
>>
>> [Thread-16-$spoutcoord-spout0] MyEmitter - initializeTransaction() txid 1
>> ..
>> [Thread-26-spout0] MyEmitter - emitBatch() txId 1 ...
>> [Thread-16-$spoutcoord-spout0] MyEmitter - initializeTransaction() txid 2
>> ..
>> [Thread-30-spout0] MyEmitter - emitBatch txId 2 ...
>> [Thread-16-$spoutcoord-spout0] MyEmitter - initializeTransaction() txid 3
>> ..
>> [Thread-28-spout0] MyEmitter - emitBatch txId 3 ...
>>
>> Is there any way to achieve this?
>>
>>
>>
>