You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by clay teahouse <cl...@gmail.com> on 2014/11/03 12:42:05 UTC

emitting batches of tuples

Hello All,
Is it possible emit batches of tuples, as opposed to one tuple at a time?
In other word, is it possible to batch the tuples before emitting them?  An
application for batching the tuples is for example for writing the tuples
to a tcp socket but not wanting to do a flush after each tuple is written
to the socket. Everything runs locally.
 Sorry if the answer is obvious.

thanks,
Clay

RE: emitting batches of tuples

Posted by "Brunner, Bill" <bi...@baml.com>.
Correct, a function would be the way to go.  Filtering and aggregating is very simple in Trident.

From: clay teahouse [mailto:clayteahouse@gmail.com]
Sent: Monday, November 03, 2014 12:35 PM
To: user@storm.apache.org
Subject: Re: emitting batches of tuples

Hello Bill,
But piping is different from joining and merging. I don't want to join/merge two streams. I want to pass, pipe the output of one stream to another processor for further processing, as you can with bolts. I suppose I could in case of trident, just apply a function to the stream and change the stream the way I want.
thanks,
Clay

On Mon, Nov 3, 2014 at 10:26 AM, Brunner, Bill <bi...@baml.com>> wrote:
Yes, you can use merge or join.

Merge combines streams assuming they have the same output tuple signature.  Join combines them using tuple values, similar to a db column join.

https://storm.incubator.apache.org/documentation/Trident-API-Overview.html

From: clay teahouse [mailto:clayteahouse@gmail.com<ma...@gmail.com>]
Sent: Monday, November 03, 2014 9:41 AM

To: user@storm.apache.org<ma...@storm.apache.org>
Subject: Re: emitting batches of tuples

Thanks Andrew. How would I chain the streams in trident? I want to pipe the output of one stream to another stream. Can I have an hierarchy of streams with trident?

Clay

On Mon, Nov 3, 2014 at 8:03 AM, Brunner, Bill <bi...@baml.com>> wrote:
In Trident, every .each() call is a stream object.  So from your spout “A”, you can just do

val stream1 = A.each()
val stream2 = A.each()

and now you have 2 streams from your spout.  You can then join or merge the streams later.

From: clay teahouse [mailto:clayteahouse@gmail.com<ma...@gmail.com>]
Sent: Monday, November 03, 2014 7:37 AM
To: user@storm.apache.org<ma...@storm.apache.org>
Subject: Re: emitting batches of tuples

But I need to be able to chain multiple streams with different type of records and need to be able to emit multiple streams from a single bolt. I am not sure if the same can be done as easily with trident. Are there examples of chaining and branching tridents out there?

thanks
Clay

On Mon, Nov 3, 2014 at 5:45 AM, Andrew Xor <an...@gmail.com>> wrote:
Hi,
 I think you should take a look at Trident API here<http://storm.incubator.apache.org/documentation/Trident-API-Overview.html> if you want an easy way to process tuples in batches... let me know if this is what you are looking for.

Cheers.

Kindly yours,
Andrew Grammenos
-- PGP PKey --
​<https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt

On Mon, Nov 3, 2014 at 1:42 PM, clay teahouse <cl...@gmail.com>> wrote:
Hello All,
Is it possible emit batches of tuples, as opposed to one tuple at a time? In other word, is it possible to batch the tuples before emitting them?  An application for batching the tuples is for example for writing the tuples to a tcp socket but not wanting to do a flush after each tuple is written to the socket. Everything runs locally.
 Sorry if the answer is obvious.

thanks,
Clay



________________________________
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.

________________________________
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.


----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.

Re: emitting batches of tuples

Posted by clay teahouse <cl...@gmail.com>.
Hello Bill,
But piping is different from joining and merging. I don't want to
join/merge two streams. I want to pass, pipe the output of one stream to
another processor for further processing, as you can with bolts. I suppose
I could in case of trident, just apply a function to the stream and change
the stream the way I want.
thanks,
Clay

On Mon, Nov 3, 2014 at 10:26 AM, Brunner, Bill <bi...@baml.com>
wrote:

>  Yes, you can use merge or join.
>
>
>
> Merge combines streams assuming they have the same output tuple
> signature.  Join combines them using tuple values, similar to a db column
> join.
>
>
>
> https://storm.incubator.apache.org/documentation/Trident-API-Overview.html
>
>
>
> *From:* clay teahouse [mailto:clayteahouse@gmail.com]
> *Sent:* Monday, November 03, 2014 9:41 AM
>
> *To:* user@storm.apache.org
> *Subject:* Re: emitting batches of tuples
>
>
>
> Thanks Andrew. How would I chain the streams in trident? I want to pipe
> the output of one stream to another stream. Can I have an hierarchy of
> streams with trident?
>
>
>
> Clay
>
>
>
> On Mon, Nov 3, 2014 at 8:03 AM, Brunner, Bill <bi...@baml.com>
> wrote:
>
> In Trident, every .each() call is a stream object.  So from your spout
> “A”, you can just do
>
>
>
> val stream1 = A.each()
>
> val stream2 = A.each()
>
>
>
> and now you have 2 streams from your spout.  You can then join or merge
> the streams later.
>
>
>
> *From:* clay teahouse [mailto:clayteahouse@gmail.com]
> *Sent:* Monday, November 03, 2014 7:37 AM
> *To:* user@storm.apache.org
> *Subject:* Re: emitting batches of tuples
>
>
>
> But I need to be able to chain multiple streams with different type of
> records and need to be able to emit multiple streams from a single bolt. I
> am not sure if the same can be done as easily with trident. Are there
> examples of chaining and branching tridents out there?
>
>
>
> thanks
>
> Clay
>
>
>
> On Mon, Nov 3, 2014 at 5:45 AM, Andrew Xor <an...@gmail.com>
> wrote:
>
> Hi,
>
>  I think you should take a look at Trident API here
> <http://storm.incubator.apache.org/documentation/Trident-API-Overview.html>
> if you want an easy way to process tuples in batches... let me know if this
> is what you are looking for.
>
> Cheers.
>
>
>    Kindly yours,
>
> Andrew Grammenos
>
> -- PGP PKey --
> ​ <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
>
> https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt
>
>
>
> On Mon, Nov 3, 2014 at 1:42 PM, clay teahouse <cl...@gmail.com>
> wrote:
>
> Hello All,
>
> Is it possible emit batches of tuples, as opposed to one tuple at a time?
> In other word, is it possible to batch the tuples before emitting them?  An
> application for batching the tuples is for example for writing the tuples
> to a tcp socket but not wanting to do a flush after each tuple is written
> to the socket. Everything runs locally.
>
>  Sorry if the answer is obvious.
>
>
>
> thanks,
>
> Clay
>
>
>
>
>
>
>    ------------------------------
>
> This message, and any attachments, is for the intended recipient(s) only,
> may contain information that is privileged, confidential and/or proprietary
> and subject to important terms and conditions available at
> http://www.bankofamerica.com/emaildisclaimer. If you are not the intended
> recipient, please delete this message.
>
>
>  ------------------------------
> This message, and any attachments, is for the intended recipient(s) only,
> may contain information that is privileged, confidential and/or proprietary
> and subject to important terms and conditions available at
> http://www.bankofamerica.com/emaildisclaimer. If you are not the intended
> recipient, please delete this message.
>

RE: emitting batches of tuples

Posted by "Brunner, Bill" <bi...@baml.com>.
Yes, you can use merge or join.

Merge combines streams assuming they have the same output tuple signature.  Join combines them using tuple values, similar to a db column join.

https://storm.incubator.apache.org/documentation/Trident-API-Overview.html

From: clay teahouse [mailto:clayteahouse@gmail.com]
Sent: Monday, November 03, 2014 9:41 AM
To: user@storm.apache.org
Subject: Re: emitting batches of tuples

Thanks Andrew. How would I chain the streams in trident? I want to pipe the output of one stream to another stream. Can I have an hierarchy of streams with trident?

Clay

On Mon, Nov 3, 2014 at 8:03 AM, Brunner, Bill <bi...@baml.com>> wrote:
In Trident, every .each() call is a stream object.  So from your spout “A”, you can just do

val stream1 = A.each()
val stream2 = A.each()

and now you have 2 streams from your spout.  You can then join or merge the streams later.

From: clay teahouse [mailto:clayteahouse@gmail.com<ma...@gmail.com>]
Sent: Monday, November 03, 2014 7:37 AM
To: user@storm.apache.org<ma...@storm.apache.org>
Subject: Re: emitting batches of tuples

But I need to be able to chain multiple streams with different type of records and need to be able to emit multiple streams from a single bolt. I am not sure if the same can be done as easily with trident. Are there examples of chaining and branching tridents out there?

thanks
Clay

On Mon, Nov 3, 2014 at 5:45 AM, Andrew Xor <an...@gmail.com>> wrote:
Hi,
 I think you should take a look at Trident API here<http://storm.incubator.apache.org/documentation/Trident-API-Overview.html> if you want an easy way to process tuples in batches... let me know if this is what you are looking for.

Cheers.

Kindly yours,
Andrew Grammenos
-- PGP PKey --
​<https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt

On Mon, Nov 3, 2014 at 1:42 PM, clay teahouse <cl...@gmail.com>> wrote:
Hello All,
Is it possible emit batches of tuples, as opposed to one tuple at a time? In other word, is it possible to batch the tuples before emitting them?  An application for batching the tuples is for example for writing the tuples to a tcp socket but not wanting to do a flush after each tuple is written to the socket. Everything runs locally.
 Sorry if the answer is obvious.

thanks,
Clay



________________________________
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.


----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.

Re: emitting batches of tuples

Posted by clay teahouse <cl...@gmail.com>.
I meant to address Bill. Sorry for the mix up.

Clay.

On Mon, Nov 3, 2014 at 8:41 AM, clay teahouse <cl...@gmail.com>
wrote:

> Thanks Andrew. How would I chain the streams in trident? I want to pipe
> the output of one stream to another stream. Can I have an hierarchy of
> streams with trident?
>
> Clay
>
> On Mon, Nov 3, 2014 at 8:03 AM, Brunner, Bill <bi...@baml.com>
> wrote:
>
>>  In Trident, every .each() call is a stream object.  So from your spout
>> “A”, you can just do
>>
>>
>>
>> val stream1 = A.each()
>>
>> val stream2 = A.each()
>>
>>
>>
>> and now you have 2 streams from your spout.  You can then join or merge
>> the streams later.
>>
>>
>>
>> *From:* clay teahouse [mailto:clayteahouse@gmail.com]
>> *Sent:* Monday, November 03, 2014 7:37 AM
>> *To:* user@storm.apache.org
>> *Subject:* Re: emitting batches of tuples
>>
>>
>>
>> But I need to be able to chain multiple streams with different type of
>> records and need to be able to emit multiple streams from a single bolt. I
>> am not sure if the same can be done as easily with trident. Are there
>> examples of chaining and branching tridents out there?
>>
>>
>>
>> thanks
>>
>> Clay
>>
>>
>>
>> On Mon, Nov 3, 2014 at 5:45 AM, Andrew Xor <an...@gmail.com>
>> wrote:
>>
>> Hi,
>>
>>  I think you should take a look at Trident API here
>> <http://storm.incubator.apache.org/documentation/Trident-API-Overview.html>
>> if you want an easy way to process tuples in batches... let me know if this
>> is what you are looking for.
>>
>> Cheers.
>>
>>
>>    Kindly yours,
>>
>> Andrew Grammenos
>>
>> -- PGP PKey --
>> ​ <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
>>
>> https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt
>>
>>
>>
>> On Mon, Nov 3, 2014 at 1:42 PM, clay teahouse <cl...@gmail.com>
>> wrote:
>>
>> Hello All,
>>
>> Is it possible emit batches of tuples, as opposed to one tuple at a time?
>> In other word, is it possible to batch the tuples before emitting them?  An
>> application for batching the tuples is for example for writing the tuples
>> to a tcp socket but not wanting to do a flush after each tuple is written
>> to the socket. Everything runs locally.
>>
>>  Sorry if the answer is obvious.
>>
>>
>>
>> thanks,
>>
>> Clay
>>
>>
>>
>>
>>
>>
>>  ------------------------------
>> This message, and any attachments, is for the intended recipient(s) only,
>> may contain information that is privileged, confidential and/or proprietary
>> and subject to important terms and conditions available at
>> http://www.bankofamerica.com/emaildisclaimer. If you are not the
>> intended recipient, please delete this message.
>>
>
>

Re: emitting batches of tuples

Posted by clay teahouse <cl...@gmail.com>.
Thanks Andrew. How would I chain the streams in trident? I want to pipe the
output of one stream to another stream. Can I have an hierarchy of streams
with trident?

Clay

On Mon, Nov 3, 2014 at 8:03 AM, Brunner, Bill <bi...@baml.com> wrote:

>  In Trident, every .each() call is a stream object.  So from your spout
> “A”, you can just do
>
>
>
> val stream1 = A.each()
>
> val stream2 = A.each()
>
>
>
> and now you have 2 streams from your spout.  You can then join or merge
> the streams later.
>
>
>
> *From:* clay teahouse [mailto:clayteahouse@gmail.com]
> *Sent:* Monday, November 03, 2014 7:37 AM
> *To:* user@storm.apache.org
> *Subject:* Re: emitting batches of tuples
>
>
>
> But I need to be able to chain multiple streams with different type of
> records and need to be able to emit multiple streams from a single bolt. I
> am not sure if the same can be done as easily with trident. Are there
> examples of chaining and branching tridents out there?
>
>
>
> thanks
>
> Clay
>
>
>
> On Mon, Nov 3, 2014 at 5:45 AM, Andrew Xor <an...@gmail.com>
> wrote:
>
> Hi,
>
>  I think you should take a look at Trident API here
> <http://storm.incubator.apache.org/documentation/Trident-API-Overview.html>
> if you want an easy way to process tuples in batches... let me know if this
> is what you are looking for.
>
> Cheers.
>
>
>    Kindly yours,
>
> Andrew Grammenos
>
> -- PGP PKey --
> ​ <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
>
> https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt
>
>
>
> On Mon, Nov 3, 2014 at 1:42 PM, clay teahouse <cl...@gmail.com>
> wrote:
>
> Hello All,
>
> Is it possible emit batches of tuples, as opposed to one tuple at a time?
> In other word, is it possible to batch the tuples before emitting them?  An
> application for batching the tuples is for example for writing the tuples
> to a tcp socket but not wanting to do a flush after each tuple is written
> to the socket. Everything runs locally.
>
>  Sorry if the answer is obvious.
>
>
>
> thanks,
>
> Clay
>
>
>
>
>
>
>  ------------------------------
> This message, and any attachments, is for the intended recipient(s) only,
> may contain information that is privileged, confidential and/or proprietary
> and subject to important terms and conditions available at
> http://www.bankofamerica.com/emaildisclaimer. If you are not the intended
> recipient, please delete this message.
>

RE: emitting batches of tuples

Posted by "Brunner, Bill" <bi...@baml.com>.
In Trident, every .each() call is a stream object.  So from your spout “A”, you can just do

val stream1 = A.each()
val stream2 = A.each()

and now you have 2 streams from your spout.  You can then join or merge the streams later.

From: clay teahouse [mailto:clayteahouse@gmail.com]
Sent: Monday, November 03, 2014 7:37 AM
To: user@storm.apache.org
Subject: Re: emitting batches of tuples

But I need to be able to chain multiple streams with different type of records and need to be able to emit multiple streams from a single bolt. I am not sure if the same can be done as easily with trident. Are there examples of chaining and branching tridents out there?

thanks
Clay

On Mon, Nov 3, 2014 at 5:45 AM, Andrew Xor <an...@gmail.com>> wrote:
Hi,
 I think you should take a look at Trident API here<http://storm.incubator.apache.org/documentation/Trident-API-Overview.html> if you want an easy way to process tuples in batches... let me know if this is what you are looking for.

Cheers.

Kindly yours,
Andrew Grammenos
-- PGP PKey --
​<https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt

On Mon, Nov 3, 2014 at 1:42 PM, clay teahouse <cl...@gmail.com>> wrote:
Hello All,
Is it possible emit batches of tuples, as opposed to one tuple at a time? In other word, is it possible to batch the tuples before emitting them?  An application for batching the tuples is for example for writing the tuples to a tcp socket but not wanting to do a flush after each tuple is written to the socket. Everything runs locally.
 Sorry if the answer is obvious.

thanks,
Clay




----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.

Re: emitting batches of tuples

Posted by clay teahouse <cl...@gmail.com>.
But I need to be able to chain multiple streams with different type of
records and need to be able to emit multiple streams from a single bolt. I
am not sure if the same can be done as easily with trident. Are there
examples of chaining and branching tridents out there?

thanks
Clay

On Mon, Nov 3, 2014 at 5:45 AM, Andrew Xor <an...@gmail.com>
wrote:

> Hi,
>
>  I think you should take a look at Trident API here
> <http://storm.incubator.apache.org/documentation/Trident-API-Overview.html>
> if you want an easy way to process tuples in batches... let me know if this
> is what you are looking for.
>
> Cheers.
>
> Kindly yours,
>
> Andrew Grammenos
>
> -- PGP PKey --
> ​ <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
> https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt
>
> On Mon, Nov 3, 2014 at 1:42 PM, clay teahouse <cl...@gmail.com>
> wrote:
>
>> Hello All,
>> Is it possible emit batches of tuples, as opposed to one tuple at a time?
>> In other word, is it possible to batch the tuples before emitting them?  An
>> application for batching the tuples is for example for writing the tuples
>> to a tcp socket but not wanting to do a flush after each tuple is written
>> to the socket. Everything runs locally.
>>  Sorry if the answer is obvious.
>>
>> thanks,
>> Clay
>>
>>
>

Re: emitting batches of tuples

Posted by Andrew Xor <an...@gmail.com>.
Hi,

 I think you should take a look at Trident API here
<http://storm.incubator.apache.org/documentation/Trident-API-Overview.html>
if you want an easy way to process tuples in batches... let me know if this
is what you are looking for.

Cheers.

Kindly yours,

Andrew Grammenos

-- PGP PKey --
​ <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt

On Mon, Nov 3, 2014 at 1:42 PM, clay teahouse <cl...@gmail.com>
wrote:

> Hello All,
> Is it possible emit batches of tuples, as opposed to one tuple at a time?
> In other word, is it possible to batch the tuples before emitting them?  An
> application for batching the tuples is for example for writing the tuples
> to a tcp socket but not wanting to do a flush after each tuple is written
> to the socket. Everything runs locally.
>  Sorry if the answer is obvious.
>
> thanks,
> Clay
>
>

Re: emitting batches of tuples

Posted by "Alberto J. Sánchez Sanz" <aj...@gmail.com>.
I think I can give you an idea.

You can emit several times in a unique tuple processing.

Do a "for loop", and execute "emit" function inside the loop to emit
multiple tuples.

On 3 November 2014 12:42, clay teahouse <cl...@gmail.com> wrote:

> Hello All,
> Is it possible emit batches of tuples, as opposed to one tuple at a time?
> In other word, is it possible to batch the tuples before emitting them?  An
> application for batching the tuples is for example for writing the tuples
> to a tcp socket but not wanting to do a flush after each tuple is written
> to the socket. Everything runs locally.
>  Sorry if the answer is obvious.
>
> thanks,
> Clay
>