You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Mallikarjuna Rao <av...@epiance.com> on 2015/09/18 06:38:21 UTC

Streaming question

Hi Samaza Gurus,

 

I want to send output of one task to input of another task for further
processing.  For example In task one I want read data from kafka que and
emit messages with group by key  , in second task I want to read the
messages emitted from first task and further process them. Essentially I
have to link Task one output to  Task two , which should read output from
task one and do further processing. Any pointers for this is highly
appreciated.

 

This is very similar to spark and storm where output of one task can be feed
as input to another task,  Please kindly provide your inputs if tasks can be
chained or linked in samza. An example from storm is provided below for
better understanding problem 

 

Please see following snippet from storm processing, how stepOne output is
linked to StepTwo.

 

builder.setBolt("StepOne", new
StepIdentificationRichBolt(),1).setNumTasks(1)

 
.globalGrouping("KafkaSpout");

 

builder.setBolt("StepTwo", new
ProcessIdentificationRichBolt_With_TrimSequence(),4).setNumTasks(4)

                                          .fieldsGrouping("StepOne", new
Fields("UserName"));               

 

Output of stepOne send as input to steptwo in above line of code in storm.


     

 

 

Thanks regards,

 

Annadath rao


Re: Streaming question

Posted by Tommy Becker <to...@tivo.com>.
I'm not familiar with Storm, but Samza doesn't really have any abstractions above the job level. Meaning there's nothing I'm aware of that will automatically configure your jobs such that they link together in some order you specify. You just have to manually configure task.inputs on the downstream jobs to read what you're writing from the upstream ones. That is a nice concept though!

On 09/18/2015 12:38 AM, Mallikarjuna Rao wrote:

Hi Samaza Gurus,



I want to send output of one task to input of another task for further
processing.  For example In task one I want read data from kafka que and
emit messages with group by key  , in second task I want to read the
messages emitted from first task and further process them. Essentially I
have to link Task one output to  Task two , which should read output from
task one and do further processing. Any pointers for this is highly
appreciated.



This is very similar to spark and storm where output of one task can be feed
as input to another task,  Please kindly provide your inputs if tasks can be
chained or linked in samza. An example from storm is provided below for
better understanding problem



Please see following snippet from storm processing, how stepOne output is
linked to StepTwo.



builder.setBolt("StepOne", new
StepIdentificationRichBolt(),1).setNumTasks(1)


.globalGrouping("KafkaSpout");



builder.setBolt("StepTwo", new
ProcessIdentificationRichBolt_With_TrimSequence(),4).setNumTasks(4)

                                          .fieldsGrouping("StepOne", new
Fields("UserName"));



Output of stepOne send as input to steptwo in above line of code in storm.








Thanks regards,



Annadath rao




--
Tommy Becker
Senior Software Engineer

Digitalsmiths
A TiVo Company

www.digitalsmiths.com<http://www.digitalsmiths.com>
tobecker@tivo.com<ma...@tivo.com>

________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.