You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Telles Nobrega <te...@gmail.com> on 2014/08/01 04:37:00 UTC

Creating Messages in Samza

Hi, I’m creating an application that is supposed to do the following:

Create messages at a fixed frequency of messages per second. I don’t have it in files or anything, the idea is to create the messages in samza and send it downstream to be processed.

I have looked into some examples and looks like the way to do this is to create a SystemConsumer that produces this messages and sends it to a Kafka topic. Does this sounds right? 

Can anyone give me a hint on how to start this implementation, I studied hello-samza job but I couldn’t figure it out how to do it properly.

Thanks in advance.

Re: Creating Messages in Samza

Posted by Telles Nobrega <te...@gmail.com>.
Thanks, it does help.

Thanks for the clarification.


On Fri, Aug 1, 2014 at 3:08 AM, Yan Fang <ya...@gmail.com> wrote:

> Hi Telles,
>
> If my understanding is correct, you want to do two things: 1) create the
> messages 2) process the messages. For 1), not sure what kind of messages
> you want to generate. If only for testing purpose, I would suggest to write
> simple java code to send messages to Kafka using Kafka producer API. For 2)
> Samza can be used to process your messages with the input from Kafka.
>
> If you use Kafka and Samza, you should be able to use them seamlessly and
> do not need to write SystemConsumer. We provide the KafkaSystemConsumer
> out-of-box.
>
> For better understanding the hello-samza, let's use the wikipedia-parser
> job as the example. (wikipedia-parser job is the first job in Generate
> Wikipedia Statistics
> <
> https://samza.incubator.apache.org/startup/hello-samza/0.7.0/#generate-wikipedia-statistics
> >
> part in Hello Samza tutorial.) What it does is to accept a Kafka topic
> called wikipedia-raw, do some process and then send messages to a Kafka
> topic called wikipedia-edits. If you look at the WikipediaParserStreamTask
> class, the process method is the most important thing. Inside the process
> method, we get input messages (envelope.getMessage()), parse it and send it
> (collector.send). The API overview
> <
> https://samza.incubator.apache.org/learn/documentation/0.7.0/api/overview.html
> >
> page gives you good insight and explanation. So for testing purpose, you
> can modify the class to send whatever messages you want to send (based on
> the input messages). Keep in mind, the process method is called for every
> incoming message.
>
> Hope this can help. Thank you.
>
> Best,
>
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
>
>
> On Thu, Jul 31, 2014 at 7:37 PM, Telles Nobrega <te...@gmail.com>
> wrote:
>
> > Hi, I’m creating an application that is supposed to do the following:
> >
> > Create messages at a fixed frequency of messages per second. I don’t have
> > it in files or anything, the idea is to create the messages in samza and
> > send it downstream to be processed.
> >
> > I have looked into some examples and looks like the way to do this is to
> > create a SystemConsumer that produces this messages and sends it to a
> Kafka
> > topic. Does this sounds right?
> >
> > Can anyone give me a hint on how to start this implementation, I studied
> > hello-samza job but I couldn’t figure it out how to do it properly.
> >
> > Thanks in advance.
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Creating Messages in Samza

Posted by Yan Fang <ya...@gmail.com>.
Hi Telles,

If my understanding is correct, you want to do two things: 1) create the
messages 2) process the messages. For 1), not sure what kind of messages
you want to generate. If only for testing purpose, I would suggest to write
simple java code to send messages to Kafka using Kafka producer API. For 2)
Samza can be used to process your messages with the input from Kafka.

If you use Kafka and Samza, you should be able to use them seamlessly and
do not need to write SystemConsumer. We provide the KafkaSystemConsumer
out-of-box.

For better understanding the hello-samza, let's use the wikipedia-parser
job as the example. (wikipedia-parser job is the first job in Generate
Wikipedia Statistics
<https://samza.incubator.apache.org/startup/hello-samza/0.7.0/#generate-wikipedia-statistics>
part in Hello Samza tutorial.) What it does is to accept a Kafka topic
called wikipedia-raw, do some process and then send messages to a Kafka
topic called wikipedia-edits. If you look at the WikipediaParserStreamTask
class, the process method is the most important thing. Inside the process
method, we get input messages (envelope.getMessage()), parse it and send it
(collector.send). The API overview
<https://samza.incubator.apache.org/learn/documentation/0.7.0/api/overview.html>
page gives you good insight and explanation. So for testing purpose, you
can modify the class to send whatever messages you want to send (based on
the input messages). Keep in mind, the process method is called for every
incoming message.

Hope this can help. Thank you.

Best,

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Thu, Jul 31, 2014 at 7:37 PM, Telles Nobrega <te...@gmail.com>
wrote:

> Hi, I’m creating an application that is supposed to do the following:
>
> Create messages at a fixed frequency of messages per second. I don’t have
> it in files or anything, the idea is to create the messages in samza and
> send it downstream to be processed.
>
> I have looked into some examples and looks like the way to do this is to
> create a SystemConsumer that produces this messages and sends it to a Kafka
> topic. Does this sounds right?
>
> Can anyone give me a hint on how to start this implementation, I studied
> hello-samza job but I couldn’t figure it out how to do it properly.
>
> Thanks in advance.