You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "M. Manna" <ma...@gmail.com> on 2020/01/12 20:02:30 UTC

Streams Newbie Question - Deployment and Management of Stream Processors

Hello,

Even though I have been using Kafka for a while, it's primarily for
publish/subscribe event messaging ( and I understand them reasonably well).
But I would like to do more regarding streams.

For my initiative, I have been going through the code written in "examples"
folder. I would like to apologise for such newbie questions in advance.

With reference to WordCountDemo.java - I wanted to understand something
related to Stream Processor integration with business applications (i.e.
clients). Is it a good practice to always keep the stream processor
topology separate from actual client application who uses the processed
data?

My understanding (from what I can see at first glace) multiple
streams.start() needs careful observation for scaling up/out in long term.
To separate problems, I would expected this to be deployed separately (may
be microservices?) But again, I am simply entering this world of streams,
so I could really use some insight into how some of us has tackled this
over the years.

Kindest Regards,

Re: Streams Newbie Question - Deployment and Management of Stream Processors

Posted by Sachin Mittal <sj...@gmail.com>.
I think literature on confluent/ASF and also the community support here is
best to learn about streaming.

On Mon, Jan 13, 2020 at 6:47 PM M. Manna <ma...@gmail.com> wrote:

> Hey Sachin,
>
> On Mon, 13 Jan 2020 at 05:12, Sachin Mittal <sj...@gmail.com> wrote:
>
> > Hi,
> > The way I have used streams processing in past; use case to process
> streams
> > is when you have a continuous stream of data which needs to be processed
> > and used by certain applications.
> > Since in kafka streams can be a simple java application, this application
> > can run in its own JVM which is different from say actual client
> > application.
> > It can be on same physical or virtual machine, but some degree of
> > separation is best.
> >
> > Regarding streams the way I look at it that, it is some continuous
> process
> > whose data downstream is used by micro services.
> > The downstream data can be stored using stream's state stores or can be
> > some external data store (say mongodb, cassandra, etc).
> >
>
>  I totally get your point. My understanding has been the same too. Stream
> processing is all about honouring what stream is all about - stateless,
> non-interfering (almost), and side-effect free.
>  Also, even though the terminal result from stream topology can be stored -
> may be it's needed for decision making only. So storage is a usage (amongst
> many).
>
> Thanks a lot for clarifying. I shall continue my endeavour to learn other
> things. Apart from Confluent and ASF examples, do you recommend anything
> else for starters ?
>
> Regards,
>
> Hope it answers some of your questions.
> >
> > Thanks
> > Sachin
> >
> >
> >
> > On Mon, Jan 13, 2020 at 1:32 AM M. Manna <ma...@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > Even though I have been using Kafka for a while, it's primarily for
> > > publish/subscribe event messaging ( and I understand them reasonably
> > well).
> > > But I would like to do more regarding streams.
> > >
> > > For my initiative, I have been going through the code written in
> > "examples"
> > > folder. I would like to apologise for such newbie questions in advance.
> > >
> > > With reference to WordCountDemo.java - I wanted to understand something
> > > related to Stream Processor integration with business applications
> (i.e.
> > > clients). Is it a good practice to always keep the stream processor
> > > topology separate from actual client application who uses the processed
> > > data?
> > >
> > > My understanding (from what I can see at first glace) multiple
> > > streams.start() needs careful observation for scaling up/out in long
> > term.
> > > To separate problems, I would expected this to be deployed separately
> > (may
> > > be microservices?) But again, I am simply entering this world of
> streams,
> > > so I could really use some insight into how some of us has tackled this
> > > over the years.
> > >
> > > Kindest Regards,
> > >
> >
>

Re: Streams Newbie Question - Deployment and Management of Stream Processors

Posted by "M. Manna" <ma...@gmail.com>.
Hey Sachin,

On Mon, 13 Jan 2020 at 05:12, Sachin Mittal <sj...@gmail.com> wrote:

> Hi,
> The way I have used streams processing in past; use case to process streams
> is when you have a continuous stream of data which needs to be processed
> and used by certain applications.
> Since in kafka streams can be a simple java application, this application
> can run in its own JVM which is different from say actual client
> application.
> It can be on same physical or virtual machine, but some degree of
> separation is best.
>
> Regarding streams the way I look at it that, it is some continuous process
> whose data downstream is used by micro services.
> The downstream data can be stored using stream's state stores or can be
> some external data store (say mongodb, cassandra, etc).
>

 I totally get your point. My understanding has been the same too. Stream
processing is all about honouring what stream is all about - stateless,
non-interfering (almost), and side-effect free.
 Also, even though the terminal result from stream topology can be stored -
may be it's needed for decision making only. So storage is a usage (amongst
many).

Thanks a lot for clarifying. I shall continue my endeavour to learn other
things. Apart from Confluent and ASF examples, do you recommend anything
else for starters ?

Regards,

Hope it answers some of your questions.
>
> Thanks
> Sachin
>
>
>
> On Mon, Jan 13, 2020 at 1:32 AM M. Manna <ma...@gmail.com> wrote:
>
> > Hello,
> >
> > Even though I have been using Kafka for a while, it's primarily for
> > publish/subscribe event messaging ( and I understand them reasonably
> well).
> > But I would like to do more regarding streams.
> >
> > For my initiative, I have been going through the code written in
> "examples"
> > folder. I would like to apologise for such newbie questions in advance.
> >
> > With reference to WordCountDemo.java - I wanted to understand something
> > related to Stream Processor integration with business applications (i.e.
> > clients). Is it a good practice to always keep the stream processor
> > topology separate from actual client application who uses the processed
> > data?
> >
> > My understanding (from what I can see at first glace) multiple
> > streams.start() needs careful observation for scaling up/out in long
> term.
> > To separate problems, I would expected this to be deployed separately
> (may
> > be microservices?) But again, I am simply entering this world of streams,
> > so I could really use some insight into how some of us has tackled this
> > over the years.
> >
> > Kindest Regards,
> >
>

Re: Streams Newbie Question - Deployment and Management of Stream Processors

Posted by Sachin Mittal <sj...@gmail.com>.
Hi,
The way I have used streams processing in past; use case to process streams
is when you have a continuous stream of data which needs to be processed
and used by certain applications.
Since in kafka streams can be a simple java application, this application
can run in its own JVM which is different from say actual client
application.
It can be on same physical or virtual machine, but some degree of
separation is best.

Regarding streams the way I look at it that, it is some continuous process
whose data downstream is used by micro services.
The downstream data can be stored using stream's state stores or can be
some external data store (say mongodb, cassandra, etc).

Hope it answers some of your questions.

Thanks
Sachin



On Mon, Jan 13, 2020 at 1:32 AM M. Manna <ma...@gmail.com> wrote:

> Hello,
>
> Even though I have been using Kafka for a while, it's primarily for
> publish/subscribe event messaging ( and I understand them reasonably well).
> But I would like to do more regarding streams.
>
> For my initiative, I have been going through the code written in "examples"
> folder. I would like to apologise for such newbie questions in advance.
>
> With reference to WordCountDemo.java - I wanted to understand something
> related to Stream Processor integration with business applications (i.e.
> clients). Is it a good practice to always keep the stream processor
> topology separate from actual client application who uses the processed
> data?
>
> My understanding (from what I can see at first glace) multiple
> streams.start() needs careful observation for scaling up/out in long term.
> To separate problems, I would expected this to be deployed separately (may
> be microservices?) But again, I am simply entering this world of streams,
> so I could really use some insight into how some of us has tackled this
> over the years.
>
> Kindest Regards,
>