You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by "Nikos R. Katsipoulakis" <ni...@gmail.com> on 2016/06/09 18:13:43 UTC

Join two streams using a count-based window

Hello all,

At first, I have a question posted on
http://stackoverflow.com/questions/37732978/join-two-streams-using-a-count-based-window
. I am re-posting this on the mailing list in case some of you are not on
SO.

In addition, I would like to know what is the difference between Flink and
other Streaming engines on data-granularity transport and processing. To be
more precise, I am aware that Storm sends tuples using Netty (by filling up
queues) and a Bolt's logic is executed per tuple. Spark, employs
micro-batches to simulate streaming and (I am not entirely certain) each
task performs processing on a micro-batch. What about Flink? How are tuples
transferred and processed. Any explanation and or article/blog-post/link is
more than welcome.

Thanks

-- 
Nikos R. Katsipoulakis,
Department of Computer Science
University of Pittsburgh

Re: Join two streams using a count-based window

Posted by "Nikos R. Katsipoulakis" <ni...@gmail.com>.

Thank you very much Matthias! Also, the link you provided is very helpful.

Cheers,
Nikos

On Fri, Jun 10, 2016 at 3:16 AM, Matthias J. Sax <mj...@apache.org> wrote:

> I just put an answer to SO.
>
> About the other questions: Flink processes tuple-by-tuple and does some
> internal buffering. You might be interested in
>
> https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks
>
> -Matthias
>
> On 06/09/2016 08:13 PM, Nikos R. Katsipoulakis wrote:
> > Hello all,
> >
> > At first, I have a question posted on
> >
> http://stackoverflow.com/questions/37732978/join-two-streams-using-a-count-based-window
> > . I am re-posting this on the mailing list in case some of you are not
> > on SO.
> >
> > In addition, I would like to know what is the difference between Flink
> > and other Streaming engines on data-granularity transport and
> > processing. To be more precise, I am aware that Storm sends tuples using
> > Netty (by filling up queues) and a Bolt's logic is executed per tuple.
> > Spark, employs micro-batches to simulate streaming and (I am not
> > entirely certain) each task performs processing on a micro-batch. What
> > about Flink? How are tuples transferred and processed. Any explanation
> > and or article/blog-post/link is more than welcome.
> >
> > Thanks
> >
> > --
> > Nikos R. Katsipoulakis,
> > Department of Computer Science
> > University of Pittsburgh
>
>


-- 
Nikos R. Katsipoulakis,
Department of Computer Science
University of Pittsburgh

Re: Join two streams using a count-based window

Posted by "Matthias J. Sax" <mj...@apache.org>.

I just put an answer to SO.

About the other questions: Flink processes tuple-by-tuple and does some
internal buffering. You might be interested in
https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks

-Matthias

On 06/09/2016 08:13 PM, Nikos R. Katsipoulakis wrote:
> Hello all,
> 
> At first, I have a question posted on
> http://stackoverflow.com/questions/37732978/join-two-streams-using-a-count-based-window
> . I am re-posting this on the mailing list in case some of you are not
> on SO.
> 
> In addition, I would like to know what is the difference between Flink
> and other Streaming engines on data-granularity transport and
> processing. To be more precise, I am aware that Storm sends tuples using
> Netty (by filling up queues) and a Bolt's logic is executed per tuple.
> Spark, employs micro-batches to simulate streaming and (I am not
> entirely certain) each task performs processing on a micro-batch. What
> about Flink? How are tuples transferred and processed. Any explanation
> and or article/blog-post/link is more than welcome.
> 
> Thanks
> 
> -- 
> Nikos R. Katsipoulakis, 
> Department of Computer Science 
> University of Pittsburgh