You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Robert Metzger <rm...@apache.org> on 2015/06/01 21:46:04 UTC

Re: Storm compatibility layer currently does not support Storm's SimpleJoin example

Great to see that you two are working together on the storm compatibility
layer.

Please let the other Flink committers know when Matthias PR is in a state
that we can review it again (= when you think its ready).
Given the feedback from Peter and the long list of missing features and the
current rework, I would suggest to merge the storm compatiblilty layer
after the 0.9 release.
The issues from the document Stephan send around two weeks ago are making
good progress, so the release will probably forked-off rather soon.

On Fri, May 29, 2015 at 2:10 PM, Szabó Péter <ne...@gmail.com>
wrote:

> Thank you very much, this explains a lot of things :)
> I'm aware of that currently the support of TopologyContext is limited, so I
> do not expect it to work smoothly. However, there was another issue with
> the grouping by the "id" field, which seemed very strange. Anyway, I will
> live the SimpleJoin example to you, then. I will write if I find out
> something important.
>
> I think the Strom compatibility layer is a really nice extension of the
> Flink streaming API. It would be great if you could add more functionality
> to it in the future.
> By the way, I'm not completely sure if you know: I was asked by Marton to
> prepare your pull request for merging into the master. I restructured your
> commits, cleaned up the code and rebased the branch on the current master.
> Actually, to follow the changes in the behaviour of the Flink streaming
> sources, I'm doing a second refactor right now. If you are working on
> something, and have not rebased yet, you can use my storm branch to follow
> up.
>
> Here are the links to my storm branches:
> - storm-backup (exmaples and other experiments; currently a little bit
> outdated, because it is supposed to be a backup branch):
> https://github.com/mbalassi/flink/tree/storm-backup
> - storm (last clean state of my work on the flink-storm-compatibility pull
> request, including code cleanup & refactor and one or two simple examples):
> https://github.com/mbalassi/flink/tree/storm
>
> Peter
>
>
> 2015-05-29 10:36 GMT+02:00 Matthias J. Sax <mjsax@informatik.hu-berlin.de
> >:
>
> > Hi Peter,
> >
> > I started to look into the issue. However, I could not find the
> > following classes in the git repository:
> >
> > org.apache.flink.stormcompatibility.util.AbstractStormSpout
> > org.apache.flink.stormcompatibility.util.OutputFormatter
> > org.apache.flink.stormcompatibility.util.StormBoltFileSink
> > org.apache.flink.stormcompatibility.util.StormBoltPrintSink
> > org.apache.flink.stormcompatibility.util.TupleOutputFormatter
> >
> > Thus, I cannot compile and run the code. Can you please update the git.
> >
> > However, I had a quick look into the code, and see a few issues right
> > away. For example the code uses "tuple.select(...)" what is not
> > supported so far. Right now, attributes can only be accessed via index.
> > Furthermore, the example uses a lot of meta information that cannot be
> > provided easily in Flink (ie, TopologyContext is only supported very
> > limited).
> >
> > To add those things, I will need much more time. I don't think I should
> > be part of the first pull request, but be added later. I will integrate
> > the SingleJoinBolt example as a ITCase if the functionality is there. It
> > seems to be a good idea, to add more examples from storm-starter to
> > flink-storm-examples.
> >
> >
> > -Matthias
> >
> >
> > On 05/28/2015 09:37 AM, Szabó Péter wrote:
> > > Hi Matthias,
> > >
> > > Of course, here is the package that contains the example's source
> > classes.
> > >
> >
> https://github.com/mbalassi/flink/tree/storm-backup/flink-staging/flink-streaming/flink-storm-examples/src/main/java/org/apache/flink/stormcompatibility/singlejoin
> > > It is mostly a copy-paste of SimpleJoin from storm-starter, though, I
> > > separated the spouts and the join bolt from the rest of the topology.
> > > I would be happy, if you could fix it. Probably I'm overlooking
> > something.
> > >
> > > Peter
> > >
> > > 2015-05-27 17:13 GMT+02:00 Matthias J. Sax <
> > mjsax@informatik.hu-berlin.de>:
> > >
> > >> Hi Peter,
> > >>
> > >> Thanks a lot for your feedback. It's exiting to see, that somebody
> uses
> > >> the layer already. :)
> > >>
> > >> The current prototype is going to be merged soon. However, I am more
> > >> than happy to extend the functionality of the layer. Can you please
> > >> share your example with me, so I can see what the problem is and fix
> it?
> > >>
> > >> I am pretty sure, that the fix will be merged later on, too. There are
> > >> many other limitation in the layer. Right now, it is still in beta
> > state.
> > >> ;)
> > >>
> > >> -Matthias
> > >>
> > >>
> > >> On 05/27/2015 03:48 PM, Szabó Péter wrote:
> > >>> Hey everyone,
> > >>>
> > >>> I experimented with the Storm compatibility layer Matthias wrote, and
> > ran
> > >>> some Storm examples on Flink. I found that Storm's SimpleJoin example
> > >> does
> > >>> not work. I suppose it is because of the multiple input streams. I'm
> > >>> willing to add another example instead.
> > >>> Right now, I'm getting it through Aljoscha's streaming refactor.
> > >>>
> > >>> Peter
> > >>>
> > >>
> > >>
> > >
> >
> >
>

Re: Storm compatibility layer currently does not support Storm's SimpleJoin example

Posted by Robert Metzger <rm...@apache.org>.
It looks like there is a now a PR request available for the storm
compatibility: https://github.com/apache/flink/pull/764

It seems were are not the only new stream processing system with
compatibility to Storm: http://dl.acm.org/citation.cfm?id=2742788

On Tue, Jun 2, 2015 at 11:09 AM, Szabó Péter <ne...@gmail.com>
wrote:

> @Robert
> Thanks! I think the PR will be ready to merge soon :)
>
> @Matthias
> I fixed the finite-source issue on my branch, now every example and ITCase
> runs and stopps without throwing an exception. Also, in case of finite
> sources, the spout wrapper will not loop infinitely.
> I will study your branch and make comments in the afternoon.
>
> Peter
>
> 2015-06-01 21:46 GMT+02:00 Robert Metzger <rm...@apache.org>:
>
> > Great to see that you two are working together on the storm compatibility
> > layer.
> >
> > Please let the other Flink committers know when Matthias PR is in a state
> > that we can review it again (= when you think its ready).
> > Given the feedback from Peter and the long list of missing features and
> the
> > current rework, I would suggest to merge the storm compatiblilty layer
> > after the 0.9 release.
> > The issues from the document Stephan send around two weeks ago are making
> > good progress, so the release will probably forked-off rather soon.
> >
> > On Fri, May 29, 2015 at 2:10 PM, Szabó Péter <ne...@gmail.com>
> > wrote:
> >
> > > Thank you very much, this explains a lot of things :)
> > > I'm aware of that currently the support of TopologyContext is limited,
> > so I
> > > do not expect it to work smoothly. However, there was another issue
> with
> > > the grouping by the "id" field, which seemed very strange. Anyway, I
> will
> > > live the SimpleJoin example to you, then. I will write if I find out
> > > something important.
> > >
> > > I think the Strom compatibility layer is a really nice extension of the
> > > Flink streaming API. It would be great if you could add more
> > functionality
> > > to it in the future.
> > > By the way, I'm not completely sure if you know: I was asked by Marton
> to
> > > prepare your pull request for merging into the master. I restructured
> > your
> > > commits, cleaned up the code and rebased the branch on the current
> > master.
> > > Actually, to follow the changes in the behaviour of the Flink streaming
> > > sources, I'm doing a second refactor right now. If you are working on
> > > something, and have not rebased yet, you can use my storm branch to
> > follow
> > > up.
> > >
> > > Here are the links to my storm branches:
> > > - storm-backup (exmaples and other experiments; currently a little bit
> > > outdated, because it is supposed to be a backup branch):
> > > https://github.com/mbalassi/flink/tree/storm-backup
> > > - storm (last clean state of my work on the flink-storm-compatibility
> > pull
> > > request, including code cleanup & refactor and one or two simple
> > examples):
> > > https://github.com/mbalassi/flink/tree/storm
> > >
> > > Peter
> > >
> > >
> > > 2015-05-29 10:36 GMT+02:00 Matthias J. Sax <
> > mjsax@informatik.hu-berlin.de
> > > >:
> > >
> > > > Hi Peter,
> > > >
> > > > I started to look into the issue. However, I could not find the
> > > > following classes in the git repository:
> > > >
> > > > org.apache.flink.stormcompatibility.util.AbstractStormSpout
> > > > org.apache.flink.stormcompatibility.util.OutputFormatter
> > > > org.apache.flink.stormcompatibility.util.StormBoltFileSink
> > > > org.apache.flink.stormcompatibility.util.StormBoltPrintSink
> > > > org.apache.flink.stormcompatibility.util.TupleOutputFormatter
> > > >
> > > > Thus, I cannot compile and run the code. Can you please update the
> git.
> > > >
> > > > However, I had a quick look into the code, and see a few issues right
> > > > away. For example the code uses "tuple.select(...)" what is not
> > > > supported so far. Right now, attributes can only be accessed via
> index.
> > > > Furthermore, the example uses a lot of meta information that cannot
> be
> > > > provided easily in Flink (ie, TopologyContext is only supported very
> > > > limited).
> > > >
> > > > To add those things, I will need much more time. I don't think I
> should
> > > > be part of the first pull request, but be added later. I will
> integrate
> > > > the SingleJoinBolt example as a ITCase if the functionality is there.
> > It
> > > > seems to be a good idea, to add more examples from storm-starter to
> > > > flink-storm-examples.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > >
> > > > On 05/28/2015 09:37 AM, Szabó Péter wrote:
> > > > > Hi Matthias,
> > > > >
> > > > > Of course, here is the package that contains the example's source
> > > > classes.
> > > > >
> > > >
> > >
> >
> https://github.com/mbalassi/flink/tree/storm-backup/flink-staging/flink-streaming/flink-storm-examples/src/main/java/org/apache/flink/stormcompatibility/singlejoin
> > > > > It is mostly a copy-paste of SimpleJoin from storm-starter,
> though, I
> > > > > separated the spouts and the join bolt from the rest of the
> topology.
> > > > > I would be happy, if you could fix it. Probably I'm overlooking
> > > > something.
> > > > >
> > > > > Peter
> > > > >
> > > > > 2015-05-27 17:13 GMT+02:00 Matthias J. Sax <
> > > > mjsax@informatik.hu-berlin.de>:
> > > > >
> > > > >> Hi Peter,
> > > > >>
> > > > >> Thanks a lot for your feedback. It's exiting to see, that somebody
> > > uses
> > > > >> the layer already. :)
> > > > >>
> > > > >> The current prototype is going to be merged soon. However, I am
> more
> > > > >> than happy to extend the functionality of the layer. Can you
> please
> > > > >> share your example with me, so I can see what the problem is and
> fix
> > > it?
> > > > >>
> > > > >> I am pretty sure, that the fix will be merged later on, too. There
> > are
> > > > >> many other limitation in the layer. Right now, it is still in beta
> > > > state.
> > > > >> ;)
> > > > >>
> > > > >> -Matthias
> > > > >>
> > > > >>
> > > > >> On 05/27/2015 03:48 PM, Szabó Péter wrote:
> > > > >>> Hey everyone,
> > > > >>>
> > > > >>> I experimented with the Storm compatibility layer Matthias wrote,
> > and
> > > > ran
> > > > >>> some Storm examples on Flink. I found that Storm's SimpleJoin
> > example
> > > > >> does
> > > > >>> not work. I suppose it is because of the multiple input streams.
> > I'm
> > > > >>> willing to add another example instead.
> > > > >>> Right now, I'm getting it through Aljoscha's streaming refactor.
> > > > >>>
> > > > >>> Peter
> > > > >>>
> > > > >>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> >
>

Re: Storm compatibility layer currently does not support Storm's SimpleJoin example

Posted by Szabó Péter <ne...@gmail.com>.
@Robert
Thanks! I think the PR will be ready to merge soon :)

@Matthias
I fixed the finite-source issue on my branch, now every example and ITCase
runs and stopps without throwing an exception. Also, in case of finite
sources, the spout wrapper will not loop infinitely.
I will study your branch and make comments in the afternoon.

Peter

2015-06-01 21:46 GMT+02:00 Robert Metzger <rm...@apache.org>:

> Great to see that you two are working together on the storm compatibility
> layer.
>
> Please let the other Flink committers know when Matthias PR is in a state
> that we can review it again (= when you think its ready).
> Given the feedback from Peter and the long list of missing features and the
> current rework, I would suggest to merge the storm compatiblilty layer
> after the 0.9 release.
> The issues from the document Stephan send around two weeks ago are making
> good progress, so the release will probably forked-off rather soon.
>
> On Fri, May 29, 2015 at 2:10 PM, Szabó Péter <ne...@gmail.com>
> wrote:
>
> > Thank you very much, this explains a lot of things :)
> > I'm aware of that currently the support of TopologyContext is limited,
> so I
> > do not expect it to work smoothly. However, there was another issue with
> > the grouping by the "id" field, which seemed very strange. Anyway, I will
> > live the SimpleJoin example to you, then. I will write if I find out
> > something important.
> >
> > I think the Strom compatibility layer is a really nice extension of the
> > Flink streaming API. It would be great if you could add more
> functionality
> > to it in the future.
> > By the way, I'm not completely sure if you know: I was asked by Marton to
> > prepare your pull request for merging into the master. I restructured
> your
> > commits, cleaned up the code and rebased the branch on the current
> master.
> > Actually, to follow the changes in the behaviour of the Flink streaming
> > sources, I'm doing a second refactor right now. If you are working on
> > something, and have not rebased yet, you can use my storm branch to
> follow
> > up.
> >
> > Here are the links to my storm branches:
> > - storm-backup (exmaples and other experiments; currently a little bit
> > outdated, because it is supposed to be a backup branch):
> > https://github.com/mbalassi/flink/tree/storm-backup
> > - storm (last clean state of my work on the flink-storm-compatibility
> pull
> > request, including code cleanup & refactor and one or two simple
> examples):
> > https://github.com/mbalassi/flink/tree/storm
> >
> > Peter
> >
> >
> > 2015-05-29 10:36 GMT+02:00 Matthias J. Sax <
> mjsax@informatik.hu-berlin.de
> > >:
> >
> > > Hi Peter,
> > >
> > > I started to look into the issue. However, I could not find the
> > > following classes in the git repository:
> > >
> > > org.apache.flink.stormcompatibility.util.AbstractStormSpout
> > > org.apache.flink.stormcompatibility.util.OutputFormatter
> > > org.apache.flink.stormcompatibility.util.StormBoltFileSink
> > > org.apache.flink.stormcompatibility.util.StormBoltPrintSink
> > > org.apache.flink.stormcompatibility.util.TupleOutputFormatter
> > >
> > > Thus, I cannot compile and run the code. Can you please update the git.
> > >
> > > However, I had a quick look into the code, and see a few issues right
> > > away. For example the code uses "tuple.select(...)" what is not
> > > supported so far. Right now, attributes can only be accessed via index.
> > > Furthermore, the example uses a lot of meta information that cannot be
> > > provided easily in Flink (ie, TopologyContext is only supported very
> > > limited).
> > >
> > > To add those things, I will need much more time. I don't think I should
> > > be part of the first pull request, but be added later. I will integrate
> > > the SingleJoinBolt example as a ITCase if the functionality is there.
> It
> > > seems to be a good idea, to add more examples from storm-starter to
> > > flink-storm-examples.
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 05/28/2015 09:37 AM, Szabó Péter wrote:
> > > > Hi Matthias,
> > > >
> > > > Of course, here is the package that contains the example's source
> > > classes.
> > > >
> > >
> >
> https://github.com/mbalassi/flink/tree/storm-backup/flink-staging/flink-streaming/flink-storm-examples/src/main/java/org/apache/flink/stormcompatibility/singlejoin
> > > > It is mostly a copy-paste of SimpleJoin from storm-starter, though, I
> > > > separated the spouts and the join bolt from the rest of the topology.
> > > > I would be happy, if you could fix it. Probably I'm overlooking
> > > something.
> > > >
> > > > Peter
> > > >
> > > > 2015-05-27 17:13 GMT+02:00 Matthias J. Sax <
> > > mjsax@informatik.hu-berlin.de>:
> > > >
> > > >> Hi Peter,
> > > >>
> > > >> Thanks a lot for your feedback. It's exiting to see, that somebody
> > uses
> > > >> the layer already. :)
> > > >>
> > > >> The current prototype is going to be merged soon. However, I am more
> > > >> than happy to extend the functionality of the layer. Can you please
> > > >> share your example with me, so I can see what the problem is and fix
> > it?
> > > >>
> > > >> I am pretty sure, that the fix will be merged later on, too. There
> are
> > > >> many other limitation in the layer. Right now, it is still in beta
> > > state.
> > > >> ;)
> > > >>
> > > >> -Matthias
> > > >>
> > > >>
> > > >> On 05/27/2015 03:48 PM, Szabó Péter wrote:
> > > >>> Hey everyone,
> > > >>>
> > > >>> I experimented with the Storm compatibility layer Matthias wrote,
> and
> > > ran
> > > >>> some Storm examples on Flink. I found that Storm's SimpleJoin
> example
> > > >> does
> > > >>> not work. I suppose it is because of the multiple input streams.
> I'm
> > > >>> willing to add another example instead.
> > > >>> Right now, I'm getting it through Aljoscha's streaming refactor.
> > > >>>
> > > >>> Peter
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> > >
> >
>