You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Daniel Bali (JIRA)" <ji...@apache.org> on 2015/02/20 16:41:13 UTC

[jira] [Created] (FLINK-1594) DataStreams don't support self-join

Daniel Bali created FLINK-1594:
----------------------------------

             Summary: DataStreams don't support self-join
                 Key: FLINK-1594
                 URL: https://issues.apache.org/jira/browse/FLINK-1594
             Project: Flink
          Issue Type: Bug
          Components: Streaming
    Affects Versions: 0.9
         Environment: flink-0.9.0-SNAPSHOT
            Reporter: Daniel Bali


Trying to join a DataSets with itself will result in exceptions. I get the following stack trace:

{noformat}
    java.lang.Exception: Error setting up runtime environment: Union buffer reader must be initialized with at least two individual buffer readers
        at org.apache.flink.runtime.execution.RuntimeEnvironment.<init>(RuntimeEnvironment.java:173)
        at org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
        at org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
        at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
        at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
        at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
        at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
        at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
        at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
        at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
        at org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
        at akka.actor.ActorCell.invoke(ActorCell.scala:487)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
        at akka.dispatch.Mailbox.run(Mailbox.scala:221)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
    Caused by: java.lang.IllegalArgumentException: Union buffer reader must be initialized with at least two individual buffer readers
        at org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
        at org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.<init>(UnionBufferReader.java:69)
        at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
        at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
        at org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
        at org.apache.flink.runtime.execution.RuntimeEnvironment.<init>(RuntimeEnvironment.java:170)
        ... 20 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [jira] [Created] (FLINK-1594) DataStreams don't support self-join

Posted by Gyula Fóra <gy...@gmail.com>.
We have a join operator that is defined over windows in the data stream. So
the problem is not with the join itself, but it seems that trying to apply
this window-join with itself throws an error.

On Fri, Feb 20, 2015 at 4:50 PM, Alexander Alexandrov <
alexander.s.alexandrov@gmail.com> wrote:

> I guess the intended behavior here is to just throw a nicer error, as you
> cannot really join two data streams.
>
> 2015-02-20 16:41 GMT+01:00 Daniel Bali (JIRA) <ji...@apache.org>:
>
> > Daniel Bali created FLINK-1594:
> > ----------------------------------
> >
> >              Summary: DataStreams don't support self-join
> >                  Key: FLINK-1594
> >                  URL: https://issues.apache.org/jira/browse/FLINK-1594
> >              Project: Flink
> >           Issue Type: Bug
> >           Components: Streaming
> >     Affects Versions: 0.9
> >          Environment: flink-0.9.0-SNAPSHOT
> >             Reporter: Daniel Bali
> >
> >
> > Trying to join a DataSets with itself will result in exceptions. I get
> the
> > following stack trace:
> >
> > {noformat}
> >     java.lang.Exception: Error setting up runtime environment: Union
> > buffer reader must be initialized with at least two individual buffer
> > readers
> >         at
> >
> org.apache.flink.runtime.execution.RuntimeEnvironment.<init>(RuntimeEnvironment.java:173)
> >         at org.apache.flink.runtime.taskmanager.TaskManager.org
> >
> $apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
> >         at
> >
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
> >         at
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> >         at
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> >         at
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> >         at
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
> >         at
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
> >         at
> > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> >         at
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
> >         at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> >         at
> >
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
> >         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> >         at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> >         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> >         at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> >         at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> >         at
> > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> >         at
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> >         at
> > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> >         at
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> >     Caused by: java.lang.IllegalArgumentException: Union buffer reader
> > must be initialized with at least two individual buffer readers
> >         at
> >
> org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
> >         at
> >
> org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.<init>(UnionBufferReader.java:69)
> >         at
> >
> org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
> >         at
> >
> org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
> >         at
> >
> org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
> >         at
> >
> org.apache.flink.runtime.execution.RuntimeEnvironment.<init>(RuntimeEnvironment.java:170)
> >         ... 20 more
> > {noformat}
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
> >
>

Re: [jira] [Created] (FLINK-1594) DataStreams don't support self-join

Posted by Alexander Alexandrov <al...@gmail.com>.
I guess the intended behavior here is to just throw a nicer error, as you
cannot really join two data streams.

2015-02-20 16:41 GMT+01:00 Daniel Bali (JIRA) <ji...@apache.org>:

> Daniel Bali created FLINK-1594:
> ----------------------------------
>
>              Summary: DataStreams don't support self-join
>                  Key: FLINK-1594
>                  URL: https://issues.apache.org/jira/browse/FLINK-1594
>              Project: Flink
>           Issue Type: Bug
>           Components: Streaming
>     Affects Versions: 0.9
>          Environment: flink-0.9.0-SNAPSHOT
>             Reporter: Daniel Bali
>
>
> Trying to join a DataSets with itself will result in exceptions. I get the
> following stack trace:
>
> {noformat}
>     java.lang.Exception: Error setting up runtime environment: Union
> buffer reader must be initialized with at least two individual buffer
> readers
>         at
> org.apache.flink.runtime.execution.RuntimeEnvironment.<init>(RuntimeEnvironment.java:173)
>         at org.apache.flink.runtime.taskmanager.TaskManager.org
> $apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
>         at
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
>         at
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>         at
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>         at
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>         at
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
>         at
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
>         at
> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>         at
> org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
>         at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>         at
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>         at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>         at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>     Caused by: java.lang.IllegalArgumentException: Union buffer reader
> must be initialized with at least two individual buffer readers
>         at
> org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
>         at
> org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.<init>(UnionBufferReader.java:69)
>         at
> org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
>         at
> org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
>         at
> org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
>         at
> org.apache.flink.runtime.execution.RuntimeEnvironment.<init>(RuntimeEnvironment.java:170)
>         ... 20 more
> {noformat}
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>