You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Szabó Péter <ne...@gmail.com> on 2015/02/27 09:54:00 UTC

Flink Streaming parallelism bug report

As I know, the time of creation of the execution environment has been
slightly modified in the streaming API, which caused that
dataStream.getParallelism() and dataStream.env.getDegreeOfParallelism() may
return different values. Usage of the former is recommended.
In theory, the latter is eliminated from the code, but there might be some
more left, hiding. I've recently fixed one in WindowedDataStream. If you
encounter problems with the parallelism, it may be the cause.

Peter

Re: Flink Streaming parallelism bug report

Posted by Szabó Péter <ne...@gmail.com>.
No problem.
I will not commit the modification until it is clarified.

Peter

2015-02-27 10:48 GMT+01:00 Gyula Fóra <gy...@apache.org>:

> I can't look at it at the moment, I am on vacation and don't have my
> laptop.
> On Feb 27, 2015 9:41 AM, "Szabó Péter" <ne...@gmail.com> wrote:
>
> > Okay, thanks!
> >
> > In my case, I tried to run an ITCase test and the environment parallelism
> > is happened to be -1, and an exception was thrown. The other ITCases ran
> > properly, so I figured, the problem is with the windowing.
> > Can you check it out for me? (WindowedDataStream, line 348)
> >
> > Peter
> >
> > 2015-02-27 10:06 GMT+01:00 Gyula Fóra <gy...@apache.org>:
> >
> > > They should actually return different values in many cases.
> > >
> > > Datastream.env.getDegreeOfParallelism returns the environment
> parallelism
> > > (default)
> > >
> > > Datastream.getparallelism() returns the parallelism of the operator.
> > There
> > > is a reason when one or the other is used.
> > >
> > > Please watch out when you try to modify that because you might actually
> > > break functionality there :p
> > > On Feb 27, 2015 8:55 AM, "Szabó Péter" <ne...@gmail.com>
> > wrote:
> > >
> > > > As I know, the time of creation of the execution environment has been
> > > > slightly modified in the streaming API, which caused that
> > > > dataStream.getParallelism() and
> dataStream.env.getDegreeOfParallelism()
> > > may
> > > > return different values. Usage of the former is recommended.
> > > > In theory, the latter is eliminated from the code, but there might be
> > > some
> > > > more left, hiding. I've recently fixed one in WindowedDataStream. If
> > you
> > > > encounter problems with the parallelism, it may be the cause.
> > > >
> > > > Peter
> > > >
> > >
> >
>

Re: Flink Streaming parallelism bug report

Posted by Gyula Fóra <gy...@apache.org>.
I can't look at it at the moment, I am on vacation and don't have my
laptop.
On Feb 27, 2015 9:41 AM, "Szabó Péter" <ne...@gmail.com> wrote:

> Okay, thanks!
>
> In my case, I tried to run an ITCase test and the environment parallelism
> is happened to be -1, and an exception was thrown. The other ITCases ran
> properly, so I figured, the problem is with the windowing.
> Can you check it out for me? (WindowedDataStream, line 348)
>
> Peter
>
> 2015-02-27 10:06 GMT+01:00 Gyula Fóra <gy...@apache.org>:
>
> > They should actually return different values in many cases.
> >
> > Datastream.env.getDegreeOfParallelism returns the environment parallelism
> > (default)
> >
> > Datastream.getparallelism() returns the parallelism of the operator.
> There
> > is a reason when one or the other is used.
> >
> > Please watch out when you try to modify that because you might actually
> > break functionality there :p
> > On Feb 27, 2015 8:55 AM, "Szabó Péter" <ne...@gmail.com>
> wrote:
> >
> > > As I know, the time of creation of the execution environment has been
> > > slightly modified in the streaming API, which caused that
> > > dataStream.getParallelism() and dataStream.env.getDegreeOfParallelism()
> > may
> > > return different values. Usage of the former is recommended.
> > > In theory, the latter is eliminated from the code, but there might be
> > some
> > > more left, hiding. I've recently fixed one in WindowedDataStream. If
> you
> > > encounter problems with the parallelism, it may be the cause.
> > >
> > > Peter
> > >
> >
>

Re: Flink Streaming parallelism bug report

Posted by Szabó Péter <ne...@gmail.com>.
Okay, thanks!

In my case, I tried to run an ITCase test and the environment parallelism
is happened to be -1, and an exception was thrown. The other ITCases ran
properly, so I figured, the problem is with the windowing.
Can you check it out for me? (WindowedDataStream, line 348)

Peter

2015-02-27 10:06 GMT+01:00 Gyula Fóra <gy...@apache.org>:

> They should actually return different values in many cases.
>
> Datastream.env.getDegreeOfParallelism returns the environment parallelism
> (default)
>
> Datastream.getparallelism() returns the parallelism of the operator. There
> is a reason when one or the other is used.
>
> Please watch out when you try to modify that because you might actually
> break functionality there :p
> On Feb 27, 2015 8:55 AM, "Szabó Péter" <ne...@gmail.com> wrote:
>
> > As I know, the time of creation of the execution environment has been
> > slightly modified in the streaming API, which caused that
> > dataStream.getParallelism() and dataStream.env.getDegreeOfParallelism()
> may
> > return different values. Usage of the former is recommended.
> > In theory, the latter is eliminated from the code, but there might be
> some
> > more left, hiding. I've recently fixed one in WindowedDataStream. If you
> > encounter problems with the parallelism, it may be the cause.
> >
> > Peter
> >
>

Re: Flink Streaming parallelism bug report

Posted by Gyula Fóra <gy...@apache.org>.
They should actually return different values in many cases.

Datastream.env.getDegreeOfParallelism returns the environment parallelism
(default)

Datastream.getparallelism() returns the parallelism of the operator. There
is a reason when one or the other is used.

Please watch out when you try to modify that because you might actually
break functionality there :p
On Feb 27, 2015 8:55 AM, "Szabó Péter" <ne...@gmail.com> wrote:

> As I know, the time of creation of the execution environment has been
> slightly modified in the streaming API, which caused that
> dataStream.getParallelism() and dataStream.env.getDegreeOfParallelism() may
> return different values. Usage of the former is recommended.
> In theory, the latter is eliminated from the code, but there might be some
> more left, hiding. I've recently fixed one in WindowedDataStream. If you
> encounter problems with the parallelism, it may be the cause.
>
> Peter
>