You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Maximilian Michels <mx...@apache.org> on 2015/06/14 18:11:45 UTC

Testing Apache Flink 0.9.0-rc2

Dear Flink community,

Here's the second release candidate for the 0.9.0 release. We haven't had a
formal vote on the previous release candidate but it received an implicit
-1 because of a couple of issues.

Thanks to the hard-working Flink devs these issues should be solved now.
The following commits have been added to the second release candidate:

f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
WritableTypeInformation to be treated as an interface
40e2df5 [FLINK-2072] [ml] Adds quickstart guide
af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and further
renamings for consistency.
e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled, and failed
jobs in webinterface
ecfde6d [docs][release] update stable version to 0.9.0
4d8ae1c [docs] remove obsolete YARN link and cleanup download links
f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
0edc0c8 [build] [streaming] Streaming parents dependencies pushed to
children
6380b95 [streaming] Logging update for checkpointed streaming topologies
5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome squirrel.
80dd72d [FLINK-2196] [javaAPI] Moved misplaced SortPartitionOperator class
c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in sync
77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot release
87988ae [scripts] remove quickstart scripts
f3a96de [streaming] Fixed streaming example jars packaging and termination
255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in Scala
Streaming
1343f26 [streaming] Allow force-enabling checkpoints for iterative jobs
c59d291 Fixed a few trivial issues:
e0e6f59 [streaming] Optional iteration feedback partitioning added
348ac86 [hotfix] Fix YARNSessionFIFOITCase
80cf2c5 [ml] Makes StandardScalers state package private and reduce
redundant code. Adjusts flink-ml readme.
c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
proprocessing package, test for the for the corresponding functionality and
documentation.
ee7c417 [docs] [streaming] Added states and fold to the streaming docs
fcca75c [docs] Fix some typos and grammar in the Streaming Programming
Guide.


Again, we need to test the new release candidate. Therefore, I've created a
new document where we keep track of our testing criteria for releases:
https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit

Everyone who tested previously, could take a different task this time. For
some components we probably don't have to test again but, if in doubt,
testing twice doesn't hurt.

Happy testing :)

Cheers,
Max

Git branch: release-0.9.0-rc2
Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
Maven artifacts:
https://repository.apache.org/content/repositories/orgapacheflink-1040/
PGP public key for verifying the signatures:
http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF

Re: Testing Apache Flink 0.9.0-rc2

Posted by Till Rohrmann <ti...@gmail.com>.
I merged the legal and scheduler PR.

On Tue, Jun 16, 2015 at 11:17 PM Márton Balassi <ba...@gmail.com>
wrote:

> @Max: The PR is good to go on my side. Does the job, could be a bit nicer
> though. Added to the document.
>
> On Tue, Jun 16, 2015 at 10:54 PM, Aljoscha Krettek <al...@apache.org>
> wrote:
>
> > I added the two relevant Table API commits to the release doc.
> >
> > On Tue, 16 Jun 2015 at 21:49 Maximilian Michels <mx...@apache.org> wrote:
> >
> > > +1 for adding the Table API fixes.
> > >
> > > @Till: It seems like you fixed the bug in the scheduler. Is
> > > https://github.com/apache/flink/pull/843 fixing it?
> > >
> > > @Marton: What's the state of your pull request to solve the license
> issue
> > > of the shaded dependencies? https://github.com/apache/flink/pull/837
> > >
> > > I'm assuming all necessary fixes have been added to the document. The
> > ones
> > > in now look ok to me.
> > > On Jun 16, 2015 6:52 PM, "Fabian Hueske" <fh...@gmail.com> wrote:
> > >
> > > Important fixes, IMO. +1 for adding.
> > >
> > > 2015-06-16 18:50 GMT+02:00 Aljoscha Krettek <al...@apache.org>:
> > >
> > > > I would like to include the fixes for the Table API: RowSerializer
> > > supports
> > > > null values, Aggregations properly deal with null values.
> > > >
> > > > What do you think?
> > > >
> > > > On Mon, 15 Jun 2015 at 17:31 Aljoscha Krettek <al...@apache.org>
> > > wrote:
> > > >
> > > > > I created this to help with release testing:
> > > > > https://github.com/aljoscha/FliRTT
> > > > >
> > > > > You just start your cluster and then point the tool to the Flink
> > > > > directory. It will then run all the examples with both builtin data
> > and
> > > > > external data.
> > > > >
> > > > > On Mon, 15 Jun 2015 at 17:15 Maximilian Michels <mx...@apache.org>
> > > wrote:
> > > > >
> > > > >> Hmm. Might be interesting to check out whether this is a
> regression
> > > from
> > > > >> rc1 to rc2. In any case, it is a serious release blocker and we
> need
> > > to
> > > > >> fix
> > > > >> it.
> > > > >>
> > > > >> On Mon, Jun 15, 2015 at 5:04 PM, Till Rohrmann <
> > trohrmann@apache.org>
> > > > >> wrote:
> > > > >>
> > > > >> > I might have found another release blocker. While running some
> > > cluster
> > > > >> > tests I also tried to run the `ConnectedComponents` example.
> > > However,
> > > > >> > sometimes the example couldn't be executed because the scheduler
> > > could
> > > > >> not
> > > > >> > schedule co-located tasks, `NoResourceAvailableException`, even
> > > though
> > > > >> it
> > > > >> > should have had enough slots. It also happened on a fresh Flink
> > > > >> cluster. I
> > > > >> > opened a corresponding JIRA issue [1].
> > > > >> >
> > > > >> > The error seems to be similar to the error reported in
> FLINK-1952
> > > [2].
> > > > >> > However, I thought that this problem was already fixed. But
> maybe
> > > the
> > > > >> error
> > > > >> > was reintroduced by the latest change to the scheduler,
> FLINK-2183
> > > [3,
> > > > >> 4].
> > > > >> >
> > > > >> > [1] https://issues.apache.org/jira/browse/FLINK-2225
> > > > >> > [2] https://issues.apache.org/jira/browse/FLINK-1952
> > > > >> > [3] https://issues.apache.org/jira/browse/FLINK-2183
> > > > >> > [4]
> > > > >> >
> > > > >> >
> > > > >>
> > > >
> > >
> > >
> >
> https://github.com/apache/flink/commit/e966a0dd1c9f35ba6cb0ff4e09205c411fc4585d
> > > > >> >
> > > > >> > On Mon, Jun 15, 2015 at 3:45 PM Ufuk Celebi <uc...@apache.org>
> > wrote:
> > > > >> >
> > > > >> > > Please continue the discussion in the issue Aljoscha opened:
> > > > >> > > https://issues.apache.org/jira/browse/FLINK-2221
> > > > >> > >
> > > > >> > > I think it is better to only point to issues in this mail
> > thread.
> > > > >> > > Otherwise the discussions are very hard to follow.
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Márton Balassi <ba...@gmail.com>.
@Max: The PR is good to go on my side. Does the job, could be a bit nicer
though. Added to the document.

On Tue, Jun 16, 2015 at 10:54 PM, Aljoscha Krettek <al...@apache.org>
wrote:

> I added the two relevant Table API commits to the release doc.
>
> On Tue, 16 Jun 2015 at 21:49 Maximilian Michels <mx...@apache.org> wrote:
>
> > +1 for adding the Table API fixes.
> >
> > @Till: It seems like you fixed the bug in the scheduler. Is
> > https://github.com/apache/flink/pull/843 fixing it?
> >
> > @Marton: What's the state of your pull request to solve the license issue
> > of the shaded dependencies? https://github.com/apache/flink/pull/837
> >
> > I'm assuming all necessary fixes have been added to the document. The
> ones
> > in now look ok to me.
> > On Jun 16, 2015 6:52 PM, "Fabian Hueske" <fh...@gmail.com> wrote:
> >
> > Important fixes, IMO. +1 for adding.
> >
> > 2015-06-16 18:50 GMT+02:00 Aljoscha Krettek <al...@apache.org>:
> >
> > > I would like to include the fixes for the Table API: RowSerializer
> > supports
> > > null values, Aggregations properly deal with null values.
> > >
> > > What do you think?
> > >
> > > On Mon, 15 Jun 2015 at 17:31 Aljoscha Krettek <al...@apache.org>
> > wrote:
> > >
> > > > I created this to help with release testing:
> > > > https://github.com/aljoscha/FliRTT
> > > >
> > > > You just start your cluster and then point the tool to the Flink
> > > > directory. It will then run all the examples with both builtin data
> and
> > > > external data.
> > > >
> > > > On Mon, 15 Jun 2015 at 17:15 Maximilian Michels <mx...@apache.org>
> > wrote:
> > > >
> > > >> Hmm. Might be interesting to check out whether this is a regression
> > from
> > > >> rc1 to rc2. In any case, it is a serious release blocker and we need
> > to
> > > >> fix
> > > >> it.
> > > >>
> > > >> On Mon, Jun 15, 2015 at 5:04 PM, Till Rohrmann <
> trohrmann@apache.org>
> > > >> wrote:
> > > >>
> > > >> > I might have found another release blocker. While running some
> > cluster
> > > >> > tests I also tried to run the `ConnectedComponents` example.
> > However,
> > > >> > sometimes the example couldn't be executed because the scheduler
> > could
> > > >> not
> > > >> > schedule co-located tasks, `NoResourceAvailableException`, even
> > though
> > > >> it
> > > >> > should have had enough slots. It also happened on a fresh Flink
> > > >> cluster. I
> > > >> > opened a corresponding JIRA issue [1].
> > > >> >
> > > >> > The error seems to be similar to the error reported in FLINK-1952
> > [2].
> > > >> > However, I thought that this problem was already fixed. But maybe
> > the
> > > >> error
> > > >> > was reintroduced by the latest change to the scheduler, FLINK-2183
> > [3,
> > > >> 4].
> > > >> >
> > > >> > [1] https://issues.apache.org/jira/browse/FLINK-2225
> > > >> > [2] https://issues.apache.org/jira/browse/FLINK-1952
> > > >> > [3] https://issues.apache.org/jira/browse/FLINK-2183
> > > >> > [4]
> > > >> >
> > > >> >
> > > >>
> > >
> >
> >
> https://github.com/apache/flink/commit/e966a0dd1c9f35ba6cb0ff4e09205c411fc4585d
> > > >> >
> > > >> > On Mon, Jun 15, 2015 at 3:45 PM Ufuk Celebi <uc...@apache.org>
> wrote:
> > > >> >
> > > >> > > Please continue the discussion in the issue Aljoscha opened:
> > > >> > > https://issues.apache.org/jira/browse/FLINK-2221
> > > >> > >
> > > >> > > I think it is better to only point to issues in this mail
> thread.
> > > >> > > Otherwise the discussions are very hard to follow.
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Aljoscha Krettek <al...@apache.org>.
I added the two relevant Table API commits to the release doc.

On Tue, 16 Jun 2015 at 21:49 Maximilian Michels <mx...@apache.org> wrote:

> +1 for adding the Table API fixes.
>
> @Till: It seems like you fixed the bug in the scheduler. Is
> https://github.com/apache/flink/pull/843 fixing it?
>
> @Marton: What's the state of your pull request to solve the license issue
> of the shaded dependencies? https://github.com/apache/flink/pull/837
>
> I'm assuming all necessary fixes have been added to the document. The ones
> in now look ok to me.
> On Jun 16, 2015 6:52 PM, "Fabian Hueske" <fh...@gmail.com> wrote:
>
> Important fixes, IMO. +1 for adding.
>
> 2015-06-16 18:50 GMT+02:00 Aljoscha Krettek <al...@apache.org>:
>
> > I would like to include the fixes for the Table API: RowSerializer
> supports
> > null values, Aggregations properly deal with null values.
> >
> > What do you think?
> >
> > On Mon, 15 Jun 2015 at 17:31 Aljoscha Krettek <al...@apache.org>
> wrote:
> >
> > > I created this to help with release testing:
> > > https://github.com/aljoscha/FliRTT
> > >
> > > You just start your cluster and then point the tool to the Flink
> > > directory. It will then run all the examples with both builtin data and
> > > external data.
> > >
> > > On Mon, 15 Jun 2015 at 17:15 Maximilian Michels <mx...@apache.org>
> wrote:
> > >
> > >> Hmm. Might be interesting to check out whether this is a regression
> from
> > >> rc1 to rc2. In any case, it is a serious release blocker and we need
> to
> > >> fix
> > >> it.
> > >>
> > >> On Mon, Jun 15, 2015 at 5:04 PM, Till Rohrmann <tr...@apache.org>
> > >> wrote:
> > >>
> > >> > I might have found another release blocker. While running some
> cluster
> > >> > tests I also tried to run the `ConnectedComponents` example.
> However,
> > >> > sometimes the example couldn't be executed because the scheduler
> could
> > >> not
> > >> > schedule co-located tasks, `NoResourceAvailableException`, even
> though
> > >> it
> > >> > should have had enough slots. It also happened on a fresh Flink
> > >> cluster. I
> > >> > opened a corresponding JIRA issue [1].
> > >> >
> > >> > The error seems to be similar to the error reported in FLINK-1952
> [2].
> > >> > However, I thought that this problem was already fixed. But maybe
> the
> > >> error
> > >> > was reintroduced by the latest change to the scheduler, FLINK-2183
> [3,
> > >> 4].
> > >> >
> > >> > [1] https://issues.apache.org/jira/browse/FLINK-2225
> > >> > [2] https://issues.apache.org/jira/browse/FLINK-1952
> > >> > [3] https://issues.apache.org/jira/browse/FLINK-2183
> > >> > [4]
> > >> >
> > >> >
> > >>
> >
>
> https://github.com/apache/flink/commit/e966a0dd1c9f35ba6cb0ff4e09205c411fc4585d
> > >> >
> > >> > On Mon, Jun 15, 2015 at 3:45 PM Ufuk Celebi <uc...@apache.org> wrote:
> > >> >
> > >> > > Please continue the discussion in the issue Aljoscha opened:
> > >> > > https://issues.apache.org/jira/browse/FLINK-2221
> > >> > >
> > >> > > I think it is better to only point to issues in this mail thread.
> > >> > > Otherwise the discussions are very hard to follow.
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Maximilian Michels <mx...@apache.org>.
+1 for adding the Table API fixes.

@Till: It seems like you fixed the bug in the scheduler. Is
https://github.com/apache/flink/pull/843 fixing it?

@Marton: What's the state of your pull request to solve the license issue
of the shaded dependencies? https://github.com/apache/flink/pull/837

I'm assuming all necessary fixes have been added to the document. The ones
in now look ok to me.
On Jun 16, 2015 6:52 PM, "Fabian Hueske" <fh...@gmail.com> wrote:

Important fixes, IMO. +1 for adding.

2015-06-16 18:50 GMT+02:00 Aljoscha Krettek <al...@apache.org>:

> I would like to include the fixes for the Table API: RowSerializer
supports
> null values, Aggregations properly deal with null values.
>
> What do you think?
>
> On Mon, 15 Jun 2015 at 17:31 Aljoscha Krettek <al...@apache.org> wrote:
>
> > I created this to help with release testing:
> > https://github.com/aljoscha/FliRTT
> >
> > You just start your cluster and then point the tool to the Flink
> > directory. It will then run all the examples with both builtin data and
> > external data.
> >
> > On Mon, 15 Jun 2015 at 17:15 Maximilian Michels <mx...@apache.org> wrote:
> >
> >> Hmm. Might be interesting to check out whether this is a regression
from
> >> rc1 to rc2. In any case, it is a serious release blocker and we need to
> >> fix
> >> it.
> >>
> >> On Mon, Jun 15, 2015 at 5:04 PM, Till Rohrmann <tr...@apache.org>
> >> wrote:
> >>
> >> > I might have found another release blocker. While running some
cluster
> >> > tests I also tried to run the `ConnectedComponents` example. However,
> >> > sometimes the example couldn't be executed because the scheduler
could
> >> not
> >> > schedule co-located tasks, `NoResourceAvailableException`, even
though
> >> it
> >> > should have had enough slots. It also happened on a fresh Flink
> >> cluster. I
> >> > opened a corresponding JIRA issue [1].
> >> >
> >> > The error seems to be similar to the error reported in FLINK-1952
[2].
> >> > However, I thought that this problem was already fixed. But maybe the
> >> error
> >> > was reintroduced by the latest change to the scheduler, FLINK-2183
[3,
> >> 4].
> >> >
> >> > [1] https://issues.apache.org/jira/browse/FLINK-2225
> >> > [2] https://issues.apache.org/jira/browse/FLINK-1952
> >> > [3] https://issues.apache.org/jira/browse/FLINK-2183
> >> > [4]
> >> >
> >> >
> >>
>
https://github.com/apache/flink/commit/e966a0dd1c9f35ba6cb0ff4e09205c411fc4585d
> >> >
> >> > On Mon, Jun 15, 2015 at 3:45 PM Ufuk Celebi <uc...@apache.org> wrote:
> >> >
> >> > > Please continue the discussion in the issue Aljoscha opened:
> >> > > https://issues.apache.org/jira/browse/FLINK-2221
> >> > >
> >> > > I think it is better to only point to issues in this mail thread.
> >> > > Otherwise the discussions are very hard to follow.
> >> > >
> >> > >
> >> >
> >>
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Fabian Hueske <fh...@gmail.com>.
Important fixes, IMO. +1 for adding.

2015-06-16 18:50 GMT+02:00 Aljoscha Krettek <al...@apache.org>:

> I would like to include the fixes for the Table API: RowSerializer supports
> null values, Aggregations properly deal with null values.
>
> What do you think?
>
> On Mon, 15 Jun 2015 at 17:31 Aljoscha Krettek <al...@apache.org> wrote:
>
> > I created this to help with release testing:
> > https://github.com/aljoscha/FliRTT
> >
> > You just start your cluster and then point the tool to the Flink
> > directory. It will then run all the examples with both builtin data and
> > external data.
> >
> > On Mon, 15 Jun 2015 at 17:15 Maximilian Michels <mx...@apache.org> wrote:
> >
> >> Hmm. Might be interesting to check out whether this is a regression from
> >> rc1 to rc2. In any case, it is a serious release blocker and we need to
> >> fix
> >> it.
> >>
> >> On Mon, Jun 15, 2015 at 5:04 PM, Till Rohrmann <tr...@apache.org>
> >> wrote:
> >>
> >> > I might have found another release blocker. While running some cluster
> >> > tests I also tried to run the `ConnectedComponents` example. However,
> >> > sometimes the example couldn't be executed because the scheduler could
> >> not
> >> > schedule co-located tasks, `NoResourceAvailableException`, even though
> >> it
> >> > should have had enough slots. It also happened on a fresh Flink
> >> cluster. I
> >> > opened a corresponding JIRA issue [1].
> >> >
> >> > The error seems to be similar to the error reported in FLINK-1952 [2].
> >> > However, I thought that this problem was already fixed. But maybe the
> >> error
> >> > was reintroduced by the latest change to the scheduler, FLINK-2183 [3,
> >> 4].
> >> >
> >> > [1] https://issues.apache.org/jira/browse/FLINK-2225
> >> > [2] https://issues.apache.org/jira/browse/FLINK-1952
> >> > [3] https://issues.apache.org/jira/browse/FLINK-2183
> >> > [4]
> >> >
> >> >
> >>
> https://github.com/apache/flink/commit/e966a0dd1c9f35ba6cb0ff4e09205c411fc4585d
> >> >
> >> > On Mon, Jun 15, 2015 at 3:45 PM Ufuk Celebi <uc...@apache.org> wrote:
> >> >
> >> > > Please continue the discussion in the issue Aljoscha opened:
> >> > > https://issues.apache.org/jira/browse/FLINK-2221
> >> > >
> >> > > I think it is better to only point to issues in this mail thread.
> >> > > Otherwise the discussions are very hard to follow.
> >> > >
> >> > >
> >> >
> >>
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Aljoscha Krettek <al...@apache.org>.
I would like to include the fixes for the Table API: RowSerializer supports
null values, Aggregations properly deal with null values.

What do you think?

On Mon, 15 Jun 2015 at 17:31 Aljoscha Krettek <al...@apache.org> wrote:

> I created this to help with release testing:
> https://github.com/aljoscha/FliRTT
>
> You just start your cluster and then point the tool to the Flink
> directory. It will then run all the examples with both builtin data and
> external data.
>
> On Mon, 15 Jun 2015 at 17:15 Maximilian Michels <mx...@apache.org> wrote:
>
>> Hmm. Might be interesting to check out whether this is a regression from
>> rc1 to rc2. In any case, it is a serious release blocker and we need to
>> fix
>> it.
>>
>> On Mon, Jun 15, 2015 at 5:04 PM, Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>> > I might have found another release blocker. While running some cluster
>> > tests I also tried to run the `ConnectedComponents` example. However,
>> > sometimes the example couldn't be executed because the scheduler could
>> not
>> > schedule co-located tasks, `NoResourceAvailableException`, even though
>> it
>> > should have had enough slots. It also happened on a fresh Flink
>> cluster. I
>> > opened a corresponding JIRA issue [1].
>> >
>> > The error seems to be similar to the error reported in FLINK-1952 [2].
>> > However, I thought that this problem was already fixed. But maybe the
>> error
>> > was reintroduced by the latest change to the scheduler, FLINK-2183 [3,
>> 4].
>> >
>> > [1] https://issues.apache.org/jira/browse/FLINK-2225
>> > [2] https://issues.apache.org/jira/browse/FLINK-1952
>> > [3] https://issues.apache.org/jira/browse/FLINK-2183
>> > [4]
>> >
>> >
>> https://github.com/apache/flink/commit/e966a0dd1c9f35ba6cb0ff4e09205c411fc4585d
>> >
>> > On Mon, Jun 15, 2015 at 3:45 PM Ufuk Celebi <uc...@apache.org> wrote:
>> >
>> > > Please continue the discussion in the issue Aljoscha opened:
>> > > https://issues.apache.org/jira/browse/FLINK-2221
>> > >
>> > > I think it is better to only point to issues in this mail thread.
>> > > Otherwise the discussions are very hard to follow.
>> > >
>> > >
>> >
>>
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Aljoscha Krettek <al...@apache.org>.
I created this to help with release testing:
https://github.com/aljoscha/FliRTT

You just start your cluster and then point the tool to the Flink directory.
It will then run all the examples with both builtin data and external data.

On Mon, 15 Jun 2015 at 17:15 Maximilian Michels <mx...@apache.org> wrote:

> Hmm. Might be interesting to check out whether this is a regression from
> rc1 to rc2. In any case, it is a serious release blocker and we need to fix
> it.
>
> On Mon, Jun 15, 2015 at 5:04 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
> > I might have found another release blocker. While running some cluster
> > tests I also tried to run the `ConnectedComponents` example. However,
> > sometimes the example couldn't be executed because the scheduler could
> not
> > schedule co-located tasks, `NoResourceAvailableException`, even though it
> > should have had enough slots. It also happened on a fresh Flink cluster.
> I
> > opened a corresponding JIRA issue [1].
> >
> > The error seems to be similar to the error reported in FLINK-1952 [2].
> > However, I thought that this problem was already fixed. But maybe the
> error
> > was reintroduced by the latest change to the scheduler, FLINK-2183 [3,
> 4].
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-2225
> > [2] https://issues.apache.org/jira/browse/FLINK-1952
> > [3] https://issues.apache.org/jira/browse/FLINK-2183
> > [4]
> >
> >
> https://github.com/apache/flink/commit/e966a0dd1c9f35ba6cb0ff4e09205c411fc4585d
> >
> > On Mon, Jun 15, 2015 at 3:45 PM Ufuk Celebi <uc...@apache.org> wrote:
> >
> > > Please continue the discussion in the issue Aljoscha opened:
> > > https://issues.apache.org/jira/browse/FLINK-2221
> > >
> > > I think it is better to only point to issues in this mail thread.
> > > Otherwise the discussions are very hard to follow.
> > >
> > >
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Maximilian Michels <mx...@apache.org>.
Hmm. Might be interesting to check out whether this is a regression from
rc1 to rc2. In any case, it is a serious release blocker and we need to fix
it.

On Mon, Jun 15, 2015 at 5:04 PM, Till Rohrmann <tr...@apache.org> wrote:

> I might have found another release blocker. While running some cluster
> tests I also tried to run the `ConnectedComponents` example. However,
> sometimes the example couldn't be executed because the scheduler could not
> schedule co-located tasks, `NoResourceAvailableException`, even though it
> should have had enough slots. It also happened on a fresh Flink cluster. I
> opened a corresponding JIRA issue [1].
>
> The error seems to be similar to the error reported in FLINK-1952 [2].
> However, I thought that this problem was already fixed. But maybe the error
> was reintroduced by the latest change to the scheduler, FLINK-2183 [3, 4].
>
> [1] https://issues.apache.org/jira/browse/FLINK-2225
> [2] https://issues.apache.org/jira/browse/FLINK-1952
> [3] https://issues.apache.org/jira/browse/FLINK-2183
> [4]
>
> https://github.com/apache/flink/commit/e966a0dd1c9f35ba6cb0ff4e09205c411fc4585d
>
> On Mon, Jun 15, 2015 at 3:45 PM Ufuk Celebi <uc...@apache.org> wrote:
>
> > Please continue the discussion in the issue Aljoscha opened:
> > https://issues.apache.org/jira/browse/FLINK-2221
> >
> > I think it is better to only point to issues in this mail thread.
> > Otherwise the discussions are very hard to follow.
> >
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Till Rohrmann <tr...@apache.org>.
I might have found another release blocker. While running some cluster
tests I also tried to run the `ConnectedComponents` example. However,
sometimes the example couldn't be executed because the scheduler could not
schedule co-located tasks, `NoResourceAvailableException`, even though it
should have had enough slots. It also happened on a fresh Flink cluster. I
opened a corresponding JIRA issue [1].

The error seems to be similar to the error reported in FLINK-1952 [2].
However, I thought that this problem was already fixed. But maybe the error
was reintroduced by the latest change to the scheduler, FLINK-2183 [3, 4].

[1] https://issues.apache.org/jira/browse/FLINK-2225
[2] https://issues.apache.org/jira/browse/FLINK-1952
[3] https://issues.apache.org/jira/browse/FLINK-2183
[4]
https://github.com/apache/flink/commit/e966a0dd1c9f35ba6cb0ff4e09205c411fc4585d

On Mon, Jun 15, 2015 at 3:45 PM Ufuk Celebi <uc...@apache.org> wrote:

> Please continue the discussion in the issue Aljoscha opened:
> https://issues.apache.org/jira/browse/FLINK-2221
>
> I think it is better to only point to issues in this mail thread.
> Otherwise the discussions are very hard to follow.
>
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Ufuk Celebi <uc...@apache.org>.
Please continue the discussion in the issue Aljoscha opened: https://issues.apache.org/jira/browse/FLINK-2221

I think it is better to only point to issues in this mail thread. Otherwise the discussions are very hard to follow.


Re: Testing Apache Flink 0.9.0-rc2

Posted by Gyula Fóra <gy...@gmail.com>.
The checkpoint cleanup works for HDFS right? I assume the job manager
should see that as well.

This is not a trivial problem in general, so the assumptions we were making
now that the JM can actually execute the cleanup logic.

Aljoscha Krettek <al...@apache.org> ezt írta (időpont: 2015. jún. 15.,
H, 15:40):

> @Ufuk The cleanup bug for file:// checkpoints is not easy to fix IMHO.
>
> On Mon, 15 Jun 2015 at 15:39 Aljoscha Krettek <al...@apache.org> wrote:
>
> > Oh yes, on that I agree. I'm just saying that the checkpoint setting
> > should maybe be a central setting.
> >
> > On Mon, 15 Jun 2015 at 15:38 Matthias J. Sax <
> > mjsax@informatik.hu-berlin.de> wrote:
> >
> >> Hi,
> >>
> >> IMHO, it is very common that Workers do have their own config files (eg,
> >> Storm works the same way). And I think it make a lot of senses. You
> >> might run Flink in an heterogeneous cluster and you want to assign
> >> different memory and slots for different hardware. This would not be
> >> possible using a single config file (specified at the master and
> >> distribute it).
> >>
> >>
> >> -Matthias
> >>
> >> On 06/15/2015 03:30 PM, Aljoscha Krettek wrote:
> >> > Regarding 1), thats why I said "bugs and features". :D But I think of
> >> it as
> >> > a bug, since people will normally set in in the flink-conf.yaml on the
> >> > master and assume that it works. That's what I assumed and it took me
> a
> >> > while to figure out that the task managers don't respect this setting.
> >> >
> >> > Regarding 3), if you think about it, this could never work. The state
> >> > handle cleanup logic happens purely on the JobManager. So what happens
> >> is
> >> > that the TaskManagers create state in some directory, let's say
> >> > /tmp/checkpoints, on the TaskManager. For cleanup, the JobManager gets
> >> the
> >> > state handle and calls discard (on the JobManager), this tries to
> >> cleanup
> >> > the state in /tmp/checkpoints, but of course, there is nothing there
> >> since
> >> > we are still on the JobManager.
> >> >
> >> > On Mon, 15 Jun 2015 at 15:23 Márton Balassi <balassi.marton@gmail.com
> >
> >> > wrote:
> >> >
> >> >> @Aljoscha:
> >> >> 1) I think this just means that you can set the state backend on a
> >> >> taskmanager basis.
> >> >> 3) This is a serious issue then. Is it work when you set it in the
> >> >> flink-conf.yaml?
> >> >>
> >> >> On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek <
> aljoscha@apache.org
> >> >
> >> >> wrote:
> >> >>
> >> >>> So, during my testing of the state checkpointing on a cluster I
> >> >> discovered
> >> >>> several things (bugs and features):
> >> >>>
> >> >>>  - If you have a setup where the configuration is not synced to the
> >> >> workers
> >> >>> they do not pick up the state back-end configuration. The workers do
> >> not
> >> >>> respect the setting in the flink-cont.yaml on the master
> >> >>> - HDFS checkpointing works fine if you manually set it as the
> per-job
> >> >>> state-backend using setStateHandleProvider()
> >> >>> - If you manually set the stateHandleProvider to a "file://"
> backend,
> >> old
> >> >>> checkpoints will not be cleaned up, they will also not be cleaned up
> >> >> when a
> >> >>> job is finished.
> >> >>>
> >> >>> On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <mx...@apache.org>
> >> wrote:
> >> >>>
> >> >>>> Hi Henry,
> >> >>>>
> >> >>>> This is just a dry run. The goal is to get everything in shape for
> a
> >> >>> proper
> >> >>>> vote.
> >> >>>>
> >> >>>> Kind regards,
> >> >>>> Max
> >> >>>>
> >> >>>>
> >> >>>> On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra <
> >> >> henry.saputra@gmail.com>
> >> >>>> wrote:
> >> >>>>
> >> >>>>> Hi Max,
> >> >>>>>
> >> >>>>> Are you doing official VOTE on the RC on 0.9 release or this is
> just
> >> >> a
> >> >>>> dry
> >> >>>>> run?
> >> >>>>>
> >> >>>>>
> >> >>>>> - Henry
> >> >>>>>
> >> >>>>> On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <
> mxm@apache.org
> >> >
> >> >>>>> wrote:
> >> >>>>>> Dear Flink community,
> >> >>>>>>
> >> >>>>>> Here's the second release candidate for the 0.9.0 release. We
> >> >> haven't
> >> >>>>> had a
> >> >>>>>> formal vote on the previous release candidate but it received an
> >> >>>> implicit
> >> >>>>>> -1 because of a couple of issues.
> >> >>>>>>
> >> >>>>>> Thanks to the hard-working Flink devs these issues should be
> solved
> >> >>>> now.
> >> >>>>>> The following commits have been added to the second release
> >> >>> candidate:
> >> >>>>>>
> >> >>>>>> f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
> >> >>>>>> WritableTypeInformation to be treated as an interface
> >> >>>>>> 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
> >> >>>>>> af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and
> >> >> further
> >> >>>>>> renamings for consistency.
> >> >>>>>> e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled,
> >> >> and
> >> >>>>> failed
> >> >>>>>> jobs in webinterface
> >> >>>>>> ecfde6d [docs][release] update stable version to 0.9.0
> >> >>>>>> 4d8ae1c [docs] remove obsolete YARN link and cleanup download
> links
> >> >>>>>> f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
> >> >>>>>> ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
> >> >>>>>> 0edc0c8 [build] [streaming] Streaming parents dependencies pushed
> >> >> to
> >> >>>>>> children
> >> >>>>>> 6380b95 [streaming] Logging update for checkpointed streaming
> >> >>>> topologies
> >> >>>>>> 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome
> >> >>>>> squirrel.
> >> >>>>>> 80dd72d [FLINK-2196] [javaAPI] Moved misplaced
> >> >> SortPartitionOperator
> >> >>>>> class
> >> >>>>>> c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart
> in
> >> >>>> sync
> >> >>>>>> 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot
> >> >>> release
> >> >>>>>> 87988ae [scripts] remove quickstart scripts
> >> >>>>>> f3a96de [streaming] Fixed streaming example jars packaging and
> >> >>>>> termination
> >> >>>>>> 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in
> >> >> Scala
> >> >>>>>> Streaming
> >> >>>>>> 1343f26 [streaming] Allow force-enabling checkpoints for
> iterative
> >> >>> jobs
> >> >>>>>> c59d291 Fixed a few trivial issues:
> >> >>>>>> e0e6f59 [streaming] Optional iteration feedback partitioning
> added
> >> >>>>>> 348ac86 [hotfix] Fix YARNSessionFIFOITCase
> >> >>>>>> 80cf2c5 [ml] Makes StandardScalers state package private and
> reduce
> >> >>>>>> redundant code. Adjusts flink-ml readme.
> >> >>>>>> c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
> >> >>>>>> proprocessing package, test for the for the corresponding
> >> >>> functionality
> >> >>>>> and
> >> >>>>>> documentation.
> >> >>>>>> ee7c417 [docs] [streaming] Added states and fold to the streaming
> >> >>> docs
> >> >>>>>> fcca75c [docs] Fix some typos and grammar in the Streaming
> >> >>> Programming
> >> >>>>>> Guide.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Again, we need to test the new release candidate. Therefore, I've
> >> >>>>> created a
> >> >>>>>> new document where we keep track of our testing criteria for
> >> >>> releases:
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
> >> >>>>>>
> >> >>>>>> Everyone who tested previously, could take a different task this
> >> >>> time.
> >> >>>>> For
> >> >>>>>> some components we probably don't have to test again but, if in
> >> >>> doubt,
> >> >>>>>> testing twice doesn't hurt.
> >> >>>>>>
> >> >>>>>> Happy testing :)
> >> >>>>>>
> >> >>>>>> Cheers,
> >> >>>>>> Max
> >> >>>>>>
> >> >>>>>> Git branch: release-0.9.0-rc2
> >> >>>>>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
> >> >>>>>> Maven artifacts:
> >> >>>>>>
> >> >>>>
> >> >>
> >> https://repository.apache.org/content/repositories/orgapacheflink-1040/
> >> >>>>>> PGP public key for verifying the signatures:
> >> >>>>>>
> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >> >
> >>
> >>
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Aljoscha Krettek <al...@apache.org>.
@Ufuk The cleanup bug for file:// checkpoints is not easy to fix IMHO.

On Mon, 15 Jun 2015 at 15:39 Aljoscha Krettek <al...@apache.org> wrote:

> Oh yes, on that I agree. I'm just saying that the checkpoint setting
> should maybe be a central setting.
>
> On Mon, 15 Jun 2015 at 15:38 Matthias J. Sax <
> mjsax@informatik.hu-berlin.de> wrote:
>
>> Hi,
>>
>> IMHO, it is very common that Workers do have their own config files (eg,
>> Storm works the same way). And I think it make a lot of senses. You
>> might run Flink in an heterogeneous cluster and you want to assign
>> different memory and slots for different hardware. This would not be
>> possible using a single config file (specified at the master and
>> distribute it).
>>
>>
>> -Matthias
>>
>> On 06/15/2015 03:30 PM, Aljoscha Krettek wrote:
>> > Regarding 1), thats why I said "bugs and features". :D But I think of
>> it as
>> > a bug, since people will normally set in in the flink-conf.yaml on the
>> > master and assume that it works. That's what I assumed and it took me a
>> > while to figure out that the task managers don't respect this setting.
>> >
>> > Regarding 3), if you think about it, this could never work. The state
>> > handle cleanup logic happens purely on the JobManager. So what happens
>> is
>> > that the TaskManagers create state in some directory, let's say
>> > /tmp/checkpoints, on the TaskManager. For cleanup, the JobManager gets
>> the
>> > state handle and calls discard (on the JobManager), this tries to
>> cleanup
>> > the state in /tmp/checkpoints, but of course, there is nothing there
>> since
>> > we are still on the JobManager.
>> >
>> > On Mon, 15 Jun 2015 at 15:23 Márton Balassi <ba...@gmail.com>
>> > wrote:
>> >
>> >> @Aljoscha:
>> >> 1) I think this just means that you can set the state backend on a
>> >> taskmanager basis.
>> >> 3) This is a serious issue then. Is it work when you set it in the
>> >> flink-conf.yaml?
>> >>
>> >> On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek <aljoscha@apache.org
>> >
>> >> wrote:
>> >>
>> >>> So, during my testing of the state checkpointing on a cluster I
>> >> discovered
>> >>> several things (bugs and features):
>> >>>
>> >>>  - If you have a setup where the configuration is not synced to the
>> >> workers
>> >>> they do not pick up the state back-end configuration. The workers do
>> not
>> >>> respect the setting in the flink-cont.yaml on the master
>> >>> - HDFS checkpointing works fine if you manually set it as the per-job
>> >>> state-backend using setStateHandleProvider()
>> >>> - If you manually set the stateHandleProvider to a "file://" backend,
>> old
>> >>> checkpoints will not be cleaned up, they will also not be cleaned up
>> >> when a
>> >>> job is finished.
>> >>>
>> >>> On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <mx...@apache.org>
>> wrote:
>> >>>
>> >>>> Hi Henry,
>> >>>>
>> >>>> This is just a dry run. The goal is to get everything in shape for a
>> >>> proper
>> >>>> vote.
>> >>>>
>> >>>> Kind regards,
>> >>>> Max
>> >>>>
>> >>>>
>> >>>> On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra <
>> >> henry.saputra@gmail.com>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi Max,
>> >>>>>
>> >>>>> Are you doing official VOTE on the RC on 0.9 release or this is just
>> >> a
>> >>>> dry
>> >>>>> run?
>> >>>>>
>> >>>>>
>> >>>>> - Henry
>> >>>>>
>> >>>>> On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <mxm@apache.org
>> >
>> >>>>> wrote:
>> >>>>>> Dear Flink community,
>> >>>>>>
>> >>>>>> Here's the second release candidate for the 0.9.0 release. We
>> >> haven't
>> >>>>> had a
>> >>>>>> formal vote on the previous release candidate but it received an
>> >>>> implicit
>> >>>>>> -1 because of a couple of issues.
>> >>>>>>
>> >>>>>> Thanks to the hard-working Flink devs these issues should be solved
>> >>>> now.
>> >>>>>> The following commits have been added to the second release
>> >>> candidate:
>> >>>>>>
>> >>>>>> f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
>> >>>>>> WritableTypeInformation to be treated as an interface
>> >>>>>> 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
>> >>>>>> af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and
>> >> further
>> >>>>>> renamings for consistency.
>> >>>>>> e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled,
>> >> and
>> >>>>> failed
>> >>>>>> jobs in webinterface
>> >>>>>> ecfde6d [docs][release] update stable version to 0.9.0
>> >>>>>> 4d8ae1c [docs] remove obsolete YARN link and cleanup download links
>> >>>>>> f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
>> >>>>>> ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
>> >>>>>> 0edc0c8 [build] [streaming] Streaming parents dependencies pushed
>> >> to
>> >>>>>> children
>> >>>>>> 6380b95 [streaming] Logging update for checkpointed streaming
>> >>>> topologies
>> >>>>>> 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome
>> >>>>> squirrel.
>> >>>>>> 80dd72d [FLINK-2196] [javaAPI] Moved misplaced
>> >> SortPartitionOperator
>> >>>>> class
>> >>>>>> c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in
>> >>>> sync
>> >>>>>> 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot
>> >>> release
>> >>>>>> 87988ae [scripts] remove quickstart scripts
>> >>>>>> f3a96de [streaming] Fixed streaming example jars packaging and
>> >>>>> termination
>> >>>>>> 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in
>> >> Scala
>> >>>>>> Streaming
>> >>>>>> 1343f26 [streaming] Allow force-enabling checkpoints for iterative
>> >>> jobs
>> >>>>>> c59d291 Fixed a few trivial issues:
>> >>>>>> e0e6f59 [streaming] Optional iteration feedback partitioning added
>> >>>>>> 348ac86 [hotfix] Fix YARNSessionFIFOITCase
>> >>>>>> 80cf2c5 [ml] Makes StandardScalers state package private and reduce
>> >>>>>> redundant code. Adjusts flink-ml readme.
>> >>>>>> c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
>> >>>>>> proprocessing package, test for the for the corresponding
>> >>> functionality
>> >>>>> and
>> >>>>>> documentation.
>> >>>>>> ee7c417 [docs] [streaming] Added states and fold to the streaming
>> >>> docs
>> >>>>>> fcca75c [docs] Fix some typos and grammar in the Streaming
>> >>> Programming
>> >>>>>> Guide.
>> >>>>>>
>> >>>>>>
>> >>>>>> Again, we need to test the new release candidate. Therefore, I've
>> >>>>> created a
>> >>>>>> new document where we keep track of our testing criteria for
>> >>> releases:
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
>> >>>>>>
>> >>>>>> Everyone who tested previously, could take a different task this
>> >>> time.
>> >>>>> For
>> >>>>>> some components we probably don't have to test again but, if in
>> >>> doubt,
>> >>>>>> testing twice doesn't hurt.
>> >>>>>>
>> >>>>>> Happy testing :)
>> >>>>>>
>> >>>>>> Cheers,
>> >>>>>> Max
>> >>>>>>
>> >>>>>> Git branch: release-0.9.0-rc2
>> >>>>>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
>> >>>>>> Maven artifacts:
>> >>>>>>
>> >>>>
>> >>
>> https://repository.apache.org/content/repositories/orgapacheflink-1040/
>> >>>>>> PGP public key for verifying the signatures:
>> >>>>>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Aljoscha Krettek <al...@apache.org>.
Oh yes, on that I agree. I'm just saying that the checkpoint setting should
maybe be a central setting.

On Mon, 15 Jun 2015 at 15:38 Matthias J. Sax <mj...@informatik.hu-berlin.de>
wrote:

> Hi,
>
> IMHO, it is very common that Workers do have their own config files (eg,
> Storm works the same way). And I think it make a lot of senses. You
> might run Flink in an heterogeneous cluster and you want to assign
> different memory and slots for different hardware. This would not be
> possible using a single config file (specified at the master and
> distribute it).
>
>
> -Matthias
>
> On 06/15/2015 03:30 PM, Aljoscha Krettek wrote:
> > Regarding 1), thats why I said "bugs and features". :D But I think of it
> as
> > a bug, since people will normally set in in the flink-conf.yaml on the
> > master and assume that it works. That's what I assumed and it took me a
> > while to figure out that the task managers don't respect this setting.
> >
> > Regarding 3), if you think about it, this could never work. The state
> > handle cleanup logic happens purely on the JobManager. So what happens is
> > that the TaskManagers create state in some directory, let's say
> > /tmp/checkpoints, on the TaskManager. For cleanup, the JobManager gets
> the
> > state handle and calls discard (on the JobManager), this tries to cleanup
> > the state in /tmp/checkpoints, but of course, there is nothing there
> since
> > we are still on the JobManager.
> >
> > On Mon, 15 Jun 2015 at 15:23 Márton Balassi <ba...@gmail.com>
> > wrote:
> >
> >> @Aljoscha:
> >> 1) I think this just means that you can set the state backend on a
> >> taskmanager basis.
> >> 3) This is a serious issue then. Is it work when you set it in the
> >> flink-conf.yaml?
> >>
> >> On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek <al...@apache.org>
> >> wrote:
> >>
> >>> So, during my testing of the state checkpointing on a cluster I
> >> discovered
> >>> several things (bugs and features):
> >>>
> >>>  - If you have a setup where the configuration is not synced to the
> >> workers
> >>> they do not pick up the state back-end configuration. The workers do
> not
> >>> respect the setting in the flink-cont.yaml on the master
> >>> - HDFS checkpointing works fine if you manually set it as the per-job
> >>> state-backend using setStateHandleProvider()
> >>> - If you manually set the stateHandleProvider to a "file://" backend,
> old
> >>> checkpoints will not be cleaned up, they will also not be cleaned up
> >> when a
> >>> job is finished.
> >>>
> >>> On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <mx...@apache.org>
> wrote:
> >>>
> >>>> Hi Henry,
> >>>>
> >>>> This is just a dry run. The goal is to get everything in shape for a
> >>> proper
> >>>> vote.
> >>>>
> >>>> Kind regards,
> >>>> Max
> >>>>
> >>>>
> >>>> On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra <
> >> henry.saputra@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi Max,
> >>>>>
> >>>>> Are you doing official VOTE on the RC on 0.9 release or this is just
> >> a
> >>>> dry
> >>>>> run?
> >>>>>
> >>>>>
> >>>>> - Henry
> >>>>>
> >>>>> On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <mx...@apache.org>
> >>>>> wrote:
> >>>>>> Dear Flink community,
> >>>>>>
> >>>>>> Here's the second release candidate for the 0.9.0 release. We
> >> haven't
> >>>>> had a
> >>>>>> formal vote on the previous release candidate but it received an
> >>>> implicit
> >>>>>> -1 because of a couple of issues.
> >>>>>>
> >>>>>> Thanks to the hard-working Flink devs these issues should be solved
> >>>> now.
> >>>>>> The following commits have been added to the second release
> >>> candidate:
> >>>>>>
> >>>>>> f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
> >>>>>> WritableTypeInformation to be treated as an interface
> >>>>>> 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
> >>>>>> af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and
> >> further
> >>>>>> renamings for consistency.
> >>>>>> e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled,
> >> and
> >>>>> failed
> >>>>>> jobs in webinterface
> >>>>>> ecfde6d [docs][release] update stable version to 0.9.0
> >>>>>> 4d8ae1c [docs] remove obsolete YARN link and cleanup download links
> >>>>>> f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
> >>>>>> ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
> >>>>>> 0edc0c8 [build] [streaming] Streaming parents dependencies pushed
> >> to
> >>>>>> children
> >>>>>> 6380b95 [streaming] Logging update for checkpointed streaming
> >>>> topologies
> >>>>>> 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome
> >>>>> squirrel.
> >>>>>> 80dd72d [FLINK-2196] [javaAPI] Moved misplaced
> >> SortPartitionOperator
> >>>>> class
> >>>>>> c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in
> >>>> sync
> >>>>>> 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot
> >>> release
> >>>>>> 87988ae [scripts] remove quickstart scripts
> >>>>>> f3a96de [streaming] Fixed streaming example jars packaging and
> >>>>> termination
> >>>>>> 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in
> >> Scala
> >>>>>> Streaming
> >>>>>> 1343f26 [streaming] Allow force-enabling checkpoints for iterative
> >>> jobs
> >>>>>> c59d291 Fixed a few trivial issues:
> >>>>>> e0e6f59 [streaming] Optional iteration feedback partitioning added
> >>>>>> 348ac86 [hotfix] Fix YARNSessionFIFOITCase
> >>>>>> 80cf2c5 [ml] Makes StandardScalers state package private and reduce
> >>>>>> redundant code. Adjusts flink-ml readme.
> >>>>>> c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
> >>>>>> proprocessing package, test for the for the corresponding
> >>> functionality
> >>>>> and
> >>>>>> documentation.
> >>>>>> ee7c417 [docs] [streaming] Added states and fold to the streaming
> >>> docs
> >>>>>> fcca75c [docs] Fix some typos and grammar in the Streaming
> >>> Programming
> >>>>>> Guide.
> >>>>>>
> >>>>>>
> >>>>>> Again, we need to test the new release candidate. Therefore, I've
> >>>>> created a
> >>>>>> new document where we keep track of our testing criteria for
> >>> releases:
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
> >>>>>>
> >>>>>> Everyone who tested previously, could take a different task this
> >>> time.
> >>>>> For
> >>>>>> some components we probably don't have to test again but, if in
> >>> doubt,
> >>>>>> testing twice doesn't hurt.
> >>>>>>
> >>>>>> Happy testing :)
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Max
> >>>>>>
> >>>>>> Git branch: release-0.9.0-rc2
> >>>>>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
> >>>>>> Maven artifacts:
> >>>>>>
> >>>>
> >> https://repository.apache.org/content/repositories/orgapacheflink-1040/
> >>>>>> PGP public key for verifying the signatures:
> >>>>>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
> >>>>>
> >>>>
> >>>
> >>
> >
>
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by "Matthias J. Sax" <mj...@informatik.hu-berlin.de>.
Hi,

IMHO, it is very common that Workers do have their own config files (eg,
Storm works the same way). And I think it make a lot of senses. You
might run Flink in an heterogeneous cluster and you want to assign
different memory and slots for different hardware. This would not be
possible using a single config file (specified at the master and
distribute it).


-Matthias

On 06/15/2015 03:30 PM, Aljoscha Krettek wrote:
> Regarding 1), thats why I said "bugs and features". :D But I think of it as
> a bug, since people will normally set in in the flink-conf.yaml on the
> master and assume that it works. That's what I assumed and it took me a
> while to figure out that the task managers don't respect this setting.
> 
> Regarding 3), if you think about it, this could never work. The state
> handle cleanup logic happens purely on the JobManager. So what happens is
> that the TaskManagers create state in some directory, let's say
> /tmp/checkpoints, on the TaskManager. For cleanup, the JobManager gets the
> state handle and calls discard (on the JobManager), this tries to cleanup
> the state in /tmp/checkpoints, but of course, there is nothing there since
> we are still on the JobManager.
> 
> On Mon, 15 Jun 2015 at 15:23 Márton Balassi <ba...@gmail.com>
> wrote:
> 
>> @Aljoscha:
>> 1) I think this just means that you can set the state backend on a
>> taskmanager basis.
>> 3) This is a serious issue then. Is it work when you set it in the
>> flink-conf.yaml?
>>
>> On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek <al...@apache.org>
>> wrote:
>>
>>> So, during my testing of the state checkpointing on a cluster I
>> discovered
>>> several things (bugs and features):
>>>
>>>  - If you have a setup where the configuration is not synced to the
>> workers
>>> they do not pick up the state back-end configuration. The workers do not
>>> respect the setting in the flink-cont.yaml on the master
>>> - HDFS checkpointing works fine if you manually set it as the per-job
>>> state-backend using setStateHandleProvider()
>>> - If you manually set the stateHandleProvider to a "file://" backend, old
>>> checkpoints will not be cleaned up, they will also not be cleaned up
>> when a
>>> job is finished.
>>>
>>> On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <mx...@apache.org> wrote:
>>>
>>>> Hi Henry,
>>>>
>>>> This is just a dry run. The goal is to get everything in shape for a
>>> proper
>>>> vote.
>>>>
>>>> Kind regards,
>>>> Max
>>>>
>>>>
>>>> On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra <
>> henry.saputra@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Max,
>>>>>
>>>>> Are you doing official VOTE on the RC on 0.9 release or this is just
>> a
>>>> dry
>>>>> run?
>>>>>
>>>>>
>>>>> - Henry
>>>>>
>>>>> On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <mx...@apache.org>
>>>>> wrote:
>>>>>> Dear Flink community,
>>>>>>
>>>>>> Here's the second release candidate for the 0.9.0 release. We
>> haven't
>>>>> had a
>>>>>> formal vote on the previous release candidate but it received an
>>>> implicit
>>>>>> -1 because of a couple of issues.
>>>>>>
>>>>>> Thanks to the hard-working Flink devs these issues should be solved
>>>> now.
>>>>>> The following commits have been added to the second release
>>> candidate:
>>>>>>
>>>>>> f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
>>>>>> WritableTypeInformation to be treated as an interface
>>>>>> 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
>>>>>> af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and
>> further
>>>>>> renamings for consistency.
>>>>>> e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled,
>> and
>>>>> failed
>>>>>> jobs in webinterface
>>>>>> ecfde6d [docs][release] update stable version to 0.9.0
>>>>>> 4d8ae1c [docs] remove obsolete YARN link and cleanup download links
>>>>>> f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
>>>>>> ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
>>>>>> 0edc0c8 [build] [streaming] Streaming parents dependencies pushed
>> to
>>>>>> children
>>>>>> 6380b95 [streaming] Logging update for checkpointed streaming
>>>> topologies
>>>>>> 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome
>>>>> squirrel.
>>>>>> 80dd72d [FLINK-2196] [javaAPI] Moved misplaced
>> SortPartitionOperator
>>>>> class
>>>>>> c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in
>>>> sync
>>>>>> 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot
>>> release
>>>>>> 87988ae [scripts] remove quickstart scripts
>>>>>> f3a96de [streaming] Fixed streaming example jars packaging and
>>>>> termination
>>>>>> 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in
>> Scala
>>>>>> Streaming
>>>>>> 1343f26 [streaming] Allow force-enabling checkpoints for iterative
>>> jobs
>>>>>> c59d291 Fixed a few trivial issues:
>>>>>> e0e6f59 [streaming] Optional iteration feedback partitioning added
>>>>>> 348ac86 [hotfix] Fix YARNSessionFIFOITCase
>>>>>> 80cf2c5 [ml] Makes StandardScalers state package private and reduce
>>>>>> redundant code. Adjusts flink-ml readme.
>>>>>> c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
>>>>>> proprocessing package, test for the for the corresponding
>>> functionality
>>>>> and
>>>>>> documentation.
>>>>>> ee7c417 [docs] [streaming] Added states and fold to the streaming
>>> docs
>>>>>> fcca75c [docs] Fix some typos and grammar in the Streaming
>>> Programming
>>>>>> Guide.
>>>>>>
>>>>>>
>>>>>> Again, we need to test the new release candidate. Therefore, I've
>>>>> created a
>>>>>> new document where we keep track of our testing criteria for
>>> releases:
>>>>>>
>>>>>
>>>>
>>>
>> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
>>>>>>
>>>>>> Everyone who tested previously, could take a different task this
>>> time.
>>>>> For
>>>>>> some components we probably don't have to test again but, if in
>>> doubt,
>>>>>> testing twice doesn't hurt.
>>>>>>
>>>>>> Happy testing :)
>>>>>>
>>>>>> Cheers,
>>>>>> Max
>>>>>>
>>>>>> Git branch: release-0.9.0-rc2
>>>>>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
>>>>>> Maven artifacts:
>>>>>>
>>>>
>> https://repository.apache.org/content/repositories/orgapacheflink-1040/
>>>>>> PGP public key for verifying the signatures:
>>>>>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
>>>>>
>>>>
>>>
>>
> 


Re: Testing Apache Flink 0.9.0-rc2

Posted by Ufuk Celebi <uc...@apache.org>.
On Mon, Jun 15, 2015 at 3:30 PM, Aljoscha Krettek <al...@apache.org>
wrote:

> Regarding 1), thats why I said "bugs and features". :D But I think of it as
> a bug, since people will normally set in in the flink-conf.yaml on the
> master and assume that it works. That's what I assumed and it took me a
> while to figure out that the task managers don't respect this setting.
>
> Regarding 3), if you think about it, this could never work. The state
> handle cleanup logic happens purely on the JobManager. So what happens is
> that the TaskManagers create state in some directory, let's say
> /tmp/checkpoints, on the TaskManager. For cleanup, the JobManager gets the
> state handle and calls discard (on the JobManager), this tries to cleanup
> the state in /tmp/checkpoints, but of course, there is nothing there since
> we are still on the JobManager.
>

Release blocker, are you going to fix it today?

Re: Testing Apache Flink 0.9.0-rc2

Posted by Aljoscha Krettek <al...@apache.org>.
Regarding 1), thats why I said "bugs and features". :D But I think of it as
a bug, since people will normally set in in the flink-conf.yaml on the
master and assume that it works. That's what I assumed and it took me a
while to figure out that the task managers don't respect this setting.

Regarding 3), if you think about it, this could never work. The state
handle cleanup logic happens purely on the JobManager. So what happens is
that the TaskManagers create state in some directory, let's say
/tmp/checkpoints, on the TaskManager. For cleanup, the JobManager gets the
state handle and calls discard (on the JobManager), this tries to cleanup
the state in /tmp/checkpoints, but of course, there is nothing there since
we are still on the JobManager.

On Mon, 15 Jun 2015 at 15:23 Márton Balassi <ba...@gmail.com>
wrote:

> @Aljoscha:
> 1) I think this just means that you can set the state backend on a
> taskmanager basis.
> 3) This is a serious issue then. Is it work when you set it in the
> flink-conf.yaml?
>
> On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek <al...@apache.org>
> wrote:
>
> > So, during my testing of the state checkpointing on a cluster I
> discovered
> > several things (bugs and features):
> >
> >  - If you have a setup where the configuration is not synced to the
> workers
> > they do not pick up the state back-end configuration. The workers do not
> > respect the setting in the flink-cont.yaml on the master
> > - HDFS checkpointing works fine if you manually set it as the per-job
> > state-backend using setStateHandleProvider()
> > - If you manually set the stateHandleProvider to a "file://" backend, old
> > checkpoints will not be cleaned up, they will also not be cleaned up
> when a
> > job is finished.
> >
> > On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <mx...@apache.org> wrote:
> >
> > > Hi Henry,
> > >
> > > This is just a dry run. The goal is to get everything in shape for a
> > proper
> > > vote.
> > >
> > > Kind regards,
> > > Max
> > >
> > >
> > > On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra <
> henry.saputra@gmail.com>
> > > wrote:
> > >
> > > > Hi Max,
> > > >
> > > > Are you doing official VOTE on the RC on 0.9 release or this is just
> a
> > > dry
> > > > run?
> > > >
> > > >
> > > > - Henry
> > > >
> > > > On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <mx...@apache.org>
> > > > wrote:
> > > > > Dear Flink community,
> > > > >
> > > > > Here's the second release candidate for the 0.9.0 release. We
> haven't
> > > > had a
> > > > > formal vote on the previous release candidate but it received an
> > > implicit
> > > > > -1 because of a couple of issues.
> > > > >
> > > > > Thanks to the hard-working Flink devs these issues should be solved
> > > now.
> > > > > The following commits have been added to the second release
> > candidate:
> > > > >
> > > > > f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
> > > > > WritableTypeInformation to be treated as an interface
> > > > > 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
> > > > > af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and
> further
> > > > > renamings for consistency.
> > > > > e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled,
> and
> > > > failed
> > > > > jobs in webinterface
> > > > > ecfde6d [docs][release] update stable version to 0.9.0
> > > > > 4d8ae1c [docs] remove obsolete YARN link and cleanup download links
> > > > > f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
> > > > > ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
> > > > > 0edc0c8 [build] [streaming] Streaming parents dependencies pushed
> to
> > > > > children
> > > > > 6380b95 [streaming] Logging update for checkpointed streaming
> > > topologies
> > > > > 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome
> > > > squirrel.
> > > > > 80dd72d [FLINK-2196] [javaAPI] Moved misplaced
> SortPartitionOperator
> > > > class
> > > > > c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in
> > > sync
> > > > > 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot
> > release
> > > > > 87988ae [scripts] remove quickstart scripts
> > > > > f3a96de [streaming] Fixed streaming example jars packaging and
> > > > termination
> > > > > 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in
> Scala
> > > > > Streaming
> > > > > 1343f26 [streaming] Allow force-enabling checkpoints for iterative
> > jobs
> > > > > c59d291 Fixed a few trivial issues:
> > > > > e0e6f59 [streaming] Optional iteration feedback partitioning added
> > > > > 348ac86 [hotfix] Fix YARNSessionFIFOITCase
> > > > > 80cf2c5 [ml] Makes StandardScalers state package private and reduce
> > > > > redundant code. Adjusts flink-ml readme.
> > > > > c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
> > > > > proprocessing package, test for the for the corresponding
> > functionality
> > > > and
> > > > > documentation.
> > > > > ee7c417 [docs] [streaming] Added states and fold to the streaming
> > docs
> > > > > fcca75c [docs] Fix some typos and grammar in the Streaming
> > Programming
> > > > > Guide.
> > > > >
> > > > >
> > > > > Again, we need to test the new release candidate. Therefore, I've
> > > > created a
> > > > > new document where we keep track of our testing criteria for
> > releases:
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
> > > > >
> > > > > Everyone who tested previously, could take a different task this
> > time.
> > > > For
> > > > > some components we probably don't have to test again but, if in
> > doubt,
> > > > > testing twice doesn't hurt.
> > > > >
> > > > > Happy testing :)
> > > > >
> > > > > Cheers,
> > > > > Max
> > > > >
> > > > > Git branch: release-0.9.0-rc2
> > > > > Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
> > > > > Maven artifacts:
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1040/
> > > > > PGP public key for verifying the signatures:
> > > > > http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
> > > >
> > >
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Márton Balassi <ba...@gmail.com>.
@Aljoscha:
1) I think this just means that you can set the state backend on a
taskmanager basis.
3) This is a serious issue then. Is it work when you set it in the
flink-conf.yaml?

On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek <al...@apache.org>
wrote:

> So, during my testing of the state checkpointing on a cluster I discovered
> several things (bugs and features):
>
>  - If you have a setup where the configuration is not synced to the workers
> they do not pick up the state back-end configuration. The workers do not
> respect the setting in the flink-cont.yaml on the master
> - HDFS checkpointing works fine if you manually set it as the per-job
> state-backend using setStateHandleProvider()
> - If you manually set the stateHandleProvider to a "file://" backend, old
> checkpoints will not be cleaned up, they will also not be cleaned up when a
> job is finished.
>
> On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <mx...@apache.org> wrote:
>
> > Hi Henry,
> >
> > This is just a dry run. The goal is to get everything in shape for a
> proper
> > vote.
> >
> > Kind regards,
> > Max
> >
> >
> > On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra <he...@gmail.com>
> > wrote:
> >
> > > Hi Max,
> > >
> > > Are you doing official VOTE on the RC on 0.9 release or this is just a
> > dry
> > > run?
> > >
> > >
> > > - Henry
> > >
> > > On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <mx...@apache.org>
> > > wrote:
> > > > Dear Flink community,
> > > >
> > > > Here's the second release candidate for the 0.9.0 release. We haven't
> > > had a
> > > > formal vote on the previous release candidate but it received an
> > implicit
> > > > -1 because of a couple of issues.
> > > >
> > > > Thanks to the hard-working Flink devs these issues should be solved
> > now.
> > > > The following commits have been added to the second release
> candidate:
> > > >
> > > > f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
> > > > WritableTypeInformation to be treated as an interface
> > > > 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
> > > > af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and further
> > > > renamings for consistency.
> > > > e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled, and
> > > failed
> > > > jobs in webinterface
> > > > ecfde6d [docs][release] update stable version to 0.9.0
> > > > 4d8ae1c [docs] remove obsolete YARN link and cleanup download links
> > > > f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
> > > > ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
> > > > 0edc0c8 [build] [streaming] Streaming parents dependencies pushed to
> > > > children
> > > > 6380b95 [streaming] Logging update for checkpointed streaming
> > topologies
> > > > 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome
> > > squirrel.
> > > > 80dd72d [FLINK-2196] [javaAPI] Moved misplaced SortPartitionOperator
> > > class
> > > > c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in
> > sync
> > > > 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot
> release
> > > > 87988ae [scripts] remove quickstart scripts
> > > > f3a96de [streaming] Fixed streaming example jars packaging and
> > > termination
> > > > 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in Scala
> > > > Streaming
> > > > 1343f26 [streaming] Allow force-enabling checkpoints for iterative
> jobs
> > > > c59d291 Fixed a few trivial issues:
> > > > e0e6f59 [streaming] Optional iteration feedback partitioning added
> > > > 348ac86 [hotfix] Fix YARNSessionFIFOITCase
> > > > 80cf2c5 [ml] Makes StandardScalers state package private and reduce
> > > > redundant code. Adjusts flink-ml readme.
> > > > c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
> > > > proprocessing package, test for the for the corresponding
> functionality
> > > and
> > > > documentation.
> > > > ee7c417 [docs] [streaming] Added states and fold to the streaming
> docs
> > > > fcca75c [docs] Fix some typos and grammar in the Streaming
> Programming
> > > > Guide.
> > > >
> > > >
> > > > Again, we need to test the new release candidate. Therefore, I've
> > > created a
> > > > new document where we keep track of our testing criteria for
> releases:
> > > >
> > >
> >
> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
> > > >
> > > > Everyone who tested previously, could take a different task this
> time.
> > > For
> > > > some components we probably don't have to test again but, if in
> doubt,
> > > > testing twice doesn't hurt.
> > > >
> > > > Happy testing :)
> > > >
> > > > Cheers,
> > > > Max
> > > >
> > > > Git branch: release-0.9.0-rc2
> > > > Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
> > > > Maven artifacts:
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1040/
> > > > PGP public key for verifying the signatures:
> > > > http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
> > >
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Aljoscha Krettek <al...@apache.org>.
So, during my testing of the state checkpointing on a cluster I discovered
several things (bugs and features):

 - If you have a setup where the configuration is not synced to the workers
they do not pick up the state back-end configuration. The workers do not
respect the setting in the flink-cont.yaml on the master
- HDFS checkpointing works fine if you manually set it as the per-job
state-backend using setStateHandleProvider()
- If you manually set the stateHandleProvider to a "file://" backend, old
checkpoints will not be cleaned up, they will also not be cleaned up when a
job is finished.

On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <mx...@apache.org> wrote:

> Hi Henry,
>
> This is just a dry run. The goal is to get everything in shape for a proper
> vote.
>
> Kind regards,
> Max
>
>
> On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra <he...@gmail.com>
> wrote:
>
> > Hi Max,
> >
> > Are you doing official VOTE on the RC on 0.9 release or this is just a
> dry
> > run?
> >
> >
> > - Henry
> >
> > On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <mx...@apache.org>
> > wrote:
> > > Dear Flink community,
> > >
> > > Here's the second release candidate for the 0.9.0 release. We haven't
> > had a
> > > formal vote on the previous release candidate but it received an
> implicit
> > > -1 because of a couple of issues.
> > >
> > > Thanks to the hard-working Flink devs these issues should be solved
> now.
> > > The following commits have been added to the second release candidate:
> > >
> > > f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
> > > WritableTypeInformation to be treated as an interface
> > > 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
> > > af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and further
> > > renamings for consistency.
> > > e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled, and
> > failed
> > > jobs in webinterface
> > > ecfde6d [docs][release] update stable version to 0.9.0
> > > 4d8ae1c [docs] remove obsolete YARN link and cleanup download links
> > > f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
> > > ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
> > > 0edc0c8 [build] [streaming] Streaming parents dependencies pushed to
> > > children
> > > 6380b95 [streaming] Logging update for checkpointed streaming
> topologies
> > > 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome
> > squirrel.
> > > 80dd72d [FLINK-2196] [javaAPI] Moved misplaced SortPartitionOperator
> > class
> > > c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in
> sync
> > > 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot release
> > > 87988ae [scripts] remove quickstart scripts
> > > f3a96de [streaming] Fixed streaming example jars packaging and
> > termination
> > > 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in Scala
> > > Streaming
> > > 1343f26 [streaming] Allow force-enabling checkpoints for iterative jobs
> > > c59d291 Fixed a few trivial issues:
> > > e0e6f59 [streaming] Optional iteration feedback partitioning added
> > > 348ac86 [hotfix] Fix YARNSessionFIFOITCase
> > > 80cf2c5 [ml] Makes StandardScalers state package private and reduce
> > > redundant code. Adjusts flink-ml readme.
> > > c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
> > > proprocessing package, test for the for the corresponding functionality
> > and
> > > documentation.
> > > ee7c417 [docs] [streaming] Added states and fold to the streaming docs
> > > fcca75c [docs] Fix some typos and grammar in the Streaming Programming
> > > Guide.
> > >
> > >
> > > Again, we need to test the new release candidate. Therefore, I've
> > created a
> > > new document where we keep track of our testing criteria for releases:
> > >
> >
> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
> > >
> > > Everyone who tested previously, could take a different task this time.
> > For
> > > some components we probably don't have to test again but, if in doubt,
> > > testing twice doesn't hurt.
> > >
> > > Happy testing :)
> > >
> > > Cheers,
> > > Max
> > >
> > > Git branch: release-0.9.0-rc2
> > > Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
> > > Maven artifacts:
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1040/
> > > PGP public key for verifying the signatures:
> > > http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
> >
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Maximilian Michels <mx...@apache.org>.
Hi Henry,

This is just a dry run. The goal is to get everything in shape for a proper
vote.

Kind regards,
Max


On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra <he...@gmail.com>
wrote:

> Hi Max,
>
> Are you doing official VOTE on the RC on 0.9 release or this is just a dry
> run?
>
>
> - Henry
>
> On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <mx...@apache.org>
> wrote:
> > Dear Flink community,
> >
> > Here's the second release candidate for the 0.9.0 release. We haven't
> had a
> > formal vote on the previous release candidate but it received an implicit
> > -1 because of a couple of issues.
> >
> > Thanks to the hard-working Flink devs these issues should be solved now.
> > The following commits have been added to the second release candidate:
> >
> > f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
> > WritableTypeInformation to be treated as an interface
> > 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
> > af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and further
> > renamings for consistency.
> > e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled, and
> failed
> > jobs in webinterface
> > ecfde6d [docs][release] update stable version to 0.9.0
> > 4d8ae1c [docs] remove obsolete YARN link and cleanup download links
> > f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
> > ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
> > 0edc0c8 [build] [streaming] Streaming parents dependencies pushed to
> > children
> > 6380b95 [streaming] Logging update for checkpointed streaming topologies
> > 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome
> squirrel.
> > 80dd72d [FLINK-2196] [javaAPI] Moved misplaced SortPartitionOperator
> class
> > c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in sync
> > 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot release
> > 87988ae [scripts] remove quickstart scripts
> > f3a96de [streaming] Fixed streaming example jars packaging and
> termination
> > 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in Scala
> > Streaming
> > 1343f26 [streaming] Allow force-enabling checkpoints for iterative jobs
> > c59d291 Fixed a few trivial issues:
> > e0e6f59 [streaming] Optional iteration feedback partitioning added
> > 348ac86 [hotfix] Fix YARNSessionFIFOITCase
> > 80cf2c5 [ml] Makes StandardScalers state package private and reduce
> > redundant code. Adjusts flink-ml readme.
> > c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
> > proprocessing package, test for the for the corresponding functionality
> and
> > documentation.
> > ee7c417 [docs] [streaming] Added states and fold to the streaming docs
> > fcca75c [docs] Fix some typos and grammar in the Streaming Programming
> > Guide.
> >
> >
> > Again, we need to test the new release candidate. Therefore, I've
> created a
> > new document where we keep track of our testing criteria for releases:
> >
> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
> >
> > Everyone who tested previously, could take a different task this time.
> For
> > some components we probably don't have to test again but, if in doubt,
> > testing twice doesn't hurt.
> >
> > Happy testing :)
> >
> > Cheers,
> > Max
> >
> > Git branch: release-0.9.0-rc2
> > Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
> > Maven artifacts:
> > https://repository.apache.org/content/repositories/orgapacheflink-1040/
> > PGP public key for verifying the signatures:
> > http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
>

Re: Testing Apache Flink 0.9.0-rc2

Posted by Henry Saputra <he...@gmail.com>.
Hi Max,

Are you doing official VOTE on the RC on 0.9 release or this is just a dry run?


- Henry

On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <mx...@apache.org> wrote:
> Dear Flink community,
>
> Here's the second release candidate for the 0.9.0 release. We haven't had a
> formal vote on the previous release candidate but it received an implicit
> -1 because of a couple of issues.
>
> Thanks to the hard-working Flink devs these issues should be solved now.
> The following commits have been added to the second release candidate:
>
> f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
> WritableTypeInformation to be treated as an interface
> 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
> af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and further
> renamings for consistency.
> e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled, and failed
> jobs in webinterface
> ecfde6d [docs][release] update stable version to 0.9.0
> 4d8ae1c [docs] remove obsolete YARN link and cleanup download links
> f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
> ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
> 0edc0c8 [build] [streaming] Streaming parents dependencies pushed to
> children
> 6380b95 [streaming] Logging update for checkpointed streaming topologies
> 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome squirrel.
> 80dd72d [FLINK-2196] [javaAPI] Moved misplaced SortPartitionOperator class
> c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in sync
> 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot release
> 87988ae [scripts] remove quickstart scripts
> f3a96de [streaming] Fixed streaming example jars packaging and termination
> 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in Scala
> Streaming
> 1343f26 [streaming] Allow force-enabling checkpoints for iterative jobs
> c59d291 Fixed a few trivial issues:
> e0e6f59 [streaming] Optional iteration feedback partitioning added
> 348ac86 [hotfix] Fix YARNSessionFIFOITCase
> 80cf2c5 [ml] Makes StandardScalers state package private and reduce
> redundant code. Adjusts flink-ml readme.
> c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
> proprocessing package, test for the for the corresponding functionality and
> documentation.
> ee7c417 [docs] [streaming] Added states and fold to the streaming docs
> fcca75c [docs] Fix some typos and grammar in the Streaming Programming
> Guide.
>
>
> Again, we need to test the new release candidate. Therefore, I've created a
> new document where we keep track of our testing criteria for releases:
> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
>
> Everyone who tested previously, could take a different task this time. For
> some components we probably don't have to test again but, if in doubt,
> testing twice doesn't hurt.
>
> Happy testing :)
>
> Cheers,
> Max
>
> Git branch: release-0.9.0-rc2
> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
> Maven artifacts:
> https://repository.apache.org/content/repositories/orgapacheflink-1040/
> PGP public key for verifying the signatures:
> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF