You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Ahmet Altay <al...@google.com> on 2018/10/19 00:26:21 UTC

[VOTE] Release 2.8.0, release candidate #1

Hi everyone,

Please review and vote on the release candidate #1 for the version 2.8.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint 6096FA00 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v2.8.0-RC1" [5],
* website pull request listing the release and publishing the API reference
manual [6].
* Python artifacts are deployed along with the source release to the
dist.apache.org [2].
* Validation sheet with a tab for 2.8.0 release to help with validation [7].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Ahmet

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
[2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
[3] https://dist.apache.org/repos/dist/dev/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1049/
[5] https://github.com/apache/beam/tree/v2.8.0-RC1
[6] https://github.com/apache/beam-site/pull/583 and
https://github.com/apache/beam/pull/6745
[7]
https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
+1 (binding)

Quickly tested with beam-samples.

Regards
JB

On 26/10/2018 17:05, Tim Robertson wrote:
> A colleague and I tested on 2.7.0 and 2.8.0RC1:
> 
> 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in spreadsheet)
> 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we backport the
> un-merged BEAM-5036 fix in our code)
> 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
> 
> Everything worked, and performance was similar on both.
> We built using maven pointing
> at https://repository.apache.org/content/repositories/orgapachebeam-1049/  
> 
> Based on this limited testing: +1 
> 
> Thank you to the release managers,
> Tim
> 
> 
> On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
> <ma...@gmail.com>> wrote:
> 
>     I can do some tests on Spark / YARN tomorrow (CEST timezone). Sorry
>     I’ve just been too busy to assist.
> 
>     Tim
> 
>     On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
>     <ma...@apache.org>> wrote:
> 
>>     I tried to do a more thorough job on this.
>>
>>      - I could not reproduce the slowdown in Query 9. I believe the
>>     variance was simply high given the parameters and environment
>>      - I saw the same slowdown in Query 8 when running as part of the
>>     suite, but it vanished when I ran repeatedly on its own, so again
>>     it is not good methodology probably
>>
>>     We do have the dashboard
>>     at https://apache-beam-testing.appspot.com/dashboard-admin though
>>     no anomaly detection set up AFAIK.
>>
>>      - There is no issue easily visible in
>>     DirectRunner: https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>>      - There is a notable degradation in Spark runner on 10/5 for many
>>     queries. https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>>      - Something minor happened for Dataflow around
>>     10/1: https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>>      - Flink runner seems to have had some fantastic improvements
>>     :-) https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>>
>>     So if there is a blocker it would really be the Spark runner perf
>>     changes. Of course, all these except Dataflow are using local
>>     instances so may not be representative of larger scale AFAIK.
>>
>>     Kenn
>>
>>     On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels <mxm@apache.org
>>     <ma...@apache.org>> wrote:
>>
>>         I've run WordCount using Quickstart with the FlinkRunner
>>         (locally and
>>         against a Flink cluster).
>>
>>         Would give a +1 but waiting what Kenn finds.
>>
>>         -Max
>>
>>         On 23.10.18 07:11, Ahmet Altay wrote:
>>         >
>>         >
>>         > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>>         <kenn@apache.org <ma...@apache.org>
>>         > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
>>         >
>>         >     You two did so much verification I had a hard time
>>         finding something
>>         >     where my help was meaningful! :-)
>>         >
>>         >     I did run the Nexmark suite on the DirectRunner against
>>         2.7.0 and
>>         >     2.8.0 following
>>         >   
>>          https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>>         >   
>>          <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
>>         >
>>         >     It is admittedly a very silly test - the instructions leave
>>         >     immutability enforcement on, etc. But it does appear
>>         that there is a
>>         >     30% degradation in query 8 and 15% in query 9. These are
>>         the pure
>>         >     Java tests, not the SQL variants. The rest of the
>>         queries are close
>>         >     enough that differences are not meaningful.
>>         >
>>         >
>>         > (It would be a good improvement for us to have alerts on daily
>>         > benchmarks if we do not have such a concept already.)
>>         >
>>         >
>>         >     I would ask a little more time to see what is going on
>>         here - is it
>>         >     a real performance issue or an artifact of how the tests are
>>         >     invoked, or ...?
>>         >
>>         >
>>         > Thank you! Much appreciated. Please let us know when you are
>>         done with
>>         > your investigation.
>>         >
>>         >
>>         >     Kenn
>>         >
>>         >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>>         <altay@google.com <ma...@google.com>
>>         >     <mailto:altay@google.com <ma...@google.com>>> wrote:
>>         >
>>         >         Hi all,
>>         >
>>         >         Did you have a chance to review this RC? Between me
>>         and Robert
>>         >         we ran a significant chunk of the validations.
>>         Let me know if
>>         >         you have any questions.
>>         >
>>         >         Ahmet
>>         >
>>         >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>>         <altay@google.com <ma...@google.com>
>>         >         <mailto:altay@google.com <ma...@google.com>>>
>>         wrote:
>>         >
>>         >             Hi everyone,
>>         >
>>         >             Please review and vote on the release candidate
>>         #1 for the
>>         >             version 2.8.0, as follows:
>>         >             [ ] +1, Approve the release
>>         >             [ ] -1, Do not approve the release (please
>>         provide specific
>>         >             comments)
>>         >
>>         >             The complete staging area is available for your
>>         review,
>>         >             which includes:
>>         >             * JIRA release notes [1],
>>         >             * the official Apache source release to be
>>         deployed to
>>         >             dist.apache.org <http://dist.apache.org>
>>         <http://dist.apache.org> [2], which is
>>         >             signed with the key with fingerprint 6096FA00 [3],
>>         >             * all artifacts to be deployed to the Maven Central
>>         >             Repository [4],
>>         >             * source code tag "v2.8.0-RC1" [5],
>>         >             * website pull request listing the release and
>>         publishing
>>         >             the API reference manual [6].
>>         >             * Python artifacts are deployed along with the
>>         source
>>         >             release to the dist.apache.org
>>         <http://dist.apache.org> <http://dist.apache.org> [2].
>>         >             * Validation sheet with a tab for 2.8.0 release
>>         to help with
>>         >             validation [7].
>>         >
>>         >             The vote will be open for at least 72 hours. It
>>         is adopted
>>         >             by majority approval, with at least 3 PMC
>>         affirmative votes.
>>         >
>>         >             Thanks,
>>         >             Ahmet
>>         >
>>         >             [1]
>>         >           
>>          https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>>         >           
>>          <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>>         >             [2]
>>         https://dist.apache.org/repos/dist/dev/beam/2.8.0
>>         >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>>         >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>>         >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>>         >             [4]
>>         >           
>>          https://repository.apache.org/content/repositories/orgapachebeam-1049/
>>         >           
>>          <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>>         >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>>         >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>>         >             [6] https://github.com/apache/beam-site/pull/583
>>         >             <https://github.com/apache/beam-site/pull/583> and
>>         >             https://github.com/apache/beam/pull/6745
>>         >             <https://github.com/apache/beam/pull/6745>
>>         >             [7]
>>         >           
>>          https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>>         >           
>>          <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>>         >
>>         >
>>         >
>>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Ahmet Altay <al...@google.com>.
I pushed binaries to the repositories.

I started a blog post draft, please feel to make any changes directly, or
comment on it [1]. I plan to publish the blog post along with an email to
user@ on Monday 10/29.

Ahmet

[1] https://github.com/apache/beam/pull/6852

On Fri, Oct 26, 2018 at 10:16 AM, Ahmet Altay <al...@google.com> wrote:

> +1 (binding)
>
> Thank you all for running validations and voting.
>
> I'm pleased to announce that the 2.8.0 RC1 is approved for release with 5
> +1 votes (4 binding) and no -1 votes. I will start pushing the bits
> around.
>
> On Fri, Oct 26, 2018 at 9:20 AM, Maximilian Michels <mx...@apache.org>
> wrote:
>
>> +1 (binding)
>>
>> On 26.10.18 17:45, Kenneth Knowles wrote:
>>
>>> Nice. Thanks.
>>>
>>> +1
>>>
>>>
>>> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com
>>> <ma...@google.com>> wrote:
>>>
>>>     Thanks Tim!
>>>
>>>     This was my only hesitation, and sounds like we're in the clear here.
>>>
>>>     +1 (binding)
>>>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
>>>     <timrobertson100@gmail.com <ma...@gmail.com>>
>>> wrote:
>>>      >
>>>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
>>>      >
>>>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
>>>     spreadsheet)
>>>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
>>>     backport the un-merged BEAM-5036 fix in our code)
>>>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>>>      >
>>>      > Everything worked, and performance was similar on both.
>>>      > We built using maven pointing at
>>>     https://repository.apache.org/content/repositories/orgapache
>>> beam-1049/
>>>      >
>>>      > Based on this limited testing: +1
>>>      >
>>>      > Thank you to the release managers,
>>>      > Tim
>>>      >
>>>      >
>>>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
>>>     <ma...@gmail.com>> wrote:
>>>      >>
>>>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
>>>     Sorry I’ve just been too busy to assist.
>>>      >>
>>>      >> Tim
>>>      >>
>>>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
>>>     <ma...@apache.org>> wrote:
>>>      >>
>>>      >> I tried to do a more thorough job on this.
>>>      >>
>>>      >>  - I could not reproduce the slowdown in Query 9. I believe the
>>>     variance was simply high given the parameters and environment
>>>      >>  - I saw the same slowdown in Query 8 when running as part of
>>>     the suite, but it vanished when I ran repeatedly on its own, so
>>>     again it is not good methodology probably
>>>      >>
>>>      >> We do have the dashboard at
>>>     https://apache-beam-testing.appspot.com/dashboard-admin though no
>>>     anomaly detection set up AFAIK.
>>>      >>
>>>      >>  - There is no issue easily visible in DirectRunner:
>>>     https://apache-beam-testing.appspot.com/explore?dashboard=50
>>> 84698770407424
>>>      >>  - There is a notable degradation in Spark runner on 10/5 for
>>>     many queries.
>>>     https://apache-beam-testing.appspot.com/explore?dashboard=51
>>> 38380291571712
>>>      >>  - Something minor happened for Dataflow around 10/1:
>>>     https://apache-beam-testing.appspot.com/explore?dashboard=56
>>> 70405876482048
>>>      >>  - Flink runner seems to have had some fantastic improvements
>>>     :-)
>>>     https://apache-beam-testing.appspot.com/explore?dashboard=56
>>> 99257587728384
>>>      >>
>>>      >> So if there is a blocker it would really be the Spark runner
>>>     perf changes. Of course, all these except Dataflow are using local
>>>     instances so may not be representative of larger scale AFAIK.
>>>      >>
>>>      >> Kenn
>>>      >>
>>>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
>>>     <mxm@apache.org <ma...@apache.org>> wrote:
>>>      >>>
>>>      >>> I've run WordCount using Quickstart with the FlinkRunner
>>>     (locally and
>>>      >>> against a Flink cluster).
>>>      >>>
>>>      >>> Would give a +1 but waiting what Kenn finds.
>>>      >>>
>>>      >>> -Max
>>>      >>>
>>>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
>>>      >>> >
>>>      >>> >
>>>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>>>     <kenn@apache.org <ma...@apache.org>
>>>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
>>>      >>> >
>>>      >>> >     You two did so much verification I had a hard time
>>>     finding something
>>>      >>> >     where my help was meaningful! :-)
>>>      >>> >
>>>      >>> >     I did run the Nexmark suite on the DirectRunner against
>>>     2.7.0 and
>>>      >>> >     2.8.0 following
>>>      >>> >
>>>     https://beam.apache.org/documentation/sdks/java/nexmark/#run
>>> ning-smoke-suite-on-the-directrunner-local
>>>      >>> >         <https://beam.apache.org/docu
>>> mentation/sdks/java/nexmark/#running-smoke-suite-on-the-dire
>>> ctrunner-local>.
>>>      >>> >
>>>      >>> >     It is admittedly a very silly test - the instructions
>>> leave
>>>      >>> >     immutability enforcement on, etc. But it does appear that
>>>     there is a
>>>      >>> >     30% degradation in query 8 and 15% in query 9. These are
>>>     the pure
>>>      >>> >     Java tests, not the SQL variants. The rest of the queries
>>>     are close
>>>      >>> >     enough that differences are not meaningful.
>>>      >>> >
>>>      >>> >
>>>      >>> > (It would be a good improvement for us to have alerts on daily
>>>      >>> > benchmarks if we do not have such a concept already.)
>>>      >>> >
>>>      >>> >
>>>      >>> >     I would ask a little more time to see what is going on
>>>     here - is it
>>>      >>> >     a real performance issue or an artifact of how the tests
>>> are
>>>      >>> >     invoked, or ...?
>>>      >>> >
>>>      >>> >
>>>      >>> > Thank you! Much appreciated. Please let us know when you are
>>>     done with
>>>      >>> > your investigation.
>>>      >>> >
>>>      >>> >
>>>      >>> >     Kenn
>>>      >>> >
>>>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>>>     <altay@google.com <ma...@google.com>
>>>      >>> >     <mailto:altay@google.com <ma...@google.com>>>
>>> wrote:
>>>      >>> >
>>>      >>> >         Hi all,
>>>      >>> >
>>>      >>> >         Did you have a chance to review this RC? Between me
>>>     and Robert
>>>      >>> >         we ran a significant chunk of the validations. Let me
>>>     know if
>>>      >>> >         you have any questions.
>>>      >>> >
>>>      >>> >         Ahmet
>>>      >>> >
>>>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>>>     <altay@google.com <ma...@google.com>
>>>      >>> >         <mailto:altay@google.com <ma...@google.com>>>
>>>
>>>     wrote:
>>>      >>> >
>>>      >>> >             Hi everyone,
>>>      >>> >
>>>      >>> >             Please review and vote on the release candidate
>>>     #1 for the
>>>      >>> >             version 2.8.0, as follows:
>>>      >>> >             [ ] +1, Approve the release
>>>      >>> >             [ ] -1, Do not approve the release (please
>>>     provide specific
>>>      >>> >             comments)
>>>      >>> >
>>>      >>> >             The complete staging area is available for your
>>>     review,
>>>      >>> >             which includes:
>>>      >>> >             * JIRA release notes [1],
>>>      >>> >             * the official Apache source release to be
>>>     deployed to
>>>      >>> > dist.apache.org <http://dist.apache.org>
>>>     <http://dist.apache.org> [2], which is
>>>      >>> >             signed with the key with fingerprint 6096FA00 [3],
>>>      >>> >             * all artifacts to be deployed to the Maven
>>> Central
>>>      >>> >             Repository [4],
>>>      >>> >             * source code tag "v2.8.0-RC1" [5],
>>>      >>> >             * website pull request listing the release and
>>>     publishing
>>>      >>> >             the API reference manual [6].
>>>      >>> >             * Python artifacts are deployed along with the
>>> source
>>>      >>> >             release to the dist.apache.org
>>>     <http://dist.apache.org> <http://dist.apache.org> [2].
>>>
>>>      >>> >             * Validation sheet with a tab for 2.8.0 release
>>>     to help with
>>>      >>> >             validation [7].
>>>      >>> >
>>>      >>> >             The vote will be open for at least 72 hours. It
>>>     is adopted
>>>      >>> >             by majority approval, with at least 3 PMC
>>>     affirmative votes.
>>>      >>> >
>>>      >>> >             Thanks,
>>>      >>> >             Ahmet
>>>      >>> >
>>>      >>> >             [1]
>>>      >>> >
>>>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
>>> ctId=12319527&version=12343985
>>>      >>> >                 <https://issues.apache.org/ji
>>> ra/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>>>      >>> >             [2] https://dist.apache.org/repos/
>>> dist/dev/beam/2.8.0
>>>      >>> >             <https://dist.apache.org/repo
>>> s/dist/dev/beam/2.8.0>
>>>      >>> >             [3] https://dist.apache.org/repos/
>>> dist/dev/beam/KEYS
>>>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS
>>> >
>>>      >>> >             [4]
>>>      >>> >
>>>     https://repository.apache.org/content/repositories/orgapache
>>> beam-1049/
>>>      >>> >                 <https://repository.apache.or
>>> g/content/repositories/orgapachebeam-1049/>
>>>      >>> >             [5] https://github.com/apache/beam
>>> /tree/v2.8.0-RC1
>>>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>>>      >>> >             [6] https://github.com/apache/beam-site/pull/583
>>>      >>> >             <https://github.com/apache/beam-site/pull/583>
>>> and
>>>      >>> > https://github.com/apache/beam/pull/6745
>>>      >>> >             <https://github.com/apache/beam/pull/6745>
>>>      >>> >             [7]
>>>      >>> >
>>>     https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkS
>>> ZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>>>      >>> >                 <https://docs.google.com/spre
>>> adsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit
>>> #gid=1854712816>
>>>      >>> >
>>>      >>> >
>>>      >>> >
>>>
>>>
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Ahmet Altay <al...@google.com>.
+1 (binding)

Thank you all for running validations and voting.

I'm pleased to announce that the 2.8.0 RC1 is approved for release with 5 +1
 votes (4 binding) and no -1 votes. I will start pushing the bits around.

On Fri, Oct 26, 2018 at 9:20 AM, Maximilian Michels <mx...@apache.org> wrote:

> +1 (binding)
>
> On 26.10.18 17:45, Kenneth Knowles wrote:
>
>> Nice. Thanks.
>>
>> +1
>>
>>
>> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com
>> <ma...@google.com>> wrote:
>>
>>     Thanks Tim!
>>
>>     This was my only hesitation, and sounds like we're in the clear here.
>>
>>     +1 (binding)
>>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
>>     <timrobertson100@gmail.com <ma...@gmail.com>> wrote:
>>      >
>>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
>>      >
>>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
>>     spreadsheet)
>>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
>>     backport the un-merged BEAM-5036 fix in our code)
>>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>>      >
>>      > Everything worked, and performance was similar on both.
>>      > We built using maven pointing at
>>     https://repository.apache.org/content/repositories/orgapache
>> beam-1049/
>>      >
>>      > Based on this limited testing: +1
>>      >
>>      > Thank you to the release managers,
>>      > Tim
>>      >
>>      >
>>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
>>     <ma...@gmail.com>> wrote:
>>      >>
>>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
>>     Sorry I’ve just been too busy to assist.
>>      >>
>>      >> Tim
>>      >>
>>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
>>     <ma...@apache.org>> wrote:
>>      >>
>>      >> I tried to do a more thorough job on this.
>>      >>
>>      >>  - I could not reproduce the slowdown in Query 9. I believe the
>>     variance was simply high given the parameters and environment
>>      >>  - I saw the same slowdown in Query 8 when running as part of
>>     the suite, but it vanished when I ran repeatedly on its own, so
>>     again it is not good methodology probably
>>      >>
>>      >> We do have the dashboard at
>>     https://apache-beam-testing.appspot.com/dashboard-admin though no
>>     anomaly detection set up AFAIK.
>>      >>
>>      >>  - There is no issue easily visible in DirectRunner:
>>     https://apache-beam-testing.appspot.com/explore?dashboard=50
>> 84698770407424
>>      >>  - There is a notable degradation in Spark runner on 10/5 for
>>     many queries.
>>     https://apache-beam-testing.appspot.com/explore?dashboard=51
>> 38380291571712
>>      >>  - Something minor happened for Dataflow around 10/1:
>>     https://apache-beam-testing.appspot.com/explore?dashboard=56
>> 70405876482048
>>      >>  - Flink runner seems to have had some fantastic improvements
>>     :-)
>>     https://apache-beam-testing.appspot.com/explore?dashboard=56
>> 99257587728384
>>      >>
>>      >> So if there is a blocker it would really be the Spark runner
>>     perf changes. Of course, all these except Dataflow are using local
>>     instances so may not be representative of larger scale AFAIK.
>>      >>
>>      >> Kenn
>>      >>
>>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
>>     <mxm@apache.org <ma...@apache.org>> wrote:
>>      >>>
>>      >>> I've run WordCount using Quickstart with the FlinkRunner
>>     (locally and
>>      >>> against a Flink cluster).
>>      >>>
>>      >>> Would give a +1 but waiting what Kenn finds.
>>      >>>
>>      >>> -Max
>>      >>>
>>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
>>      >>> >
>>      >>> >
>>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>>     <kenn@apache.org <ma...@apache.org>
>>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
>>      >>> >
>>      >>> >     You two did so much verification I had a hard time
>>     finding something
>>      >>> >     where my help was meaningful! :-)
>>      >>> >
>>      >>> >     I did run the Nexmark suite on the DirectRunner against
>>     2.7.0 and
>>      >>> >     2.8.0 following
>>      >>> >
>>     https://beam.apache.org/documentation/sdks/java/nexmark/#
>> running-smoke-suite-on-the-directrunner-local
>>      >>> >         <https://beam.apache.org/docu
>> mentation/sdks/java/nexmark/#running-smoke-suite-on-the-
>> directrunner-local>.
>>      >>> >
>>      >>> >     It is admittedly a very silly test - the instructions leave
>>      >>> >     immutability enforcement on, etc. But it does appear that
>>     there is a
>>      >>> >     30% degradation in query 8 and 15% in query 9. These are
>>     the pure
>>      >>> >     Java tests, not the SQL variants. The rest of the queries
>>     are close
>>      >>> >     enough that differences are not meaningful.
>>      >>> >
>>      >>> >
>>      >>> > (It would be a good improvement for us to have alerts on daily
>>      >>> > benchmarks if we do not have such a concept already.)
>>      >>> >
>>      >>> >
>>      >>> >     I would ask a little more time to see what is going on
>>     here - is it
>>      >>> >     a real performance issue or an artifact of how the tests
>> are
>>      >>> >     invoked, or ...?
>>      >>> >
>>      >>> >
>>      >>> > Thank you! Much appreciated. Please let us know when you are
>>     done with
>>      >>> > your investigation.
>>      >>> >
>>      >>> >
>>      >>> >     Kenn
>>      >>> >
>>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>>     <altay@google.com <ma...@google.com>
>>      >>> >     <mailto:altay@google.com <ma...@google.com>>>
>> wrote:
>>      >>> >
>>      >>> >         Hi all,
>>      >>> >
>>      >>> >         Did you have a chance to review this RC? Between me
>>     and Robert
>>      >>> >         we ran a significant chunk of the validations. Let me
>>     know if
>>      >>> >         you have any questions.
>>      >>> >
>>      >>> >         Ahmet
>>      >>> >
>>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>>     <altay@google.com <ma...@google.com>
>>      >>> >         <mailto:altay@google.com <ma...@google.com>>>
>>
>>     wrote:
>>      >>> >
>>      >>> >             Hi everyone,
>>      >>> >
>>      >>> >             Please review and vote on the release candidate
>>     #1 for the
>>      >>> >             version 2.8.0, as follows:
>>      >>> >             [ ] +1, Approve the release
>>      >>> >             [ ] -1, Do not approve the release (please
>>     provide specific
>>      >>> >             comments)
>>      >>> >
>>      >>> >             The complete staging area is available for your
>>     review,
>>      >>> >             which includes:
>>      >>> >             * JIRA release notes [1],
>>      >>> >             * the official Apache source release to be
>>     deployed to
>>      >>> > dist.apache.org <http://dist.apache.org>
>>     <http://dist.apache.org> [2], which is
>>      >>> >             signed with the key with fingerprint 6096FA00 [3],
>>      >>> >             * all artifacts to be deployed to the Maven Central
>>      >>> >             Repository [4],
>>      >>> >             * source code tag "v2.8.0-RC1" [5],
>>      >>> >             * website pull request listing the release and
>>     publishing
>>      >>> >             the API reference manual [6].
>>      >>> >             * Python artifacts are deployed along with the
>> source
>>      >>> >             release to the dist.apache.org
>>     <http://dist.apache.org> <http://dist.apache.org> [2].
>>
>>      >>> >             * Validation sheet with a tab for 2.8.0 release
>>     to help with
>>      >>> >             validation [7].
>>      >>> >
>>      >>> >             The vote will be open for at least 72 hours. It
>>     is adopted
>>      >>> >             by majority approval, with at least 3 PMC
>>     affirmative votes.
>>      >>> >
>>      >>> >             Thanks,
>>      >>> >             Ahmet
>>      >>> >
>>      >>> >             [1]
>>      >>> >
>>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
>> ctId=12319527&version=12343985
>>      >>> >                 <https://issues.apache.org/ji
>> ra/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>>      >>> >             [2] https://dist.apache.org/repos/
>> dist/dev/beam/2.8.0
>>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0
>> >
>>      >>> >             [3] https://dist.apache.org/repos/
>> dist/dev/beam/KEYS
>>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>>      >>> >             [4]
>>      >>> >
>>     https://repository.apache.org/content/repositories/orgapache
>> beam-1049/
>>      >>> >                 <https://repository.apache.or
>> g/content/repositories/orgapachebeam-1049/>
>>      >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>>      >>> >             [6] https://github.com/apache/beam-site/pull/583
>>      >>> >             <https://github.com/apache/beam-site/pull/583> and
>>      >>> > https://github.com/apache/beam/pull/6745
>>      >>> >             <https://github.com/apache/beam/pull/6745>
>>      >>> >             [7]
>>      >>> >
>>     https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkS
>> ZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>>      >>> >                 <https://docs.google.com/spre
>> adsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/
>> edit#gid=1854712816>
>>      >>> >
>>      >>> >
>>      >>> >
>>
>>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Ahmet Altay <al...@google.com>.
On Mon, Oct 29, 2018 at 12:40 PM, Ismaël Mejía <ie...@gmail.com> wrote:

> From the Apache point of view nothing impedes anyone from doing
> intermediate releases for non LTS releases, only needed thing is
> someone willing to do the release and the due vote process.
>

Agreed. I was not suggesting not doing a release. I wanted to understand
cost benefit.


>
> I don’t know however how will we decide this, we are exactly in the
> middle of the release cycle and in 3 weeks we will be cutting the next
> version so not sure if it is worth, any thoughts?
>

My suggestion is to look from a user perspective. Are we affecting a
significant chunk of users? And could those stay on 2.7 until we release
2.9? From there we can decide whether this warrants a patch release or not.
I do not have information on how large of a user base we are affecting. I
assume the answer to the second question is yes and we can suggest them to
stay on 2.7 until a new release is out. From that perspective, I would
suggest skipping a patch release and waiting for the next regular release.


>
> On Mon, Oct 29, 2018 at 6:08 PM Ahmet Altay <al...@google.com> wrote:
> >
> >
> >
> > On Mon, Oct 29, 2018 at 8:55 AM, Kenneth Knowles <ke...@apache.org>
> wrote:
> >>
> >> I think definitely open a cherry pick PR to a 2.8.x branch. I think we
> must not corrupt maven central, so if it is published to users this has to
> be 2.8.1. Ahmet - we are to this point, right?
> >
> >
> > Yes, if someone is willing to make a new release this would be 2.8.1
> release. (2.8.0 is already on Maven central.)
> >
> > Side question about the initial LTS discussion. We have decided to not
> make 2.8.0 a LTS release. Should we wait until next release to patch this
> issue? What is the cost/benefit of maintaining this branch?
> >
> >>
> >>
> >> Kenn
> >>
> >> On Mon, Oct 29, 2018 at 8:40 AM Ismaël Mejía <ie...@gmail.com> wrote:
> >>>
> >>> First thanks Etienne and Kenn for noting the performance issue. I
> >>> reviewed the discussed PR.It introduced a new ‘@Experimental’ option
> >>> to the Spark runner to change the default source partitioning and
> >>> enable users to control it via a predefined size (a prerrequisite for
> >>> Spark’s dynamicAllocation).
> >>>
> >>> This however must not be the default behavior, it seems after looking
> >>> at the PR that things are not as expected and the default is now the
> >>> new behavior. I will provide a PR to fix this quickly. However the
> >>> question is, should I do cherry pick it and we do a new RC (since the
> >>> release was already 'passed') ?
> >>> On Mon, Oct 29, 2018 at 2:51 PM Kenneth Knowles <ke...@apache.org>
> wrote:
> >>> >
> >>> > I didn't isolate it to a cause and commit, so that is extremely
> useful to know. To bring some details on thread:
> >>> >
> >>> > query 4: a single aggregation in sliding windows
> >>> > query 8: a single join with no other interesting logic
> >>> > query 9 (prefix of query 6*): find the winning bid for each auction
> >>> > query 6: query 9 followed by a single aggregation
> >>> >
> >>> > Kenn
> >>> >
> >>> > * they seem out of order because the original queries were 1-8 and
> we added 9 later to benchmark the baseline without the aggregation
> >>> >
> >>> > On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot <
> echauchot@apache.org> wrote:
> >>> >>
> >>> >> Oops, just saw than Kenn already mentioned spark perf degradation
> on spark runner around 10/05. Sorry for the repetition.
> >>> >> Nevertheless, IMHO, I think it will be still worth checking PR
> #6181.
> >>> >>
> >>> >> Etienne
> >>> >>
> >>> >> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
> >>> >>
> >>> >> Hey,
> >>> >> I would vote -0 : here is the explanation:
> >>> >>
> >>> >> I took a look at Nexmark dashboards for output size and performance
> for all the runners in all the modes around the date of the release cut to
> search for regressions.
> >>> >>
> >>> >> I noted a regression on the performance of the spark runner.
> Query4, Query6, Query8 and Query9 running times were multiplied by 2 to 3
> around the date of 10/05/18. See https://apache-beam-testing.
> appspot.com/explore?dashboard=5138380291571712
> >>> >> So I searched in the commit history of the spark runner module for
> what happened around 10/05/18. And I found this commit
> >>> >>
> >>> >> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request
> #6181: [BEAM-4783] Add bundleSize for splitting BoundedSources
> >>> >>
> >>> >> I don't know if it should be considered a blocker but we should
> definitely take another look at pull request #6181 that seems to change the
> way we split on spark runner.
> >>> >>
> >>> >> Best
> >>> >> Etienne
> >>> >>
> >>> >>
> >>> >> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a
> écrit :
> >>> >>
> >>> >> +1 (binding)
> >>> >>
> >>> >>
> >>> >> On 26.10.18 17:45, Kenneth Knowles wrote:
> >>> >>
> >>> >> Nice. Thanks.
> >>> >>
> >>> >>
> >>> >> +1
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <
> robertwb@google.com
> >>> >>
> >>> >> <ma...@google.com>> wrote:
> >>> >>
> >>> >>
> >>> >>     Thanks Tim!
> >>> >>
> >>> >>
> >>> >>     This was my only hesitation, and sounds like we're in the clear
> here.
> >>> >>
> >>> >>
> >>> >>     +1 (binding)
> >>> >>
> >>> >>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
> >>> >>
> >>> >>     <timrobertson100@gmail.com <ma...@gmail.com>>
> wrote:
> >>> >>
> >>> >>      >
> >>> >>
> >>> >>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
> >>> >>
> >>> >>      >
> >>> >>
> >>> >>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
> >>> >>
> >>> >>     spreadsheet)
> >>> >>
> >>> >>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
> >>> >>
> >>> >>     backport the un-merged BEAM-5036 fix in our code)
> >>> >>
> >>> >>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
> >>> >>
> >>> >>      >
> >>> >>
> >>> >>      > Everything worked, and performance was similar on both.
> >>> >>
> >>> >>      > We built using maven pointing at
> >>> >>
> >>> >>     https://repository.apache.org/content/repositories/
> orgapachebeam-1049/
> >>> >>
> >>> >>      >
> >>> >>
> >>> >>      > Based on this limited testing: +1
> >>> >>
> >>> >>      >
> >>> >>
> >>> >>      > Thank you to the release managers,
> >>> >>
> >>> >>      > Tim
> >>> >>
> >>> >>      >
> >>> >>
> >>> >>      >
> >>> >>
> >>> >>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <
> timrobertson100@gmail.com
> >>> >>
> >>> >>     <ma...@gmail.com>> wrote:
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >> I can do some tests on Spark / YARN tomorrow (CEST
> timezone).
> >>> >>
> >>> >>     Sorry I’ve just been too busy to assist.
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >> Tim
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
> >>> >>
> >>> >>     <ma...@apache.org>> wrote:
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >> I tried to do a more thorough job on this.
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >>  - I could not reproduce the slowdown in Query 9. I believe
> the
> >>> >>
> >>> >>     variance was simply high given the parameters and environment
> >>> >>
> >>> >>      >>  - I saw the same slowdown in Query 8 when running as part
> of
> >>> >>
> >>> >>     the suite, but it vanished when I ran repeatedly on its own, so
> >>> >>
> >>> >>     again it is not good methodology probably
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >> We do have the dashboard at
> >>> >>
> >>> >>     https://apache-beam-testing.appspot.com/dashboard-admin though
> no
> >>> >>
> >>> >>     anomaly detection set up AFAIK.
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >>  - There is no issue easily visible in DirectRunner:
> >>> >>
> >>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=
> 5084698770407424
> >>> >>
> >>> >>      >>  - There is a notable degradation in Spark runner on 10/5
> for
> >>> >>
> >>> >>     many queries.
> >>> >>
> >>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=
> 5138380291571712
> >>> >>
> >>> >>      >>  - Something minor happened for Dataflow around 10/1:
> >>> >>
> >>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=
> 5670405876482048
> >>> >>
> >>> >>      >>  - Flink runner seems to have had some fantastic
> improvements
> >>> >>
> >>> >>     :-)
> >>> >>
> >>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=
> 5699257587728384
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >> So if there is a blocker it would really be the Spark runner
> >>> >>
> >>> >>     perf changes. Of course, all these except Dataflow are using
> local
> >>> >>
> >>> >>     instances so may not be representative of larger scale AFAIK.
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >> Kenn
> >>> >>
> >>> >>      >>
> >>> >>
> >>> >>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
> >>> >>
> >>> >>     <mxm@apache.org <ma...@apache.org>> wrote:
> >>> >>
> >>> >>      >>>
> >>> >>
> >>> >>      >>> I've run WordCount using Quickstart with the FlinkRunner
> >>> >>
> >>> >>     (locally and
> >>> >>
> >>> >>      >>> against a Flink cluster).
> >>> >>
> >>> >>      >>>
> >>> >>
> >>> >>      >>> Would give a +1 but waiting what Kenn finds.
> >>> >>
> >>> >>      >>>
> >>> >>
> >>> >>      >>> -Max
> >>> >>
> >>> >>      >>>
> >>> >>
> >>> >>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
> >>> >>
> >>> >>     <kenn@apache.org <ma...@apache.org>
> >>> >>
> >>> >>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>>
> wrote:
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >     You two did so much verification I had a hard time
> >>> >>
> >>> >>     finding something
> >>> >>
> >>> >>      >>> >     where my help was meaningful! :-)
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >     I did run the Nexmark suite on the DirectRunner
> against
> >>> >>
> >>> >>     2.7.0 and
> >>> >>
> >>> >>      >>> >     2.8.0 following
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>     https://beam.apache.org/documentation/sdks/java/
> nexmark/#running-smoke-suite-on-the-directrunner-local
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>       <https://beam.apache.org/documentation/sdks/java/
> nexmark/#running-smoke-suite-on-the-directrunner-local>.
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >     It is admittedly a very silly test - the
> instructions leave
> >>> >>
> >>> >>      >>> >     immutability enforcement on, etc. But it does appear
> that
> >>> >>
> >>> >>     there is a
> >>> >>
> >>> >>      >>> >     30% degradation in query 8 and 15% in query 9. These
> are
> >>> >>
> >>> >>     the pure
> >>> >>
> >>> >>      >>> >     Java tests, not the SQL variants. The rest of the
> queries
> >>> >>
> >>> >>     are close
> >>> >>
> >>> >>      >>> >     enough that differences are not meaningful.
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> > (It would be a good improvement for us to have alerts on
> daily
> >>> >>
> >>> >>      >>> > benchmarks if we do not have such a concept already.)
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >     I would ask a little more time to see what is going
> on
> >>> >>
> >>> >>     here - is it
> >>> >>
> >>> >>      >>> >     a real performance issue or an artifact of how the
> tests are
> >>> >>
> >>> >>      >>> >     invoked, or ...?
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> > Thank you! Much appreciated. Please let us know when you
> are
> >>> >>
> >>> >>     done with
> >>> >>
> >>> >>      >>> > your investigation.
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >     Kenn
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
> >>> >>
> >>> >>     <altay@google.com <ma...@google.com>
> >>> >>
> >>> >>      >>> >     <mailto:altay@google.com <ma...@google.com>>>
> wrote:
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >         Hi all,
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >         Did you have a chance to review this RC? Between
> me
> >>> >>
> >>> >>     and Robert
> >>> >>
> >>> >>      >>> >         we ran a significant chunk of the validations.
> Let me
> >>> >>
> >>> >>     know if
> >>> >>
> >>> >>      >>> >         you have any questions.
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >         Ahmet
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
> >>> >>
> >>> >>     <altay@google.com <ma...@google.com>
> >>> >>
> >>> >>      >>> >         <mailto:altay@google.com <mailto:
> altay@google.com>>>
> >>> >>
> >>> >>     wrote:
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >             Hi everyone,
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >             Please review and vote on the release
> candidate
> >>> >>
> >>> >>     #1 for the
> >>> >>
> >>> >>      >>> >             version 2.8.0, as follows:
> >>> >>
> >>> >>      >>> >             [ ] +1, Approve the release
> >>> >>
> >>> >>      >>> >             [ ] -1, Do not approve the release (please
> >>> >>
> >>> >>     provide specific
> >>> >>
> >>> >>      >>> >             comments)
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >             The complete staging area is available for
> your
> >>> >>
> >>> >>     review,
> >>> >>
> >>> >>      >>> >             which includes:
> >>> >>
> >>> >>      >>> >             * JIRA release notes [1],
> >>> >>
> >>> >>      >>> >             * the official Apache source release to be
> >>> >>
> >>> >>     deployed to
> >>> >>
> >>> >>      >>> > dist.apache.org <http://dist.apache.org>
> >>> >>
> >>> >>     <http://dist.apache.org> [2], which is
> >>> >>
> >>> >>      >>> >             signed with the key with fingerprint
> 6096FA00 [3],
> >>> >>
> >>> >>      >>> >             * all artifacts to be deployed to the Maven
> Central
> >>> >>
> >>> >>      >>> >             Repository [4],
> >>> >>
> >>> >>      >>> >             * source code tag "v2.8.0-RC1" [5],
> >>> >>
> >>> >>      >>> >             * website pull request listing the release
> and
> >>> >>
> >>> >>     publishing
> >>> >>
> >>> >>      >>> >             the API reference manual [6].
> >>> >>
> >>> >>      >>> >             * Python artifacts are deployed along with
> the source
> >>> >>
> >>> >>      >>> >             release to the dist.apache.org
> >>> >>
> >>> >>     <http://dist.apache.org> <http://dist.apache.org> [2].
> >>> >>
> >>> >>      >>> >             * Validation sheet with a tab for 2.8.0
> release
> >>> >>
> >>> >>     to help with
> >>> >>
> >>> >>      >>> >             validation [7].
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >             The vote will be open for at least 72 hours.
> It
> >>> >>
> >>> >>     is adopted
> >>> >>
> >>> >>      >>> >             by majority approval, with at least 3 PMC
> >>> >>
> >>> >>     affirmative votes.
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >             Thanks,
> >>> >>
> >>> >>      >>> >             Ahmet
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >             [1]
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12319527&version=12343985
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>       <https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12319527&version=12343985>
> >>> >>
> >>> >>      >>> >             [2] https://dist.apache.org/repos/
> dist/dev/beam/2.8.0
> >>> >>
> >>> >>      >>> >             <https://dist.apache.org/
> repos/dist/dev/beam/2.8.0>
> >>> >>
> >>> >>      >>> >             [3] https://dist.apache.org/repos/
> dist/dev/beam/KEYS
> >>> >>
> >>> >>      >>> >             <https://dist.apache.org/
> repos/dist/dev/beam/KEYS>
> >>> >>
> >>> >>      >>> >             [4]
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>     https://repository.apache.org/content/repositories/
> orgapachebeam-1049/
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>       <https://repository.apache.org/content/repositories/
> orgapachebeam-1049/>
> >>> >>
> >>> >>      >>> >             [5] https://github.com/apache/
> beam/tree/v2.8.0-RC1
> >>> >>
> >>> >>      >>> >             <https://github.com/apache/
> beam/tree/v2.8.0-RC1>
> >>> >>
> >>> >>      >>> >             [6] https://github.com/apache/
> beam-site/pull/583
> >>> >>
> >>> >>      >>> >             <https://github.com/apache/
> beam-site/pull/583> and
> >>> >>
> >>> >>      >>> > https://github.com/apache/beam/pull/6745
> >>> >>
> >>> >>      >>> >             <https://github.com/apache/beam/pull/6745>
> >>> >>
> >>> >>      >>> >             [7]
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>     https://docs.google.com/spreadsheets/d/1qk-
> N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>       <https://docs.google.com/spreadsheets/d/1qk-
> N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>      >>> >
> >>> >>
> >>> >>
> >
> >
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Ismaël Mejía <ie...@gmail.com>.
From the Apache point of view nothing impedes anyone from doing
intermediate releases for non LTS releases, only needed thing is
someone willing to do the release and the due vote process.

I don’t know however how will we decide this, we are exactly in the
middle of the release cycle and in 3 weeks we will be cutting the next
version so not sure if it is worth, any thoughts?

On Mon, Oct 29, 2018 at 6:08 PM Ahmet Altay <al...@google.com> wrote:
>
>
>
> On Mon, Oct 29, 2018 at 8:55 AM, Kenneth Knowles <ke...@apache.org> wrote:
>>
>> I think definitely open a cherry pick PR to a 2.8.x branch. I think we must not corrupt maven central, so if it is published to users this has to be 2.8.1. Ahmet - we are to this point, right?
>
>
> Yes, if someone is willing to make a new release this would be 2.8.1 release. (2.8.0 is already on Maven central.)
>
> Side question about the initial LTS discussion. We have decided to not make 2.8.0 a LTS release. Should we wait until next release to patch this issue? What is the cost/benefit of maintaining this branch?
>
>>
>>
>> Kenn
>>
>> On Mon, Oct 29, 2018 at 8:40 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>> First thanks Etienne and Kenn for noting the performance issue. I
>>> reviewed the discussed PR.It introduced a new ‘@Experimental’ option
>>> to the Spark runner to change the default source partitioning and
>>> enable users to control it via a predefined size (a prerrequisite for
>>> Spark’s dynamicAllocation).
>>>
>>> This however must not be the default behavior, it seems after looking
>>> at the PR that things are not as expected and the default is now the
>>> new behavior. I will provide a PR to fix this quickly. However the
>>> question is, should I do cherry pick it and we do a new RC (since the
>>> release was already 'passed') ?
>>> On Mon, Oct 29, 2018 at 2:51 PM Kenneth Knowles <ke...@apache.org> wrote:
>>> >
>>> > I didn't isolate it to a cause and commit, so that is extremely useful to know. To bring some details on thread:
>>> >
>>> > query 4: a single aggregation in sliding windows
>>> > query 8: a single join with no other interesting logic
>>> > query 9 (prefix of query 6*): find the winning bid for each auction
>>> > query 6: query 9 followed by a single aggregation
>>> >
>>> > Kenn
>>> >
>>> > * they seem out of order because the original queries were 1-8 and we added 9 later to benchmark the baseline without the aggregation
>>> >
>>> > On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot <ec...@apache.org> wrote:
>>> >>
>>> >> Oops, just saw than Kenn already mentioned spark perf degradation on spark runner around 10/05. Sorry for the repetition.
>>> >> Nevertheless, IMHO, I think it will be still worth checking PR #6181.
>>> >>
>>> >> Etienne
>>> >>
>>> >> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
>>> >>
>>> >> Hey,
>>> >> I would vote -0 : here is the explanation:
>>> >>
>>> >> I took a look at Nexmark dashboards for output size and performance for all the runners in all the modes around the date of the release cut to search for regressions.
>>> >>
>>> >> I noted a regression on the performance of the spark runner. Query4, Query6, Query8 and Query9 running times were multiplied by 2 to 3 around the date of 10/05/18. See https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>>> >> So I searched in the commit history of the spark runner module for what happened around 10/05/18. And I found this commit
>>> >>
>>> >> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181: [BEAM-4783] Add bundleSize for splitting BoundedSources
>>> >>
>>> >> I don't know if it should be considered a blocker but we should definitely take another look at pull request #6181 that seems to change the way we split on spark runner.
>>> >>
>>> >> Best
>>> >> Etienne
>>> >>
>>> >>
>>> >> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
>>> >>
>>> >> +1 (binding)
>>> >>
>>> >>
>>> >> On 26.10.18 17:45, Kenneth Knowles wrote:
>>> >>
>>> >> Nice. Thanks.
>>> >>
>>> >>
>>> >> +1
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com
>>> >>
>>> >> <ma...@google.com>> wrote:
>>> >>
>>> >>
>>> >>     Thanks Tim!
>>> >>
>>> >>
>>> >>     This was my only hesitation, and sounds like we're in the clear here.
>>> >>
>>> >>
>>> >>     +1 (binding)
>>> >>
>>> >>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
>>> >>
>>> >>     <timrobertson100@gmail.com <ma...@gmail.com>> wrote:
>>> >>
>>> >>      >
>>> >>
>>> >>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
>>> >>
>>> >>      >
>>> >>
>>> >>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
>>> >>
>>> >>     spreadsheet)
>>> >>
>>> >>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
>>> >>
>>> >>     backport the un-merged BEAM-5036 fix in our code)
>>> >>
>>> >>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>>> >>
>>> >>      >
>>> >>
>>> >>      > Everything worked, and performance was similar on both.
>>> >>
>>> >>      > We built using maven pointing at
>>> >>
>>> >>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>>> >>
>>> >>      >
>>> >>
>>> >>      > Based on this limited testing: +1
>>> >>
>>> >>      >
>>> >>
>>> >>      > Thank you to the release managers,
>>> >>
>>> >>      > Tim
>>> >>
>>> >>      >
>>> >>
>>> >>      >
>>> >>
>>> >>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
>>> >>
>>> >>     <ma...@gmail.com>> wrote:
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
>>> >>
>>> >>     Sorry I’ve just been too busy to assist.
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> Tim
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
>>> >>
>>> >>     <ma...@apache.org>> wrote:
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> I tried to do a more thorough job on this.
>>> >>
>>> >>      >>
>>> >>
>>> >>      >>  - I could not reproduce the slowdown in Query 9. I believe the
>>> >>
>>> >>     variance was simply high given the parameters and environment
>>> >>
>>> >>      >>  - I saw the same slowdown in Query 8 when running as part of
>>> >>
>>> >>     the suite, but it vanished when I ran repeatedly on its own, so
>>> >>
>>> >>     again it is not good methodology probably
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> We do have the dashboard at
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/dashboard-admin though no
>>> >>
>>> >>     anomaly detection set up AFAIK.
>>> >>
>>> >>      >>
>>> >>
>>> >>      >>  - There is no issue easily visible in DirectRunner:
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>>> >>
>>> >>      >>  - There is a notable degradation in Spark runner on 10/5 for
>>> >>
>>> >>     many queries.
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>>> >>
>>> >>      >>  - Something minor happened for Dataflow around 10/1:
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>>> >>
>>> >>      >>  - Flink runner seems to have had some fantastic improvements
>>> >>
>>> >>     :-)
>>> >>
>>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> So if there is a blocker it would really be the Spark runner
>>> >>
>>> >>     perf changes. Of course, all these except Dataflow are using local
>>> >>
>>> >>     instances so may not be representative of larger scale AFAIK.
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> Kenn
>>> >>
>>> >>      >>
>>> >>
>>> >>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
>>> >>
>>> >>     <mxm@apache.org <ma...@apache.org>> wrote:
>>> >>
>>> >>      >>>
>>> >>
>>> >>      >>> I've run WordCount using Quickstart with the FlinkRunner
>>> >>
>>> >>     (locally and
>>> >>
>>> >>      >>> against a Flink cluster).
>>> >>
>>> >>      >>>
>>> >>
>>> >>      >>> Would give a +1 but waiting what Kenn finds.
>>> >>
>>> >>      >>>
>>> >>
>>> >>      >>> -Max
>>> >>
>>> >>      >>>
>>> >>
>>> >>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>>> >>
>>> >>     <kenn@apache.org <ma...@apache.org>
>>> >>
>>> >>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     You two did so much verification I had a hard time
>>> >>
>>> >>     finding something
>>> >>
>>> >>      >>> >     where my help was meaningful! :-)
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     I did run the Nexmark suite on the DirectRunner against
>>> >>
>>> >>     2.7.0 and
>>> >>
>>> >>      >>> >     2.8.0 following
>>> >>
>>> >>      >>> >
>>> >>
>>> >>     https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>>> >>
>>> >>      >>> >
>>> >>
>>> >>       <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     It is admittedly a very silly test - the instructions leave
>>> >>
>>> >>      >>> >     immutability enforcement on, etc. But it does appear that
>>> >>
>>> >>     there is a
>>> >>
>>> >>      >>> >     30% degradation in query 8 and 15% in query 9. These are
>>> >>
>>> >>     the pure
>>> >>
>>> >>      >>> >     Java tests, not the SQL variants. The rest of the queries
>>> >>
>>> >>     are close
>>> >>
>>> >>      >>> >     enough that differences are not meaningful.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> > (It would be a good improvement for us to have alerts on daily
>>> >>
>>> >>      >>> > benchmarks if we do not have such a concept already.)
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     I would ask a little more time to see what is going on
>>> >>
>>> >>     here - is it
>>> >>
>>> >>      >>> >     a real performance issue or an artifact of how the tests are
>>> >>
>>> >>      >>> >     invoked, or ...?
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> > Thank you! Much appreciated. Please let us know when you are
>>> >>
>>> >>     done with
>>> >>
>>> >>      >>> > your investigation.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     Kenn
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>>> >>
>>> >>     <altay@google.com <ma...@google.com>
>>> >>
>>> >>      >>> >     <mailto:altay@google.com <ma...@google.com>>> wrote:
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >         Hi all,
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >         Did you have a chance to review this RC? Between me
>>> >>
>>> >>     and Robert
>>> >>
>>> >>      >>> >         we ran a significant chunk of the validations. Let me
>>> >>
>>> >>     know if
>>> >>
>>> >>      >>> >         you have any questions.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >         Ahmet
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>>> >>
>>> >>     <altay@google.com <ma...@google.com>
>>> >>
>>> >>      >>> >         <mailto:altay@google.com <ma...@google.com>>>
>>> >>
>>> >>     wrote:
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             Hi everyone,
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             Please review and vote on the release candidate
>>> >>
>>> >>     #1 for the
>>> >>
>>> >>      >>> >             version 2.8.0, as follows:
>>> >>
>>> >>      >>> >             [ ] +1, Approve the release
>>> >>
>>> >>      >>> >             [ ] -1, Do not approve the release (please
>>> >>
>>> >>     provide specific
>>> >>
>>> >>      >>> >             comments)
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             The complete staging area is available for your
>>> >>
>>> >>     review,
>>> >>
>>> >>      >>> >             which includes:
>>> >>
>>> >>      >>> >             * JIRA release notes [1],
>>> >>
>>> >>      >>> >             * the official Apache source release to be
>>> >>
>>> >>     deployed to
>>> >>
>>> >>      >>> > dist.apache.org <http://dist.apache.org>
>>> >>
>>> >>     <http://dist.apache.org> [2], which is
>>> >>
>>> >>      >>> >             signed with the key with fingerprint 6096FA00 [3],
>>> >>
>>> >>      >>> >             * all artifacts to be deployed to the Maven Central
>>> >>
>>> >>      >>> >             Repository [4],
>>> >>
>>> >>      >>> >             * source code tag "v2.8.0-RC1" [5],
>>> >>
>>> >>      >>> >             * website pull request listing the release and
>>> >>
>>> >>     publishing
>>> >>
>>> >>      >>> >             the API reference manual [6].
>>> >>
>>> >>      >>> >             * Python artifacts are deployed along with the source
>>> >>
>>> >>      >>> >             release to the dist.apache.org
>>> >>
>>> >>     <http://dist.apache.org> <http://dist.apache.org> [2].
>>> >>
>>> >>      >>> >             * Validation sheet with a tab for 2.8.0 release
>>> >>
>>> >>     to help with
>>> >>
>>> >>      >>> >             validation [7].
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             The vote will be open for at least 72 hours. It
>>> >>
>>> >>     is adopted
>>> >>
>>> >>      >>> >             by majority approval, with at least 3 PMC
>>> >>
>>> >>     affirmative votes.
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             Thanks,
>>> >>
>>> >>      >>> >             Ahmet
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >             [1]
>>> >>
>>> >>      >>> >
>>> >>
>>> >>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>>> >>
>>> >>      >>> >
>>> >>
>>> >>       <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>>> >>
>>> >>      >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>>> >>
>>> >>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>>> >>
>>> >>      >>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>>> >>
>>> >>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>>> >>
>>> >>      >>> >             [4]
>>> >>
>>> >>      >>> >
>>> >>
>>> >>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>>> >>
>>> >>      >>> >
>>> >>
>>> >>       <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>>> >>
>>> >>      >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>>> >>
>>> >>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>>> >>
>>> >>      >>> >             [6] https://github.com/apache/beam-site/pull/583
>>> >>
>>> >>      >>> >             <https://github.com/apache/beam-site/pull/583> and
>>> >>
>>> >>      >>> > https://github.com/apache/beam/pull/6745
>>> >>
>>> >>      >>> >             <https://github.com/apache/beam/pull/6745>
>>> >>
>>> >>      >>> >             [7]
>>> >>
>>> >>      >>> >
>>> >>
>>> >>     https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>>> >>
>>> >>      >>> >
>>> >>
>>> >>       <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>      >>> >
>>> >>
>>> >>
>
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Ahmet Altay <al...@google.com>.
On Mon, Oct 29, 2018 at 8:55 AM, Kenneth Knowles <ke...@apache.org> wrote:

> I think definitely open a cherry pick PR to a 2.8.x branch. I think we
> must not corrupt maven central, so if it is published to users this has to
> be 2.8.1. Ahmet - we are to this point, right?
>

Yes, if someone is willing to make a new release this would be 2.8.1
release. (2.8.0 is already on Maven central.)

Side question about the initial LTS discussion. We have decided to not make
2.8.0 a LTS release. Should we wait until next release to patch this issue?
What is the cost/benefit of maintaining this branch?


>
> Kenn
>
> On Mon, Oct 29, 2018 at 8:40 AM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> First thanks Etienne and Kenn for noting the performance issue. I
>> reviewed the discussed PR.It introduced a new ‘@Experimental’ option
>> to the Spark runner to change the default source partitioning and
>> enable users to control it via a predefined size (a prerrequisite for
>> Spark’s dynamicAllocation).
>>
>> This however must not be the default behavior, it seems after looking
>> at the PR that things are not as expected and the default is now the
>> new behavior. I will provide a PR to fix this quickly. However the
>> question is, should I do cherry pick it and we do a new RC (since the
>> release was already 'passed') ?
>> On Mon, Oct 29, 2018 at 2:51 PM Kenneth Knowles <ke...@apache.org> wrote:
>> >
>> > I didn't isolate it to a cause and commit, so that is extremely useful
>> to know. To bring some details on thread:
>> >
>> > query 4: a single aggregation in sliding windows
>> > query 8: a single join with no other interesting logic
>> > query 9 (prefix of query 6*): find the winning bid for each auction
>> > query 6: query 9 followed by a single aggregation
>> >
>> > Kenn
>> >
>> > * they seem out of order because the original queries were 1-8 and we
>> added 9 later to benchmark the baseline without the aggregation
>> >
>> > On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot <ec...@apache.org>
>> wrote:
>> >>
>> >> Oops, just saw than Kenn already mentioned spark perf degradation on
>> spark runner around 10/05. Sorry for the repetition.
>> >> Nevertheless, IMHO, I think it will be still worth checking PR #6181.
>> >>
>> >> Etienne
>> >>
>> >> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
>> >>
>> >> Hey,
>> >> I would vote -0 : here is the explanation:
>> >>
>> >> I took a look at Nexmark dashboards for output size and performance
>> for all the runners in all the modes around the date of the release cut to
>> search for regressions.
>> >>
>> >> I noted a regression on the performance of the spark runner. Query4,
>> Query6, Query8 and Query9 running times were multiplied by 2 to 3 around
>> the date of 10/05/18. See https://apache-beam-testing.
>> appspot.com/explore?dashboard=5138380291571712
>> >> So I searched in the commit history of the spark runner module for
>> what happened around 10/05/18. And I found this commit
>> >>
>> >> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181:
>> [BEAM-4783] Add bundleSize for splitting BoundedSources
>> >>
>> >> I don't know if it should be considered a blocker but we should
>> definitely take another look at pull request #6181 that seems to change the
>> way we split on spark runner.
>> >>
>> >> Best
>> >> Etienne
>> >>
>> >>
>> >> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
>> >>
>> >> +1 (binding)
>> >>
>> >>
>> >> On 26.10.18 17:45, Kenneth Knowles wrote:
>> >>
>> >> Nice. Thanks.
>> >>
>> >>
>> >> +1
>> >>
>> >>
>> >>
>> >> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com
>> >>
>> >> <ma...@google.com>> wrote:
>> >>
>> >>
>> >>     Thanks Tim!
>> >>
>> >>
>> >>     This was my only hesitation, and sounds like we're in the clear
>> here.
>> >>
>> >>
>> >>     +1 (binding)
>> >>
>> >>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
>> >>
>> >>     <timrobertson100@gmail.com <ma...@gmail.com>>
>> wrote:
>> >>
>> >>      >
>> >>
>> >>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
>> >>
>> >>      >
>> >>
>> >>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
>> >>
>> >>     spreadsheet)
>> >>
>> >>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
>> >>
>> >>     backport the un-merged BEAM-5036 fix in our code)
>> >>
>> >>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>> >>
>> >>      >
>> >>
>> >>      > Everything worked, and performance was similar on both.
>> >>
>> >>      > We built using maven pointing at
>> >>
>> >>     https://repository.apache.org/content/repositories/
>> orgapachebeam-1049/
>> >>
>> >>      >
>> >>
>> >>      > Based on this limited testing: +1
>> >>
>> >>      >
>> >>
>> >>      > Thank you to the release managers,
>> >>
>> >>      > Tim
>> >>
>> >>      >
>> >>
>> >>      >
>> >>
>> >>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
>> >>
>> >>     <ma...@gmail.com>> wrote:
>> >>
>> >>      >>
>> >>
>> >>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
>> >>
>> >>     Sorry I’ve just been too busy to assist.
>> >>
>> >>      >>
>> >>
>> >>      >> Tim
>> >>
>> >>      >>
>> >>
>> >>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
>> >>
>> >>     <ma...@apache.org>> wrote:
>> >>
>> >>      >>
>> >>
>> >>      >> I tried to do a more thorough job on this.
>> >>
>> >>      >>
>> >>
>> >>      >>  - I could not reproduce the slowdown in Query 9. I believe the
>> >>
>> >>     variance was simply high given the parameters and environment
>> >>
>> >>      >>  - I saw the same slowdown in Query 8 when running as part of
>> >>
>> >>     the suite, but it vanished when I ran repeatedly on its own, so
>> >>
>> >>     again it is not good methodology probably
>> >>
>> >>      >>
>> >>
>> >>      >> We do have the dashboard at
>> >>
>> >>     https://apache-beam-testing.appspot.com/dashboard-admin though no
>> >>
>> >>     anomaly detection set up AFAIK.
>> >>
>> >>      >>
>> >>
>> >>      >>  - There is no issue easily visible in DirectRunner:
>> >>
>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=
>> 5084698770407424
>> >>
>> >>      >>  - There is a notable degradation in Spark runner on 10/5 for
>> >>
>> >>     many queries.
>> >>
>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=
>> 5138380291571712
>> >>
>> >>      >>  - Something minor happened for Dataflow around 10/1:
>> >>
>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=
>> 5670405876482048
>> >>
>> >>      >>  - Flink runner seems to have had some fantastic improvements
>> >>
>> >>     :-)
>> >>
>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=
>> 5699257587728384
>> >>
>> >>      >>
>> >>
>> >>      >> So if there is a blocker it would really be the Spark runner
>> >>
>> >>     perf changes. Of course, all these except Dataflow are using local
>> >>
>> >>     instances so may not be representative of larger scale AFAIK.
>> >>
>> >>      >>
>> >>
>> >>      >> Kenn
>> >>
>> >>      >>
>> >>
>> >>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
>> >>
>> >>     <mxm@apache.org <ma...@apache.org>> wrote:
>> >>
>> >>      >>>
>> >>
>> >>      >>> I've run WordCount using Quickstart with the FlinkRunner
>> >>
>> >>     (locally and
>> >>
>> >>      >>> against a Flink cluster).
>> >>
>> >>      >>>
>> >>
>> >>      >>> Would give a +1 but waiting what Kenn finds.
>> >>
>> >>      >>>
>> >>
>> >>      >>> -Max
>> >>
>> >>      >>>
>> >>
>> >>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>> >>
>> >>     <kenn@apache.org <ma...@apache.org>
>> >>
>> >>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     You two did so much verification I had a hard time
>> >>
>> >>     finding something
>> >>
>> >>      >>> >     where my help was meaningful! :-)
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     I did run the Nexmark suite on the DirectRunner against
>> >>
>> >>     2.7.0 and
>> >>
>> >>      >>> >     2.8.0 following
>> >>
>> >>      >>> >
>> >>
>> >>     https://beam.apache.org/documentation/sdks/java/
>> nexmark/#running-smoke-suite-on-the-directrunner-local
>> >>
>> >>      >>> >
>> >>
>> >>       <https://beam.apache.org/documentation/sdks/java/
>> nexmark/#running-smoke-suite-on-the-directrunner-local>.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     It is admittedly a very silly test - the instructions
>> leave
>> >>
>> >>      >>> >     immutability enforcement on, etc. But it does appear
>> that
>> >>
>> >>     there is a
>> >>
>> >>      >>> >     30% degradation in query 8 and 15% in query 9. These are
>> >>
>> >>     the pure
>> >>
>> >>      >>> >     Java tests, not the SQL variants. The rest of the
>> queries
>> >>
>> >>     are close
>> >>
>> >>      >>> >     enough that differences are not meaningful.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> > (It would be a good improvement for us to have alerts on
>> daily
>> >>
>> >>      >>> > benchmarks if we do not have such a concept already.)
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     I would ask a little more time to see what is going on
>> >>
>> >>     here - is it
>> >>
>> >>      >>> >     a real performance issue or an artifact of how the
>> tests are
>> >>
>> >>      >>> >     invoked, or ...?
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> > Thank you! Much appreciated. Please let us know when you are
>> >>
>> >>     done with
>> >>
>> >>      >>> > your investigation.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     Kenn
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>> >>
>> >>     <altay@google.com <ma...@google.com>
>> >>
>> >>      >>> >     <mailto:altay@google.com <ma...@google.com>>>
>> wrote:
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >         Hi all,
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >         Did you have a chance to review this RC? Between me
>> >>
>> >>     and Robert
>> >>
>> >>      >>> >         we ran a significant chunk of the validations. Let
>> me
>> >>
>> >>     know if
>> >>
>> >>      >>> >         you have any questions.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >         Ahmet
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>> >>
>> >>     <altay@google.com <ma...@google.com>
>> >>
>> >>      >>> >         <mailto:altay@google.com <mailto:altay@google.com
>> >>>
>> >>
>> >>     wrote:
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             Hi everyone,
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             Please review and vote on the release candidate
>> >>
>> >>     #1 for the
>> >>
>> >>      >>> >             version 2.8.0, as follows:
>> >>
>> >>      >>> >             [ ] +1, Approve the release
>> >>
>> >>      >>> >             [ ] -1, Do not approve the release (please
>> >>
>> >>     provide specific
>> >>
>> >>      >>> >             comments)
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             The complete staging area is available for your
>> >>
>> >>     review,
>> >>
>> >>      >>> >             which includes:
>> >>
>> >>      >>> >             * JIRA release notes [1],
>> >>
>> >>      >>> >             * the official Apache source release to be
>> >>
>> >>     deployed to
>> >>
>> >>      >>> > dist.apache.org <http://dist.apache.org>
>> >>
>> >>     <http://dist.apache.org> [2], which is
>> >>
>> >>      >>> >             signed with the key with fingerprint 6096FA00
>> [3],
>> >>
>> >>      >>> >             * all artifacts to be deployed to the Maven
>> Central
>> >>
>> >>      >>> >             Repository [4],
>> >>
>> >>      >>> >             * source code tag "v2.8.0-RC1" [5],
>> >>
>> >>      >>> >             * website pull request listing the release and
>> >>
>> >>     publishing
>> >>
>> >>      >>> >             the API reference manual [6].
>> >>
>> >>      >>> >             * Python artifacts are deployed along with the
>> source
>> >>
>> >>      >>> >             release to the dist.apache.org
>> >>
>> >>     <http://dist.apache.org> <http://dist.apache.org> [2].
>> >>
>> >>      >>> >             * Validation sheet with a tab for 2.8.0 release
>> >>
>> >>     to help with
>> >>
>> >>      >>> >             validation [7].
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             The vote will be open for at least 72 hours. It
>> >>
>> >>     is adopted
>> >>
>> >>      >>> >             by majority approval, with at least 3 PMC
>> >>
>> >>     affirmative votes.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             Thanks,
>> >>
>> >>      >>> >             Ahmet
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             [1]
>> >>
>> >>      >>> >
>> >>
>> >>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>> projectId=12319527&version=12343985
>> >>
>> >>      >>> >
>> >>
>> >>       <https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>> projectId=12319527&version=12343985>
>> >>
>> >>      >>> >             [2] https://dist.apache.org/repos/
>> dist/dev/beam/2.8.0
>> >>
>> >>      >>> >             <https://dist.apache.org/
>> repos/dist/dev/beam/2.8.0>
>> >>
>> >>      >>> >             [3] https://dist.apache.org/repos/
>> dist/dev/beam/KEYS
>> >>
>> >>      >>> >             <https://dist.apache.org/
>> repos/dist/dev/beam/KEYS>
>> >>
>> >>      >>> >             [4]
>> >>
>> >>      >>> >
>> >>
>> >>     https://repository.apache.org/content/repositories/
>> orgapachebeam-1049/
>> >>
>> >>      >>> >
>> >>
>> >>       <https://repository.apache.org/content/repositories/
>> orgapachebeam-1049/>
>> >>
>> >>      >>> >             [5] https://github.com/apache/
>> beam/tree/v2.8.0-RC1
>> >>
>> >>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1
>> >
>> >>
>> >>      >>> >             [6] https://github.com/apache/
>> beam-site/pull/583
>> >>
>> >>      >>> >             <https://github.com/apache/beam-site/pull/583>
>> and
>> >>
>> >>      >>> > https://github.com/apache/beam/pull/6745
>> >>
>> >>      >>> >             <https://github.com/apache/beam/pull/6745>
>> >>
>> >>      >>> >             [7]
>> >>
>> >>      >>> >
>> >>
>> >>     https://docs.google.com/spreadsheets/d/1qk-
>> N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>> >>
>> >>      >>> >
>> >>
>> >>       <https://docs.google.com/spreadsheets/d/1qk-
>> N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>
>>
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Ismaël Mejía <ie...@gmail.com>.
Mmm 2.8.0 is already in maven central, so probably worth to discuss if
other backports are needed too.

On Mon, Oct 29, 2018 at 4:55 PM Kenneth Knowles <ke...@apache.org> wrote:
>
> I think definitely open a cherry pick PR to a 2.8.x branch. I think we must not corrupt maven central, so if it is published to users this has to be 2.8.1. Ahmet - we are to this point, right?
>
> Kenn
>
> On Mon, Oct 29, 2018 at 8:40 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>> First thanks Etienne and Kenn for noting the performance issue. I
>> reviewed the discussed PR.It introduced a new ‘@Experimental’ option
>> to the Spark runner to change the default source partitioning and
>> enable users to control it via a predefined size (a prerrequisite for
>> Spark’s dynamicAllocation).
>>
>> This however must not be the default behavior, it seems after looking
>> at the PR that things are not as expected and the default is now the
>> new behavior. I will provide a PR to fix this quickly. However the
>> question is, should I do cherry pick it and we do a new RC (since the
>> release was already 'passed') ?
>> On Mon, Oct 29, 2018 at 2:51 PM Kenneth Knowles <ke...@apache.org> wrote:
>> >
>> > I didn't isolate it to a cause and commit, so that is extremely useful to know. To bring some details on thread:
>> >
>> > query 4: a single aggregation in sliding windows
>> > query 8: a single join with no other interesting logic
>> > query 9 (prefix of query 6*): find the winning bid for each auction
>> > query 6: query 9 followed by a single aggregation
>> >
>> > Kenn
>> >
>> > * they seem out of order because the original queries were 1-8 and we added 9 later to benchmark the baseline without the aggregation
>> >
>> > On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot <ec...@apache.org> wrote:
>> >>
>> >> Oops, just saw than Kenn already mentioned spark perf degradation on spark runner around 10/05. Sorry for the repetition.
>> >> Nevertheless, IMHO, I think it will be still worth checking PR #6181.
>> >>
>> >> Etienne
>> >>
>> >> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
>> >>
>> >> Hey,
>> >> I would vote -0 : here is the explanation:
>> >>
>> >> I took a look at Nexmark dashboards for output size and performance for all the runners in all the modes around the date of the release cut to search for regressions.
>> >>
>> >> I noted a regression on the performance of the spark runner. Query4, Query6, Query8 and Query9 running times were multiplied by 2 to 3 around the date of 10/05/18. See https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>> >> So I searched in the commit history of the spark runner module for what happened around 10/05/18. And I found this commit
>> >>
>> >> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181: [BEAM-4783] Add bundleSize for splitting BoundedSources
>> >>
>> >> I don't know if it should be considered a blocker but we should definitely take another look at pull request #6181 that seems to change the way we split on spark runner.
>> >>
>> >> Best
>> >> Etienne
>> >>
>> >>
>> >> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
>> >>
>> >> +1 (binding)
>> >>
>> >>
>> >> On 26.10.18 17:45, Kenneth Knowles wrote:
>> >>
>> >> Nice. Thanks.
>> >>
>> >>
>> >> +1
>> >>
>> >>
>> >>
>> >> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com
>> >>
>> >> <ma...@google.com>> wrote:
>> >>
>> >>
>> >>     Thanks Tim!
>> >>
>> >>
>> >>     This was my only hesitation, and sounds like we're in the clear here.
>> >>
>> >>
>> >>     +1 (binding)
>> >>
>> >>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
>> >>
>> >>     <timrobertson100@gmail.com <ma...@gmail.com>> wrote:
>> >>
>> >>      >
>> >>
>> >>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
>> >>
>> >>      >
>> >>
>> >>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
>> >>
>> >>     spreadsheet)
>> >>
>> >>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
>> >>
>> >>     backport the un-merged BEAM-5036 fix in our code)
>> >>
>> >>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>> >>
>> >>      >
>> >>
>> >>      > Everything worked, and performance was similar on both.
>> >>
>> >>      > We built using maven pointing at
>> >>
>> >>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>> >>
>> >>      >
>> >>
>> >>      > Based on this limited testing: +1
>> >>
>> >>      >
>> >>
>> >>      > Thank you to the release managers,
>> >>
>> >>      > Tim
>> >>
>> >>      >
>> >>
>> >>      >
>> >>
>> >>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
>> >>
>> >>     <ma...@gmail.com>> wrote:
>> >>
>> >>      >>
>> >>
>> >>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
>> >>
>> >>     Sorry I’ve just been too busy to assist.
>> >>
>> >>      >>
>> >>
>> >>      >> Tim
>> >>
>> >>      >>
>> >>
>> >>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
>> >>
>> >>     <ma...@apache.org>> wrote:
>> >>
>> >>      >>
>> >>
>> >>      >> I tried to do a more thorough job on this.
>> >>
>> >>      >>
>> >>
>> >>      >>  - I could not reproduce the slowdown in Query 9. I believe the
>> >>
>> >>     variance was simply high given the parameters and environment
>> >>
>> >>      >>  - I saw the same slowdown in Query 8 when running as part of
>> >>
>> >>     the suite, but it vanished when I ran repeatedly on its own, so
>> >>
>> >>     again it is not good methodology probably
>> >>
>> >>      >>
>> >>
>> >>      >> We do have the dashboard at
>> >>
>> >>     https://apache-beam-testing.appspot.com/dashboard-admin though no
>> >>
>> >>     anomaly detection set up AFAIK.
>> >>
>> >>      >>
>> >>
>> >>      >>  - There is no issue easily visible in DirectRunner:
>> >>
>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>> >>
>> >>      >>  - There is a notable degradation in Spark runner on 10/5 for
>> >>
>> >>     many queries.
>> >>
>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>> >>
>> >>      >>  - Something minor happened for Dataflow around 10/1:
>> >>
>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>> >>
>> >>      >>  - Flink runner seems to have had some fantastic improvements
>> >>
>> >>     :-)
>> >>
>> >>     https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>> >>
>> >>      >>
>> >>
>> >>      >> So if there is a blocker it would really be the Spark runner
>> >>
>> >>     perf changes. Of course, all these except Dataflow are using local
>> >>
>> >>     instances so may not be representative of larger scale AFAIK.
>> >>
>> >>      >>
>> >>
>> >>      >> Kenn
>> >>
>> >>      >>
>> >>
>> >>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
>> >>
>> >>     <mxm@apache.org <ma...@apache.org>> wrote:
>> >>
>> >>      >>>
>> >>
>> >>      >>> I've run WordCount using Quickstart with the FlinkRunner
>> >>
>> >>     (locally and
>> >>
>> >>      >>> against a Flink cluster).
>> >>
>> >>      >>>
>> >>
>> >>      >>> Would give a +1 but waiting what Kenn finds.
>> >>
>> >>      >>>
>> >>
>> >>      >>> -Max
>> >>
>> >>      >>>
>> >>
>> >>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>> >>
>> >>     <kenn@apache.org <ma...@apache.org>
>> >>
>> >>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     You two did so much verification I had a hard time
>> >>
>> >>     finding something
>> >>
>> >>      >>> >     where my help was meaningful! :-)
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     I did run the Nexmark suite on the DirectRunner against
>> >>
>> >>     2.7.0 and
>> >>
>> >>      >>> >     2.8.0 following
>> >>
>> >>      >>> >
>> >>
>> >>     https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>> >>
>> >>      >>> >
>> >>
>> >>       <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     It is admittedly a very silly test - the instructions leave
>> >>
>> >>      >>> >     immutability enforcement on, etc. But it does appear that
>> >>
>> >>     there is a
>> >>
>> >>      >>> >     30% degradation in query 8 and 15% in query 9. These are
>> >>
>> >>     the pure
>> >>
>> >>      >>> >     Java tests, not the SQL variants. The rest of the queries
>> >>
>> >>     are close
>> >>
>> >>      >>> >     enough that differences are not meaningful.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> > (It would be a good improvement for us to have alerts on daily
>> >>
>> >>      >>> > benchmarks if we do not have such a concept already.)
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     I would ask a little more time to see what is going on
>> >>
>> >>     here - is it
>> >>
>> >>      >>> >     a real performance issue or an artifact of how the tests are
>> >>
>> >>      >>> >     invoked, or ...?
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> > Thank you! Much appreciated. Please let us know when you are
>> >>
>> >>     done with
>> >>
>> >>      >>> > your investigation.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     Kenn
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>> >>
>> >>     <altay@google.com <ma...@google.com>
>> >>
>> >>      >>> >     <mailto:altay@google.com <ma...@google.com>>> wrote:
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >         Hi all,
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >         Did you have a chance to review this RC? Between me
>> >>
>> >>     and Robert
>> >>
>> >>      >>> >         we ran a significant chunk of the validations. Let me
>> >>
>> >>     know if
>> >>
>> >>      >>> >         you have any questions.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >         Ahmet
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>> >>
>> >>     <altay@google.com <ma...@google.com>
>> >>
>> >>      >>> >         <mailto:altay@google.com <ma...@google.com>>>
>> >>
>> >>     wrote:
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             Hi everyone,
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             Please review and vote on the release candidate
>> >>
>> >>     #1 for the
>> >>
>> >>      >>> >             version 2.8.0, as follows:
>> >>
>> >>      >>> >             [ ] +1, Approve the release
>> >>
>> >>      >>> >             [ ] -1, Do not approve the release (please
>> >>
>> >>     provide specific
>> >>
>> >>      >>> >             comments)
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             The complete staging area is available for your
>> >>
>> >>     review,
>> >>
>> >>      >>> >             which includes:
>> >>
>> >>      >>> >             * JIRA release notes [1],
>> >>
>> >>      >>> >             * the official Apache source release to be
>> >>
>> >>     deployed to
>> >>
>> >>      >>> > dist.apache.org <http://dist.apache.org>
>> >>
>> >>     <http://dist.apache.org> [2], which is
>> >>
>> >>      >>> >             signed with the key with fingerprint 6096FA00 [3],
>> >>
>> >>      >>> >             * all artifacts to be deployed to the Maven Central
>> >>
>> >>      >>> >             Repository [4],
>> >>
>> >>      >>> >             * source code tag "v2.8.0-RC1" [5],
>> >>
>> >>      >>> >             * website pull request listing the release and
>> >>
>> >>     publishing
>> >>
>> >>      >>> >             the API reference manual [6].
>> >>
>> >>      >>> >             * Python artifacts are deployed along with the source
>> >>
>> >>      >>> >             release to the dist.apache.org
>> >>
>> >>     <http://dist.apache.org> <http://dist.apache.org> [2].
>> >>
>> >>      >>> >             * Validation sheet with a tab for 2.8.0 release
>> >>
>> >>     to help with
>> >>
>> >>      >>> >             validation [7].
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             The vote will be open for at least 72 hours. It
>> >>
>> >>     is adopted
>> >>
>> >>      >>> >             by majority approval, with at least 3 PMC
>> >>
>> >>     affirmative votes.
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             Thanks,
>> >>
>> >>      >>> >             Ahmet
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >             [1]
>> >>
>> >>      >>> >
>> >>
>> >>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>> >>
>> >>      >>> >
>> >>
>> >>       <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>> >>
>> >>      >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>> >>
>> >>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>> >>
>> >>      >>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>> >>
>> >>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>> >>
>> >>      >>> >             [4]
>> >>
>> >>      >>> >
>> >>
>> >>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>> >>
>> >>      >>> >
>> >>
>> >>       <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>> >>
>> >>      >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>> >>
>> >>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>> >>
>> >>      >>> >             [6] https://github.com/apache/beam-site/pull/583
>> >>
>> >>      >>> >             <https://github.com/apache/beam-site/pull/583> and
>> >>
>> >>      >>> > https://github.com/apache/beam/pull/6745
>> >>
>> >>      >>> >             <https://github.com/apache/beam/pull/6745>
>> >>
>> >>      >>> >             [7]
>> >>
>> >>      >>> >
>> >>
>> >>     https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>> >>
>> >>      >>> >
>> >>
>> >>       <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>      >>> >
>> >>
>> >>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Kenneth Knowles <ke...@apache.org>.
I think definitely open a cherry pick PR to a 2.8.x branch. I think we must
not corrupt maven central, so if it is published to users this has to be
2.8.1. Ahmet - we are to this point, right?

Kenn

On Mon, Oct 29, 2018 at 8:40 AM Ismaël Mejía <ie...@gmail.com> wrote:

> First thanks Etienne and Kenn for noting the performance issue. I
> reviewed the discussed PR.It introduced a new ‘@Experimental’ option
> to the Spark runner to change the default source partitioning and
> enable users to control it via a predefined size (a prerrequisite for
> Spark’s dynamicAllocation).
>
> This however must not be the default behavior, it seems after looking
> at the PR that things are not as expected and the default is now the
> new behavior. I will provide a PR to fix this quickly. However the
> question is, should I do cherry pick it and we do a new RC (since the
> release was already 'passed') ?
> On Mon, Oct 29, 2018 at 2:51 PM Kenneth Knowles <ke...@apache.org> wrote:
> >
> > I didn't isolate it to a cause and commit, so that is extremely useful
> to know. To bring some details on thread:
> >
> > query 4: a single aggregation in sliding windows
> > query 8: a single join with no other interesting logic
> > query 9 (prefix of query 6*): find the winning bid for each auction
> > query 6: query 9 followed by a single aggregation
> >
> > Kenn
> >
> > * they seem out of order because the original queries were 1-8 and we
> added 9 later to benchmark the baseline without the aggregation
> >
> > On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot <ec...@apache.org>
> wrote:
> >>
> >> Oops, just saw than Kenn already mentioned spark perf degradation on
> spark runner around 10/05. Sorry for the repetition.
> >> Nevertheless, IMHO, I think it will be still worth checking PR #6181.
> >>
> >> Etienne
> >>
> >> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
> >>
> >> Hey,
> >> I would vote -0 : here is the explanation:
> >>
> >> I took a look at Nexmark dashboards for output size and performance for
> all the runners in all the modes around the date of the release cut to
> search for regressions.
> >>
> >> I noted a regression on the performance of the spark runner. Query4,
> Query6, Query8 and Query9 running times were multiplied by 2 to 3 around
> the date of 10/05/18. See
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >> So I searched in the commit history of the spark runner module for what
> happened around 10/05/18. And I found this commit
> >>
> >> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181:
> [BEAM-4783] Add bundleSize for splitting BoundedSources
> >>
> >> I don't know if it should be considered a blocker but we should
> definitely take another look at pull request #6181 that seems to change the
> way we split on spark runner.
> >>
> >> Best
> >> Etienne
> >>
> >>
> >> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
> >>
> >> +1 (binding)
> >>
> >>
> >> On 26.10.18 17:45, Kenneth Knowles wrote:
> >>
> >> Nice. Thanks.
> >>
> >>
> >> +1
> >>
> >>
> >>
> >> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com
> >>
> >> <ma...@google.com>> wrote:
> >>
> >>
> >>     Thanks Tim!
> >>
> >>
> >>     This was my only hesitation, and sounds like we're in the clear
> here.
> >>
> >>
> >>     +1 (binding)
> >>
> >>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
> >>
> >>     <timrobertson100@gmail.com <ma...@gmail.com>>
> wrote:
> >>
> >>      >
> >>
> >>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
> >>
> >>      >
> >>
> >>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
> >>
> >>     spreadsheet)
> >>
> >>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
> >>
> >>     backport the un-merged BEAM-5036 fix in our code)
> >>
> >>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
> >>
> >>      >
> >>
> >>      > Everything worked, and performance was similar on both.
> >>
> >>      > We built using maven pointing at
> >>
> >>
> https://repository.apache.org/content/repositories/orgapachebeam-1049/
> >>
> >>      >
> >>
> >>      > Based on this limited testing: +1
> >>
> >>      >
> >>
> >>      > Thank you to the release managers,
> >>
> >>      > Tim
> >>
> >>      >
> >>
> >>      >
> >>
> >>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
> >>
> >>     <ma...@gmail.com>> wrote:
> >>
> >>      >>
> >>
> >>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
> >>
> >>     Sorry I’ve just been too busy to assist.
> >>
> >>      >>
> >>
> >>      >> Tim
> >>
> >>      >>
> >>
> >>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
> >>
> >>     <ma...@apache.org>> wrote:
> >>
> >>      >>
> >>
> >>      >> I tried to do a more thorough job on this.
> >>
> >>      >>
> >>
> >>      >>  - I could not reproduce the slowdown in Query 9. I believe the
> >>
> >>     variance was simply high given the parameters and environment
> >>
> >>      >>  - I saw the same slowdown in Query 8 when running as part of
> >>
> >>     the suite, but it vanished when I ran repeatedly on its own, so
> >>
> >>     again it is not good methodology probably
> >>
> >>      >>
> >>
> >>      >> We do have the dashboard at
> >>
> >>     https://apache-beam-testing.appspot.com/dashboard-admin though no
> >>
> >>     anomaly detection set up AFAIK.
> >>
> >>      >>
> >>
> >>      >>  - There is no issue easily visible in DirectRunner:
> >>
> >>
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> >>
> >>      >>  - There is a notable degradation in Spark runner on 10/5 for
> >>
> >>     many queries.
> >>
> >>
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >>
> >>      >>  - Something minor happened for Dataflow around 10/1:
> >>
> >>
> https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
> >>
> >>      >>  - Flink runner seems to have had some fantastic improvements
> >>
> >>     :-)
> >>
> >>
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> >>
> >>      >>
> >>
> >>      >> So if there is a blocker it would really be the Spark runner
> >>
> >>     perf changes. Of course, all these except Dataflow are using local
> >>
> >>     instances so may not be representative of larger scale AFAIK.
> >>
> >>      >>
> >>
> >>      >> Kenn
> >>
> >>      >>
> >>
> >>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
> >>
> >>     <mxm@apache.org <ma...@apache.org>> wrote:
> >>
> >>      >>>
> >>
> >>      >>> I've run WordCount using Quickstart with the FlinkRunner
> >>
> >>     (locally and
> >>
> >>      >>> against a Flink cluster).
> >>
> >>      >>>
> >>
> >>      >>> Would give a +1 but waiting what Kenn finds.
> >>
> >>      >>>
> >>
> >>      >>> -Max
> >>
> >>      >>>
> >>
> >>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
> >>
> >>     <kenn@apache.org <ma...@apache.org>
> >>
> >>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
> >>
> >>      >>> >
> >>
> >>      >>> >     You two did so much verification I had a hard time
> >>
> >>     finding something
> >>
> >>      >>> >     where my help was meaningful! :-)
> >>
> >>      >>> >
> >>
> >>      >>> >     I did run the Nexmark suite on the DirectRunner against
> >>
> >>     2.7.0 and
> >>
> >>      >>> >     2.8.0 following
> >>
> >>      >>> >
> >>
> >>
> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
> >>
> >>      >>> >
> >>
> >>       <
> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
> >.
> >>
> >>      >>> >
> >>
> >>      >>> >     It is admittedly a very silly test - the instructions
> leave
> >>
> >>      >>> >     immutability enforcement on, etc. But it does appear that
> >>
> >>     there is a
> >>
> >>      >>> >     30% degradation in query 8 and 15% in query 9. These are
> >>
> >>     the pure
> >>
> >>      >>> >     Java tests, not the SQL variants. The rest of the queries
> >>
> >>     are close
> >>
> >>      >>> >     enough that differences are not meaningful.
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> > (It would be a good improvement for us to have alerts on
> daily
> >>
> >>      >>> > benchmarks if we do not have such a concept already.)
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> >     I would ask a little more time to see what is going on
> >>
> >>     here - is it
> >>
> >>      >>> >     a real performance issue or an artifact of how the tests
> are
> >>
> >>      >>> >     invoked, or ...?
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> > Thank you! Much appreciated. Please let us know when you are
> >>
> >>     done with
> >>
> >>      >>> > your investigation.
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> >     Kenn
> >>
> >>      >>> >
> >>
> >>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
> >>
> >>     <altay@google.com <ma...@google.com>
> >>
> >>      >>> >     <mailto:altay@google.com <ma...@google.com>>>
> wrote:
> >>
> >>      >>> >
> >>
> >>      >>> >         Hi all,
> >>
> >>      >>> >
> >>
> >>      >>> >         Did you have a chance to review this RC? Between me
> >>
> >>     and Robert
> >>
> >>      >>> >         we ran a significant chunk of the validations. Let me
> >>
> >>     know if
> >>
> >>      >>> >         you have any questions.
> >>
> >>      >>> >
> >>
> >>      >>> >         Ahmet
> >>
> >>      >>> >
> >>
> >>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
> >>
> >>     <altay@google.com <ma...@google.com>
> >>
> >>      >>> >         <mailto:altay@google.com <ma...@google.com>>>
> >>
> >>     wrote:
> >>
> >>      >>> >
> >>
> >>      >>> >             Hi everyone,
> >>
> >>      >>> >
> >>
> >>      >>> >             Please review and vote on the release candidate
> >>
> >>     #1 for the
> >>
> >>      >>> >             version 2.8.0, as follows:
> >>
> >>      >>> >             [ ] +1, Approve the release
> >>
> >>      >>> >             [ ] -1, Do not approve the release (please
> >>
> >>     provide specific
> >>
> >>      >>> >             comments)
> >>
> >>      >>> >
> >>
> >>      >>> >             The complete staging area is available for your
> >>
> >>     review,
> >>
> >>      >>> >             which includes:
> >>
> >>      >>> >             * JIRA release notes [1],
> >>
> >>      >>> >             * the official Apache source release to be
> >>
> >>     deployed to
> >>
> >>      >>> > dist.apache.org <http://dist.apache.org>
> >>
> >>     <http://dist.apache.org> [2], which is
> >>
> >>      >>> >             signed with the key with fingerprint 6096FA00
> [3],
> >>
> >>      >>> >             * all artifacts to be deployed to the Maven
> Central
> >>
> >>      >>> >             Repository [4],
> >>
> >>      >>> >             * source code tag "v2.8.0-RC1" [5],
> >>
> >>      >>> >             * website pull request listing the release and
> >>
> >>     publishing
> >>
> >>      >>> >             the API reference manual [6].
> >>
> >>      >>> >             * Python artifacts are deployed along with the
> source
> >>
> >>      >>> >             release to the dist.apache.org
> >>
> >>     <http://dist.apache.org> <http://dist.apache.org> [2].
> >>
> >>      >>> >             * Validation sheet with a tab for 2.8.0 release
> >>
> >>     to help with
> >>
> >>      >>> >             validation [7].
> >>
> >>      >>> >
> >>
> >>      >>> >             The vote will be open for at least 72 hours. It
> >>
> >>     is adopted
> >>
> >>      >>> >             by majority approval, with at least 3 PMC
> >>
> >>     affirmative votes.
> >>
> >>      >>> >
> >>
> >>      >>> >             Thanks,
> >>
> >>      >>> >             Ahmet
> >>
> >>      >>> >
> >>
> >>      >>> >             [1]
> >>
> >>      >>> >
> >>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
> >>
> >>      >>> >
> >>
> >>       <
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
> >
> >>
> >>      >>> >             [2]
> https://dist.apache.org/repos/dist/dev/beam/2.8.0
> >>
> >>      >>> >             <
> https://dist.apache.org/repos/dist/dev/beam/2.8.0>
> >>
> >>      >>> >             [3]
> https://dist.apache.org/repos/dist/dev/beam/KEYS
> >>
> >>      >>> >             <
> https://dist.apache.org/repos/dist/dev/beam/KEYS>
> >>
> >>      >>> >             [4]
> >>
> >>      >>> >
> >>
> >>
> https://repository.apache.org/content/repositories/orgapachebeam-1049/
> >>
> >>      >>> >
> >>
> >>       <
> https://repository.apache.org/content/repositories/orgapachebeam-1049/>
> >>
> >>      >>> >             [5]
> https://github.com/apache/beam/tree/v2.8.0-RC1
> >>
> >>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
> >>
> >>      >>> >             [6] https://github.com/apache/beam-site/pull/583
> >>
> >>      >>> >             <https://github.com/apache/beam-site/pull/583>
> and
> >>
> >>      >>> > https://github.com/apache/beam/pull/6745
> >>
> >>      >>> >             <https://github.com/apache/beam/pull/6745>
> >>
> >>      >>> >             [7]
> >>
> >>      >>> >
> >>
> >>
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
> >>
> >>      >>> >
> >>
> >>       <
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
> >
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Ismaël Mejía <ie...@gmail.com>.
First thanks Etienne and Kenn for noting the performance issue. I
reviewed the discussed PR.It introduced a new ‘@Experimental’ option
to the Spark runner to change the default source partitioning and
enable users to control it via a predefined size (a prerrequisite for
Spark’s dynamicAllocation).

This however must not be the default behavior, it seems after looking
at the PR that things are not as expected and the default is now the
new behavior. I will provide a PR to fix this quickly. However the
question is, should I do cherry pick it and we do a new RC (since the
release was already 'passed') ?
On Mon, Oct 29, 2018 at 2:51 PM Kenneth Knowles <ke...@apache.org> wrote:
>
> I didn't isolate it to a cause and commit, so that is extremely useful to know. To bring some details on thread:
>
> query 4: a single aggregation in sliding windows
> query 8: a single join with no other interesting logic
> query 9 (prefix of query 6*): find the winning bid for each auction
> query 6: query 9 followed by a single aggregation
>
> Kenn
>
> * they seem out of order because the original queries were 1-8 and we added 9 later to benchmark the baseline without the aggregation
>
> On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot <ec...@apache.org> wrote:
>>
>> Oops, just saw than Kenn already mentioned spark perf degradation on spark runner around 10/05. Sorry for the repetition.
>> Nevertheless, IMHO, I think it will be still worth checking PR #6181.
>>
>> Etienne
>>
>> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
>>
>> Hey,
>> I would vote -0 : here is the explanation:
>>
>> I took a look at Nexmark dashboards for output size and performance for all the runners in all the modes around the date of the release cut to search for regressions.
>>
>> I noted a regression on the performance of the spark runner. Query4, Query6, Query8 and Query9 running times were multiplied by 2 to 3 around the date of 10/05/18. See https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>> So I searched in the commit history of the spark runner module for what happened around 10/05/18. And I found this commit
>>
>> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181: [BEAM-4783] Add bundleSize for splitting BoundedSources
>>
>> I don't know if it should be considered a blocker but we should definitely take another look at pull request #6181 that seems to change the way we split on spark runner.
>>
>> Best
>> Etienne
>>
>>
>> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
>>
>> +1 (binding)
>>
>>
>> On 26.10.18 17:45, Kenneth Knowles wrote:
>>
>> Nice. Thanks.
>>
>>
>> +1
>>
>>
>>
>> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com
>>
>> <ma...@google.com>> wrote:
>>
>>
>>     Thanks Tim!
>>
>>
>>     This was my only hesitation, and sounds like we're in the clear here.
>>
>>
>>     +1 (binding)
>>
>>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
>>
>>     <timrobertson100@gmail.com <ma...@gmail.com>> wrote:
>>
>>      >
>>
>>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
>>
>>      >
>>
>>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
>>
>>     spreadsheet)
>>
>>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
>>
>>     backport the un-merged BEAM-5036 fix in our code)
>>
>>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>>
>>      >
>>
>>      > Everything worked, and performance was similar on both.
>>
>>      > We built using maven pointing at
>>
>>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>>
>>      >
>>
>>      > Based on this limited testing: +1
>>
>>      >
>>
>>      > Thank you to the release managers,
>>
>>      > Tim
>>
>>      >
>>
>>      >
>>
>>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
>>
>>     <ma...@gmail.com>> wrote:
>>
>>      >>
>>
>>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
>>
>>     Sorry I’ve just been too busy to assist.
>>
>>      >>
>>
>>      >> Tim
>>
>>      >>
>>
>>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
>>
>>     <ma...@apache.org>> wrote:
>>
>>      >>
>>
>>      >> I tried to do a more thorough job on this.
>>
>>      >>
>>
>>      >>  - I could not reproduce the slowdown in Query 9. I believe the
>>
>>     variance was simply high given the parameters and environment
>>
>>      >>  - I saw the same slowdown in Query 8 when running as part of
>>
>>     the suite, but it vanished when I ran repeatedly on its own, so
>>
>>     again it is not good methodology probably
>>
>>      >>
>>
>>      >> We do have the dashboard at
>>
>>     https://apache-beam-testing.appspot.com/dashboard-admin though no
>>
>>     anomaly detection set up AFAIK.
>>
>>      >>
>>
>>      >>  - There is no issue easily visible in DirectRunner:
>>
>>     https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>>
>>      >>  - There is a notable degradation in Spark runner on 10/5 for
>>
>>     many queries.
>>
>>     https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>>
>>      >>  - Something minor happened for Dataflow around 10/1:
>>
>>     https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>>
>>      >>  - Flink runner seems to have had some fantastic improvements
>>
>>     :-)
>>
>>     https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>>
>>      >>
>>
>>      >> So if there is a blocker it would really be the Spark runner
>>
>>     perf changes. Of course, all these except Dataflow are using local
>>
>>     instances so may not be representative of larger scale AFAIK.
>>
>>      >>
>>
>>      >> Kenn
>>
>>      >>
>>
>>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
>>
>>     <mxm@apache.org <ma...@apache.org>> wrote:
>>
>>      >>>
>>
>>      >>> I've run WordCount using Quickstart with the FlinkRunner
>>
>>     (locally and
>>
>>      >>> against a Flink cluster).
>>
>>      >>>
>>
>>      >>> Would give a +1 but waiting what Kenn finds.
>>
>>      >>>
>>
>>      >>> -Max
>>
>>      >>>
>>
>>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
>>
>>      >>> >
>>
>>      >>> >
>>
>>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>>
>>     <kenn@apache.org <ma...@apache.org>
>>
>>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
>>
>>      >>> >
>>
>>      >>> >     You two did so much verification I had a hard time
>>
>>     finding something
>>
>>      >>> >     where my help was meaningful! :-)
>>
>>      >>> >
>>
>>      >>> >     I did run the Nexmark suite on the DirectRunner against
>>
>>     2.7.0 and
>>
>>      >>> >     2.8.0 following
>>
>>      >>> >
>>
>>     https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>>
>>      >>> >
>>
>>       <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
>>
>>      >>> >
>>
>>      >>> >     It is admittedly a very silly test - the instructions leave
>>
>>      >>> >     immutability enforcement on, etc. But it does appear that
>>
>>     there is a
>>
>>      >>> >     30% degradation in query 8 and 15% in query 9. These are
>>
>>     the pure
>>
>>      >>> >     Java tests, not the SQL variants. The rest of the queries
>>
>>     are close
>>
>>      >>> >     enough that differences are not meaningful.
>>
>>      >>> >
>>
>>      >>> >
>>
>>      >>> > (It would be a good improvement for us to have alerts on daily
>>
>>      >>> > benchmarks if we do not have such a concept already.)
>>
>>      >>> >
>>
>>      >>> >
>>
>>      >>> >     I would ask a little more time to see what is going on
>>
>>     here - is it
>>
>>      >>> >     a real performance issue or an artifact of how the tests are
>>
>>      >>> >     invoked, or ...?
>>
>>      >>> >
>>
>>      >>> >
>>
>>      >>> > Thank you! Much appreciated. Please let us know when you are
>>
>>     done with
>>
>>      >>> > your investigation.
>>
>>      >>> >
>>
>>      >>> >
>>
>>      >>> >     Kenn
>>
>>      >>> >
>>
>>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>>
>>     <altay@google.com <ma...@google.com>
>>
>>      >>> >     <mailto:altay@google.com <ma...@google.com>>> wrote:
>>
>>      >>> >
>>
>>      >>> >         Hi all,
>>
>>      >>> >
>>
>>      >>> >         Did you have a chance to review this RC? Between me
>>
>>     and Robert
>>
>>      >>> >         we ran a significant chunk of the validations. Let me
>>
>>     know if
>>
>>      >>> >         you have any questions.
>>
>>      >>> >
>>
>>      >>> >         Ahmet
>>
>>      >>> >
>>
>>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>>
>>     <altay@google.com <ma...@google.com>
>>
>>      >>> >         <mailto:altay@google.com <ma...@google.com>>>
>>
>>     wrote:
>>
>>      >>> >
>>
>>      >>> >             Hi everyone,
>>
>>      >>> >
>>
>>      >>> >             Please review and vote on the release candidate
>>
>>     #1 for the
>>
>>      >>> >             version 2.8.0, as follows:
>>
>>      >>> >             [ ] +1, Approve the release
>>
>>      >>> >             [ ] -1, Do not approve the release (please
>>
>>     provide specific
>>
>>      >>> >             comments)
>>
>>      >>> >
>>
>>      >>> >             The complete staging area is available for your
>>
>>     review,
>>
>>      >>> >             which includes:
>>
>>      >>> >             * JIRA release notes [1],
>>
>>      >>> >             * the official Apache source release to be
>>
>>     deployed to
>>
>>      >>> > dist.apache.org <http://dist.apache.org>
>>
>>     <http://dist.apache.org> [2], which is
>>
>>      >>> >             signed with the key with fingerprint 6096FA00 [3],
>>
>>      >>> >             * all artifacts to be deployed to the Maven Central
>>
>>      >>> >             Repository [4],
>>
>>      >>> >             * source code tag "v2.8.0-RC1" [5],
>>
>>      >>> >             * website pull request listing the release and
>>
>>     publishing
>>
>>      >>> >             the API reference manual [6].
>>
>>      >>> >             * Python artifacts are deployed along with the source
>>
>>      >>> >             release to the dist.apache.org
>>
>>     <http://dist.apache.org> <http://dist.apache.org> [2].
>>
>>      >>> >             * Validation sheet with a tab for 2.8.0 release
>>
>>     to help with
>>
>>      >>> >             validation [7].
>>
>>      >>> >
>>
>>      >>> >             The vote will be open for at least 72 hours. It
>>
>>     is adopted
>>
>>      >>> >             by majority approval, with at least 3 PMC
>>
>>     affirmative votes.
>>
>>      >>> >
>>
>>      >>> >             Thanks,
>>
>>      >>> >             Ahmet
>>
>>      >>> >
>>
>>      >>> >             [1]
>>
>>      >>> >
>>
>>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>>
>>      >>> >
>>
>>       <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>>
>>      >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>>
>>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>>
>>      >>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>>
>>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>>
>>      >>> >             [4]
>>
>>      >>> >
>>
>>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>>
>>      >>> >
>>
>>       <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>>
>>      >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>>
>>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>>
>>      >>> >             [6] https://github.com/apache/beam-site/pull/583
>>
>>      >>> >             <https://github.com/apache/beam-site/pull/583> and
>>
>>      >>> > https://github.com/apache/beam/pull/6745
>>
>>      >>> >             <https://github.com/apache/beam/pull/6745>
>>
>>      >>> >             [7]
>>
>>      >>> >
>>
>>     https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>>
>>      >>> >
>>
>>       <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>>
>>      >>> >
>>
>>      >>> >
>>
>>      >>> >
>>
>>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Kenneth Knowles <ke...@apache.org>.
I didn't isolate it to a cause and commit, so that is extremely useful to
know. To bring some details on thread:

query 4: a single aggregation in sliding windows
query 8: a single join with no other interesting logic
query 9 (prefix of query 6*): find the winning bid for each auction
query 6: query 9 followed by a single aggregation

Kenn

* they seem out of order because the original queries were 1-8 and we added
9 later to benchmark the baseline without the aggregation

On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot <ec...@apache.org>
wrote:

> Oops, just saw than Kenn already mentioned spark perf degradation on spark
> runner around 10/05. Sorry for the repetition.
> Nevertheless, IMHO, I think it will be still worth checking PR #6181.
>
> Etienne
>
> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
>
> Hey,
> I would vote -0 : here is the explanation:
>
> I took a look at Nexmark dashboards for output size and performance for
> all the runners in all the modes around the date of the release cut to
> search for regressions.
>
> I noted a regression on the performance of the spark runner. Query4,
> Query6, Query8 and Query9 running times were multiplied by 2 to 3 around
> the date of 10/05/18. See
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> So I searched in the commit history of the spark runner module for what
> happened around 10/05/18. And I found this commit
>
> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181:
> [BEAM-4783] Add bundleSize for splitting BoundedSources
>
> I don't know if it should be considered a blocker but we should definitely
> take another look at pull request #6181 that seems to change the way we
> split on spark runner.
>
> Best
> Etienne
>
>
> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
>
> +1 (binding)
>
>
> On 26.10.18 17:45, Kenneth Knowles wrote:
>
> Nice. Thanks.
>
>
> +1
>
>
>
> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com
>
> <ma...@google.com>> wrote:
>
>
>     Thanks Tim!
>
>
>     This was my only hesitation, and sounds like we're in the clear here.
>
>
>     +1 (binding)
>
>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
>
>     <timrobertson100@gmail.com <ma...@gmail.com>> wrote:
>
>      >
>
>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
>
>      >
>
>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
>
>     spreadsheet)
>
>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
>
>     backport the un-merged BEAM-5036 fix in our code)
>
>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>
>      >
>
>      > Everything worked, and performance was similar on both.
>
>      > We built using maven pointing at
>
>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>
>      >
>
>      > Based on this limited testing: +1
>
>      >
>
>      > Thank you to the release managers,
>
>      > Tim
>
>      >
>
>      >
>
>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
>
>     <ma...@gmail.com>> wrote:
>
>      >>
>
>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
>
>     Sorry I’ve just been too busy to assist.
>
>      >>
>
>      >> Tim
>
>      >>
>
>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
>
>     <ma...@apache.org>> wrote:
>
>      >>
>
>      >> I tried to do a more thorough job on this.
>
>      >>
>
>      >>  - I could not reproduce the slowdown in Query 9. I believe the
>
>     variance was simply high given the parameters and environment
>
>      >>  - I saw the same slowdown in Query 8 when running as part of
>
>     the suite, but it vanished when I ran repeatedly on its own, so
>
>     again it is not good methodology probably
>
>      >>
>
>      >> We do have the dashboard at
>
>     https://apache-beam-testing.appspot.com/dashboard-admin though no
>
>     anomaly detection set up AFAIK.
>
>      >>
>
>      >>  - There is no issue easily visible in DirectRunner:
>
>     https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>
>      >>  - There is a notable degradation in Spark runner on 10/5 for
>
>     many queries.
>
>     https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>
>      >>  - Something minor happened for Dataflow around 10/1:
>
>     https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>
>      >>  - Flink runner seems to have had some fantastic improvements
>
>     :-)
>
>     https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>
>      >>
>
>      >> So if there is a blocker it would really be the Spark runner
>
>     perf changes. Of course, all these except Dataflow are using local
>
>     instances so may not be representative of larger scale AFAIK.
>
>      >>
>
>      >> Kenn
>
>      >>
>
>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
>
>     <mxm@apache.org <ma...@apache.org>> wrote:
>
>      >>>
>
>      >>> I've run WordCount using Quickstart with the FlinkRunner
>
>     (locally and
>
>      >>> against a Flink cluster).
>
>      >>>
>
>      >>> Would give a +1 but waiting what Kenn finds.
>
>      >>>
>
>      >>> -Max
>
>      >>>
>
>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
>
>      >>> >
>
>      >>> >
>
>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>
>     <kenn@apache.org <ma...@apache.org>
>
>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
>
>      >>> >
>
>      >>> >     You two did so much verification I had a hard time
>
>     finding something
>
>      >>> >     where my help was meaningful! :-)
>
>      >>> >
>
>      >>> >     I did run the Nexmark suite on the DirectRunner against
>
>     2.7.0 and
>
>      >>> >     2.8.0 following
>
>      >>> >
>
>     https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>
>      >>> >
>
>       <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
>
>      >>> >
>
>      >>> >     It is admittedly a very silly test - the instructions leave
>
>      >>> >     immutability enforcement on, etc. But it does appear that
>
>     there is a
>
>      >>> >     30% degradation in query 8 and 15% in query 9. These are
>
>     the pure
>
>      >>> >     Java tests, not the SQL variants. The rest of the queries
>
>     are close
>
>      >>> >     enough that differences are not meaningful.
>
>      >>> >
>
>      >>> >
>
>      >>> > (It would be a good improvement for us to have alerts on daily
>
>      >>> > benchmarks if we do not have such a concept already.)
>
>      >>> >
>
>      >>> >
>
>      >>> >     I would ask a little more time to see what is going on
>
>     here - is it
>
>      >>> >     a real performance issue or an artifact of how the tests are
>
>      >>> >     invoked, or ...?
>
>      >>> >
>
>      >>> >
>
>      >>> > Thank you! Much appreciated. Please let us know when you are
>
>     done with
>
>      >>> > your investigation.
>
>      >>> >
>
>      >>> >
>
>      >>> >     Kenn
>
>      >>> >
>
>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>
>     <altay@google.com <ma...@google.com>
>
>      >>> >     <mailto:altay@google.com <ma...@google.com>>> wrote:
>
>      >>> >
>
>      >>> >         Hi all,
>
>      >>> >
>
>      >>> >         Did you have a chance to review this RC? Between me
>
>     and Robert
>
>      >>> >         we ran a significant chunk of the validations. Let me
>
>     know if
>
>      >>> >         you have any questions.
>
>      >>> >
>
>      >>> >         Ahmet
>
>      >>> >
>
>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>
>     <altay@google.com <ma...@google.com>
>
>      >>> >         <mailto:altay@google.com <ma...@google.com>>>
>
>     wrote:
>
>      >>> >
>
>      >>> >             Hi everyone,
>
>      >>> >
>
>      >>> >             Please review and vote on the release candidate
>
>     #1 for the
>
>      >>> >             version 2.8.0, as follows:
>
>      >>> >             [ ] +1, Approve the release
>
>      >>> >             [ ] -1, Do not approve the release (please
>
>     provide specific
>
>      >>> >             comments)
>
>      >>> >
>
>      >>> >             The complete staging area is available for your
>
>     review,
>
>      >>> >             which includes:
>
>      >>> >             * JIRA release notes [1],
>
>      >>> >             * the official Apache source release to be
>
>     deployed to
>
>      >>> > dist.apache.org <http://dist.apache.org>
>
>     <http://dist.apache.org> [2], which is
>
>      >>> >             signed with the key with fingerprint 6096FA00 [3],
>
>      >>> >             * all artifacts to be deployed to the Maven Central
>
>      >>> >             Repository [4],
>
>      >>> >             * source code tag "v2.8.0-RC1" [5],
>
>      >>> >             * website pull request listing the release and
>
>     publishing
>
>      >>> >             the API reference manual [6].
>
>      >>> >             * Python artifacts are deployed along with the source
>
>      >>> >             release to the dist.apache.org
>
>     <http://dist.apache.org> <http://dist.apache.org> [2].
>
>      >>> >             * Validation sheet with a tab for 2.8.0 release
>
>     to help with
>
>      >>> >             validation [7].
>
>      >>> >
>
>      >>> >             The vote will be open for at least 72 hours. It
>
>     is adopted
>
>      >>> >             by majority approval, with at least 3 PMC
>
>     affirmative votes.
>
>      >>> >
>
>      >>> >             Thanks,
>
>      >>> >             Ahmet
>
>      >>> >
>
>      >>> >             [1]
>
>      >>> >
>
>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>
>      >>> >
>
>       <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>
>      >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>
>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>
>      >>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>
>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>
>      >>> >             [4]
>
>      >>> >
>
>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>
>      >>> >
>
>       <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>
>      >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>
>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>
>      >>> >             [6] https://github.com/apache/beam-site/pull/583
>
>      >>> >             <https://github.com/apache/beam-site/pull/583> and
>
>      >>> > https://github.com/apache/beam/pull/6745
>
>      >>> >             <https://github.com/apache/beam/pull/6745>
>
>      >>> >             [7]
>
>      >>> >
>
>     https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>
>      >>> >
>
>       <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>
>      >>> >
>
>      >>> >
>
>      >>> >
>
>
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Etienne Chauchot <ec...@apache.org>.
Oops, just saw than Kenn already mentioned spark perf degradation on spark runner around 10/05. Sorry for the
repetition.Nevertheless, IMHO, I think it will be still worth checking PR #6181.
Etienne
Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
> Hey,I would vote -0 : here is the explanation: 
> I took a look at Nexmark dashboards for output size and performance for all the runners in all the modes around the
> date of the release cut to search for regressions. 
> I noted a regression on the performance of the spark runner. Query4, Query6, Query8 and Query9 running times were
> multiplied by 2 to 3 around the date of 10/05/18. See https://apache-beam-testing.appspot.com/explore?dashboard=513838
> 0291571712So I searched in the commit history of the spark runner module for what happened around 10/05/18. And I
> found this commit 
> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181: [BEAM-4783] Add bundleSize for splitting
> BoundedSources
> I don't know if it should be considered a blocker but we should definitely take another look at pull request #6181
> that seems to change the way we split on spark runner.
> BestEtienne
> 
> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
> > +1 (binding)
> > On 26.10.18 17:45, Kenneth Knowles wrote:
> > Nice. Thanks.
> > +1
> > 
> > On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com <ma...@google.com>> wrote:
> >     Thanks Tim!
> >     This was my only hesitation, and sounds like we're in the clear here.
> >     +1 (binding)    On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson    <timrobertson100@gmail.com
> > <ma...@gmail.com>> wrote:     >     > A colleague and I tested on 2.7.0 and 2.8.0RC1:     >     >
> > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in    spreadsheet)     > 2. Our Avro to Avro pipelines on
> > Spark/YARN/HDFS (note we    backport the un-merged BEAM-5036 fix in our code)     > 3. Our Avro to Elasticsearch
> > pipelines on Spark/YARN/HDFS     >     > Everything worked, and performance was similar on both.     > We built
> > using maven pointing at    https://repository.apache.org/content/repositories/orgapachebeam-1049//     >     > Based
> > on this limited testing: +1     >     > Thank you to the release managers,     > Tim     >     >     > On Thu, Oct
> > 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com    <ma...@gmail.com>> wrote:     >>     >> I can
> > do some tests on Spark / YARN tomorrow (CEST timezone).    Sorry I’ve just been too busy to assist.     >>     >>
> > Tim     >>     >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org    <ma...@apache.org>>
> > wrote:     >>     >> I tried to do a more thorough job on this.     >>     >>  - I could not reproduce the slowdown
> > in Query 9. I believe the    variance was simply high given the parameters and environment     >>  - I saw the same
> > slowdown in Query 8 when running as part of    the suite, but it vanished when I ran repeatedly on its own,
> > so    again it is not good methodology probably     >>     >> We do have the dashboard at    https://apache-beam-tes
> > ting.appspot.com/dashboard-admin though no    anomaly detection set up AFAIK.     >>     >>  - There is no issue
> > easily visible in DirectRunner:    https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424     >>
> >   - There is a notable degradation in Spark runner on 10/5 for    many queries.    https://apache-beam-testing.appsp
> > ot.com/explore?dashboard=5138380291571712     >>  - Something minor happened for Dataflow around 10/1:    https://ap
> > ache-beam-testing.appspot.com/explore?dashboard=5670405876482048     >>  - Flink runner seems to have had some
> > fantastic improvements    :-)    https://apache-beam-testing.appspot.com/explore?dashboard=56992575877283844     >> 
> >     >> So if there is a blocker it would really be the Spark runner    perf changes. Of course, all these except
> > Dataflow are using local    instances so may not be representative of larger scale AFAIK.     >>     >>
> > Kenn     >>     >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels    <mxm@apache.org <ma...@apache.org>>
> > wrote:     >>>     >>> I've run WordCount using Quickstart with the FlinkRunner    (locally and     >>> against a
> > Flink cluster).     >>>     >>> Would give a +1 but waiting what Kenn finds.     >>>     >>> -Max     >>>     >>> On
> > 23.10.18 07:11, Ahmet Altay wrote:     >>> >     >>> >     >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth
> > Knowles    <kenn@apache.org <ma...@apache.org>     >>> > <mailto:kenn@apache.org <ma...@apache.org>>>
> > wrote:     >>> >     >>> >     You two did so much verification I had a hard time    finding something     >>> >   
> >  where my help was meaningful! :-)     >>> >     >>> >     I did run the Nexmark suite on the DirectRunner
> > against    2.7.0 and     >>> >     2.8.0 following     >>> >    https://beam.apache.org/documentation/sdks/java/nexm
> > ark/#running-smoke-suite-on-the-directrunner-locall     >>> >         <https://beam.apache.org/documentation/sdks/ja
> > va/nexmark/#running-smoke-suite-on-the-directrunner-local>.     >>> >     >>> >     It is admittedly a very silly
> > test - the instructions leave     >>> >     immutability enforcement on, etc. But it does appear that    there is
> > a     >>> >     30% degradation in query 8 and 15% in query 9. These are    the pure     >>> >     Java tests, not
> > the SQL variants. The rest of the queries    are close     >>> >     enough that differences are not
> > meaningful.     >>> >     >>> >     >>> > (It would be a good improvement for us to have alerts on daily     >>> >
> > benchmarks if we do not have such a concept already.)     >>> >     >>> >     >>> >     I would ask a little more
> > time to see what is going on    here - is it     >>> >     a real performance issue or an artifact of how the tests
> > are     >>> >     invoked, or ...?     >>> >     >>> >     >>> > Thank you! Much appreciated. Please let us know
> > when you are    done with     >>> > your investigation.     >>> >     >>> >     >>> >     Kenn     >>> >     >>> > 
> >    On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay    <altay@google.com <ma...@google.com>     >>> >   
> >  <mailto:altay@google.com <ma...@google.com>>> wrote:     >>> >     >>> >         Hi all,     >>> >     >>>
> > >         Did you have a chance to review this RC? Between me    and Robert     >>> >         we ran a significant
> > chunk of the validations. Let me    know if     >>> >         you have any questions.     >>> >     >>> >       
> >  Ahmet     >>> >     >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay    <altay@google.com
> > <ma...@google.com>     >>> >         <mailto:altay@google.com <ma...@google.com>>>    wrote:     >>>
> > >     >>> >             Hi everyone,     >>> >     >>> >             Please review and vote on the release
> > candidate    #1 for the     >>> >             version 2.8.0, as follows:     >>> >             [ ] +1, Approve the
> > release     >>> >             [ ] -1, Do not approve the release (please    provide specific     >>> >           
> >  comments)     >>> >     >>> >             The complete staging area is available for your    review,     >>> >     
> >        which includes:     >>> >             * JIRA release notes [1],     >>> >             * the official Apache
> > source release to be    deployed to     >>> > dist.apache.org <http://dist.apache.org>    <http://dist.apache.org>
> > [2], which is     >>> >             signed with the key with fingerprint 6096FA00 [3],     >>> >             * all
> > artifacts to be deployed to the Maven Central     >>> >             Repository [4],     >>> >             * source
> > code tag "v2.8.0-RC1" [5],     >>> >             * website pull request listing the release
> > and    publishing     >>> >             the API reference manual [6].     >>> >             * Python artifacts are
> > deployed along with the source     >>> >             release to the dist.apache.org    <http://dist.apache.org> <htt
> > p://dist.apache.org> [2].     >>> >             * Validation sheet with a tab for 2.8.0 release    to help
> > with     >>> >             validation [7].     >>> >     >>> >             The vote will be open for at least 72
> > hours. It    is adopted     >>> >             by majority approval, with at least 3 PMC    affirmative
> > votes.     >>> >     >>> >             Thanks,     >>> >             Ahmet     >>> >     >>> >           
> >  [1]     >>> >    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=123439855     >>
> > > >                 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>    
> >  >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.00     >>> >             <https://dist.apache
> > .org/repos/dist/dev/beam/2.8.0>     >>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYSS     >>>
> > >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>     >>> >             [4]     >>> >    https://repo
> > sitory.apache.org/content/repositories/orgapachebeam-1049//     >>> >                 <https://repository.apache.org
> > /content/repositories/orgapachebeam-1049/>     >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC11
> >      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>     >>> >             [6] https://github.com
> > /apache/beam-site/pull/5833     >>> >             <https://github.com/apache/beam-site/pull/583> and     >>> > https
> > ://github.com/apache/beam/pull/67455     >>> >             <https://github.com/apache/beam/pull/6745>     >>> >     
> >        [7]     >>> >    https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid
> > =18547128166     >>> >                 <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo
> > _ZXBpJw/edit#gid=1854712816>     >>> >     >>> >     >>> >

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Etienne Chauchot <ec...@apache.org>.
Hey,I would vote -0 : here is the explanation: 
I took a look at Nexmark dashboards for output size and performance for all the runners in all the modes around the date
of the release cut to search for regressions. 
I noted a regression on the performance of the spark runner. Query4, Query6, Query8 and Query9 running times were
multiplied by 2 to 3 around the date of 10/05/18. See https://apache-beam-testing.appspot.com/explore?dashboard=51383802
91571712So I searched in the commit history of the spark runner module for what happened around 10/05/18. And I found
this commit 
e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181: [BEAM-4783] Add bundleSize for splitting
BoundedSources
I don't know if it should be considered a blocker but we should definitely take another look at pull request #6181 that
seems to change the way we split on spark runner.
BestEtienne

Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
> +1 (binding)
> On 26.10.18 17:45, Kenneth Knowles wrote:
> Nice. Thanks.
> +1
> 
> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com <ma...@google.com>> wrote:
>     Thanks Tim!
>     This was my only hesitation, and sounds like we're in the clear here.
>     +1 (binding)    On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson    <timrobertson100@gmail.com
> <ma...@gmail.com>> wrote:     >     > A colleague and I tested on 2.7.0 and 2.8.0RC1:     >     > 1.
> Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in    spreadsheet)     > 2. Our Avro to Avro pipelines on
> Spark/YARN/HDFS (note we    backport the un-merged BEAM-5036 fix in our code)     > 3. Our Avro to Elasticsearch
> pipelines on Spark/YARN/HDFS     >     > Everything worked, and performance was similar on both.     > We built using
> maven pointing at    https://repository.apache.org/content/repositories/orgapachebeam-1049//     >     > Based on this
> limited testing: +1     >     > Thank you to the release managers,     > Tim     >     >     > On Thu, Oct 25, 2018 at
> 7:21 PM Tim <timrobertson100@gmail.com    <ma...@gmail.com>> wrote:     >>     >> I can do some tests
> on Spark / YARN tomorrow (CEST timezone).    Sorry I’ve just been too busy to assist.     >>     >> Tim     >>     >>
> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org    <ma...@apache.org>> wrote:     >>     >> I tried
> to do a more thorough job on this.     >>     >>  - I could not reproduce the slowdown in Query 9. I believe
> the    variance was simply high given the parameters and environment     >>  - I saw the same slowdown in Query 8 when
> running as part of    the suite, but it vanished when I ran repeatedly on its own, so    again it is not good
> methodology probably     >>     >> We do have the dashboard at    https://apache-beam-testing.appspot.com/dashboard-ad
> min though no    anomaly detection set up AFAIK.     >>     >>  - There is no issue easily visible in
> DirectRunner:    https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424     >>  - There is a
> notable degradation in Spark runner on 10/5 for    many queries.    https://apache-beam-testing.appspot.com/explore?da
> shboard=5138380291571712     >>  - Something minor happened for Dataflow around 10/1:    https://apache-beam-testing.a
> ppspot.com/explore?dashboard=5670405876482048     >>  - Flink runner seems to have had some fantastic
> improvements    :-)    https://apache-beam-testing.appspot.com/explore?dashboard=56992575877283844     >>     >> So if
> there is a blocker it would really be the Spark runner    perf changes. Of course, all these except Dataflow are using
> local    instances so may not be representative of larger scale AFAIK.     >>     >> Kenn     >>     >> On Wed, Oct
> 24, 2018 at 9:48 AM Maximilian Michels    <mxm@apache.org <ma...@apache.org>> wrote:     >>>     >>> I've run
> WordCount using Quickstart with the FlinkRunner    (locally and     >>> against a Flink cluster).     >>>     >>>
> Would give a +1 but waiting what Kenn finds.     >>>     >>> -Max     >>>     >>> On 23.10.18 07:11, Ahmet Altay
> wrote:     >>> >     >>> >     >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles    <kenn@apache.org
> <ma...@apache.org>     >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:     >>> >     >>> >   
>  You two did so much verification I had a hard time    finding something     >>> >     where my help was meaningful!
> :-)     >>> >     >>> >     I did run the Nexmark suite on the DirectRunner against    2.7.0 and     >>> >     2.8.0
> following     >>> >    https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunne
> r-locall     >>> >         <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-direct
> runner-local>.     >>> >     >>> >     It is admittedly a very silly test - the instructions leave     >>> >   
>  immutability enforcement on, etc. But it does appear that    there is a     >>> >     30% degradation in query 8 and
> 15% in query 9. These are    the pure     >>> >     Java tests, not the SQL variants. The rest of the queries    are
> close     >>> >     enough that differences are not meaningful.     >>> >     >>> >     >>> > (It would be a good
> improvement for us to have alerts on daily     >>> > benchmarks if we do not have such a concept already.)     >>>
> >     >>> >     >>> >     I would ask a little more time to see what is going on    here - is it     >>> >     a real
> performance issue or an artifact of how the tests are     >>> >     invoked, or ...?     >>> >     >>> >     >>> >
> Thank you! Much appreciated. Please let us know when you are    done with     >>> > your investigation.     >>>
> >     >>> >     >>> >     Kenn     >>> >     >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet
> Altay    <altay@google.com <ma...@google.com>     >>> >     <mailto:altay@google.com
> <ma...@google.com>>> wrote:     >>> >     >>> >         Hi all,     >>> >     >>> >         Did you have a
> chance to review this RC? Between me    and Robert     >>> >         we ran a significant chunk of the validations.
> Let me    know if     >>> >         you have any questions.     >>> >     >>> >         Ahmet     >>> >     >>> >     
>    On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay    <altay@google.com <ma...@google.com>     >>> >       
>  <mailto:altay@google.com <ma...@google.com>>>    wrote:     >>> >     >>> >             Hi everyone,     >>>
> >     >>> >             Please review and vote on the release candidate    #1 for the     >>> >             version
> 2.8.0, as follows:     >>> >             [ ] +1, Approve the release     >>> >             [ ] -1, Do not approve the
> release (please    provide specific     >>> >             comments)     >>> >     >>> >             The complete
> staging area is available for your    review,     >>> >             which includes:     >>> >             * JIRA
> release notes [1],     >>> >             * the official Apache source release to be    deployed to     >>> >
> dist.apache.org <http://dist.apache.org>    <http://dist.apache.org> [2], which is     >>> >             signed with
> the key with fingerprint 6096FA00 [3],     >>> >             * all artifacts to be deployed to the Maven
> Central     >>> >             Repository [4],     >>> >             * source code tag "v2.8.0-RC1" [5],     >>> >     
>        * website pull request listing the release and    publishing     >>> >             the API reference manual
> [6].     >>> >             * Python artifacts are deployed along with the source     >>> >             release to the
> dist.apache.org    <http://dist.apache.org> <http://dist.apache.org> [2].     >>> >             * Validation sheet
> with a tab for 2.8.0 release    to help with     >>> >             validation [7].     >>> >     >>> >             The
> vote will be open for at least 72 hours. It    is adopted     >>> >             by majority approval, with at least 3
> PMC    affirmative votes.     >>> >     >>> >             Thanks,     >>> >             Ahmet     >>> >     >>> >     
>        [1]     >>> >    https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=123439855   
>   >>> >                 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>  
>    >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.00     >>> >             <https://dist.apache
> .org/repos/dist/dev/beam/2.8.0>     >>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYSS     >>> > 
>            <https://dist.apache.org/repos/dist/dev/beam/KEYS>     >>> >             [4]     >>> >    https://repositor
> y.apache.org/content/repositories/orgapachebeam-1049//     >>> >                 <https://repository.apache.org/conten
> t/repositories/orgapachebeam-1049/>     >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC11     >>>
> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>     >>> >             [6] https://github.com/apache/bea
> m-site/pull/5833     >>> >             <https://github.com/apache/beam-site/pull/583> and     >>> > https://github.com
> /apache/beam/pull/67455     >>> >             <https://github.com/apache/beam/pull/6745>     >>> >           
>  [7]     >>> >    https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712
> 8166     >>> >                 <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/ed
> it#gid=1854712816>     >>> >     >>> >     >>> >

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Maximilian Michels <mx...@apache.org>.
+1 (binding)

On 26.10.18 17:45, Kenneth Knowles wrote:
> Nice. Thanks.
> 
> +1
> 
> 
> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com 
> <ma...@google.com>> wrote:
> 
>     Thanks Tim!
> 
>     This was my only hesitation, and sounds like we're in the clear here.
> 
>     +1 (binding)
>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
>     <timrobertson100@gmail.com <ma...@gmail.com>> wrote:
>      >
>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
>      >
>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
>     spreadsheet)
>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
>     backport the un-merged BEAM-5036 fix in our code)
>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>      >
>      > Everything worked, and performance was similar on both.
>      > We built using maven pointing at
>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>      >
>      > Based on this limited testing: +1
>      >
>      > Thank you to the release managers,
>      > Tim
>      >
>      >
>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
>     <ma...@gmail.com>> wrote:
>      >>
>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
>     Sorry I’ve just been too busy to assist.
>      >>
>      >> Tim
>      >>
>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
>     <ma...@apache.org>> wrote:
>      >>
>      >> I tried to do a more thorough job on this.
>      >>
>      >>  - I could not reproduce the slowdown in Query 9. I believe the
>     variance was simply high given the parameters and environment
>      >>  - I saw the same slowdown in Query 8 when running as part of
>     the suite, but it vanished when I ran repeatedly on its own, so
>     again it is not good methodology probably
>      >>
>      >> We do have the dashboard at
>     https://apache-beam-testing.appspot.com/dashboard-admin though no
>     anomaly detection set up AFAIK.
>      >>
>      >>  - There is no issue easily visible in DirectRunner:
>     https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>      >>  - There is a notable degradation in Spark runner on 10/5 for
>     many queries.
>     https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>      >>  - Something minor happened for Dataflow around 10/1:
>     https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>      >>  - Flink runner seems to have had some fantastic improvements
>     :-)
>     https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>      >>
>      >> So if there is a blocker it would really be the Spark runner
>     perf changes. Of course, all these except Dataflow are using local
>     instances so may not be representative of larger scale AFAIK.
>      >>
>      >> Kenn
>      >>
>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
>     <mxm@apache.org <ma...@apache.org>> wrote:
>      >>>
>      >>> I've run WordCount using Quickstart with the FlinkRunner
>     (locally and
>      >>> against a Flink cluster).
>      >>>
>      >>> Would give a +1 but waiting what Kenn finds.
>      >>>
>      >>> -Max
>      >>>
>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
>      >>> >
>      >>> >
>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
>     <kenn@apache.org <ma...@apache.org>
>      >>> > <mailto:kenn@apache.org <ma...@apache.org>>> wrote:
>      >>> >
>      >>> >     You two did so much verification I had a hard time
>     finding something
>      >>> >     where my help was meaningful! :-)
>      >>> >
>      >>> >     I did run the Nexmark suite on the DirectRunner against
>     2.7.0 and
>      >>> >     2.8.0 following
>      >>> >
>     https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>      >>> >   
>       <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
>      >>> >
>      >>> >     It is admittedly a very silly test - the instructions leave
>      >>> >     immutability enforcement on, etc. But it does appear that
>     there is a
>      >>> >     30% degradation in query 8 and 15% in query 9. These are
>     the pure
>      >>> >     Java tests, not the SQL variants. The rest of the queries
>     are close
>      >>> >     enough that differences are not meaningful.
>      >>> >
>      >>> >
>      >>> > (It would be a good improvement for us to have alerts on daily
>      >>> > benchmarks if we do not have such a concept already.)
>      >>> >
>      >>> >
>      >>> >     I would ask a little more time to see what is going on
>     here - is it
>      >>> >     a real performance issue or an artifact of how the tests are
>      >>> >     invoked, or ...?
>      >>> >
>      >>> >
>      >>> > Thank you! Much appreciated. Please let us know when you are
>     done with
>      >>> > your investigation.
>      >>> >
>      >>> >
>      >>> >     Kenn
>      >>> >
>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
>     <altay@google.com <ma...@google.com>
>      >>> >     <mailto:altay@google.com <ma...@google.com>>> wrote:
>      >>> >
>      >>> >         Hi all,
>      >>> >
>      >>> >         Did you have a chance to review this RC? Between me
>     and Robert
>      >>> >         we ran a significant chunk of the validations. Let me
>     know if
>      >>> >         you have any questions.
>      >>> >
>      >>> >         Ahmet
>      >>> >
>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
>     <altay@google.com <ma...@google.com>
>      >>> >         <mailto:altay@google.com <ma...@google.com>>>
>     wrote:
>      >>> >
>      >>> >             Hi everyone,
>      >>> >
>      >>> >             Please review and vote on the release candidate
>     #1 for the
>      >>> >             version 2.8.0, as follows:
>      >>> >             [ ] +1, Approve the release
>      >>> >             [ ] -1, Do not approve the release (please
>     provide specific
>      >>> >             comments)
>      >>> >
>      >>> >             The complete staging area is available for your
>     review,
>      >>> >             which includes:
>      >>> >             * JIRA release notes [1],
>      >>> >             * the official Apache source release to be
>     deployed to
>      >>> > dist.apache.org <http://dist.apache.org>
>     <http://dist.apache.org> [2], which is
>      >>> >             signed with the key with fingerprint 6096FA00 [3],
>      >>> >             * all artifacts to be deployed to the Maven Central
>      >>> >             Repository [4],
>      >>> >             * source code tag "v2.8.0-RC1" [5],
>      >>> >             * website pull request listing the release and
>     publishing
>      >>> >             the API reference manual [6].
>      >>> >             * Python artifacts are deployed along with the source
>      >>> >             release to the dist.apache.org
>     <http://dist.apache.org> <http://dist.apache.org> [2].
>      >>> >             * Validation sheet with a tab for 2.8.0 release
>     to help with
>      >>> >             validation [7].
>      >>> >
>      >>> >             The vote will be open for at least 72 hours. It
>     is adopted
>      >>> >             by majority approval, with at least 3 PMC
>     affirmative votes.
>      >>> >
>      >>> >             Thanks,
>      >>> >             Ahmet
>      >>> >
>      >>> >             [1]
>      >>> >
>     https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>      >>> >           
>       <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>      >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>      >>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>      >>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>      >>> >             [4]
>      >>> >
>     https://repository.apache.org/content/repositories/orgapachebeam-1049/
>      >>> >           
>       <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>      >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>      >>> >             [6] https://github.com/apache/beam-site/pull/583
>      >>> >             <https://github.com/apache/beam-site/pull/583> and
>      >>> > https://github.com/apache/beam/pull/6745
>      >>> >             <https://github.com/apache/beam/pull/6745>
>      >>> >             [7]
>      >>> >
>     https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>      >>> >           
>       <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>      >>> >
>      >>> >
>      >>> >
> 

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Kenneth Knowles <ke...@apache.org>.
Nice. Thanks.

+1


On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <ro...@google.com> wrote:

> Thanks Tim!
>
> This was my only hesitation, and sounds like we're in the clear here.
>
> +1 (binding)
> On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson <ti...@gmail.com>
> wrote:
> >
> > A colleague and I tested on 2.7.0 and 2.8.0RC1:
> >
> > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in spreadsheet)
> > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we backport the
> un-merged BEAM-5036 fix in our code)
> > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
> >
> > Everything worked, and performance was similar on both.
> > We built using maven pointing at
> https://repository.apache.org/content/repositories/orgapachebeam-1049/
> >
> > Based on this limited testing: +1
> >
> > Thank you to the release managers,
> > Tim
> >
> >
> > On Thu, Oct 25, 2018 at 7:21 PM Tim <ti...@gmail.com> wrote:
> >>
> >> I can do some tests on Spark / YARN tomorrow (CEST timezone). Sorry
> I’ve just been too busy to assist.
> >>
> >> Tim
> >>
> >> On 25 Oct 2018, at 18:59, Kenneth Knowles <ke...@apache.org> wrote:
> >>
> >> I tried to do a more thorough job on this.
> >>
> >>  - I could not reproduce the slowdown in Query 9. I believe the
> variance was simply high given the parameters and environment
> >>  - I saw the same slowdown in Query 8 when running as part of the
> suite, but it vanished when I ran repeatedly on its own, so again it is not
> good methodology probably
> >>
> >> We do have the dashboard at
> https://apache-beam-testing.appspot.com/dashboard-admin though no anomaly
> detection set up AFAIK.
> >>
> >>  - There is no issue easily visible in DirectRunner:
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> >>  - There is a notable degradation in Spark runner on 10/5 for many
> queries.
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >>  - Something minor happened for Dataflow around 10/1:
> https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
> >>  - Flink runner seems to have had some fantastic improvements :-)
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> >>
> >> So if there is a blocker it would really be the Spark runner perf
> changes. Of course, all these except Dataflow are using local instances so
> may not be representative of larger scale AFAIK.
> >>
> >> Kenn
> >>
> >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels <mx...@apache.org>
> wrote:
> >>>
> >>> I've run WordCount using Quickstart with the FlinkRunner (locally and
> >>> against a Flink cluster).
> >>>
> >>> Would give a +1 but waiting what Kenn finds.
> >>>
> >>> -Max
> >>>
> >>> On 23.10.18 07:11, Ahmet Altay wrote:
> >>> >
> >>> >
> >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles <kenn@apache.org
> >>> > <ma...@apache.org>> wrote:
> >>> >
> >>> >     You two did so much verification I had a hard time finding
> something
> >>> >     where my help was meaningful! :-)
> >>> >
> >>> >     I did run the Nexmark suite on the DirectRunner against 2.7.0 and
> >>> >     2.8.0 following
> >>> >
> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
> >>> >     <
> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
> >.
> >>> >
> >>> >     It is admittedly a very silly test - the instructions leave
> >>> >     immutability enforcement on, etc. But it does appear that there
> is a
> >>> >     30% degradation in query 8 and 15% in query 9. These are the pure
> >>> >     Java tests, not the SQL variants. The rest of the queries are
> close
> >>> >     enough that differences are not meaningful.
> >>> >
> >>> >
> >>> > (It would be a good improvement for us to have alerts on daily
> >>> > benchmarks if we do not have such a concept already.)
> >>> >
> >>> >
> >>> >     I would ask a little more time to see what is going on here - is
> it
> >>> >     a real performance issue or an artifact of how the tests are
> >>> >     invoked, or ...?
> >>> >
> >>> >
> >>> > Thank you! Much appreciated. Please let us know when you are done
> with
> >>> > your investigation.
> >>> >
> >>> >
> >>> >     Kenn
> >>> >
> >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay <altay@google.com
> >>> >     <ma...@google.com>> wrote:
> >>> >
> >>> >         Hi all,
> >>> >
> >>> >         Did you have a chance to review this RC? Between me and
> Robert
> >>> >         we ran a significant chunk of the validations. Let me know if
> >>> >         you have any questions.
> >>> >
> >>> >         Ahmet
> >>> >
> >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay <
> altay@google.com
> >>> >         <ma...@google.com>> wrote:
> >>> >
> >>> >             Hi everyone,
> >>> >
> >>> >             Please review and vote on the release candidate #1 for
> the
> >>> >             version 2.8.0, as follows:
> >>> >             [ ] +1, Approve the release
> >>> >             [ ] -1, Do not approve the release (please provide
> specific
> >>> >             comments)
> >>> >
> >>> >             The complete staging area is available for your review,
> >>> >             which includes:
> >>> >             * JIRA release notes [1],
> >>> >             * the official Apache source release to be deployed to
> >>> >             dist.apache.org <http://dist.apache.org> [2], which is
> >>> >             signed with the key with fingerprint 6096FA00 [3],
> >>> >             * all artifacts to be deployed to the Maven Central
> >>> >             Repository [4],
> >>> >             * source code tag "v2.8.0-RC1" [5],
> >>> >             * website pull request listing the release and publishing
> >>> >             the API reference manual [6].
> >>> >             * Python artifacts are deployed along with the source
> >>> >             release to the dist.apache.org <http://dist.apache.org>
> [2].
> >>> >             * Validation sheet with a tab for 2.8.0 release to help
> with
> >>> >             validation [7].
> >>> >
> >>> >             The vote will be open for at least 72 hours. It is
> adopted
> >>> >             by majority approval, with at least 3 PMC affirmative
> votes.
> >>> >
> >>> >             Thanks,
> >>> >             Ahmet
> >>> >
> >>> >             [1]
> >>> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
> >>> >             <
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
> >
> >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
> >>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
> >>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
> >>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
> >>> >             [4]
> >>> >
> https://repository.apache.org/content/repositories/orgapachebeam-1049/
> >>> >             <
> https://repository.apache.org/content/repositories/orgapachebeam-1049/>
> >>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
> >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
> >>> >             [6] https://github.com/apache/beam-site/pull/583
> >>> >             <https://github.com/apache/beam-site/pull/583> and
> >>> >             https://github.com/apache/beam/pull/6745
> >>> >             <https://github.com/apache/beam/pull/6745>
> >>> >             [7]
> >>> >
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
> >>> >             <
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
> >
> >>> >
> >>> >
> >>> >
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Robert Bradshaw <ro...@google.com>.
Thanks Tim!

This was my only hesitation, and sounds like we're in the clear here.

+1 (binding)
On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson <ti...@gmail.com> wrote:
>
> A colleague and I tested on 2.7.0 and 2.8.0RC1:
>
> 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in spreadsheet)
> 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we backport the un-merged BEAM-5036 fix in our code)
> 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
>
> Everything worked, and performance was similar on both.
> We built using maven pointing at https://repository.apache.org/content/repositories/orgapachebeam-1049/
>
> Based on this limited testing: +1
>
> Thank you to the release managers,
> Tim
>
>
> On Thu, Oct 25, 2018 at 7:21 PM Tim <ti...@gmail.com> wrote:
>>
>> I can do some tests on Spark / YARN tomorrow (CEST timezone). Sorry I’ve just been too busy to assist.
>>
>> Tim
>>
>> On 25 Oct 2018, at 18:59, Kenneth Knowles <ke...@apache.org> wrote:
>>
>> I tried to do a more thorough job on this.
>>
>>  - I could not reproduce the slowdown in Query 9. I believe the variance was simply high given the parameters and environment
>>  - I saw the same slowdown in Query 8 when running as part of the suite, but it vanished when I ran repeatedly on its own, so again it is not good methodology probably
>>
>> We do have the dashboard at https://apache-beam-testing.appspot.com/dashboard-admin though no anomaly detection set up AFAIK.
>>
>>  - There is no issue easily visible in DirectRunner: https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>>  - There is a notable degradation in Spark runner on 10/5 for many queries. https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>>  - Something minor happened for Dataflow around 10/1: https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>>  - Flink runner seems to have had some fantastic improvements :-) https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>>
>> So if there is a blocker it would really be the Spark runner perf changes. Of course, all these except Dataflow are using local instances so may not be representative of larger scale AFAIK.
>>
>> Kenn
>>
>> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels <mx...@apache.org> wrote:
>>>
>>> I've run WordCount using Quickstart with the FlinkRunner (locally and
>>> against a Flink cluster).
>>>
>>> Would give a +1 but waiting what Kenn finds.
>>>
>>> -Max
>>>
>>> On 23.10.18 07:11, Ahmet Altay wrote:
>>> >
>>> >
>>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles <kenn@apache.org
>>> > <ma...@apache.org>> wrote:
>>> >
>>> >     You two did so much verification I had a hard time finding something
>>> >     where my help was meaningful! :-)
>>> >
>>> >     I did run the Nexmark suite on the DirectRunner against 2.7.0 and
>>> >     2.8.0 following
>>> >     https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>>> >     <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
>>> >
>>> >     It is admittedly a very silly test - the instructions leave
>>> >     immutability enforcement on, etc. But it does appear that there is a
>>> >     30% degradation in query 8 and 15% in query 9. These are the pure
>>> >     Java tests, not the SQL variants. The rest of the queries are close
>>> >     enough that differences are not meaningful.
>>> >
>>> >
>>> > (It would be a good improvement for us to have alerts on daily
>>> > benchmarks if we do not have such a concept already.)
>>> >
>>> >
>>> >     I would ask a little more time to see what is going on here - is it
>>> >     a real performance issue or an artifact of how the tests are
>>> >     invoked, or ...?
>>> >
>>> >
>>> > Thank you! Much appreciated. Please let us know when you are done with
>>> > your investigation.
>>> >
>>> >
>>> >     Kenn
>>> >
>>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay <altay@google.com
>>> >     <ma...@google.com>> wrote:
>>> >
>>> >         Hi all,
>>> >
>>> >         Did you have a chance to review this RC? Between me and Robert
>>> >         we ran a significant chunk of the validations. Let me know if
>>> >         you have any questions.
>>> >
>>> >         Ahmet
>>> >
>>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay <altay@google.com
>>> >         <ma...@google.com>> wrote:
>>> >
>>> >             Hi everyone,
>>> >
>>> >             Please review and vote on the release candidate #1 for the
>>> >             version 2.8.0, as follows:
>>> >             [ ] +1, Approve the release
>>> >             [ ] -1, Do not approve the release (please provide specific
>>> >             comments)
>>> >
>>> >             The complete staging area is available for your review,
>>> >             which includes:
>>> >             * JIRA release notes [1],
>>> >             * the official Apache source release to be deployed to
>>> >             dist.apache.org <http://dist.apache.org> [2], which is
>>> >             signed with the key with fingerprint 6096FA00 [3],
>>> >             * all artifacts to be deployed to the Maven Central
>>> >             Repository [4],
>>> >             * source code tag "v2.8.0-RC1" [5],
>>> >             * website pull request listing the release and publishing
>>> >             the API reference manual [6].
>>> >             * Python artifacts are deployed along with the source
>>> >             release to the dist.apache.org <http://dist.apache.org> [2].
>>> >             * Validation sheet with a tab for 2.8.0 release to help with
>>> >             validation [7].
>>> >
>>> >             The vote will be open for at least 72 hours. It is adopted
>>> >             by majority approval, with at least 3 PMC affirmative votes.
>>> >
>>> >             Thanks,
>>> >             Ahmet
>>> >
>>> >             [1]
>>> >             https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>>> >             <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>>> >             [4]
>>> >             https://repository.apache.org/content/repositories/orgapachebeam-1049/
>>> >             <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>>> >             [6] https://github.com/apache/beam-site/pull/583
>>> >             <https://github.com/apache/beam-site/pull/583> and
>>> >             https://github.com/apache/beam/pull/6745
>>> >             <https://github.com/apache/beam/pull/6745>
>>> >             [7]
>>> >             https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>>> >             <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>>> >
>>> >
>>> >

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Tim Robertson <ti...@gmail.com>.
A colleague and I tested on 2.7.0 and 2.8.0RC1:

1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in spreadsheet)
2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we backport the
un-merged BEAM-5036 fix in our code)
3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS

Everything worked, and performance was similar on both.
We built using maven pointing at
https://repository.apache.org/content/repositories/orgapachebeam-1049/

Based on this limited testing: +1

Thank you to the release managers,
Tim


On Thu, Oct 25, 2018 at 7:21 PM Tim <ti...@gmail.com> wrote:

> I can do some tests on Spark / YARN tomorrow (CEST timezone). Sorry I’ve
> just been too busy to assist.
>
> Tim
>
> On 25 Oct 2018, at 18:59, Kenneth Knowles <ke...@apache.org> wrote:
>
> I tried to do a more thorough job on this.
>
>  - I could not reproduce the slowdown in Query 9. I believe the variance
> was simply high given the parameters and environment
>  - I saw the same slowdown in Query 8 when running as part of the suite,
> but it vanished when I ran repeatedly on its own, so again it is not good
> methodology probably
>
> We do have the dashboard at
> https://apache-beam-testing.appspot.com/dashboard-admin though no anomaly
> detection set up AFAIK.
>
>  - There is no issue easily visible in DirectRunner:
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>  - There is a notable degradation in Spark runner on 10/5 for many
> queries.
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>  - Something minor happened for Dataflow around 10/1:
> https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>  - Flink runner seems to have had some fantastic improvements :-)
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>
> So if there is a blocker it would really be the Spark runner perf changes.
> Of course, all these except Dataflow are using local instances so may not
> be representative of larger scale AFAIK.
>
> Kenn
>
> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels <mx...@apache.org> wrote:
>
>> I've run WordCount using Quickstart with the FlinkRunner (locally and
>> against a Flink cluster).
>>
>> Would give a +1 but waiting what Kenn finds.
>>
>> -Max
>>
>> On 23.10.18 07:11, Ahmet Altay wrote:
>> >
>> >
>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles <kenn@apache.org
>> > <ma...@apache.org>> wrote:
>> >
>> >     You two did so much verification I had a hard time finding something
>> >     where my help was meaningful! :-)
>> >
>> >     I did run the Nexmark suite on the DirectRunner against 2.7.0 and
>> >     2.8.0 following
>> >
>> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>> >     <
>> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>> >.
>> >
>> >     It is admittedly a very silly test - the instructions leave
>> >     immutability enforcement on, etc. But it does appear that there is a
>> >     30% degradation in query 8 and 15% in query 9. These are the pure
>> >     Java tests, not the SQL variants. The rest of the queries are close
>> >     enough that differences are not meaningful.
>> >
>> >
>> > (It would be a good improvement for us to have alerts on daily
>> > benchmarks if we do not have such a concept already.)
>> >
>> >
>> >     I would ask a little more time to see what is going on here - is it
>> >     a real performance issue or an artifact of how the tests are
>> >     invoked, or ...?
>> >
>> >
>> > Thank you! Much appreciated. Please let us know when you are done with
>> > your investigation.
>> >
>> >
>> >     Kenn
>> >
>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay <altay@google.com
>> >     <ma...@google.com>> wrote:
>> >
>> >         Hi all,
>> >
>> >         Did you have a chance to review this RC? Between me and Robert
>> >         we ran a significant chunk of the validations. Let me know if
>> >         you have any questions.
>> >
>> >         Ahmet
>> >
>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay <altay@google.com
>> >         <ma...@google.com>> wrote:
>> >
>> >             Hi everyone,
>> >
>> >             Please review and vote on the release candidate #1 for the
>> >             version 2.8.0, as follows:
>> >             [ ] +1, Approve the release
>> >             [ ] -1, Do not approve the release (please provide specific
>> >             comments)
>> >
>> >             The complete staging area is available for your review,
>> >             which includes:
>> >             * JIRA release notes [1],
>> >             * the official Apache source release to be deployed to
>> >             dist.apache.org <http://dist.apache.org> [2], which is
>> >             signed with the key with fingerprint 6096FA00 [3],
>> >             * all artifacts to be deployed to the Maven Central
>> >             Repository [4],
>> >             * source code tag "v2.8.0-RC1" [5],
>> >             * website pull request listing the release and publishing
>> >             the API reference manual [6].
>> >             * Python artifacts are deployed along with the source
>> >             release to the dist.apache.org <http://dist.apache.org>
>> [2].
>> >             * Validation sheet with a tab for 2.8.0 release to help with
>> >             validation [7].
>> >
>> >             The vote will be open for at least 72 hours. It is adopted
>> >             by majority approval, with at least 3 PMC affirmative votes.
>> >
>> >             Thanks,
>> >             Ahmet
>> >
>> >             [1]
>> >
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>> >             <
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>> >
>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>> >             [4]
>> >
>> https://repository.apache.org/content/repositories/orgapachebeam-1049/
>> >             <
>> https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>> >             [6] https://github.com/apache/beam-site/pull/583
>> >             <https://github.com/apache/beam-site/pull/583> and
>> >             https://github.com/apache/beam/pull/6745
>> >             <https://github.com/apache/beam/pull/6745>
>> >             [7]
>> >
>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>> >             <
>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>> >
>> >
>> >
>> >
>>
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Tim <ti...@gmail.com>.
I can do some tests on Spark / YARN tomorrow (CEST timezone). Sorry I’ve just been too busy to assist.

Tim

> On 25 Oct 2018, at 18:59, Kenneth Knowles <ke...@apache.org> wrote:
> 
> I tried to do a more thorough job on this.
> 
>  - I could not reproduce the slowdown in Query 9. I believe the variance was simply high given the parameters and environment
>  - I saw the same slowdown in Query 8 when running as part of the suite, but it vanished when I ran repeatedly on its own, so again it is not good methodology probably
> 
> We do have the dashboard at https://apache-beam-testing.appspot.com/dashboard-admin though no anomaly detection set up AFAIK.
> 
>  - There is no issue easily visible in DirectRunner: https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>  - There is a notable degradation in Spark runner on 10/5 for many queries. https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>  - Something minor happened for Dataflow around 10/1: https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
>  - Flink runner seems to have had some fantastic improvements :-) https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> 
> So if there is a blocker it would really be the Spark runner perf changes. Of course, all these except Dataflow are using local instances so may not be representative of larger scale AFAIK.
> 
> Kenn
> 
>> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels <mx...@apache.org> wrote:
>> I've run WordCount using Quickstart with the FlinkRunner (locally and 
>> against a Flink cluster).
>> 
>> Would give a +1 but waiting what Kenn finds.
>> 
>> -Max
>> 
>> On 23.10.18 07:11, Ahmet Altay wrote:
>> > 
>> > 
>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles <kenn@apache.org 
>> > <ma...@apache.org>> wrote:
>> > 
>> >     You two did so much verification I had a hard time finding something
>> >     where my help was meaningful! :-)
>> > 
>> >     I did run the Nexmark suite on the DirectRunner against 2.7.0 and
>> >     2.8.0 following
>> >     https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>> >     <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
>> > 
>> >     It is admittedly a very silly test - the instructions leave
>> >     immutability enforcement on, etc. But it does appear that there is a
>> >     30% degradation in query 8 and 15% in query 9. These are the pure
>> >     Java tests, not the SQL variants. The rest of the queries are close
>> >     enough that differences are not meaningful.
>> > 
>> > 
>> > (It would be a good improvement for us to have alerts on daily 
>> > benchmarks if we do not have such a concept already.)
>> > 
>> > 
>> >     I would ask a little more time to see what is going on here - is it
>> >     a real performance issue or an artifact of how the tests are
>> >     invoked, or ...?
>> > 
>> > 
>> > Thank you! Much appreciated. Please let us know when you are done with 
>> > your investigation.
>> > 
>> > 
>> >     Kenn
>> > 
>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay <altay@google.com
>> >     <ma...@google.com>> wrote:
>> > 
>> >         Hi all,
>> > 
>> >         Did you have a chance to review this RC? Between me and Robert
>> >         we ran a significant chunk of the validations. Let me know if
>> >         you have any questions.
>> > 
>> >         Ahmet
>> > 
>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay <altay@google.com
>> >         <ma...@google.com>> wrote:
>> > 
>> >             Hi everyone,
>> > 
>> >             Please review and vote on the release candidate #1 for the
>> >             version 2.8.0, as follows:
>> >             [ ] +1, Approve the release
>> >             [ ] -1, Do not approve the release (please provide specific
>> >             comments)
>> > 
>> >             The complete staging area is available for your review,
>> >             which includes:
>> >             * JIRA release notes [1],
>> >             * the official Apache source release to be deployed to
>> >             dist.apache.org <http://dist.apache.org> [2], which is
>> >             signed with the key with fingerprint 6096FA00 [3],
>> >             * all artifacts to be deployed to the Maven Central
>> >             Repository [4],
>> >             * source code tag "v2.8.0-RC1" [5],
>> >             * website pull request listing the release and publishing
>> >             the API reference manual [6].
>> >             * Python artifacts are deployed along with the source
>> >             release to the dist.apache.org <http://dist.apache.org> [2].
>> >             * Validation sheet with a tab for 2.8.0 release to help with
>> >             validation [7].
>> > 
>> >             The vote will be open for at least 72 hours. It is adopted
>> >             by majority approval, with at least 3 PMC affirmative votes.
>> > 
>> >             Thanks,
>> >             Ahmet
>> > 
>> >             [1]
>> >             https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>> >             <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>> >             [4]
>> >             https://repository.apache.org/content/repositories/orgapachebeam-1049/
>> >             <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>> >             [6] https://github.com/apache/beam-site/pull/583
>> >             <https://github.com/apache/beam-site/pull/583> and
>> >             https://github.com/apache/beam/pull/6745
>> >             <https://github.com/apache/beam/pull/6745>
>> >             [7]
>> >             https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>> >             <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
>> > 
>> > 
>> > 

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Kenneth Knowles <ke...@apache.org>.
I tried to do a more thorough job on this.

 - I could not reproduce the slowdown in Query 9. I believe the variance
was simply high given the parameters and environment
 - I saw the same slowdown in Query 8 when running as part of the suite,
but it vanished when I ran repeatedly on its own, so again it is not good
methodology probably

We do have the dashboard at
https://apache-beam-testing.appspot.com/dashboard-admin though no anomaly
detection set up AFAIK.

 - There is no issue easily visible in DirectRunner:
https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
 - There is a notable degradation in Spark runner on 10/5 for many queries.
https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
 - Something minor happened for Dataflow around 10/1:
https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
 - Flink runner seems to have had some fantastic improvements :-)
https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384

So if there is a blocker it would really be the Spark runner perf changes.
Of course, all these except Dataflow are using local instances so may not
be representative of larger scale AFAIK.

Kenn

On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels <mx...@apache.org> wrote:

> I've run WordCount using Quickstart with the FlinkRunner (locally and
> against a Flink cluster).
>
> Would give a +1 but waiting what Kenn finds.
>
> -Max
>
> On 23.10.18 07:11, Ahmet Altay wrote:
> >
> >
> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles <kenn@apache.org
> > <ma...@apache.org>> wrote:
> >
> >     You two did so much verification I had a hard time finding something
> >     where my help was meaningful! :-)
> >
> >     I did run the Nexmark suite on the DirectRunner against 2.7.0 and
> >     2.8.0 following
> >
> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
> >     <
> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
> >.
> >
> >     It is admittedly a very silly test - the instructions leave
> >     immutability enforcement on, etc. But it does appear that there is a
> >     30% degradation in query 8 and 15% in query 9. These are the pure
> >     Java tests, not the SQL variants. The rest of the queries are close
> >     enough that differences are not meaningful.
> >
> >
> > (It would be a good improvement for us to have alerts on daily
> > benchmarks if we do not have such a concept already.)
> >
> >
> >     I would ask a little more time to see what is going on here - is it
> >     a real performance issue or an artifact of how the tests are
> >     invoked, or ...?
> >
> >
> > Thank you! Much appreciated. Please let us know when you are done with
> > your investigation.
> >
> >
> >     Kenn
> >
> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay <altay@google.com
> >     <ma...@google.com>> wrote:
> >
> >         Hi all,
> >
> >         Did you have a chance to review this RC? Between me and Robert
> >         we ran a significant chunk of the validations. Let me know if
> >         you have any questions.
> >
> >         Ahmet
> >
> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay <altay@google.com
> >         <ma...@google.com>> wrote:
> >
> >             Hi everyone,
> >
> >             Please review and vote on the release candidate #1 for the
> >             version 2.8.0, as follows:
> >             [ ] +1, Approve the release
> >             [ ] -1, Do not approve the release (please provide specific
> >             comments)
> >
> >             The complete staging area is available for your review,
> >             which includes:
> >             * JIRA release notes [1],
> >             * the official Apache source release to be deployed to
> >             dist.apache.org <http://dist.apache.org> [2], which is
> >             signed with the key with fingerprint 6096FA00 [3],
> >             * all artifacts to be deployed to the Maven Central
> >             Repository [4],
> >             * source code tag "v2.8.0-RC1" [5],
> >             * website pull request listing the release and publishing
> >             the API reference manual [6].
> >             * Python artifacts are deployed along with the source
> >             release to the dist.apache.org <http://dist.apache.org> [2].
> >             * Validation sheet with a tab for 2.8.0 release to help with
> >             validation [7].
> >
> >             The vote will be open for at least 72 hours. It is adopted
> >             by majority approval, with at least 3 PMC affirmative votes.
> >
> >             Thanks,
> >             Ahmet
> >
> >             [1]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
> >             <
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
> >
> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
> >             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
> >             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
> >             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
> >             [4]
> >
> https://repository.apache.org/content/repositories/orgapachebeam-1049/
> >             <
> https://repository.apache.org/content/repositories/orgapachebeam-1049/>
> >             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
> >             [6] https://github.com/apache/beam-site/pull/583
> >             <https://github.com/apache/beam-site/pull/583> and
> >             https://github.com/apache/beam/pull/6745
> >             <https://github.com/apache/beam/pull/6745>
> >             [7]
> >
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
> >             <
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
> >
> >
> >
> >
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Maximilian Michels <mx...@apache.org>.
I've run WordCount using Quickstart with the FlinkRunner (locally and 
against a Flink cluster).

Would give a +1 but waiting what Kenn finds.

-Max

On 23.10.18 07:11, Ahmet Altay wrote:
> 
> 
> On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles <kenn@apache.org 
> <ma...@apache.org>> wrote:
> 
>     You two did so much verification I had a hard time finding something
>     where my help was meaningful! :-)
> 
>     I did run the Nexmark suite on the DirectRunner against 2.7.0 and
>     2.8.0 following
>     https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
>     <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local>.
> 
>     It is admittedly a very silly test - the instructions leave
>     immutability enforcement on, etc. But it does appear that there is a
>     30% degradation in query 8 and 15% in query 9. These are the pure
>     Java tests, not the SQL variants. The rest of the queries are close
>     enough that differences are not meaningful.
> 
> 
> (It would be a good improvement for us to have alerts on daily 
> benchmarks if we do not have such a concept already.)
> 
> 
>     I would ask a little more time to see what is going on here - is it
>     a real performance issue or an artifact of how the tests are
>     invoked, or ...?
> 
> 
> Thank you! Much appreciated. Please let us know when you are done with 
> your investigation.
> 
> 
>     Kenn
> 
>     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay <altay@google.com
>     <ma...@google.com>> wrote:
> 
>         Hi all,
> 
>         Did you have a chance to review this RC? Between me and Robert
>         we ran a significant chunk of the validations. Let me know if
>         you have any questions.
> 
>         Ahmet
> 
>         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay <altay@google.com
>         <ma...@google.com>> wrote:
> 
>             Hi everyone,
> 
>             Please review and vote on the release candidate #1 for the
>             version 2.8.0, as follows:
>             [ ] +1, Approve the release
>             [ ] -1, Do not approve the release (please provide specific
>             comments)
> 
>             The complete staging area is available for your review,
>             which includes:
>             * JIRA release notes [1],
>             * the official Apache source release to be deployed to
>             dist.apache.org <http://dist.apache.org> [2], which is
>             signed with the key with fingerprint 6096FA00 [3],
>             * all artifacts to be deployed to the Maven Central
>             Repository [4],
>             * source code tag "v2.8.0-RC1" [5],
>             * website pull request listing the release and publishing
>             the API reference manual [6].
>             * Python artifacts are deployed along with the source
>             release to the dist.apache.org <http://dist.apache.org> [2].
>             * Validation sheet with a tab for 2.8.0 release to help with
>             validation [7].
> 
>             The vote will be open for at least 72 hours. It is adopted
>             by majority approval, with at least 3 PMC affirmative votes.
> 
>             Thanks,
>             Ahmet
> 
>             [1]
>             https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>             <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>             <https://dist.apache.org/repos/dist/dev/beam/2.8.0>
>             [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>             <https://dist.apache.org/repos/dist/dev/beam/KEYS>
>             [4]
>             https://repository.apache.org/content/repositories/orgapachebeam-1049/
>             <https://repository.apache.org/content/repositories/orgapachebeam-1049/>
>             [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>             <https://github.com/apache/beam/tree/v2.8.0-RC1>
>             [6] https://github.com/apache/beam-site/pull/583
>             <https://github.com/apache/beam-site/pull/583> and
>             https://github.com/apache/beam/pull/6745
>             <https://github.com/apache/beam/pull/6745>
>             [7]
>             https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>             <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816>
> 
> 
> 

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Ahmet Altay <al...@google.com>.
On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles <ke...@apache.org> wrote:

> You two did so much verification I had a hard time finding something where
> my help was meaningful! :-)
>
> I did run the Nexmark suite on the DirectRunner against 2.7.0 and 2.8.0
> following https://beam.apache.org/documentation/sdks/java/
> nexmark/#running-smoke-suite-on-the-directrunner-local.
>
> It is admittedly a very silly test - the instructions leave immutability
> enforcement on, etc. But it does appear that there is a 30% degradation in
> query 8 and 15% in query 9. These are the pure Java tests, not the SQL
> variants. The rest of the queries are close enough that differences are not
> meaningful.
>

(It would be a good improvement for us to have alerts on daily benchmarks
if we do not have such a concept already.)


>
> I would ask a little more time to see what is going on here - is it a real
> performance issue or an artifact of how the tests are invoked, or ...?
>

Thank you! Much appreciated. Please let us know when you are done with your
investigation.


>
> Kenn
>
> On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay <al...@google.com> wrote:
>
>> Hi all,
>>
>> Did you have a chance to review this RC? Between me and Robert we ran a
>> significant chunk of the validations. Let me know if you have any questions.
>>
>> Ahmet
>>
>> On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay <al...@google.com> wrote:
>>
>>> Hi everyone,
>>>
>>> Please review and vote on the release candidate #1 for the version
>>> 2.8.0, as follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>
>>> The complete staging area is available for your review, which includes:
>>> * JIRA release notes [1],
>>> * the official Apache source release to be deployed to dist.apache.org
>>> [2], which is signed with the key with fingerprint 6096FA00 [3],
>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>> * source code tag "v2.8.0-RC1" [5],
>>> * website pull request listing the release and publishing the API
>>> reference manual [6].
>>> * Python artifacts are deployed along with the source release to the
>>> dist.apache.org [2].
>>> * Validation sheet with a tab for 2.8.0 release to help with validation
>>> [7].
>>>
>>> The vote will be open for at least 72 hours. It is adopted by majority
>>> approval, with at least 3 PMC affirmative votes.
>>>
>>> Thanks,
>>> Ahmet
>>>
>>> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>>> projectId=12319527&version=12343985
>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>>> [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>>> [4] https://repository.apache.org/content/repositories/
>>> orgapachebeam-1049/
>>> [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>>> [6] https://github.com/apache/beam-site/pull/583 and
>>> https://github.com/apache/beam/pull/6745
>>> [7] https://docs.google.com/spreadsheets/d/1qk-
>>> N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>>>
>>
>>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Kenneth Knowles <ke...@apache.org>.
You two did so much verification I had a hard time finding something where
my help was meaningful! :-)

I did run the Nexmark suite on the DirectRunner against 2.7.0 and 2.8.0
following
https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
.

It is admittedly a very silly test - the instructions leave immutability
enforcement on, etc. But it does appear that there is a 30% degradation in
query 8 and 15% in query 9. These are the pure Java tests, not the SQL
variants. The rest of the queries are close enough that differences are not
meaningful.

I would ask a little more time to see what is going on here - is it a real
performance issue or an artifact of how the tests are invoked, or ...?

Kenn

On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay <al...@google.com> wrote:

> Hi all,
>
> Did you have a chance to review this RC? Between me and Robert we ran a
> significant chunk of the validations. Let me know if you have any questions.
>
> Ahmet
>
> On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay <al...@google.com> wrote:
>
>> Hi everyone,
>>
>> Please review and vote on the release candidate #1 for the version 2.8.0,
>> as follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>>
>> The complete staging area is available for your review, which includes:
>> * JIRA release notes [1],
>> * the official Apache source release to be deployed to dist.apache.org
>> [2], which is signed with the key with fingerprint 6096FA00 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "v2.8.0-RC1" [5],
>> * website pull request listing the release and publishing the API
>> reference manual [6].
>> * Python artifacts are deployed along with the source release to the
>> dist.apache.org [2].
>> * Validation sheet with a tab for 2.8.0 release to help with validation
>> [7].
>>
>> The vote will be open for at least 72 hours. It is adopted by majority
>> approval, with at least 3 PMC affirmative votes.
>>
>> Thanks,
>> Ahmet
>>
>> [1]
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
>> [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
>> [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>> [4]
>> https://repository.apache.org/content/repositories/orgapachebeam-1049/
>> [5] https://github.com/apache/beam/tree/v2.8.0-RC1
>> [6] https://github.com/apache/beam-site/pull/583 and
>> https://github.com/apache/beam/pull/6745
>> [7]
>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>>
>
>

Re: [VOTE] Release 2.8.0, release candidate #1

Posted by Ahmet Altay <al...@google.com>.
Hi all,

Did you have a chance to review this RC? Between me and Robert we ran a
significant chunk of the validations. Let me know if you have any questions.

Ahmet

On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay <al...@google.com> wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #1 for the version 2.8.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org
> [2], which is signed with the key with fingerprint 6096FA00 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "v2.8.0-RC1" [5],
> * website pull request listing the release and publishing the API
> reference manual [6].
> * Python artifacts are deployed along with the source release to the
> dist.apache.org [2].
> * Validation sheet with a tab for 2.8.0 release to help with validation
> [7].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Ahmet
>
> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12319527&version=12343985
> [2] https://dist.apache.org/repos/dist/dev/beam/2.8.0
> [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1049/
> [5] https://github.com/apache/beam/tree/v2.8.0-RC1
> [6] https://github.com/apache/beam-site/pull/583 and
> https://github.com/apache/beam/pull/6745
> [7] https://docs.google.com/spreadsheets/d/1qk-
> N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
>