You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mxnet.apache.org by "Skalicky, Sam" <ss...@amazon.com.INVALID> on 2020/09/26 06:24:30 UTC

[VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Dear MXNet community,

This is the vote to release Apache MXNet (incubating) version 1.8.0. Voting will start September 26, 23:59:59 PDT and close on September 29, 23:59:59 PDT.

Link to release notes:
https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes

Link to release candidate:
https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0

Link to source and signatures on apache dist server:
https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/

Please remember to TEST first before voting accordingly:
+1 = approve
+0 = no opinion
-1 = disapprove (provide reason)

Best regards,
Sam Skalicky


Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
Hi MXNet Community,

Quick update on the progress of the fixes for the release:

[1] - Thanks to Leonard for finding the missing PRs that were in v1.7.x but missing in v1.x, and backporting them to v1.8.x
[2] - Ive backported the combined commit with the missing PRs to v1.x to make sure this doesn’t happen in a future release
[3] - Fix for the oneDNN/intgemm problem
[4] - Fix for split_and_save 
[5] - Fix for setting attributes in reviewSubgraph

Tentatively we'll shoot for restarting the vote early next week once the remaining PRs with fixes are merged (we're making sure these fixes are backports from v1.x :-D).

Please reply if there are other PRs with fixes that need to be included in the v1.8.0 release.

Thanks and have a great weekend!
Sam

[1] https://github.com/apache/incubator-mxnet/pull/19262
[2] https://github.com/apache/incubator-mxnet/pull/19281 
[3] https://github.com/apache/incubator-mxnet/pull/19251
[4] https://github.com/apache/incubator-mxnet/pull/19267 
[5] https://github.com/apache/incubator-mxnet/pull/19278 

On 9/30/20, 9:53 PM, "Skalicky, Sam" <ss...@amazon.com> wrote:

    Thanks Leonard for picking up this work. Are you planning to open another PR that commits these PRs into v1.x too so this doesn’t happen again (if we ever release a 1.9 version)? 

    Other than these 2 PRs are there any others that are required for the v1.8.0 release?

    https://github.com/apache/incubator-mxnet/pull/19251
    https://github.com/apache/incubator-mxnet/pull/19262

    Sam

    On 9/30/20, 9:03 PM, "Leonard Lausen" <la...@apache.org> wrote:

        CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



        Thank you Sam for driving the release!

        I took a quick look at the missing commits from v1.7.x in v1.8.x via the git
        cherry mode and applied them to v1.8.x. Please see
        https://github.com/apache/incubator-mxnet/pull/19262

        The missing code in v1.8.x is substantial (+675 −518) and I thus change my vote
        for the rc0 release to -1.

        I hope we can include checking for missing commits via git cherry mode in the
        release manager process going forward. It just takes a few minutes. If we want
        to streamline the process, we can do so by avoiding to squash commits during
        porting from one branch to another which reduces false positives in git cherry
        mode (commits detected as missing that were actually ported).

        Best regards
        Leonard

        On Wed, 2020-09-30 at 21:52 +0000, Skalicky, Sam wrote:
        > Hi MXNet Community,
        >
        > Quick summary on the status of the vote:
        >
        > 2  +1
        > 1 -0.9
        >
        > I spoke with Leonard offline, and the problem only impacts the specific
        > instance when running MKLDNN/oneDNN immediately after intgemm. We don’t expect
        > users to fall into this specific edge case, and so far the problem hasn’t been
        > reproduced on 1.8.x (even through it contains the same oneDNN and intgemm
        > components that are in the master branch). He proposed to not postpone the
        > release for this issue, but if other issues arise we should fix this one at
        > the same time.
        >
        > There are also still missing PRs that were in v1.7.x that were never committed
        > to v1.x branch. And so when branching from v1.x to create the v1.8.x branch
        > these PRs do not exist. Unfortunately no one has volunteered to port these to
        > v1.x and v1.8.x branches.
        >
        > I propose extending the vote until Friday October 2, 23:59:59 PDT to conclude
        > the discussion and get the remaining votes necessary.
        >
        > Thanks!
        > Sam
        >
        > On 9/29/20, 12:41 PM, "Skalicky, Sam" <ss...@amazon.com.INVALID> wrote:
        >
        >     There was no response from the community on the discussion thread [1]. So
        > the current state is the same.
        >
        >     [1]
        > https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
        >
        >     On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:
        >
        >         CAUTION: This email originated from outside of the organization. Do
        > not click links or open attachments unless you can confirm the sender and know
        > the content is safe.
        >
        >
        >
        >         Just one question regarding the 1.8.0.rc0. Are all PRs that are in
        > 1.7.0 included in 1.8.0? For example,
        > https://github.com/apache/incubator-mxnet/pull/18653
        >
        >         Thanks,
        >         Xingjian
        >
        >         On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:
        >
        >             Thank you Aaron for trying the build and pointing out the issues.
        >
        >             On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
        >             > 2) Tried just doing a make. This fails because none of the
        > submodules are
        >             > there. [...]
        >
        >             I downloaded the rc from the link shared by Sam [1] and it does
        > include the
        >             submodules. Could you provide more details on your issue?
        >
        >             > Downloaded the tar.gz for the release and looked at the build
        > from
        >             source directions on the website, but these have you use cmake and
        > don't
        >             really tell you what to do...
        >
        >             The docs refer users to version-controlled files, as the build-
        > from-source guide
        >             on the website is shared among all versions, however the actual
        > build steps
        >             differes on different versions. I think the best way to improve it
        > is to provide
        >             version-specific build from source instructions via the "version
        > selector"
        >             feature on the get started page. Contributions towards this goal
        > or other
        >             improvements would be great [2].
        >
        >             Thanks
        >             Leonard
        >
        >             [1]:
        > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
        >             [2]: https://github.com/apache/incubator-mxnet/issues/18666
        >
        >
        > On 9/29/20, 10:09 AM, "Leonard Lausen" <la...@apache.org> wrote:
        >
        >     CAUTION: This email originated from outside of the organization. Do not
        > click links or open attachments unless you can confirm the sender and know the
        > content is safe.
        >
        >
        >
        >     Vote -0.9.
        >
        >     Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
        >     handles zmm registers. Together with MXNet intgemm feature (also included
        > in 1.8
        >     rc0) this can yield NaN results if onednn gemm is executed some time after
        >     intgemm. [1]
        >
        >     Thanks
        >     Leonard
        >
        >     [1]:
        > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
        >             >
        >             >
        >             > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <
        > sskalic@amazon.com.invalid>
        >             > wrote:
        >             >
        >             > > Thanks for pointing this out Leonard. Has anyone been able to
        > reproduce
        >             > > the problem on 1.8.0.rc0?
        >             > >
        >             > > Either way, I would proposed that we continue validating the
        > release as-is
        >             > > and see if we can find any other issues.
        >             > >
        >             > > Sam
        >             > >
        >             > > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org>
        > wrote:
        >             > >
        >             > >     CAUTION: This email originated from outside of the
        > organization. Do
        >             > > not click links or open attachments unless you can confirm the
        > sender and
        >             > > know the content is safe.
        >             > >
        >             > >
        >             > >
        >             > >     Thank you Sam for driving the 1.8 release!
        >             > >
        >             > >     As the included oneDNN package is known to produce nan
        > results on the
        >             > > master
        >             > >     branch [1] and is pending an upstream fix by Intel, I'd
        > suggest to
        >             > > extend the
        >             > >     vote until we have clarity if the bug also affects the 1.8
        > release,
        >             > > given that
        >             > >     oneDNN is enabled in the default configuration [2].
        >             > >
        >             > >     [1]:
        >             > >
        > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
        >             > >     [2]:
        >             > >
        >             > >
        > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
        >             > >
        >             > >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy
        > wrote:
        >             > >     > Sam,
        >             > >     >
        >             > >     > Thank you for driving the v1.8.0 release of MXNet. This
        > is exciting
        >             > > given
        >             > >     > it is coming with CUDA11 and cuDNN8!!
        >             > >     >
        >             > >     > Fixing the release candidate link:
        >             > >     >
        > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
        >             > >     >
        >             > >     > Best,
        >             > >     > Sandeep
        >             > >     >
        >             > >     >
        >             > >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
        >             > > <ss...@amazon.com.invalid>
        >             > >     > wrote:
        >             > >     >
        >             > >     > > Dear MXNet community,
        >             > >     > >
        >             > >     > > This is the vote to release Apache MXNet (incubating)
        > version
        >             > > 1.8.0.
        >             > >     > > Voting will start September 26, 23:59:59 PDT and close
        > on
        >             > > September 29,
        >             > >     > > 23:59:59 PDT.
        >             > >     > >
        >             > >     > > Link to release notes:
        >             > >     > >
        >             > >
        > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
        >             > >     > >
        >             > >     > > Link to release candidate:
        >             > >     > >
        > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
        >             > >     > >
        >             > >     > > Link to source and signatures on apache dist server:
        >             > >     > >
        > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
        >             > >     > >
        >             > >     > > Please remember to TEST first before voting
        > accordingly:
        >             > >     > > +1 = approve
        >             > >     > > +0 = no opinion
        >             > >     > > -1 = disapprove (provide reason)
        >             > >     > >
        >             > >     > > Best regards,
        >             > >     > > Sam Skalicky
        >             > >     > >
        >             > >     > >
        >             > >     >
        >             > >     > --
        >             > >     > Sandeep Krishnamurthy
        >             > >
        >             > >
        >             > >
        >
        >
        >
        >




Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
Thanks Leonard for picking up this work. Are you planning to open another PR that commits these PRs into v1.x too so this doesn’t happen again (if we ever release a 1.9 version)? 

Other than these 2 PRs are there any others that are required for the v1.8.0 release?

https://github.com/apache/incubator-mxnet/pull/19251
https://github.com/apache/incubator-mxnet/pull/19262

Sam

On 9/30/20, 9:03 PM, "Leonard Lausen" <la...@apache.org> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



    Thank you Sam for driving the release!

    I took a quick look at the missing commits from v1.7.x in v1.8.x via the git
    cherry mode and applied them to v1.8.x. Please see
    https://github.com/apache/incubator-mxnet/pull/19262

    The missing code in v1.8.x is substantial (+675 −518) and I thus change my vote
    for the rc0 release to -1.

    I hope we can include checking for missing commits via git cherry mode in the
    release manager process going forward. It just takes a few minutes. If we want
    to streamline the process, we can do so by avoiding to squash commits during
    porting from one branch to another which reduces false positives in git cherry
    mode (commits detected as missing that were actually ported).

    Best regards
    Leonard

    On Wed, 2020-09-30 at 21:52 +0000, Skalicky, Sam wrote:
    > Hi MXNet Community,
    >
    > Quick summary on the status of the vote:
    >
    > 2  +1
    > 1 -0.9
    >
    > I spoke with Leonard offline, and the problem only impacts the specific
    > instance when running MKLDNN/oneDNN immediately after intgemm. We don’t expect
    > users to fall into this specific edge case, and so far the problem hasn’t been
    > reproduced on 1.8.x (even through it contains the same oneDNN and intgemm
    > components that are in the master branch). He proposed to not postpone the
    > release for this issue, but if other issues arise we should fix this one at
    > the same time.
    >
    > There are also still missing PRs that were in v1.7.x that were never committed
    > to v1.x branch. And so when branching from v1.x to create the v1.8.x branch
    > these PRs do not exist. Unfortunately no one has volunteered to port these to
    > v1.x and v1.8.x branches.
    >
    > I propose extending the vote until Friday October 2, 23:59:59 PDT to conclude
    > the discussion and get the remaining votes necessary.
    >
    > Thanks!
    > Sam
    >
    > On 9/29/20, 12:41 PM, "Skalicky, Sam" <ss...@amazon.com.INVALID> wrote:
    >
    >     There was no response from the community on the discussion thread [1]. So
    > the current state is the same.
    >
    >     [1]
    > https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
    >
    >     On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:
    >
    >         CAUTION: This email originated from outside of the organization. Do
    > not click links or open attachments unless you can confirm the sender and know
    > the content is safe.
    >
    >
    >
    >         Just one question regarding the 1.8.0.rc0. Are all PRs that are in
    > 1.7.0 included in 1.8.0? For example,
    > https://github.com/apache/incubator-mxnet/pull/18653
    >
    >         Thanks,
    >         Xingjian
    >
    >         On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:
    >
    >             Thank you Aaron for trying the build and pointing out the issues.
    >
    >             On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
    >             > 2) Tried just doing a make. This fails because none of the
    > submodules are
    >             > there. [...]
    >
    >             I downloaded the rc from the link shared by Sam [1] and it does
    > include the
    >             submodules. Could you provide more details on your issue?
    >
    >             > Downloaded the tar.gz for the release and looked at the build
    > from
    >             source directions on the website, but these have you use cmake and
    > don't
    >             really tell you what to do...
    >
    >             The docs refer users to version-controlled files, as the build-
    > from-source guide
    >             on the website is shared among all versions, however the actual
    > build steps
    >             differes on different versions. I think the best way to improve it
    > is to provide
    >             version-specific build from source instructions via the "version
    > selector"
    >             feature on the get started page. Contributions towards this goal
    > or other
    >             improvements would be great [2].
    >
    >             Thanks
    >             Leonard
    >
    >             [1]:
    > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
    >             [2]: https://github.com/apache/incubator-mxnet/issues/18666
    >
    >
    > On 9/29/20, 10:09 AM, "Leonard Lausen" <la...@apache.org> wrote:
    >
    >     CAUTION: This email originated from outside of the organization. Do not
    > click links or open attachments unless you can confirm the sender and know the
    > content is safe.
    >
    >
    >
    >     Vote -0.9.
    >
    >     Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
    >     handles zmm registers. Together with MXNet intgemm feature (also included
    > in 1.8
    >     rc0) this can yield NaN results if onednn gemm is executed some time after
    >     intgemm. [1]
    >
    >     Thanks
    >     Leonard
    >
    >     [1]:
    > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
    >             >
    >             >
    >             > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <
    > sskalic@amazon.com.invalid>
    >             > wrote:
    >             >
    >             > > Thanks for pointing this out Leonard. Has anyone been able to
    > reproduce
    >             > > the problem on 1.8.0.rc0?
    >             > >
    >             > > Either way, I would proposed that we continue validating the
    > release as-is
    >             > > and see if we can find any other issues.
    >             > >
    >             > > Sam
    >             > >
    >             > > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org>
    > wrote:
    >             > >
    >             > >     CAUTION: This email originated from outside of the
    > organization. Do
    >             > > not click links or open attachments unless you can confirm the
    > sender and
    >             > > know the content is safe.
    >             > >
    >             > >
    >             > >
    >             > >     Thank you Sam for driving the 1.8 release!
    >             > >
    >             > >     As the included oneDNN package is known to produce nan
    > results on the
    >             > > master
    >             > >     branch [1] and is pending an upstream fix by Intel, I'd
    > suggest to
    >             > > extend the
    >             > >     vote until we have clarity if the bug also affects the 1.8
    > release,
    >             > > given that
    >             > >     oneDNN is enabled in the default configuration [2].
    >             > >
    >             > >     [1]:
    >             > >
    > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
    >             > >     [2]:
    >             > >
    >             > >
    > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
    >             > >
    >             > >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy
    > wrote:
    >             > >     > Sam,
    >             > >     >
    >             > >     > Thank you for driving the v1.8.0 release of MXNet. This
    > is exciting
    >             > > given
    >             > >     > it is coming with CUDA11 and cuDNN8!!
    >             > >     >
    >             > >     > Fixing the release candidate link:
    >             > >     >
    > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
    >             > >     >
    >             > >     > Best,
    >             > >     > Sandeep
    >             > >     >
    >             > >     >
    >             > >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
    >             > > <ss...@amazon.com.invalid>
    >             > >     > wrote:
    >             > >     >
    >             > >     > > Dear MXNet community,
    >             > >     > >
    >             > >     > > This is the vote to release Apache MXNet (incubating)
    > version
    >             > > 1.8.0.
    >             > >     > > Voting will start September 26, 23:59:59 PDT and close
    > on
    >             > > September 29,
    >             > >     > > 23:59:59 PDT.
    >             > >     > >
    >             > >     > > Link to release notes:
    >             > >     > >
    >             > >
    > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
    >             > >     > >
    >             > >     > > Link to release candidate:
    >             > >     > >
    > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
    >             > >     > >
    >             > >     > > Link to source and signatures on apache dist server:
    >             > >     > >
    > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
    >             > >     > >
    >             > >     > > Please remember to TEST first before voting
    > accordingly:
    >             > >     > > +1 = approve
    >             > >     > > +0 = no opinion
    >             > >     > > -1 = disapprove (provide reason)
    >             > >     > >
    >             > >     > > Best regards,
    >             > >     > > Sam Skalicky
    >             > >     > >
    >             > >     > >
    >             > >     >
    >             > >     > --
    >             > >     > Sandeep Krishnamurthy
    >             > >
    >             > >
    >             > >
    >
    >
    >
    >



Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by Leonard Lausen <la...@apache.org>.
Thank you Sam for driving the release!

I took a quick look at the missing commits from v1.7.x in v1.8.x via the git
cherry mode and applied them to v1.8.x. Please see 
https://github.com/apache/incubator-mxnet/pull/19262

The missing code in v1.8.x is substantial (+675 −518) and I thus change my vote
for the rc0 release to -1.

I hope we can include checking for missing commits via git cherry mode in the
release manager process going forward. It just takes a few minutes. If we want
to streamline the process, we can do so by avoiding to squash commits during
porting from one branch to another which reduces false positives in git cherry
mode (commits detected as missing that were actually ported).

Best regards
Leonard

On Wed, 2020-09-30 at 21:52 +0000, Skalicky, Sam wrote:
> Hi MXNet Community, 
> 
> Quick summary on the status of the vote:
> 
> 2  +1
> 1 -0.9
> 
> I spoke with Leonard offline, and the problem only impacts the specific
> instance when running MKLDNN/oneDNN immediately after intgemm. We don’t expect
> users to fall into this specific edge case, and so far the problem hasn’t been
> reproduced on 1.8.x (even through it contains the same oneDNN and intgemm
> components that are in the master branch). He proposed to not postpone the
> release for this issue, but if other issues arise we should fix this one at
> the same time. 
> 
> There are also still missing PRs that were in v1.7.x that were never committed
> to v1.x branch. And so when branching from v1.x to create the v1.8.x branch
> these PRs do not exist. Unfortunately no one has volunteered to port these to
> v1.x and v1.8.x branches.
> 
> I propose extending the vote until Friday October 2, 23:59:59 PDT to conclude
> the discussion and get the remaining votes necessary. 
> 
> Thanks!
> Sam
> 
> On 9/29/20, 12:41 PM, "Skalicky, Sam" <ss...@amazon.com.INVALID> wrote:
> 
>     There was no response from the community on the discussion thread [1]. So
> the current state is the same.
> 
>     [1] 
> https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
> 
>     On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:
> 
>         CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and know
> the content is safe.
> 
> 
> 
>         Just one question regarding the 1.8.0.rc0. Are all PRs that are in
> 1.7.0 included in 1.8.0? For example, 
> https://github.com/apache/incubator-mxnet/pull/18653
> 
>         Thanks,
>         Xingjian
> 
>         On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:
> 
>             Thank you Aaron for trying the build and pointing out the issues.
> 
>             On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
>             > 2) Tried just doing a make. This fails because none of the
> submodules are
>             > there. [...]
> 
>             I downloaded the rc from the link shared by Sam [1] and it does
> include the
>             submodules. Could you provide more details on your issue?
> 
>             > Downloaded the tar.gz for the release and looked at the build
> from
>             source directions on the website, but these have you use cmake and
> don't
>             really tell you what to do...
> 
>             The docs refer users to version-controlled files, as the build-
> from-source guide
>             on the website is shared among all versions, however the actual
> build steps
>             differes on different versions. I think the best way to improve it
> is to provide
>             version-specific build from source instructions via the "version
> selector"
>             feature on the get started page. Contributions towards this goal
> or other
>             improvements would be great [2].
> 
>             Thanks
>             Leonard
> 
>             [1]: 
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
>             [2]: https://github.com/apache/incubator-mxnet/issues/18666
> 
> 
> On 9/29/20, 10:09 AM, "Leonard Lausen" <la...@apache.org> wrote:
> 
>     CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know the
> content is safe.
> 
> 
> 
>     Vote -0.9.
> 
>     Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
>     handles zmm registers. Together with MXNet intgemm feature (also included
> in 1.8
>     rc0) this can yield NaN results if onednn gemm is executed some time after
>     intgemm. [1]
> 
>     Thanks
>     Leonard
> 
>     [1]: 
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
>             >
>             >
>             > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <
> sskalic@amazon.com.invalid>
>             > wrote:
>             >
>             > > Thanks for pointing this out Leonard. Has anyone been able to
> reproduce
>             > > the problem on 1.8.0.rc0?
>             > >
>             > > Either way, I would proposed that we continue validating the
> release as-is
>             > > and see if we can find any other issues.
>             > >
>             > > Sam
>             > >
>             > > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org>
> wrote:
>             > >
>             > >     CAUTION: This email originated from outside of the
> organization. Do
>             > > not click links or open attachments unless you can confirm the
> sender and
>             > > know the content is safe.
>             > >
>             > >
>             > >
>             > >     Thank you Sam for driving the 1.8 release!
>             > >
>             > >     As the included oneDNN package is known to produce nan
> results on the
>             > > master
>             > >     branch [1] and is pending an upstream fix by Intel, I'd
> suggest to
>             > > extend the
>             > >     vote until we have clarity if the bug also affects the 1.8
> release,
>             > > given that
>             > >     oneDNN is enabled in the default configuration [2].
>             > >
>             > >     [1]:
>             > > 
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
>             > >     [2]:
>             > >
>             > > 
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
>             > >
>             > >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy
> wrote:
>             > >     > Sam,
>             > >     >
>             > >     > Thank you for driving the v1.8.0 release of MXNet. This
> is exciting
>             > > given
>             > >     > it is coming with CUDA11 and cuDNN8!!
>             > >     >
>             > >     > Fixing the release candidate link:
>             > >     > 
> https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
>             > >     >
>             > >     > Best,
>             > >     > Sandeep
>             > >     >
>             > >     >
>             > >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
>             > > <ss...@amazon.com.invalid>
>             > >     > wrote:
>             > >     >
>             > >     > > Dear MXNet community,
>             > >     > >
>             > >     > > This is the vote to release Apache MXNet (incubating)
> version
>             > > 1.8.0.
>             > >     > > Voting will start September 26, 23:59:59 PDT and close
> on
>             > > September 29,
>             > >     > > 23:59:59 PDT.
>             > >     > >
>             > >     > > Link to release notes:
>             > >     > >
>             > > 
> https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
>             > >     > >
>             > >     > > Link to release candidate:
>             > >     > > 
> https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
>             > >     > >
>             > >     > > Link to source and signatures on apache dist server:
>             > >     > > 
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
>             > >     > >
>             > >     > > Please remember to TEST first before voting
> accordingly:
>             > >     > > +1 = approve
>             > >     > > +0 = no opinion
>             > >     > > -1 = disapprove (provide reason)
>             > >     > >
>             > >     > > Best regards,
>             > >     > > Sam Skalicky
>             > >     > >
>             > >     > >
>             > >     >
>             > >     > --
>             > >     > Sandeep Krishnamurthy
>             > >
>             > >
>             > >
> 
> 
> 
> 


Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
Hi MXNet Community, 

Quick summary on the status of the vote:

2  +1
1 -0.9

I spoke with Leonard offline, and the problem only impacts the specific instance when running MKLDNN/oneDNN immediately after intgemm. We don’t expect users to fall into this specific edge case, and so far the problem hasn’t been reproduced on 1.8.x (even through it contains the same oneDNN and intgemm components that are in the master branch). He proposed to not postpone the release for this issue, but if other issues arise we should fix this one at the same time. 

There are also still missing PRs that were in v1.7.x that were never committed to v1.x branch. And so when branching from v1.x to create the v1.8.x branch these PRs do not exist. Unfortunately no one has volunteered to port these to v1.x and v1.8.x branches.

I propose extending the vote until Friday October 2, 23:59:59 PDT to conclude the discussion and get the remaining votes necessary. 

Thanks!
Sam

On 9/29/20, 12:41 PM, "Skalicky, Sam" <ss...@amazon.com.INVALID> wrote:

    There was no response from the community on the discussion thread [1]. So the current state is the same.

    [1] https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E

    On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:

        CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



        Just one question regarding the 1.8.0.rc0. Are all PRs that are in 1.7.0 included in 1.8.0? For example, https://github.com/apache/incubator-mxnet/pull/18653

        Thanks,
        Xingjian

        On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:

            Thank you Aaron for trying the build and pointing out the issues.

            On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
            > 2) Tried just doing a make. This fails because none of the submodules are
            > there. [...]

            I downloaded the rc from the link shared by Sam [1] and it does include the
            submodules. Could you provide more details on your issue?

            > Downloaded the tar.gz for the release and looked at the build from
            source directions on the website, but these have you use cmake and don't
            really tell you what to do...

            The docs refer users to version-controlled files, as the build-from-source guide
            on the website is shared among all versions, however the actual build steps
            differes on different versions. I think the best way to improve it is to provide
            version-specific build from source instructions via the "version selector"
            feature on the get started page. Contributions towards this goal or other
            improvements would be great [2].

            Thanks
            Leonard

            [1]: https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
            [2]: https://github.com/apache/incubator-mxnet/issues/18666


On 9/29/20, 10:09 AM, "Leonard Lausen" <la...@apache.org> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



    Vote -0.9.

    Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
    handles zmm registers. Together with MXNet intgemm feature (also included in 1.8
    rc0) this can yield NaN results if onednn gemm is executed some time after
    intgemm. [1]

    Thanks
    Leonard

    [1]: https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
            >
            >
            > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
            > wrote:
            >
            > > Thanks for pointing this out Leonard. Has anyone been able to reproduce
            > > the problem on 1.8.0.rc0?
            > >
            > > Either way, I would proposed that we continue validating the release as-is
            > > and see if we can find any other issues.
            > >
            > > Sam
            > >
            > > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
            > >
            > >     CAUTION: This email originated from outside of the organization. Do
            > > not click links or open attachments unless you can confirm the sender and
            > > know the content is safe.
            > >
            > >
            > >
            > >     Thank you Sam for driving the 1.8 release!
            > >
            > >     As the included oneDNN package is known to produce nan results on the
            > > master
            > >     branch [1] and is pending an upstream fix by Intel, I'd suggest to
            > > extend the
            > >     vote until we have clarity if the bug also affects the 1.8 release,
            > > given that
            > >     oneDNN is enabled in the default configuration [2].
            > >
            > >     [1]:
            > > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
            > >     [2]:
            > >
            > > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
            > >
            > >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
            > >     > Sam,
            > >     >
            > >     > Thank you for driving the v1.8.0 release of MXNet. This is exciting
            > > given
            > >     > it is coming with CUDA11 and cuDNN8!!
            > >     >
            > >     > Fixing the release candidate link:
            > >     > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
            > >     >
            > >     > Best,
            > >     > Sandeep
            > >     >
            > >     >
            > >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
            > > <ss...@amazon.com.invalid>
            > >     > wrote:
            > >     >
            > >     > > Dear MXNet community,
            > >     > >
            > >     > > This is the vote to release Apache MXNet (incubating) version
            > > 1.8.0.
            > >     > > Voting will start September 26, 23:59:59 PDT and close on
            > > September 29,
            > >     > > 23:59:59 PDT.
            > >     > >
            > >     > > Link to release notes:
            > >     > >
            > > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
            > >     > >
            > >     > > Link to release candidate:
            > >     > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
            > >     > >
            > >     > > Link to source and signatures on apache dist server:
            > >     > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
            > >     > >
            > >     > > Please remember to TEST first before voting accordingly:
            > >     > > +1 = approve
            > >     > > +0 = no opinion
            > >     > > -1 = disapprove (provide reason)
            > >     > >
            > >     > > Best regards,
            > >     > > Sam Skalicky
            > >     > >
            > >     > >
            > >     >
            > >     > --
            > >     > Sandeep Krishnamurthy
            > >
            > >
            > >





Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
There was no response from the community on the discussion thread [1]. So the current state is the same.

[1] https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E

On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



    Just one question regarding the 1.8.0.rc0. Are all PRs that are in 1.7.0 included in 1.8.0? For example, https://github.com/apache/incubator-mxnet/pull/18653

    Thanks,
    Xingjian

    On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:

        Thank you Aaron for trying the build and pointing out the issues.

        On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
        > 2) Tried just doing a make. This fails because none of the submodules are
        > there. [...]

        I downloaded the rc from the link shared by Sam [1] and it does include the
        submodules. Could you provide more details on your issue?

        > Downloaded the tar.gz for the release and looked at the build from
        source directions on the website, but these have you use cmake and don't
        really tell you what to do...

        The docs refer users to version-controlled files, as the build-from-source guide
        on the website is shared among all versions, however the actual build steps
        differes on different versions. I think the best way to improve it is to provide
        version-specific build from source instructions via the "version selector"
        feature on the get started page. Contributions towards this goal or other
        improvements would be great [2].

        Thanks
        Leonard

        [1]: https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
        [2]: https://github.com/apache/incubator-mxnet/issues/18666

        >
        >
        > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
        > wrote:
        >
        > > Thanks for pointing this out Leonard. Has anyone been able to reproduce
        > > the problem on 1.8.0.rc0?
        > >
        > > Either way, I would proposed that we continue validating the release as-is
        > > and see if we can find any other issues.
        > >
        > > Sam
        > >
        > > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
        > >
        > >     CAUTION: This email originated from outside of the organization. Do
        > > not click links or open attachments unless you can confirm the sender and
        > > know the content is safe.
        > >
        > >
        > >
        > >     Thank you Sam for driving the 1.8 release!
        > >
        > >     As the included oneDNN package is known to produce nan results on the
        > > master
        > >     branch [1] and is pending an upstream fix by Intel, I'd suggest to
        > > extend the
        > >     vote until we have clarity if the bug also affects the 1.8 release,
        > > given that
        > >     oneDNN is enabled in the default configuration [2].
        > >
        > >     [1]:
        > > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
        > >     [2]:
        > >
        > > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
        > >
        > >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
        > >     > Sam,
        > >     >
        > >     > Thank you for driving the v1.8.0 release of MXNet. This is exciting
        > > given
        > >     > it is coming with CUDA11 and cuDNN8!!
        > >     >
        > >     > Fixing the release candidate link:
        > >     > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
        > >     >
        > >     > Best,
        > >     > Sandeep
        > >     >
        > >     >
        > >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
        > > <ss...@amazon.com.invalid>
        > >     > wrote:
        > >     >
        > >     > > Dear MXNet community,
        > >     > >
        > >     > > This is the vote to release Apache MXNet (incubating) version
        > > 1.8.0.
        > >     > > Voting will start September 26, 23:59:59 PDT and close on
        > > September 29,
        > >     > > 23:59:59 PDT.
        > >     > >
        > >     > > Link to release notes:
        > >     > >
        > > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
        > >     > >
        > >     > > Link to release candidate:
        > >     > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
        > >     > >
        > >     > > Link to source and signatures on apache dist server:
        > >     > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
        > >     > >
        > >     > > Please remember to TEST first before voting accordingly:
        > >     > > +1 = approve
        > >     > > +0 = no opinion
        > >     > > -1 = disapprove (provide reason)
        > >     > >
        > >     > > Best regards,
        > >     > > Sam Skalicky
        > >     > >
        > >     > >
        > >     >
        > >     > --
        > >     > Sandeep Krishnamurthy
        > >
        > >
        > >




Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by Xingjian SHI <xs...@connect.ust.hk>.
Just one question regarding the 1.8.0.rc0. Are all PRs that are in 1.7.0 included in 1.8.0? For example, https://github.com/apache/incubator-mxnet/pull/18653

Thanks,
Xingjian

On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:

    Thank you Aaron for trying the build and pointing out the issues.

    On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
    > 2) Tried just doing a make. This fails because none of the submodules are
    > there. [...]

    I downloaded the rc from the link shared by Sam [1] and it does include the
    submodules. Could you provide more details on your issue?

    > Downloaded the tar.gz for the release and looked at the build from
    source directions on the website, but these have you use cmake and don't
    really tell you what to do... 

    The docs refer users to version-controlled files, as the build-from-source guide
    on the website is shared among all versions, however the actual build steps
    differes on different versions. I think the best way to improve it is to provide
    version-specific build from source instructions via the "version selector"
    feature on the get started page. Contributions towards this goal or other
    improvements would be great [2].

    Thanks
    Leonard

    [1]: https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
    [2]: https://github.com/apache/incubator-mxnet/issues/18666

    > 
    > 
    > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
    > wrote:
    > 
    > > Thanks for pointing this out Leonard. Has anyone been able to reproduce
    > > the problem on 1.8.0.rc0?
    > > 
    > > Either way, I would proposed that we continue validating the release as-is
    > > and see if we can find any other issues.
    > > 
    > > Sam
    > > 
    > > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
    > > 
    > >     CAUTION: This email originated from outside of the organization. Do
    > > not click links or open attachments unless you can confirm the sender and
    > > know the content is safe.
    > > 
    > > 
    > > 
    > >     Thank you Sam for driving the 1.8 release!
    > > 
    > >     As the included oneDNN package is known to produce nan results on the
    > > master
    > >     branch [1] and is pending an upstream fix by Intel, I'd suggest to
    > > extend the
    > >     vote until we have clarity if the bug also affects the 1.8 release,
    > > given that
    > >     oneDNN is enabled in the default configuration [2].
    > > 
    > >     [1]:
    > > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
    > >     [2]:
    > > 
    > > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
    > > 
    > >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
    > >     > Sam,
    > >     >
    > >     > Thank you for driving the v1.8.0 release of MXNet. This is exciting
    > > given
    > >     > it is coming with CUDA11 and cuDNN8!!
    > >     >
    > >     > Fixing the release candidate link:
    > >     > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
    > >     >
    > >     > Best,
    > >     > Sandeep
    > >     >
    > >     >
    > >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
    > > <ss...@amazon.com.invalid>
    > >     > wrote:
    > >     >
    > >     > > Dear MXNet community,
    > >     > >
    > >     > > This is the vote to release Apache MXNet (incubating) version
    > > 1.8.0.
    > >     > > Voting will start September 26, 23:59:59 PDT and close on
    > > September 29,
    > >     > > 23:59:59 PDT.
    > >     > >
    > >     > > Link to release notes:
    > >     > >
    > > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
    > >     > >
    > >     > > Link to release candidate:
    > >     > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
    > >     > >
    > >     > > Link to source and signatures on apache dist server:
    > >     > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
    > >     > >
    > >     > > Please remember to TEST first before voting accordingly:
    > >     > > +1 = approve
    > >     > > +0 = no opinion
    > >     > > -1 = disapprove (provide reason)
    > >     > >
    > >     > > Best regards,
    > >     > > Sam Skalicky
    > >     > >
    > >     > >
    > >     >
    > >     > --
    > >     > Sandeep Krishnamurthy
    > > 
    > > 
    > > 



Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by Leonard Lausen <la...@apache.org>.
Thank you Aaron for trying the build and pointing out the issues.

On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
> 2) Tried just doing a make. This fails because none of the submodules are
> there. [...]

I downloaded the rc from the link shared by Sam [1] and it does include the
submodules. Could you provide more details on your issue?

> Downloaded the tar.gz for the release and looked at the build from
source directions on the website, but these have you use cmake and don't
really tell you what to do... 

The docs refer users to version-controlled files, as the build-from-source guide
on the website is shared among all versions, however the actual build steps
differes on different versions. I think the best way to improve it is to provide
version-specific build from source instructions via the "version selector"
feature on the get started page. Contributions towards this goal or other
improvements would be great [2].

Thanks
Leonard

[1]: https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
[2]: https://github.com/apache/incubator-mxnet/issues/18666

> 
> 
> On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
> wrote:
> 
> > Thanks for pointing this out Leonard. Has anyone been able to reproduce
> > the problem on 1.8.0.rc0?
> > 
> > Either way, I would proposed that we continue validating the release as-is
> > and see if we can find any other issues.
> > 
> > Sam
> > 
> > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> > 
> >     CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> > 
> > 
> > 
> >     Thank you Sam for driving the 1.8 release!
> > 
> >     As the included oneDNN package is known to produce nan results on the
> > master
> >     branch [1] and is pending an upstream fix by Intel, I'd suggest to
> > extend the
> >     vote until we have clarity if the bug also affects the 1.8 release,
> > given that
> >     oneDNN is enabled in the default configuration [2].
> > 
> >     [1]:
> > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> >     [2]:
> > 
> > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> > 
> >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> >     > Sam,
> >     >
> >     > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> > given
> >     > it is coming with CUDA11 and cuDNN8!!
> >     >
> >     > Fixing the release candidate link:
> >     > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> >     >
> >     > Best,
> >     > Sandeep
> >     >
> >     >
> >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > <ss...@amazon.com.invalid>
> >     > wrote:
> >     >
> >     > > Dear MXNet community,
> >     > >
> >     > > This is the vote to release Apache MXNet (incubating) version
> > 1.8.0.
> >     > > Voting will start September 26, 23:59:59 PDT and close on
> > September 29,
> >     > > 23:59:59 PDT.
> >     > >
> >     > > Link to release notes:
> >     > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> >     > >
> >     > > Link to release candidate:
> >     > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> >     > >
> >     > > Link to source and signatures on apache dist server:
> >     > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> >     > >
> >     > > Please remember to TEST first before voting accordingly:
> >     > > +1 = approve
> >     > > +0 = no opinion
> >     > > -1 = disapprove (provide reason)
> >     > >
> >     > > Best regards,
> >     > > Sam Skalicky
> >     > >
> >     > >
> >     >
> >     > --
> >     > Sandeep Krishnamurthy
> > 
> > 
> > 


Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by Qing Lan <la...@live.com>.
+1 (binding)

build from source with latest rc0 tag for Mac. I am able to build the whole Scala package and passed all tests.

Thanks,
Qing
________________________________
From: Manu Seth <ma...@gmail.com>
Sent: Monday, September 28, 2020 23:55
To: dev@mxnet.apache.org <de...@mxnet.apache.org>
Subject: Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

+1

Was able to build from source following instructions on [1] and then
comments in config_gpu.cmake file. I agree with Aaron that the instructions
should all be in one place. Users should not have to go inside cmake config
files and follow comments there.

I built MXNet on an Ubuntu 18.04 Deep Learning Base AMI, with CUDA v11.0
and cuDNN v8.0.2, and tested it by running all non operator tests in
tests/python/gpu/ folder.

[1]
https://github.com/apache/incubator-mxnet/blob/1.8.0.rc0/docs/static_site/src/pages/get_started/build_from_source.md#building-mxnet

Manu

On Mon, Sep 28, 2020 at 6:30 PM Aaron Markham <aa...@gmail.com>
wrote:

> Couple of issues with instructions:
> 1) Downloaded the tar.gz for the release and looked at the build from
> source directions on the website, but these have you use cmake and don't
> really tell you what to do... just look at the cmake config files. I mean
> sure, I guess I can look inside a config file's comments for build
> instructions. But these don't even work. (Could be related to #2, but IDK
> since I haven't really tried using the cmake route as it used to be
> incompatible with the docs/website builds.)
> 2) Tried just doing a make. This fails because none of the submodules are
> there. So where are the instructions for how to use an official
> distribution release now that so much stuff has been removed?
>
>
>
> On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <sskalic@amazon.com.invalid
> >
> wrote:
>
> > Thanks for pointing this out Leonard. Has anyone been able to reproduce
> > the problem on 1.8.0.rc0?
> >
> > Either way, I would proposed that we continue validating the release
> as-is
> > and see if we can find any other issues.
> >
> > Sam
> >
> > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> >
> >     CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> >     Thank you Sam for driving the 1.8 release!
> >
> >     As the included oneDNN package is known to produce nan results on the
> > master
> >     branch [1] and is pending an upstream fix by Intel, I'd suggest to
> > extend the
> >     vote until we have clarity if the bug also affects the 1.8 release,
> > given that
> >     oneDNN is enabled in the default configuration [2].
> >
> >     [1]:
> >
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> >     [2]:
> >
> >
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> >
> >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> >     > Sam,
> >     >
> >     > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> > given
> >     > it is coming with CUDA11 and cuDNN8!!
> >     >
> >     > Fixing the release candidate link:
> >     > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> >     >
> >     > Best,
> >     > Sandeep
> >     >
> >     >
> >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > <ss...@amazon.com.invalid>
> >     > wrote:
> >     >
> >     > > Dear MXNet community,
> >     > >
> >     > > This is the vote to release Apache MXNet (incubating) version
> > 1.8.0.
> >     > > Voting will start September 26, 23:59:59 PDT and close on
> > September 29,
> >     > > 23:59:59 PDT.
> >     > >
> >     > > Link to release notes:
> >     > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> >     > >
> >     > > Link to release candidate:
> >     > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> >     > >
> >     > > Link to source and signatures on apache dist server:
> >     > >
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> >     > >
> >     > > Please remember to TEST first before voting accordingly:
> >     > > +1 = approve
> >     > > +0 = no opinion
> >     > > -1 = disapprove (provide reason)
> >     > >
> >     > > Best regards,
> >     > > Sam Skalicky
> >     > >
> >     > >
> >     >
> >     > --
> >     > Sandeep Krishnamurthy
> >
> >
> >
>

Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by Manu Seth <ma...@gmail.com>.
+1

Was able to build from source following instructions on [1] and then
comments in config_gpu.cmake file. I agree with Aaron that the instructions
should all be in one place. Users should not have to go inside cmake config
files and follow comments there.

I built MXNet on an Ubuntu 18.04 Deep Learning Base AMI, with CUDA v11.0
and cuDNN v8.0.2, and tested it by running all non operator tests in
tests/python/gpu/ folder.

[1]
https://github.com/apache/incubator-mxnet/blob/1.8.0.rc0/docs/static_site/src/pages/get_started/build_from_source.md#building-mxnet

Manu

On Mon, Sep 28, 2020 at 6:30 PM Aaron Markham <aa...@gmail.com>
wrote:

> Couple of issues with instructions:
> 1) Downloaded the tar.gz for the release and looked at the build from
> source directions on the website, but these have you use cmake and don't
> really tell you what to do... just look at the cmake config files. I mean
> sure, I guess I can look inside a config file's comments for build
> instructions. But these don't even work. (Could be related to #2, but IDK
> since I haven't really tried using the cmake route as it used to be
> incompatible with the docs/website builds.)
> 2) Tried just doing a make. This fails because none of the submodules are
> there. So where are the instructions for how to use an official
> distribution release now that so much stuff has been removed?
>
>
>
> On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <sskalic@amazon.com.invalid
> >
> wrote:
>
> > Thanks for pointing this out Leonard. Has anyone been able to reproduce
> > the problem on 1.8.0.rc0?
> >
> > Either way, I would proposed that we continue validating the release
> as-is
> > and see if we can find any other issues.
> >
> > Sam
> >
> > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> >
> >     CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> >     Thank you Sam for driving the 1.8 release!
> >
> >     As the included oneDNN package is known to produce nan results on the
> > master
> >     branch [1] and is pending an upstream fix by Intel, I'd suggest to
> > extend the
> >     vote until we have clarity if the bug also affects the 1.8 release,
> > given that
> >     oneDNN is enabled in the default configuration [2].
> >
> >     [1]:
> >
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> >     [2]:
> >
> >
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> >
> >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> >     > Sam,
> >     >
> >     > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> > given
> >     > it is coming with CUDA11 and cuDNN8!!
> >     >
> >     > Fixing the release candidate link:
> >     > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> >     >
> >     > Best,
> >     > Sandeep
> >     >
> >     >
> >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > <ss...@amazon.com.invalid>
> >     > wrote:
> >     >
> >     > > Dear MXNet community,
> >     > >
> >     > > This is the vote to release Apache MXNet (incubating) version
> > 1.8.0.
> >     > > Voting will start September 26, 23:59:59 PDT and close on
> > September 29,
> >     > > 23:59:59 PDT.
> >     > >
> >     > > Link to release notes:
> >     > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> >     > >
> >     > > Link to release candidate:
> >     > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> >     > >
> >     > > Link to source and signatures on apache dist server:
> >     > >
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> >     > >
> >     > > Please remember to TEST first before voting accordingly:
> >     > > +1 = approve
> >     > > +0 = no opinion
> >     > > -1 = disapprove (provide reason)
> >     > >
> >     > > Best regards,
> >     > > Sam Skalicky
> >     > >
> >     > >
> >     >
> >     > --
> >     > Sandeep Krishnamurthy
> >
> >
> >
>

Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by Aaron Markham <aa...@gmail.com>.
Couple of issues with instructions:
1) Downloaded the tar.gz for the release and looked at the build from
source directions on the website, but these have you use cmake and don't
really tell you what to do... just look at the cmake config files. I mean
sure, I guess I can look inside a config file's comments for build
instructions. But these don't even work. (Could be related to #2, but IDK
since I haven't really tried using the cmake route as it used to be
incompatible with the docs/website builds.)
2) Tried just doing a make. This fails because none of the submodules are
there. So where are the instructions for how to use an official
distribution release now that so much stuff has been removed?



On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
wrote:

> Thanks for pointing this out Leonard. Has anyone been able to reproduce
> the problem on 1.8.0.rc0?
>
> Either way, I would proposed that we continue validating the release as-is
> and see if we can find any other issues.
>
> Sam
>
> On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
>     Thank you Sam for driving the 1.8 release!
>
>     As the included oneDNN package is known to produce nan results on the
> master
>     branch [1] and is pending an upstream fix by Intel, I'd suggest to
> extend the
>     vote until we have clarity if the bug also affects the 1.8 release,
> given that
>     oneDNN is enabled in the default configuration [2].
>
>     [1]:
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
>     [2]:
>
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
>
>     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
>     > Sam,
>     >
>     > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> given
>     > it is coming with CUDA11 and cuDNN8!!
>     >
>     > Fixing the release candidate link:
>     > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
>     >
>     > Best,
>     > Sandeep
>     >
>     >
>     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> <ss...@amazon.com.invalid>
>     > wrote:
>     >
>     > > Dear MXNet community,
>     > >
>     > > This is the vote to release Apache MXNet (incubating) version
> 1.8.0.
>     > > Voting will start September 26, 23:59:59 PDT and close on
> September 29,
>     > > 23:59:59 PDT.
>     > >
>     > > Link to release notes:
>     > >
> https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
>     > >
>     > > Link to release candidate:
>     > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
>     > >
>     > > Link to source and signatures on apache dist server:
>     > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
>     > >
>     > > Please remember to TEST first before voting accordingly:
>     > > +1 = approve
>     > > +0 = no opinion
>     > > -1 = disapprove (provide reason)
>     > >
>     > > Best regards,
>     > > Sam Skalicky
>     > >
>     > >
>     >
>     > --
>     > Sandeep Krishnamurthy
>
>
>

Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by Leonard Lausen <la...@apache.org>.
Vote -0.9.

Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
handles zmm registers. Together with MXNet intgemm feature (also included in 1.8
rc0) this can yield NaN results if onednn gemm is executed some time after
intgemm. [1]

Thanks
Leonard

[1]: https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056

On Mon, 2020-09-28 at 18:35 +0000, Skalicky, Sam wrote:
> Thanks for pointing this out Leonard. Has anyone been able to reproduce the
> problem on 1.8.0.rc0? 
> 
> Either way, I would proposed that we continue validating the release as-is and
> see if we can find any other issues. 
> 
> Sam
> 
> On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> 
>     CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know the
> content is safe.
> 
> 
> 
>     Thank you Sam for driving the 1.8 release!
> 
>     As the included oneDNN package is known to produce nan results on the
> master
>     branch [1] and is pending an upstream fix by Intel, I'd suggest to extend
> the
>     vote until we have clarity if the bug also affects the 1.8 release, given
> that
>     oneDNN is enabled in the default configuration [2].
> 
>     [1]: 
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
>     [2]:
>     
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> 
>     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
>     > Sam,
>     >
>     > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> given
>     > it is coming with CUDA11 and cuDNN8!!
>     >
>     > Fixing the release candidate link:
>     > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
>     >
>     > Best,
>     > Sandeep
>     >
>     >
>     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam <
> sskalic@amazon.com.invalid>
>     > wrote:
>     >
>     > > Dear MXNet community,
>     > >
>     > > This is the vote to release Apache MXNet (incubating) version 1.8.0.
>     > > Voting will start September 26, 23:59:59 PDT and close on September
> 29,
>     > > 23:59:59 PDT.
>     > >
>     > > Link to release notes:
>     > > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
>     > >
>     > > Link to release candidate:
>     > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
>     > >
>     > > Link to source and signatures on apache dist server:
>     > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
>     > >
>     > > Please remember to TEST first before voting accordingly:
>     > > +1 = approve
>     > > +0 = no opinion
>     > > -1 = disapprove (provide reason)
>     > >
>     > > Best regards,
>     > > Sam Skalicky
>     > >
>     > >
>     >
>     > --
>     > Sandeep Krishnamurthy
> 
> 


Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
Thanks for pointing this out Leonard. Has anyone been able to reproduce the problem on 1.8.0.rc0? 

Either way, I would proposed that we continue validating the release as-is and see if we can find any other issues. 

Sam

On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



    Thank you Sam for driving the 1.8 release!

    As the included oneDNN package is known to produce nan results on the master
    branch [1] and is pending an upstream fix by Intel, I'd suggest to extend the
    vote until we have clarity if the bug also affects the 1.8 release, given that
    oneDNN is enabled in the default configuration [2].

    [1]: https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
    [2]:
    https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E

    On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
    > Sam,
    >
    > Thank you for driving the v1.8.0 release of MXNet. This is exciting given
    > it is coming with CUDA11 and cuDNN8!!
    >
    > Fixing the release candidate link:
    > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
    >
    > Best,
    > Sandeep
    >
    >
    > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam <ss...@amazon.com.invalid>
    > wrote:
    >
    > > Dear MXNet community,
    > >
    > > This is the vote to release Apache MXNet (incubating) version 1.8.0.
    > > Voting will start September 26, 23:59:59 PDT and close on September 29,
    > > 23:59:59 PDT.
    > >
    > > Link to release notes:
    > > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
    > >
    > > Link to release candidate:
    > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
    > >
    > > Link to source and signatures on apache dist server:
    > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
    > >
    > > Please remember to TEST first before voting accordingly:
    > > +1 = approve
    > > +0 = no opinion
    > > -1 = disapprove (provide reason)
    > >
    > > Best regards,
    > > Sam Skalicky
    > >
    > >
    >
    > --
    > Sandeep Krishnamurthy



Re: [EXTERNAL] [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by Leonard Lausen <la...@apache.org>.
Thank you Sam for driving the 1.8 release!

As the included oneDNN package is known to produce nan results on the master
branch [1] and is pending an upstream fix by Intel, I'd suggest to extend the
vote until we have clarity if the bug also affects the 1.8 release, given that
oneDNN is enabled in the default configuration [2].

[1]: https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
[2]: 
https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E

On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> Sam,
> 
> Thank you for driving the v1.8.0 release of MXNet. This is exciting given
> it is coming with CUDA11 and cuDNN8!!
> 
> Fixing the release candidate link:
> https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> 
> Best,
> Sandeep
> 
> 
> On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam <ss...@amazon.com.invalid>
> wrote:
> 
> > Dear MXNet community,
> > 
> > This is the vote to release Apache MXNet (incubating) version 1.8.0.
> > Voting will start September 26, 23:59:59 PDT and close on September 29,
> > 23:59:59 PDT.
> > 
> > Link to release notes:
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > 
> > Link to release candidate:
> > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > 
> > Link to source and signatures on apache dist server:
> > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > 
> > Please remember to TEST first before voting accordingly:
> > +1 = approve
> > +0 = no opinion
> > -1 = disapprove (provide reason)
> > 
> > Best regards,
> > Sam Skalicky
> > 
> > 
> 
> --
> Sandeep Krishnamurthy


Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0

Posted by sandeep krishnamurthy <sa...@gmail.com>.
Sam,

Thank you for driving the v1.8.0 release of MXNet. This is exciting given
it is coming with CUDA11 and cuDNN8!!

Fixing the release candidate link:
https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0

Best,
Sandeep


On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam <ss...@amazon.com.invalid>
wrote:

> Dear MXNet community,
>
> This is the vote to release Apache MXNet (incubating) version 1.8.0.
> Voting will start September 26, 23:59:59 PDT and close on September 29,
> 23:59:59 PDT.
>
> Link to release notes:
> https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
>
> Link to release candidate:
> https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
>
> Link to source and signatures on apache dist server:
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
>
> Please remember to TEST first before voting accordingly:
> +1 = approve
> +0 = no opinion
> -1 = disapprove (provide reason)
>
> Best regards,
> Sam Skalicky
>
>

-- 
Sandeep Krishnamurthy