You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mxnet.apache.org by "Skalicky, Sam" <ss...@amazon.com.INVALID> on 2020/09/26 06:24:30 UTC
[VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Dear MXNet community,
This is the vote to release Apache MXNet (incubating) version 1.8.0. Voting will start September 26, 23:59:59 PDT and close on September 29, 23:59:59 PDT.
Link to release notes:
https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
Link to release candidate:
https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
Link to source and signatures on apache dist server:
https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
Please remember to TEST first before voting accordingly:
+1 = approve
+0 = no opinion
-1 = disapprove (provide reason)
Best regards,
Sam Skalicky
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
Hi MXNet Community,
Quick update on the progress of the fixes for the release:
[1] - Thanks to Leonard for finding the missing PRs that were in v1.7.x but missing in v1.x, and backporting them to v1.8.x
[2] - Ive backported the combined commit with the missing PRs to v1.x to make sure this doesn’t happen in a future release
[3] - Fix for the oneDNN/intgemm problem
[4] - Fix for split_and_save
[5] - Fix for setting attributes in reviewSubgraph
Tentatively we'll shoot for restarting the vote early next week once the remaining PRs with fixes are merged (we're making sure these fixes are backports from v1.x :-D).
Please reply if there are other PRs with fixes that need to be included in the v1.8.0 release.
Thanks and have a great weekend!
Sam
[1] https://github.com/apache/incubator-mxnet/pull/19262
[2] https://github.com/apache/incubator-mxnet/pull/19281
[3] https://github.com/apache/incubator-mxnet/pull/19251
[4] https://github.com/apache/incubator-mxnet/pull/19267
[5] https://github.com/apache/incubator-mxnet/pull/19278
On 9/30/20, 9:53 PM, "Skalicky, Sam" <ss...@amazon.com> wrote:
Thanks Leonard for picking up this work. Are you planning to open another PR that commits these PRs into v1.x too so this doesn’t happen again (if we ever release a 1.9 version)?
Other than these 2 PRs are there any others that are required for the v1.8.0 release?
https://github.com/apache/incubator-mxnet/pull/19251
https://github.com/apache/incubator-mxnet/pull/19262
Sam
On 9/30/20, 9:03 PM, "Leonard Lausen" <la...@apache.org> wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Thank you Sam for driving the release!
I took a quick look at the missing commits from v1.7.x in v1.8.x via the git
cherry mode and applied them to v1.8.x. Please see
https://github.com/apache/incubator-mxnet/pull/19262
The missing code in v1.8.x is substantial (+675 −518) and I thus change my vote
for the rc0 release to -1.
I hope we can include checking for missing commits via git cherry mode in the
release manager process going forward. It just takes a few minutes. If we want
to streamline the process, we can do so by avoiding to squash commits during
porting from one branch to another which reduces false positives in git cherry
mode (commits detected as missing that were actually ported).
Best regards
Leonard
On Wed, 2020-09-30 at 21:52 +0000, Skalicky, Sam wrote:
> Hi MXNet Community,
>
> Quick summary on the status of the vote:
>
> 2 +1
> 1 -0.9
>
> I spoke with Leonard offline, and the problem only impacts the specific
> instance when running MKLDNN/oneDNN immediately after intgemm. We don’t expect
> users to fall into this specific edge case, and so far the problem hasn’t been
> reproduced on 1.8.x (even through it contains the same oneDNN and intgemm
> components that are in the master branch). He proposed to not postpone the
> release for this issue, but if other issues arise we should fix this one at
> the same time.
>
> There are also still missing PRs that were in v1.7.x that were never committed
> to v1.x branch. And so when branching from v1.x to create the v1.8.x branch
> these PRs do not exist. Unfortunately no one has volunteered to port these to
> v1.x and v1.8.x branches.
>
> I propose extending the vote until Friday October 2, 23:59:59 PDT to conclude
> the discussion and get the remaining votes necessary.
>
> Thanks!
> Sam
>
> On 9/29/20, 12:41 PM, "Skalicky, Sam" <ss...@amazon.com.INVALID> wrote:
>
> There was no response from the community on the discussion thread [1]. So
> the current state is the same.
>
> [1]
> https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
>
> On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:
>
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Just one question regarding the 1.8.0.rc0. Are all PRs that are in
> 1.7.0 included in 1.8.0? For example,
> https://github.com/apache/incubator-mxnet/pull/18653
>
> Thanks,
> Xingjian
>
> On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:
>
> Thank you Aaron for trying the build and pointing out the issues.
>
> On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
> > 2) Tried just doing a make. This fails because none of the
> submodules are
> > there. [...]
>
> I downloaded the rc from the link shared by Sam [1] and it does
> include the
> submodules. Could you provide more details on your issue?
>
> > Downloaded the tar.gz for the release and looked at the build
> from
> source directions on the website, but these have you use cmake and
> don't
> really tell you what to do...
>
> The docs refer users to version-controlled files, as the build-
> from-source guide
> on the website is shared among all versions, however the actual
> build steps
> differes on different versions. I think the best way to improve it
> is to provide
> version-specific build from source instructions via the "version
> selector"
> feature on the get started page. Contributions towards this goal
> or other
> improvements would be great [2].
>
> Thanks
> Leonard
>
> [1]:
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> [2]: https://github.com/apache/incubator-mxnet/issues/18666
>
>
> On 9/29/20, 10:09 AM, "Leonard Lausen" <la...@apache.org> wrote:
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know the
> content is safe.
>
>
>
> Vote -0.9.
>
> Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
> handles zmm registers. Together with MXNet intgemm feature (also included
> in 1.8
> rc0) this can yield NaN results if onednn gemm is executed some time after
> intgemm. [1]
>
> Thanks
> Leonard
>
> [1]:
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
> >
> >
> > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <
> sskalic@amazon.com.invalid>
> > wrote:
> >
> > > Thanks for pointing this out Leonard. Has anyone been able to
> reproduce
> > > the problem on 1.8.0.rc0?
> > >
> > > Either way, I would proposed that we continue validating the
> release as-is
> > > and see if we can find any other issues.
> > >
> > > Sam
> > >
> > > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org>
> wrote:
> > >
> > > CAUTION: This email originated from outside of the
> organization. Do
> > > not click links or open attachments unless you can confirm the
> sender and
> > > know the content is safe.
> > >
> > >
> > >
> > > Thank you Sam for driving the 1.8 release!
> > >
> > > As the included oneDNN package is known to produce nan
> results on the
> > > master
> > > branch [1] and is pending an upstream fix by Intel, I'd
> suggest to
> > > extend the
> > > vote until we have clarity if the bug also affects the 1.8
> release,
> > > given that
> > > oneDNN is enabled in the default configuration [2].
> > >
> > > [1]:
> > >
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> > > [2]:
> > >
> > >
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> > >
> > > On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy
> wrote:
> > > > Sam,
> > > >
> > > > Thank you for driving the v1.8.0 release of MXNet. This
> is exciting
> > > given
> > > > it is coming with CUDA11 and cuDNN8!!
> > > >
> > > > Fixing the release candidate link:
> > > >
> https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> > > >
> > > > Best,
> > > > Sandeep
> > > >
> > > >
> > > > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > > <ss...@amazon.com.invalid>
> > > > wrote:
> > > >
> > > > > Dear MXNet community,
> > > > >
> > > > > This is the vote to release Apache MXNet (incubating)
> version
> > > 1.8.0.
> > > > > Voting will start September 26, 23:59:59 PDT and close
> on
> > > September 29,
> > > > > 23:59:59 PDT.
> > > > >
> > > > > Link to release notes:
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > > > >
> > > > > Link to release candidate:
> > > > >
> https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > > > >
> > > > > Link to source and signatures on apache dist server:
> > > > >
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > > > >
> > > > > Please remember to TEST first before voting
> accordingly:
> > > > > +1 = approve
> > > > > +0 = no opinion
> > > > > -1 = disapprove (provide reason)
> > > > >
> > > > > Best regards,
> > > > > Sam Skalicky
> > > > >
> > > > >
> > > >
> > > > --
> > > > Sandeep Krishnamurthy
> > >
> > >
> > >
>
>
>
>
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
Thanks Leonard for picking up this work. Are you planning to open another PR that commits these PRs into v1.x too so this doesn’t happen again (if we ever release a 1.9 version)?
Other than these 2 PRs are there any others that are required for the v1.8.0 release?
https://github.com/apache/incubator-mxnet/pull/19251
https://github.com/apache/incubator-mxnet/pull/19262
Sam
On 9/30/20, 9:03 PM, "Leonard Lausen" <la...@apache.org> wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Thank you Sam for driving the release!
I took a quick look at the missing commits from v1.7.x in v1.8.x via the git
cherry mode and applied them to v1.8.x. Please see
https://github.com/apache/incubator-mxnet/pull/19262
The missing code in v1.8.x is substantial (+675 −518) and I thus change my vote
for the rc0 release to -1.
I hope we can include checking for missing commits via git cherry mode in the
release manager process going forward. It just takes a few minutes. If we want
to streamline the process, we can do so by avoiding to squash commits during
porting from one branch to another which reduces false positives in git cherry
mode (commits detected as missing that were actually ported).
Best regards
Leonard
On Wed, 2020-09-30 at 21:52 +0000, Skalicky, Sam wrote:
> Hi MXNet Community,
>
> Quick summary on the status of the vote:
>
> 2 +1
> 1 -0.9
>
> I spoke with Leonard offline, and the problem only impacts the specific
> instance when running MKLDNN/oneDNN immediately after intgemm. We don’t expect
> users to fall into this specific edge case, and so far the problem hasn’t been
> reproduced on 1.8.x (even through it contains the same oneDNN and intgemm
> components that are in the master branch). He proposed to not postpone the
> release for this issue, but if other issues arise we should fix this one at
> the same time.
>
> There are also still missing PRs that were in v1.7.x that were never committed
> to v1.x branch. And so when branching from v1.x to create the v1.8.x branch
> these PRs do not exist. Unfortunately no one has volunteered to port these to
> v1.x and v1.8.x branches.
>
> I propose extending the vote until Friday October 2, 23:59:59 PDT to conclude
> the discussion and get the remaining votes necessary.
>
> Thanks!
> Sam
>
> On 9/29/20, 12:41 PM, "Skalicky, Sam" <ss...@amazon.com.INVALID> wrote:
>
> There was no response from the community on the discussion thread [1]. So
> the current state is the same.
>
> [1]
> https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
>
> On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:
>
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Just one question regarding the 1.8.0.rc0. Are all PRs that are in
> 1.7.0 included in 1.8.0? For example,
> https://github.com/apache/incubator-mxnet/pull/18653
>
> Thanks,
> Xingjian
>
> On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:
>
> Thank you Aaron for trying the build and pointing out the issues.
>
> On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
> > 2) Tried just doing a make. This fails because none of the
> submodules are
> > there. [...]
>
> I downloaded the rc from the link shared by Sam [1] and it does
> include the
> submodules. Could you provide more details on your issue?
>
> > Downloaded the tar.gz for the release and looked at the build
> from
> source directions on the website, but these have you use cmake and
> don't
> really tell you what to do...
>
> The docs refer users to version-controlled files, as the build-
> from-source guide
> on the website is shared among all versions, however the actual
> build steps
> differes on different versions. I think the best way to improve it
> is to provide
> version-specific build from source instructions via the "version
> selector"
> feature on the get started page. Contributions towards this goal
> or other
> improvements would be great [2].
>
> Thanks
> Leonard
>
> [1]:
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> [2]: https://github.com/apache/incubator-mxnet/issues/18666
>
>
> On 9/29/20, 10:09 AM, "Leonard Lausen" <la...@apache.org> wrote:
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know the
> content is safe.
>
>
>
> Vote -0.9.
>
> Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
> handles zmm registers. Together with MXNet intgemm feature (also included
> in 1.8
> rc0) this can yield NaN results if onednn gemm is executed some time after
> intgemm. [1]
>
> Thanks
> Leonard
>
> [1]:
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
> >
> >
> > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <
> sskalic@amazon.com.invalid>
> > wrote:
> >
> > > Thanks for pointing this out Leonard. Has anyone been able to
> reproduce
> > > the problem on 1.8.0.rc0?
> > >
> > > Either way, I would proposed that we continue validating the
> release as-is
> > > and see if we can find any other issues.
> > >
> > > Sam
> > >
> > > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org>
> wrote:
> > >
> > > CAUTION: This email originated from outside of the
> organization. Do
> > > not click links or open attachments unless you can confirm the
> sender and
> > > know the content is safe.
> > >
> > >
> > >
> > > Thank you Sam for driving the 1.8 release!
> > >
> > > As the included oneDNN package is known to produce nan
> results on the
> > > master
> > > branch [1] and is pending an upstream fix by Intel, I'd
> suggest to
> > > extend the
> > > vote until we have clarity if the bug also affects the 1.8
> release,
> > > given that
> > > oneDNN is enabled in the default configuration [2].
> > >
> > > [1]:
> > >
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> > > [2]:
> > >
> > >
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> > >
> > > On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy
> wrote:
> > > > Sam,
> > > >
> > > > Thank you for driving the v1.8.0 release of MXNet. This
> is exciting
> > > given
> > > > it is coming with CUDA11 and cuDNN8!!
> > > >
> > > > Fixing the release candidate link:
> > > >
> https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> > > >
> > > > Best,
> > > > Sandeep
> > > >
> > > >
> > > > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > > <ss...@amazon.com.invalid>
> > > > wrote:
> > > >
> > > > > Dear MXNet community,
> > > > >
> > > > > This is the vote to release Apache MXNet (incubating)
> version
> > > 1.8.0.
> > > > > Voting will start September 26, 23:59:59 PDT and close
> on
> > > September 29,
> > > > > 23:59:59 PDT.
> > > > >
> > > > > Link to release notes:
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > > > >
> > > > > Link to release candidate:
> > > > >
> https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > > > >
> > > > > Link to source and signatures on apache dist server:
> > > > >
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > > > >
> > > > > Please remember to TEST first before voting
> accordingly:
> > > > > +1 = approve
> > > > > +0 = no opinion
> > > > > -1 = disapprove (provide reason)
> > > > >
> > > > > Best regards,
> > > > > Sam Skalicky
> > > > >
> > > > >
> > > >
> > > > --
> > > > Sandeep Krishnamurthy
> > >
> > >
> > >
>
>
>
>
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by Leonard Lausen <la...@apache.org>.
Thank you Sam for driving the release!
I took a quick look at the missing commits from v1.7.x in v1.8.x via the git
cherry mode and applied them to v1.8.x. Please see
https://github.com/apache/incubator-mxnet/pull/19262
The missing code in v1.8.x is substantial (+675 −518) and I thus change my vote
for the rc0 release to -1.
I hope we can include checking for missing commits via git cherry mode in the
release manager process going forward. It just takes a few minutes. If we want
to streamline the process, we can do so by avoiding to squash commits during
porting from one branch to another which reduces false positives in git cherry
mode (commits detected as missing that were actually ported).
Best regards
Leonard
On Wed, 2020-09-30 at 21:52 +0000, Skalicky, Sam wrote:
> Hi MXNet Community,
>
> Quick summary on the status of the vote:
>
> 2 +1
> 1 -0.9
>
> I spoke with Leonard offline, and the problem only impacts the specific
> instance when running MKLDNN/oneDNN immediately after intgemm. We don’t expect
> users to fall into this specific edge case, and so far the problem hasn’t been
> reproduced on 1.8.x (even through it contains the same oneDNN and intgemm
> components that are in the master branch). He proposed to not postpone the
> release for this issue, but if other issues arise we should fix this one at
> the same time.
>
> There are also still missing PRs that were in v1.7.x that were never committed
> to v1.x branch. And so when branching from v1.x to create the v1.8.x branch
> these PRs do not exist. Unfortunately no one has volunteered to port these to
> v1.x and v1.8.x branches.
>
> I propose extending the vote until Friday October 2, 23:59:59 PDT to conclude
> the discussion and get the remaining votes necessary.
>
> Thanks!
> Sam
>
> On 9/29/20, 12:41 PM, "Skalicky, Sam" <ss...@amazon.com.INVALID> wrote:
>
> There was no response from the community on the discussion thread [1]. So
> the current state is the same.
>
> [1]
> https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
>
> On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:
>
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Just one question regarding the 1.8.0.rc0. Are all PRs that are in
> 1.7.0 included in 1.8.0? For example,
> https://github.com/apache/incubator-mxnet/pull/18653
>
> Thanks,
> Xingjian
>
> On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:
>
> Thank you Aaron for trying the build and pointing out the issues.
>
> On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
> > 2) Tried just doing a make. This fails because none of the
> submodules are
> > there. [...]
>
> I downloaded the rc from the link shared by Sam [1] and it does
> include the
> submodules. Could you provide more details on your issue?
>
> > Downloaded the tar.gz for the release and looked at the build
> from
> source directions on the website, but these have you use cmake and
> don't
> really tell you what to do...
>
> The docs refer users to version-controlled files, as the build-
> from-source guide
> on the website is shared among all versions, however the actual
> build steps
> differes on different versions. I think the best way to improve it
> is to provide
> version-specific build from source instructions via the "version
> selector"
> feature on the get started page. Contributions towards this goal
> or other
> improvements would be great [2].
>
> Thanks
> Leonard
>
> [1]:
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> [2]: https://github.com/apache/incubator-mxnet/issues/18666
>
>
> On 9/29/20, 10:09 AM, "Leonard Lausen" <la...@apache.org> wrote:
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know the
> content is safe.
>
>
>
> Vote -0.9.
>
> Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
> handles zmm registers. Together with MXNet intgemm feature (also included
> in 1.8
> rc0) this can yield NaN results if onednn gemm is executed some time after
> intgemm. [1]
>
> Thanks
> Leonard
>
> [1]:
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
> >
> >
> > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <
> sskalic@amazon.com.invalid>
> > wrote:
> >
> > > Thanks for pointing this out Leonard. Has anyone been able to
> reproduce
> > > the problem on 1.8.0.rc0?
> > >
> > > Either way, I would proposed that we continue validating the
> release as-is
> > > and see if we can find any other issues.
> > >
> > > Sam
> > >
> > > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org>
> wrote:
> > >
> > > CAUTION: This email originated from outside of the
> organization. Do
> > > not click links or open attachments unless you can confirm the
> sender and
> > > know the content is safe.
> > >
> > >
> > >
> > > Thank you Sam for driving the 1.8 release!
> > >
> > > As the included oneDNN package is known to produce nan
> results on the
> > > master
> > > branch [1] and is pending an upstream fix by Intel, I'd
> suggest to
> > > extend the
> > > vote until we have clarity if the bug also affects the 1.8
> release,
> > > given that
> > > oneDNN is enabled in the default configuration [2].
> > >
> > > [1]:
> > >
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> > > [2]:
> > >
> > >
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> > >
> > > On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy
> wrote:
> > > > Sam,
> > > >
> > > > Thank you for driving the v1.8.0 release of MXNet. This
> is exciting
> > > given
> > > > it is coming with CUDA11 and cuDNN8!!
> > > >
> > > > Fixing the release candidate link:
> > > >
> https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> > > >
> > > > Best,
> > > > Sandeep
> > > >
> > > >
> > > > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > > <ss...@amazon.com.invalid>
> > > > wrote:
> > > >
> > > > > Dear MXNet community,
> > > > >
> > > > > This is the vote to release Apache MXNet (incubating)
> version
> > > 1.8.0.
> > > > > Voting will start September 26, 23:59:59 PDT and close
> on
> > > September 29,
> > > > > 23:59:59 PDT.
> > > > >
> > > > > Link to release notes:
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > > > >
> > > > > Link to release candidate:
> > > > >
> https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > > > >
> > > > > Link to source and signatures on apache dist server:
> > > > >
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > > > >
> > > > > Please remember to TEST first before voting
> accordingly:
> > > > > +1 = approve
> > > > > +0 = no opinion
> > > > > -1 = disapprove (provide reason)
> > > > >
> > > > > Best regards,
> > > > > Sam Skalicky
> > > > >
> > > > >
> > > >
> > > > --
> > > > Sandeep Krishnamurthy
> > >
> > >
> > >
>
>
>
>
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
Hi MXNet Community,
Quick summary on the status of the vote:
2 +1
1 -0.9
I spoke with Leonard offline, and the problem only impacts the specific instance when running MKLDNN/oneDNN immediately after intgemm. We don’t expect users to fall into this specific edge case, and so far the problem hasn’t been reproduced on 1.8.x (even through it contains the same oneDNN and intgemm components that are in the master branch). He proposed to not postpone the release for this issue, but if other issues arise we should fix this one at the same time.
There are also still missing PRs that were in v1.7.x that were never committed to v1.x branch. And so when branching from v1.x to create the v1.8.x branch these PRs do not exist. Unfortunately no one has volunteered to port these to v1.x and v1.8.x branches.
I propose extending the vote until Friday October 2, 23:59:59 PDT to conclude the discussion and get the remaining votes necessary.
Thanks!
Sam
On 9/29/20, 12:41 PM, "Skalicky, Sam" <ss...@amazon.com.INVALID> wrote:
There was no response from the community on the discussion thread [1]. So the current state is the same.
[1] https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Just one question regarding the 1.8.0.rc0. Are all PRs that are in 1.7.0 included in 1.8.0? For example, https://github.com/apache/incubator-mxnet/pull/18653
Thanks,
Xingjian
On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:
Thank you Aaron for trying the build and pointing out the issues.
On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
> 2) Tried just doing a make. This fails because none of the submodules are
> there. [...]
I downloaded the rc from the link shared by Sam [1] and it does include the
submodules. Could you provide more details on your issue?
> Downloaded the tar.gz for the release and looked at the build from
source directions on the website, but these have you use cmake and don't
really tell you what to do...
The docs refer users to version-controlled files, as the build-from-source guide
on the website is shared among all versions, however the actual build steps
differes on different versions. I think the best way to improve it is to provide
version-specific build from source instructions via the "version selector"
feature on the get started page. Contributions towards this goal or other
improvements would be great [2].
Thanks
Leonard
[1]: https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
[2]: https://github.com/apache/incubator-mxnet/issues/18666
On 9/29/20, 10:09 AM, "Leonard Lausen" <la...@apache.org> wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Vote -0.9.
Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
handles zmm registers. Together with MXNet intgemm feature (also included in 1.8
rc0) this can yield NaN results if onednn gemm is executed some time after
intgemm. [1]
Thanks
Leonard
[1]: https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
>
>
> On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
> wrote:
>
> > Thanks for pointing this out Leonard. Has anyone been able to reproduce
> > the problem on 1.8.0.rc0?
> >
> > Either way, I would proposed that we continue validating the release as-is
> > and see if we can find any other issues.
> >
> > Sam
> >
> > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> > Thank you Sam for driving the 1.8 release!
> >
> > As the included oneDNN package is known to produce nan results on the
> > master
> > branch [1] and is pending an upstream fix by Intel, I'd suggest to
> > extend the
> > vote until we have clarity if the bug also affects the 1.8 release,
> > given that
> > oneDNN is enabled in the default configuration [2].
> >
> > [1]:
> > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> > [2]:
> >
> > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> >
> > On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> > > Sam,
> > >
> > > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> > given
> > > it is coming with CUDA11 and cuDNN8!!
> > >
> > > Fixing the release candidate link:
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> > >
> > > Best,
> > > Sandeep
> > >
> > >
> > > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > <ss...@amazon.com.invalid>
> > > wrote:
> > >
> > > > Dear MXNet community,
> > > >
> > > > This is the vote to release Apache MXNet (incubating) version
> > 1.8.0.
> > > > Voting will start September 26, 23:59:59 PDT and close on
> > September 29,
> > > > 23:59:59 PDT.
> > > >
> > > > Link to release notes:
> > > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > > >
> > > > Link to release candidate:
> > > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > > >
> > > > Link to source and signatures on apache dist server:
> > > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > > >
> > > > Please remember to TEST first before voting accordingly:
> > > > +1 = approve
> > > > +0 = no opinion
> > > > -1 = disapprove (provide reason)
> > > >
> > > > Best regards,
> > > > Sam Skalicky
> > > >
> > > >
> > >
> > > --
> > > Sandeep Krishnamurthy
> >
> >
> >
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
There was no response from the community on the discussion thread [1]. So the current state is the same.
[1] https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
On 9/29/20, 11:36 AM, "Xingjian SHI" <xs...@connect.ust.hk> wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Just one question regarding the 1.8.0.rc0. Are all PRs that are in 1.7.0 included in 1.8.0? For example, https://github.com/apache/incubator-mxnet/pull/18653
Thanks,
Xingjian
On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:
Thank you Aaron for trying the build and pointing out the issues.
On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
> 2) Tried just doing a make. This fails because none of the submodules are
> there. [...]
I downloaded the rc from the link shared by Sam [1] and it does include the
submodules. Could you provide more details on your issue?
> Downloaded the tar.gz for the release and looked at the build from
source directions on the website, but these have you use cmake and don't
really tell you what to do...
The docs refer users to version-controlled files, as the build-from-source guide
on the website is shared among all versions, however the actual build steps
differes on different versions. I think the best way to improve it is to provide
version-specific build from source instructions via the "version selector"
feature on the get started page. Contributions towards this goal or other
improvements would be great [2].
Thanks
Leonard
[1]: https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
[2]: https://github.com/apache/incubator-mxnet/issues/18666
>
>
> On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
> wrote:
>
> > Thanks for pointing this out Leonard. Has anyone been able to reproduce
> > the problem on 1.8.0.rc0?
> >
> > Either way, I would proposed that we continue validating the release as-is
> > and see if we can find any other issues.
> >
> > Sam
> >
> > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> > Thank you Sam for driving the 1.8 release!
> >
> > As the included oneDNN package is known to produce nan results on the
> > master
> > branch [1] and is pending an upstream fix by Intel, I'd suggest to
> > extend the
> > vote until we have clarity if the bug also affects the 1.8 release,
> > given that
> > oneDNN is enabled in the default configuration [2].
> >
> > [1]:
> > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> > [2]:
> >
> > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> >
> > On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> > > Sam,
> > >
> > > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> > given
> > > it is coming with CUDA11 and cuDNN8!!
> > >
> > > Fixing the release candidate link:
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> > >
> > > Best,
> > > Sandeep
> > >
> > >
> > > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > <ss...@amazon.com.invalid>
> > > wrote:
> > >
> > > > Dear MXNet community,
> > > >
> > > > This is the vote to release Apache MXNet (incubating) version
> > 1.8.0.
> > > > Voting will start September 26, 23:59:59 PDT and close on
> > September 29,
> > > > 23:59:59 PDT.
> > > >
> > > > Link to release notes:
> > > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > > >
> > > > Link to release candidate:
> > > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > > >
> > > > Link to source and signatures on apache dist server:
> > > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > > >
> > > > Please remember to TEST first before voting accordingly:
> > > > +1 = approve
> > > > +0 = no opinion
> > > > -1 = disapprove (provide reason)
> > > >
> > > > Best regards,
> > > > Sam Skalicky
> > > >
> > > >
> > >
> > > --
> > > Sandeep Krishnamurthy
> >
> >
> >
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by Xingjian SHI <xs...@connect.ust.hk>.
Just one question regarding the 1.8.0.rc0. Are all PRs that are in 1.7.0 included in 1.8.0? For example, https://github.com/apache/incubator-mxnet/pull/18653
Thanks,
Xingjian
On 9/29/20, 10:20 AM, "Leonard Lausen" <la...@apache.org> wrote:
Thank you Aaron for trying the build and pointing out the issues.
On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
> 2) Tried just doing a make. This fails because none of the submodules are
> there. [...]
I downloaded the rc from the link shared by Sam [1] and it does include the
submodules. Could you provide more details on your issue?
> Downloaded the tar.gz for the release and looked at the build from
source directions on the website, but these have you use cmake and don't
really tell you what to do...
The docs refer users to version-controlled files, as the build-from-source guide
on the website is shared among all versions, however the actual build steps
differes on different versions. I think the best way to improve it is to provide
version-specific build from source instructions via the "version selector"
feature on the get started page. Contributions towards this goal or other
improvements would be great [2].
Thanks
Leonard
[1]: https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
[2]: https://github.com/apache/incubator-mxnet/issues/18666
>
>
> On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
> wrote:
>
> > Thanks for pointing this out Leonard. Has anyone been able to reproduce
> > the problem on 1.8.0.rc0?
> >
> > Either way, I would proposed that we continue validating the release as-is
> > and see if we can find any other issues.
> >
> > Sam
> >
> > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> > Thank you Sam for driving the 1.8 release!
> >
> > As the included oneDNN package is known to produce nan results on the
> > master
> > branch [1] and is pending an upstream fix by Intel, I'd suggest to
> > extend the
> > vote until we have clarity if the bug also affects the 1.8 release,
> > given that
> > oneDNN is enabled in the default configuration [2].
> >
> > [1]:
> > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> > [2]:
> >
> > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> >
> > On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> > > Sam,
> > >
> > > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> > given
> > > it is coming with CUDA11 and cuDNN8!!
> > >
> > > Fixing the release candidate link:
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> > >
> > > Best,
> > > Sandeep
> > >
> > >
> > > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > <ss...@amazon.com.invalid>
> > > wrote:
> > >
> > > > Dear MXNet community,
> > > >
> > > > This is the vote to release Apache MXNet (incubating) version
> > 1.8.0.
> > > > Voting will start September 26, 23:59:59 PDT and close on
> > September 29,
> > > > 23:59:59 PDT.
> > > >
> > > > Link to release notes:
> > > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > > >
> > > > Link to release candidate:
> > > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > > >
> > > > Link to source and signatures on apache dist server:
> > > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > > >
> > > > Please remember to TEST first before voting accordingly:
> > > > +1 = approve
> > > > +0 = no opinion
> > > > -1 = disapprove (provide reason)
> > > >
> > > > Best regards,
> > > > Sam Skalicky
> > > >
> > > >
> > >
> > > --
> > > Sandeep Krishnamurthy
> >
> >
> >
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by Leonard Lausen <la...@apache.org>.
Thank you Aaron for trying the build and pointing out the issues.
On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
> 2) Tried just doing a make. This fails because none of the submodules are
> there. [...]
I downloaded the rc from the link shared by Sam [1] and it does include the
submodules. Could you provide more details on your issue?
> Downloaded the tar.gz for the release and looked at the build from
source directions on the website, but these have you use cmake and don't
really tell you what to do...
The docs refer users to version-controlled files, as the build-from-source guide
on the website is shared among all versions, however the actual build steps
differes on different versions. I think the best way to improve it is to provide
version-specific build from source instructions via the "version selector"
feature on the get started page. Contributions towards this goal or other
improvements would be great [2].
Thanks
Leonard
[1]: https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
[2]: https://github.com/apache/incubator-mxnet/issues/18666
>
>
> On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
> wrote:
>
> > Thanks for pointing this out Leonard. Has anyone been able to reproduce
> > the problem on 1.8.0.rc0?
> >
> > Either way, I would proposed that we continue validating the release as-is
> > and see if we can find any other issues.
> >
> > Sam
> >
> > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> > Thank you Sam for driving the 1.8 release!
> >
> > As the included oneDNN package is known to produce nan results on the
> > master
> > branch [1] and is pending an upstream fix by Intel, I'd suggest to
> > extend the
> > vote until we have clarity if the bug also affects the 1.8 release,
> > given that
> > oneDNN is enabled in the default configuration [2].
> >
> > [1]:
> > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> > [2]:
> >
> > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> >
> > On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> > > Sam,
> > >
> > > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> > given
> > > it is coming with CUDA11 and cuDNN8!!
> > >
> > > Fixing the release candidate link:
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> > >
> > > Best,
> > > Sandeep
> > >
> > >
> > > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > <ss...@amazon.com.invalid>
> > > wrote:
> > >
> > > > Dear MXNet community,
> > > >
> > > > This is the vote to release Apache MXNet (incubating) version
> > 1.8.0.
> > > > Voting will start September 26, 23:59:59 PDT and close on
> > September 29,
> > > > 23:59:59 PDT.
> > > >
> > > > Link to release notes:
> > > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > > >
> > > > Link to release candidate:
> > > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > > >
> > > > Link to source and signatures on apache dist server:
> > > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > > >
> > > > Please remember to TEST first before voting accordingly:
> > > > +1 = approve
> > > > +0 = no opinion
> > > > -1 = disapprove (provide reason)
> > > >
> > > > Best regards,
> > > > Sam Skalicky
> > > >
> > > >
> > >
> > > --
> > > Sandeep Krishnamurthy
> >
> >
> >
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by Qing Lan <la...@live.com>.
+1 (binding)
build from source with latest rc0 tag for Mac. I am able to build the whole Scala package and passed all tests.
Thanks,
Qing
________________________________
From: Manu Seth <ma...@gmail.com>
Sent: Monday, September 28, 2020 23:55
To: dev@mxnet.apache.org <de...@mxnet.apache.org>
Subject: Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
+1
Was able to build from source following instructions on [1] and then
comments in config_gpu.cmake file. I agree with Aaron that the instructions
should all be in one place. Users should not have to go inside cmake config
files and follow comments there.
I built MXNet on an Ubuntu 18.04 Deep Learning Base AMI, with CUDA v11.0
and cuDNN v8.0.2, and tested it by running all non operator tests in
tests/python/gpu/ folder.
[1]
https://github.com/apache/incubator-mxnet/blob/1.8.0.rc0/docs/static_site/src/pages/get_started/build_from_source.md#building-mxnet
Manu
On Mon, Sep 28, 2020 at 6:30 PM Aaron Markham <aa...@gmail.com>
wrote:
> Couple of issues with instructions:
> 1) Downloaded the tar.gz for the release and looked at the build from
> source directions on the website, but these have you use cmake and don't
> really tell you what to do... just look at the cmake config files. I mean
> sure, I guess I can look inside a config file's comments for build
> instructions. But these don't even work. (Could be related to #2, but IDK
> since I haven't really tried using the cmake route as it used to be
> incompatible with the docs/website builds.)
> 2) Tried just doing a make. This fails because none of the submodules are
> there. So where are the instructions for how to use an official
> distribution release now that so much stuff has been removed?
>
>
>
> On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <sskalic@amazon.com.invalid
> >
> wrote:
>
> > Thanks for pointing this out Leonard. Has anyone been able to reproduce
> > the problem on 1.8.0.rc0?
> >
> > Either way, I would proposed that we continue validating the release
> as-is
> > and see if we can find any other issues.
> >
> > Sam
> >
> > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> > Thank you Sam for driving the 1.8 release!
> >
> > As the included oneDNN package is known to produce nan results on the
> > master
> > branch [1] and is pending an upstream fix by Intel, I'd suggest to
> > extend the
> > vote until we have clarity if the bug also affects the 1.8 release,
> > given that
> > oneDNN is enabled in the default configuration [2].
> >
> > [1]:
> >
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> > [2]:
> >
> >
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> >
> > On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> > > Sam,
> > >
> > > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> > given
> > > it is coming with CUDA11 and cuDNN8!!
> > >
> > > Fixing the release candidate link:
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> > >
> > > Best,
> > > Sandeep
> > >
> > >
> > > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > <ss...@amazon.com.invalid>
> > > wrote:
> > >
> > > > Dear MXNet community,
> > > >
> > > > This is the vote to release Apache MXNet (incubating) version
> > 1.8.0.
> > > > Voting will start September 26, 23:59:59 PDT and close on
> > September 29,
> > > > 23:59:59 PDT.
> > > >
> > > > Link to release notes:
> > > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > > >
> > > > Link to release candidate:
> > > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > > >
> > > > Link to source and signatures on apache dist server:
> > > >
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > > >
> > > > Please remember to TEST first before voting accordingly:
> > > > +1 = approve
> > > > +0 = no opinion
> > > > -1 = disapprove (provide reason)
> > > >
> > > > Best regards,
> > > > Sam Skalicky
> > > >
> > > >
> > >
> > > --
> > > Sandeep Krishnamurthy
> >
> >
> >
>
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by Manu Seth <ma...@gmail.com>.
+1
Was able to build from source following instructions on [1] and then
comments in config_gpu.cmake file. I agree with Aaron that the instructions
should all be in one place. Users should not have to go inside cmake config
files and follow comments there.
I built MXNet on an Ubuntu 18.04 Deep Learning Base AMI, with CUDA v11.0
and cuDNN v8.0.2, and tested it by running all non operator tests in
tests/python/gpu/ folder.
[1]
https://github.com/apache/incubator-mxnet/blob/1.8.0.rc0/docs/static_site/src/pages/get_started/build_from_source.md#building-mxnet
Manu
On Mon, Sep 28, 2020 at 6:30 PM Aaron Markham <aa...@gmail.com>
wrote:
> Couple of issues with instructions:
> 1) Downloaded the tar.gz for the release and looked at the build from
> source directions on the website, but these have you use cmake and don't
> really tell you what to do... just look at the cmake config files. I mean
> sure, I guess I can look inside a config file's comments for build
> instructions. But these don't even work. (Could be related to #2, but IDK
> since I haven't really tried using the cmake route as it used to be
> incompatible with the docs/website builds.)
> 2) Tried just doing a make. This fails because none of the submodules are
> there. So where are the instructions for how to use an official
> distribution release now that so much stuff has been removed?
>
>
>
> On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <sskalic@amazon.com.invalid
> >
> wrote:
>
> > Thanks for pointing this out Leonard. Has anyone been able to reproduce
> > the problem on 1.8.0.rc0?
> >
> > Either way, I would proposed that we continue validating the release
> as-is
> > and see if we can find any other issues.
> >
> > Sam
> >
> > On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> > Thank you Sam for driving the 1.8 release!
> >
> > As the included oneDNN package is known to produce nan results on the
> > master
> > branch [1] and is pending an upstream fix by Intel, I'd suggest to
> > extend the
> > vote until we have clarity if the bug also affects the 1.8 release,
> > given that
> > oneDNN is enabled in the default configuration [2].
> >
> > [1]:
> >
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> > [2]:
> >
> >
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
> >
> > On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> > > Sam,
> > >
> > > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> > given
> > > it is coming with CUDA11 and cuDNN8!!
> > >
> > > Fixing the release candidate link:
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> > >
> > > Best,
> > > Sandeep
> > >
> > >
> > > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> > <ss...@amazon.com.invalid>
> > > wrote:
> > >
> > > > Dear MXNet community,
> > > >
> > > > This is the vote to release Apache MXNet (incubating) version
> > 1.8.0.
> > > > Voting will start September 26, 23:59:59 PDT and close on
> > September 29,
> > > > 23:59:59 PDT.
> > > >
> > > > Link to release notes:
> > > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > > >
> > > > Link to release candidate:
> > > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > > >
> > > > Link to source and signatures on apache dist server:
> > > >
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > > >
> > > > Please remember to TEST first before voting accordingly:
> > > > +1 = approve
> > > > +0 = no opinion
> > > > -1 = disapprove (provide reason)
> > > >
> > > > Best regards,
> > > > Sam Skalicky
> > > >
> > > >
> > >
> > > --
> > > Sandeep Krishnamurthy
> >
> >
> >
>
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by Aaron Markham <aa...@gmail.com>.
Couple of issues with instructions:
1) Downloaded the tar.gz for the release and looked at the build from
source directions on the website, but these have you use cmake and don't
really tell you what to do... just look at the cmake config files. I mean
sure, I guess I can look inside a config file's comments for build
instructions. But these don't even work. (Could be related to #2, but IDK
since I haven't really tried using the cmake route as it used to be
incompatible with the docs/website builds.)
2) Tried just doing a make. This fails because none of the submodules are
there. So where are the instructions for how to use an official
distribution release now that so much stuff has been removed?
On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <ss...@amazon.com.invalid>
wrote:
> Thanks for pointing this out Leonard. Has anyone been able to reproduce
> the problem on 1.8.0.rc0?
>
> Either way, I would proposed that we continue validating the release as-is
> and see if we can find any other issues.
>
> Sam
>
> On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
>
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
> Thank you Sam for driving the 1.8 release!
>
> As the included oneDNN package is known to produce nan results on the
> master
> branch [1] and is pending an upstream fix by Intel, I'd suggest to
> extend the
> vote until we have clarity if the bug also affects the 1.8 release,
> given that
> oneDNN is enabled in the default configuration [2].
>
> [1]:
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> [2]:
>
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
>
> On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> > Sam,
> >
> > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> given
> > it is coming with CUDA11 and cuDNN8!!
> >
> > Fixing the release candidate link:
> > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> >
> > Best,
> > Sandeep
> >
> >
> > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
> <ss...@amazon.com.invalid>
> > wrote:
> >
> > > Dear MXNet community,
> > >
> > > This is the vote to release Apache MXNet (incubating) version
> 1.8.0.
> > > Voting will start September 26, 23:59:59 PDT and close on
> September 29,
> > > 23:59:59 PDT.
> > >
> > > Link to release notes:
> > >
> https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > >
> > > Link to release candidate:
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > >
> > > Link to source and signatures on apache dist server:
> > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > >
> > > Please remember to TEST first before voting accordingly:
> > > +1 = approve
> > > +0 = no opinion
> > > -1 = disapprove (provide reason)
> > >
> > > Best regards,
> > > Sam Skalicky
> > >
> > >
> >
> > --
> > Sandeep Krishnamurthy
>
>
>
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by Leonard Lausen <la...@apache.org>.
Vote -0.9.
Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
handles zmm registers. Together with MXNet intgemm feature (also included in 1.8
rc0) this can yield NaN results if onednn gemm is executed some time after
intgemm. [1]
Thanks
Leonard
[1]: https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
On Mon, 2020-09-28 at 18:35 +0000, Skalicky, Sam wrote:
> Thanks for pointing this out Leonard. Has anyone been able to reproduce the
> problem on 1.8.0.rc0?
>
> Either way, I would proposed that we continue validating the release as-is and
> see if we can find any other issues.
>
> Sam
>
> On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know the
> content is safe.
>
>
>
> Thank you Sam for driving the 1.8 release!
>
> As the included oneDNN package is known to produce nan results on the
> master
> branch [1] and is pending an upstream fix by Intel, I'd suggest to extend
> the
> vote until we have clarity if the bug also affects the 1.8 release, given
> that
> oneDNN is enabled in the default configuration [2].
>
> [1]:
> https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
> [2]:
>
> https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
>
> On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> > Sam,
> >
> > Thank you for driving the v1.8.0 release of MXNet. This is exciting
> given
> > it is coming with CUDA11 and cuDNN8!!
> >
> > Fixing the release candidate link:
> > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
> >
> > Best,
> > Sandeep
> >
> >
> > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam <
> sskalic@amazon.com.invalid>
> > wrote:
> >
> > > Dear MXNet community,
> > >
> > > This is the vote to release Apache MXNet (incubating) version 1.8.0.
> > > Voting will start September 26, 23:59:59 PDT and close on September
> 29,
> > > 23:59:59 PDT.
> > >
> > > Link to release notes:
> > > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> > >
> > > Link to release candidate:
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> > >
> > > Link to source and signatures on apache dist server:
> > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> > >
> > > Please remember to TEST first before voting accordingly:
> > > +1 = approve
> > > +0 = no opinion
> > > -1 = disapprove (provide reason)
> > >
> > > Best regards,
> > > Sam Skalicky
> > >
> > >
> >
> > --
> > Sandeep Krishnamurthy
>
>
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by "Skalicky, Sam" <ss...@amazon.com.INVALID>.
Thanks for pointing this out Leonard. Has anyone been able to reproduce the problem on 1.8.0.rc0?
Either way, I would proposed that we continue validating the release as-is and see if we can find any other issues.
Sam
On 9/28/20, 10:22 AM, "Leonard Lausen" <la...@apache.org> wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Thank you Sam for driving the 1.8 release!
As the included oneDNN package is known to produce nan results on the master
branch [1] and is pending an upstream fix by Intel, I'd suggest to extend the
vote until we have clarity if the bug also affects the 1.8 release, given that
oneDNN is enabled in the default configuration [2].
[1]: https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
[2]:
https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> Sam,
>
> Thank you for driving the v1.8.0 release of MXNet. This is exciting given
> it is coming with CUDA11 and cuDNN8!!
>
> Fixing the release candidate link:
> https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
>
> Best,
> Sandeep
>
>
> On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam <ss...@amazon.com.invalid>
> wrote:
>
> > Dear MXNet community,
> >
> > This is the vote to release Apache MXNet (incubating) version 1.8.0.
> > Voting will start September 26, 23:59:59 PDT and close on September 29,
> > 23:59:59 PDT.
> >
> > Link to release notes:
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> >
> > Link to release candidate:
> > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> >
> > Link to source and signatures on apache dist server:
> > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> >
> > Please remember to TEST first before voting accordingly:
> > +1 = approve
> > +0 = no opinion
> > -1 = disapprove (provide reason)
> >
> > Best regards,
> > Sam Skalicky
> >
> >
>
> --
> Sandeep Krishnamurthy
Re: [EXTERNAL] [VOTE] Release Apache MXNet (incubating) version
1.8.0.rc0
Posted by Leonard Lausen <la...@apache.org>.
Thank you Sam for driving the 1.8 release!
As the included oneDNN package is known to produce nan results on the master
branch [1] and is pending an upstream fix by Intel, I'd suggest to extend the
vote until we have clarity if the bug also affects the 1.8 release, given that
oneDNN is enabled in the default configuration [2].
[1]: https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
[2]:
https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy wrote:
> Sam,
>
> Thank you for driving the v1.8.0 release of MXNet. This is exciting given
> it is coming with CUDA11 and cuDNN8!!
>
> Fixing the release candidate link:
> https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
>
> Best,
> Sandeep
>
>
> On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam <ss...@amazon.com.invalid>
> wrote:
>
> > Dear MXNet community,
> >
> > This is the vote to release Apache MXNet (incubating) version 1.8.0.
> > Voting will start September 26, 23:59:59 PDT and close on September 29,
> > 23:59:59 PDT.
> >
> > Link to release notes:
> > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
> >
> > Link to release candidate:
> > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
> >
> > Link to source and signatures on apache dist server:
> > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
> >
> > Please remember to TEST first before voting accordingly:
> > +1 = approve
> > +0 = no opinion
> > -1 = disapprove (provide reason)
> >
> > Best regards,
> > Sam Skalicky
> >
> >
>
> --
> Sandeep Krishnamurthy
Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Posted by sandeep krishnamurthy <sa...@gmail.com>.
Sam,
Thank you for driving the v1.8.0 release of MXNet. This is exciting given
it is coming with CUDA11 and cuDNN8!!
Fixing the release candidate link:
https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
Best,
Sandeep
On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam <ss...@amazon.com.invalid>
wrote:
> Dear MXNet community,
>
> This is the vote to release Apache MXNet (incubating) version 1.8.0.
> Voting will start September 26, 23:59:59 PDT and close on September 29,
> 23:59:59 PDT.
>
> Link to release notes:
> https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
>
> Link to release candidate:
> https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
>
> Link to source and signatures on apache dist server:
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
>
> Please remember to TEST first before voting accordingly:
> +1 = approve
> +0 = no opinion
> -1 = disapprove (provide reason)
>
> Best regards,
> Sam Skalicky
>
>
--
Sandeep Krishnamurthy