You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mxnet.apache.org by Pedro Larroy <pe...@gmail.com> on 2019/06/14 23:59:33 UTC

[VOTE] Remove conflicting OpenMP from CMake builds

Hi all

This is a 5-day vote to act and wrap up an outstanding PR that removes
linkage with multiple OpenMP from 3rdparty and uses the system
provided one which might resolve a number of difficult to debug issues
and possible undefined behaviour.

https://github.com/apache/incubator-mxnet/pull/12160

See the comments in the thread for more details but in short, linking
with multiple openmp versions seems to lead to undefined behaviour,
plus it would simplify not having to deal with a custom openmp version
and rely on the platform provided one.

This is expected to simplify builds and address a number of problems.
Seems it doesn't cause any performance degradation, (the Gluon tests
run almost 4x faster in my 64 core machine).

There has been in depth study of performance implications by
contributors like Stanislav Tsukrov and Anton Chernov.  All the
concerns and comments from the reviewers have been addressed and we
can't keep asking open ended questions to block PRs. Reviewers are
expected to be proactive and responsive to contributors so we keep
encouraging active contributors.

please vote to merge this PR accordingly:

+1 = approve
+0 = no opinion
-1 = disapprove (provide reason)

If we observe regressions reported by any internal performance systems
or by contributors the PR can be reverted easily. So it's not a one
way door. But it will be useful to try this in master for a while.

Thank you.

Pedro.

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Pedro Larroy <pe...@gmail.com>.
+1

This is a link to the benchmarks in the wiki:

https://cwiki.apache.org/confluence/display/MXNET/Benchmarking+MXNet+with+different+OpenMP+implementations

Pedro.

On Fri, Jun 14, 2019 at 4:59 PM Pedro Larroy
<pe...@gmail.com> wrote:
>
> Hi all
>
> This is a 5-day vote to act and wrap up an outstanding PR that removes
> linkage with multiple OpenMP from 3rdparty and uses the system
> provided one which might resolve a number of difficult to debug issues
> and possible undefined behaviour.
>
> https://github.com/apache/incubator-mxnet/pull/12160
>
> See the comments in the thread for more details but in short, linking
> with multiple openmp versions seems to lead to undefined behaviour,
> plus it would simplify not having to deal with a custom openmp version
> and rely on the platform provided one.
>
> This is expected to simplify builds and address a number of problems.
> Seems it doesn't cause any performance degradation, (the Gluon tests
> run almost 4x faster in my 64 core machine).
>
> There has been in depth study of performance implications by
> contributors like Stanislav Tsukrov and Anton Chernov.  All the
> concerns and comments from the reviewers have been addressed and we
> can't keep asking open ended questions to block PRs. Reviewers are
> expected to be proactive and responsive to contributors so we keep
> encouraging active contributors.
>
> please vote to merge this PR accordingly:
>
> +1 = approve
> +0 = no opinion
> -1 = disapprove (provide reason)
>
> If we observe regressions reported by any internal performance systems
> or by contributors the PR can be reverted easily. So it's not a one
> way door. But it will be useful to try this in master for a while.
>
> Thank you.
>
> Pedro.

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Sheng Zha <zh...@apache.org>.
Vetos stand until and unless withdrawn by their casters [1]. The only path forward would be to convince the caster to withdraw the veto.

-sz

[1] https://www.apache.org/foundation/voting.html#Veto

On 2019/06/17 17:46:59, Pedro Larroy <pe...@gmail.com> wrote: 
> Thanks.
> 
> How do we go on advancing this PR then? all the questions have been
> answered, performance numbers provided and more. Until how long can a
> veto stand? Also without replies to contributors.
> 
> Pedro.
> 
> On Fri, Jun 14, 2019 at 5:44 PM Sheng Zha <zh...@apache.org> wrote:
> >
> > This vote is invalid as the original PR has been vetoed by a committer. A vote on dev@ won't help you circumvent a veto.
> >
> > -sz
> >
> > On 2019/06/14 23:59:33, Pedro Larroy <pe...@gmail.com> wrote:
> > > Hi all
> > >
> > > This is a 5-day vote to act and wrap up an outstanding PR that removes
> > > linkage with multiple OpenMP from 3rdparty and uses the system
> > > provided one which might resolve a number of difficult to debug issues
> > > and possible undefined behaviour.
> > >
> > > https://github.com/apache/incubator-mxnet/pull/12160
> > >
> > > See the comments in the thread for more details but in short, linking
> > > with multiple openmp versions seems to lead to undefined behaviour,
> > > plus it would simplify not having to deal with a custom openmp version
> > > and rely on the platform provided one.
> > >
> > > This is expected to simplify builds and address a number of problems.
> > > Seems it doesn't cause any performance degradation, (the Gluon tests
> > > run almost 4x faster in my 64 core machine).
> > >
> > > There has been in depth study of performance implications by
> > > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > > concerns and comments from the reviewers have been addressed and we
> > > can't keep asking open ended questions to block PRs. Reviewers are
> > > expected to be proactive and responsive to contributors so we keep
> > > encouraging active contributors.
> > >
> > > please vote to merge this PR accordingly:
> > >
> > > +1 = approve
> > > +0 = no opinion
> > > -1 = disapprove (provide reason)
> > >
> > > If we observe regressions reported by any internal performance systems
> > > or by contributors the PR can be reverted easily. So it's not a one
> > > way door. But it will be useful to try this in master for a while.
> > >
> > > Thank you.
> > >
> > > Pedro.
> > >
> 

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Per da Silva <pe...@gmail.com>.
+1

On Tue., 18 Jun. 2019, 12:20 pm Anton Chernov, <me...@gmail.com> wrote:

> I would like to cite Apache Glossary [1]:
>
> Veto
> According to the Apache methodology, a change which has been made or
> proposed may be made moot through the exercise of a veto by a committer to
> the codebase in question. If the R-T-C commit policy is in effect, a veto
> prevents the change from being made. In either the R-T-C or C-T-R
> environments, a veto applied to a change that has already been made forces
> it to be reverted. Vetos may not be overridden nor voted down, and only
> cease to apply when the committer who issued the veto withdraws it. All
> vetos must be accompanied by a valid technical justification; a veto
> without such a justification is invalid; in case of doubt, deciding whether
> a technical justification is valid is up to the PMC. Vetos force discussion
> and, if supported, version control rollback or appropriate code changes.
> Vetoed code commits are best reverted by the original committer, unless an
> urgent solution is needed (e.g., build breakers). Vetos only apply to code
> changes; they do not apply to procedural issues such as software releases.
>
> Therefore the vote opened by Pedro has it's right to exist as a voting of
> PMC members on whether the veto of Chris has technical justification. It
> would be great if the committee could cast a vote on this to clarify the
> situation.
>
> Best
> Anton
>
> [1] https://www.apache.org/foundation/glossary.html
>
>
> On Mon, 17 Jun 2019 at 22:01, Pedro Larroy <pe...@gmail.com>
> wrote:
>
> > Regarding your paste of ldd, not sure what's your point. This is the
> > output with the patch from PR applied, the openmp version used is the
> > one provided by MKL:
> >
> >
> > Version of the PR that removes openmp from 3rdparty comes from MKL:
> >
> >         linux-vdso.so.1 (0x00007ffc4c993000)
> >         libmkldnn.so.0 =>
> > /home/piotr/mxnet_openmp/build/3rdparty/mkldnn/src/libmkldnn.so.0
> > (0x00007f7874519000)
> >         libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0
> > (0x00007f7872273000)
> >         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1
> (0x00007f787206b000)
> >         libjemalloc.so.1 => /usr/lib/x86_64-linux-gnu/libjemalloc.so.1
> > (0x00007f7871e35000)
> >         libmklml_intel.so =>
> >
> >
> /home/piotr/mxnet_openmp/build/mklml/mklml_lnx_2019.0.5.20190502/lib/libmklml_intel.so
> > (0x00007f786a3a3000)
> >         libiomp5.so =>
> >
> >
> /home/piotr/mxnet_openmp/build/mklml/mklml_lnx_2019.0.5.20190502/lib/libiomp5.so
> > (0x00007f7869fae000)
> >
> >
> >
> > In master, ldd shows that openmp is coming from 3rdparty:
> >
> >         libomp.so =>
> > /home/piotr/mxnet_master/build/3rdparty/openmp/runtime/src/libomp.so
> > (0x00007f1ea2353000)
> >
> >
> >
> > Could you please explain your argument on what's the recommended way
> > and what's the divergence?  As explained in the PR there seems to be
> > no performance advantage on adding our own version of openmp.
> >
> > Thanks.
> >
> > On Mon, Jun 17, 2019 at 12:55 PM Pedro Larroy
> > <pe...@gmail.com> wrote:
> > >
> > > Thanks for your answer Chris. I'm not militant in regards to one
> > > option or another nor belong of any club that will accept me as
> > > member, but I think we should drive this to conclusion and understand
> > > why we keep it or if we should remove it and use the platform provided
> > > version. There are open issues linked on that PR that are causing
> > > problems, that remain unaddressed it has been said that linking with
> > > two openmp versions causes undefined behaviour, I had the experience
> > > of having MXNet hang when using OpenMP running the tests in plain
> > > ubuntu CPU, I know we are reentering OpenMP initialization when
> > > creating threads in the engine and getting an assertion, etc, so I
> > > have serious concerns when running MXNet with OpenMP enabled in
> > > production.
> > >
> > > I think you are the expert in OpenMP and HPC and it would be good to
> > > have a documented and explainable outcome of one option or another
> > > either in the mailing list or in the wiki.
> > >
> > > I also feel bad for contributors spending time measuring and
> > > benchmarking without conclusion, I feel they have given up working on
> > > this issue since the PR keeps getting closed and is not moved forward.
> > >
> > > Please help us understand whats the best way forward with this issue
> > > and not just close the PR without explanations.
> > >
> > > Pedro.
> > >
> > > On Mon, Jun 17, 2019 at 11:15 AM Chris Olivier <cj...@gmail.com>
> > wrote:
> > > >
> > > > I am curious why you're being so militant troll about this.  libomp
> is
> > used
> > > > in every MKL build (download mxnet-mkl yourself and see).  I don't
> see
> > any
> > > > convincing reason to change it and so far as I can tell, no real
> issue
> > has
> > > > been proven to be related.  Anyway, I am reluctant to feed trolls any
> > more
> > > > than this, so I don't really have much else to add.
> > > >
> > > > ldd /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so
> > > >         linux-vdso.so.1 (0x00007ffc989cf000)
> > > >         libmklml_intel.so =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libmklml_intel.so
> > > > (0x00007f0afb7c1000)
> > > >        * libiomp5.so =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libiomp5.so
> > > > (0x00007f0afb3e5000)*
> > > >         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1
> > (0x00007f0afb1dd000)
> > > >         libmkldnn.so.0 =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libmkldnn.so.0
> > > > (0x00007f0afa7ba000)
> > > >         libgfortran.so.3 =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libgfortran.so.3
> > > > (0x00007f0afa493000)
> > > >         libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2
> > (0x00007f0afa28f000)
> > > >         libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> > > > (0x00007f0af9f06000)
> > > >         libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6
> > (0x00007f0af9b68000)
> > > >         libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
> > > > (0x00007f0af9950000)
> > > >         libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> > > > (0x00007f0af9731000)
> > > >         libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
> > (0x00007f0af9340000)
> > > >         /lib64/ld-linux-x86-64.so.2 (0x00007f0b073f4000)
> > > >         libquadmath.so.0 =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libquadmath.so.0
> > > > (0x00007f0af9100000)
> > > >
> > > >
> > > > On Mon, Jun 17, 2019 at 10:58 AM Pedro Larroy <
> > pedro.larroy.lists@gmail.com>
> > > > wrote:
> > > >
> > > > > I had read the "Apache Voting Process" guide here:
> > > > > https://www.apache.org/foundation/voting.html  and I thought code
> > > > > changes could be discussed on the mailing list in cases where the
> PR
> > > > > is stuck or there's no response for a long time, also I understood
> > > > > that -1's have to be justified.  Could you, or someone more
> familiar
> > > > > which the Apache way enlighten on how to move this issue forward
> in a
> > > > > constructive way?
> > > > >
> > > > > Thanks a lot.
> > > > >
> > > > > Pedro.
> > > > >
> > > > > On Mon, Jun 17, 2019 at 10:46 AM Pedro Larroy
> > > > > <pe...@gmail.com> wrote:
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > How do we go on advancing this PR then? all the questions have
> been
> > > > > > answered, performance numbers provided and more. Until how long
> > can a
> > > > > > veto stand? Also without replies to contributors.
> > > > > >
> > > > > > Pedro.
> > > > > >
> > > > > > On Fri, Jun 14, 2019 at 5:44 PM Sheng Zha <zh...@apache.org>
> > wrote:
> > > > > > >
> > > > > > > This vote is invalid as the original PR has been vetoed by a
> > > > > committer. A vote on dev@ won't help you circumvent a veto.
> > > > > > >
> > > > > > > -sz
> > > > > > >
> > > > > > > On 2019/06/14 23:59:33, Pedro Larroy <
> > pedro.larroy.lists@gmail.com>
> > > > > wrote:
> > > > > > > > Hi all
> > > > > > > >
> > > > > > > > This is a 5-day vote to act and wrap up an outstanding PR
> that
> > > > > removes
> > > > > > > > linkage with multiple OpenMP from 3rdparty and uses the
> system
> > > > > > > > provided one which might resolve a number of difficult to
> debug
> > > > > issues
> > > > > > > > and possible undefined behaviour.
> > > > > > > >
> > > > > > > > https://github.com/apache/incubator-mxnet/pull/12160
> > > > > > > >
> > > > > > > > See the comments in the thread for more details but in short,
> > linking
> > > > > > > > with multiple openmp versions seems to lead to undefined
> > behaviour,
> > > > > > > > plus it would simplify not having to deal with a custom
> openmp
> > > > > version
> > > > > > > > and rely on the platform provided one.
> > > > > > > >
> > > > > > > > This is expected to simplify builds and address a number of
> > problems.
> > > > > > > > Seems it doesn't cause any performance degradation, (the
> Gluon
> > tests
> > > > > > > > run almost 4x faster in my 64 core machine).
> > > > > > > >
> > > > > > > > There has been in depth study of performance implications by
> > > > > > > > contributors like Stanislav Tsukrov and Anton Chernov.  All
> the
> > > > > > > > concerns and comments from the reviewers have been addressed
> > and we
> > > > > > > > can't keep asking open ended questions to block PRs.
> Reviewers
> > are
> > > > > > > > expected to be proactive and responsive to contributors so we
> > keep
> > > > > > > > encouraging active contributors.
> > > > > > > >
> > > > > > > > please vote to merge this PR accordingly:
> > > > > > > >
> > > > > > > > +1 = approve
> > > > > > > > +0 = no opinion
> > > > > > > > -1 = disapprove (provide reason)
> > > > > > > >
> > > > > > > > If we observe regressions reported by any internal
> performance
> > > > > systems
> > > > > > > > or by contributors the PR can be reverted easily. So it's not
> > a one
> > > > > > > > way door. But it will be useful to try this in master for a
> > while.
> > > > > > > >
> > > > > > > > Thank you.
> > > > > > > >
> > > > > > > > Pedro.
> > > > > > > >
> > > > >
> >
>

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Pedro Larroy <pe...@gmail.com>.
I tried to have a vote to bring light and clarify a PR which was not
advancing and not handled well in my opinion. As I understand, and
indicated before here by Apache mentors, the mailing list is the main
communication medium and where votes and discussion happen, so my
understanding was that it should be always possible to bring a
discussion from Github to the mailing list so we can build consensus
and have things moving in the right direction we have many PRs open
and thousands of issues open, bringing something to dev@ gives the
opportunity to build consensus and community.

Sheng was quick to call the vote invalid. I think we should be also
quick to call out toxic behaviors and communications which are not
helping having a healthy community. This is also part of what I would
expect from the leaders of this project as an external contributor.

There should be possible to bring discussion from github to the
mailing list, for example in cases where there are hastily merged PRs
which might need revisiting or further discussion. Maybe someone else
with more Apache experience can chime in, but personally I  don't
think a request for changes or closing a PR should constitute a formal
veto. Having a vote in the mailing list doesn't prevent anyone from
casting their vote or their veto with appropriate justification if
they desire.

Pedro.

On Tue, Jun 18, 2019 at 3:20 AM Anton Chernov <me...@gmail.com> wrote:
>
> I would like to cite Apache Glossary [1]:
>
> Veto
> According to the Apache methodology, a change which has been made or
> proposed may be made moot through the exercise of a veto by a committer to
> the codebase in question. If the R-T-C commit policy is in effect, a veto
> prevents the change from being made. In either the R-T-C or C-T-R
> environments, a veto applied to a change that has already been made forces
> it to be reverted. Vetos may not be overridden nor voted down, and only
> cease to apply when the committer who issued the veto withdraws it. All
> vetos must be accompanied by a valid technical justification; a veto
> without such a justification is invalid; in case of doubt, deciding whether
> a technical justification is valid is up to the PMC. Vetos force discussion
> and, if supported, version control rollback or appropriate code changes.
> Vetoed code commits are best reverted by the original committer, unless an
> urgent solution is needed (e.g., build breakers). Vetos only apply to code
> changes; they do not apply to procedural issues such as software releases.
>
> Therefore the vote opened by Pedro has it's right to exist as a voting of
> PMC members on whether the veto of Chris has technical justification. It
> would be great if the committee could cast a vote on this to clarify the
> situation.
>
> Best
> Anton
>
> [1] https://www.apache.org/foundation/glossary.html
>
>
> On Mon, 17 Jun 2019 at 22:01, Pedro Larroy <pe...@gmail.com>
> wrote:
>
> > Regarding your paste of ldd, not sure what's your point. This is the
> > output with the patch from PR applied, the openmp version used is the
> > one provided by MKL:
> >
> >
> > Version of the PR that removes openmp from 3rdparty comes from MKL:
> >
> >         linux-vdso.so.1 (0x00007ffc4c993000)
> >         libmkldnn.so.0 =>
> > /home/piotr/mxnet_openmp/build/3rdparty/mkldnn/src/libmkldnn.so.0
> > (0x00007f7874519000)
> >         libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0
> > (0x00007f7872273000)
> >         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f787206b000)
> >         libjemalloc.so.1 => /usr/lib/x86_64-linux-gnu/libjemalloc.so.1
> > (0x00007f7871e35000)
> >         libmklml_intel.so =>
> >
> > /home/piotr/mxnet_openmp/build/mklml/mklml_lnx_2019.0.5.20190502/lib/libmklml_intel.so
> > (0x00007f786a3a3000)
> >         libiomp5.so =>
> >
> > /home/piotr/mxnet_openmp/build/mklml/mklml_lnx_2019.0.5.20190502/lib/libiomp5.so
> > (0x00007f7869fae000)
> >
> >
> >
> > In master, ldd shows that openmp is coming from 3rdparty:
> >
> >         libomp.so =>
> > /home/piotr/mxnet_master/build/3rdparty/openmp/runtime/src/libomp.so
> > (0x00007f1ea2353000)
> >
> >
> >
> > Could you please explain your argument on what's the recommended way
> > and what's the divergence?  As explained in the PR there seems to be
> > no performance advantage on adding our own version of openmp.
> >
> > Thanks.
> >
> > On Mon, Jun 17, 2019 at 12:55 PM Pedro Larroy
> > <pe...@gmail.com> wrote:
> > >
> > > Thanks for your answer Chris. I'm not militant in regards to one
> > > option or another nor belong of any club that will accept me as
> > > member, but I think we should drive this to conclusion and understand
> > > why we keep it or if we should remove it and use the platform provided
> > > version. There are open issues linked on that PR that are causing
> > > problems, that remain unaddressed it has been said that linking with
> > > two openmp versions causes undefined behaviour, I had the experience
> > > of having MXNet hang when using OpenMP running the tests in plain
> > > ubuntu CPU, I know we are reentering OpenMP initialization when
> > > creating threads in the engine and getting an assertion, etc, so I
> > > have serious concerns when running MXNet with OpenMP enabled in
> > > production.
> > >
> > > I think you are the expert in OpenMP and HPC and it would be good to
> > > have a documented and explainable outcome of one option or another
> > > either in the mailing list or in the wiki.
> > >
> > > I also feel bad for contributors spending time measuring and
> > > benchmarking without conclusion, I feel they have given up working on
> > > this issue since the PR keeps getting closed and is not moved forward.
> > >
> > > Please help us understand whats the best way forward with this issue
> > > and not just close the PR without explanations.
> > >
> > > Pedro.
> > >
> > > On Mon, Jun 17, 2019 at 11:15 AM Chris Olivier <cj...@gmail.com>
> > wrote:
> > > >
> > > > I am curious why you're being so militant troll about this.  libomp is
> > used
> > > > in every MKL build (download mxnet-mkl yourself and see).  I don't see
> > any
> > > > convincing reason to change it and so far as I can tell, no real issue
> > has
> > > > been proven to be related.  Anyway, I am reluctant to feed trolls any
> > more
> > > > than this, so I don't really have much else to add.
> > > >
> > > > ldd /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so
> > > >         linux-vdso.so.1 (0x00007ffc989cf000)
> > > >         libmklml_intel.so =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libmklml_intel.so
> > > > (0x00007f0afb7c1000)
> > > >        * libiomp5.so =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libiomp5.so
> > > > (0x00007f0afb3e5000)*
> > > >         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1
> > (0x00007f0afb1dd000)
> > > >         libmkldnn.so.0 =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libmkldnn.so.0
> > > > (0x00007f0afa7ba000)
> > > >         libgfortran.so.3 =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libgfortran.so.3
> > > > (0x00007f0afa493000)
> > > >         libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2
> > (0x00007f0afa28f000)
> > > >         libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> > > > (0x00007f0af9f06000)
> > > >         libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6
> > (0x00007f0af9b68000)
> > > >         libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
> > > > (0x00007f0af9950000)
> > > >         libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> > > > (0x00007f0af9731000)
> > > >         libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
> > (0x00007f0af9340000)
> > > >         /lib64/ld-linux-x86-64.so.2 (0x00007f0b073f4000)
> > > >         libquadmath.so.0 =>
> > > > /usr/local/lib/python3.6/dist-packages/mxnet/libquadmath.so.0
> > > > (0x00007f0af9100000)
> > > >
> > > >
> > > > On Mon, Jun 17, 2019 at 10:58 AM Pedro Larroy <
> > pedro.larroy.lists@gmail.com>
> > > > wrote:
> > > >
> > > > > I had read the "Apache Voting Process" guide here:
> > > > > https://www.apache.org/foundation/voting.html  and I thought code
> > > > > changes could be discussed on the mailing list in cases where the PR
> > > > > is stuck or there's no response for a long time, also I understood
> > > > > that -1's have to be justified.  Could you, or someone more familiar
> > > > > which the Apache way enlighten on how to move this issue forward in a
> > > > > constructive way?
> > > > >
> > > > > Thanks a lot.
> > > > >
> > > > > Pedro.
> > > > >
> > > > > On Mon, Jun 17, 2019 at 10:46 AM Pedro Larroy
> > > > > <pe...@gmail.com> wrote:
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > How do we go on advancing this PR then? all the questions have been
> > > > > > answered, performance numbers provided and more. Until how long
> > can a
> > > > > > veto stand? Also without replies to contributors.
> > > > > >
> > > > > > Pedro.
> > > > > >
> > > > > > On Fri, Jun 14, 2019 at 5:44 PM Sheng Zha <zh...@apache.org>
> > wrote:
> > > > > > >
> > > > > > > This vote is invalid as the original PR has been vetoed by a
> > > > > committer. A vote on dev@ won't help you circumvent a veto.
> > > > > > >
> > > > > > > -sz
> > > > > > >
> > > > > > > On 2019/06/14 23:59:33, Pedro Larroy <
> > pedro.larroy.lists@gmail.com>
> > > > > wrote:
> > > > > > > > Hi all
> > > > > > > >
> > > > > > > > This is a 5-day vote to act and wrap up an outstanding PR that
> > > > > removes
> > > > > > > > linkage with multiple OpenMP from 3rdparty and uses the system
> > > > > > > > provided one which might resolve a number of difficult to debug
> > > > > issues
> > > > > > > > and possible undefined behaviour.
> > > > > > > >
> > > > > > > > https://github.com/apache/incubator-mxnet/pull/12160
> > > > > > > >
> > > > > > > > See the comments in the thread for more details but in short,
> > linking
> > > > > > > > with multiple openmp versions seems to lead to undefined
> > behaviour,
> > > > > > > > plus it would simplify not having to deal with a custom openmp
> > > > > version
> > > > > > > > and rely on the platform provided one.
> > > > > > > >
> > > > > > > > This is expected to simplify builds and address a number of
> > problems.
> > > > > > > > Seems it doesn't cause any performance degradation, (the Gluon
> > tests
> > > > > > > > run almost 4x faster in my 64 core machine).
> > > > > > > >
> > > > > > > > There has been in depth study of performance implications by
> > > > > > > > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > > > > > > > concerns and comments from the reviewers have been addressed
> > and we
> > > > > > > > can't keep asking open ended questions to block PRs. Reviewers
> > are
> > > > > > > > expected to be proactive and responsive to contributors so we
> > keep
> > > > > > > > encouraging active contributors.
> > > > > > > >
> > > > > > > > please vote to merge this PR accordingly:
> > > > > > > >
> > > > > > > > +1 = approve
> > > > > > > > +0 = no opinion
> > > > > > > > -1 = disapprove (provide reason)
> > > > > > > >
> > > > > > > > If we observe regressions reported by any internal performance
> > > > > systems
> > > > > > > > or by contributors the PR can be reverted easily. So it's not
> > a one
> > > > > > > > way door. But it will be useful to try this in master for a
> > while.
> > > > > > > >
> > > > > > > > Thank you.
> > > > > > > >
> > > > > > > > Pedro.
> > > > > > > >
> > > > >
> >

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Anton Chernov <me...@gmail.com>.
I would like to cite Apache Glossary [1]:

Veto
According to the Apache methodology, a change which has been made or
proposed may be made moot through the exercise of a veto by a committer to
the codebase in question. If the R-T-C commit policy is in effect, a veto
prevents the change from being made. In either the R-T-C or C-T-R
environments, a veto applied to a change that has already been made forces
it to be reverted. Vetos may not be overridden nor voted down, and only
cease to apply when the committer who issued the veto withdraws it. All
vetos must be accompanied by a valid technical justification; a veto
without such a justification is invalid; in case of doubt, deciding whether
a technical justification is valid is up to the PMC. Vetos force discussion
and, if supported, version control rollback or appropriate code changes.
Vetoed code commits are best reverted by the original committer, unless an
urgent solution is needed (e.g., build breakers). Vetos only apply to code
changes; they do not apply to procedural issues such as software releases.

Therefore the vote opened by Pedro has it's right to exist as a voting of
PMC members on whether the veto of Chris has technical justification. It
would be great if the committee could cast a vote on this to clarify the
situation.

Best
Anton

[1] https://www.apache.org/foundation/glossary.html


On Mon, 17 Jun 2019 at 22:01, Pedro Larroy <pe...@gmail.com>
wrote:

> Regarding your paste of ldd, not sure what's your point. This is the
> output with the patch from PR applied, the openmp version used is the
> one provided by MKL:
>
>
> Version of the PR that removes openmp from 3rdparty comes from MKL:
>
>         linux-vdso.so.1 (0x00007ffc4c993000)
>         libmkldnn.so.0 =>
> /home/piotr/mxnet_openmp/build/3rdparty/mkldnn/src/libmkldnn.so.0
> (0x00007f7874519000)
>         libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0
> (0x00007f7872273000)
>         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f787206b000)
>         libjemalloc.so.1 => /usr/lib/x86_64-linux-gnu/libjemalloc.so.1
> (0x00007f7871e35000)
>         libmklml_intel.so =>
>
> /home/piotr/mxnet_openmp/build/mklml/mklml_lnx_2019.0.5.20190502/lib/libmklml_intel.so
> (0x00007f786a3a3000)
>         libiomp5.so =>
>
> /home/piotr/mxnet_openmp/build/mklml/mklml_lnx_2019.0.5.20190502/lib/libiomp5.so
> (0x00007f7869fae000)
>
>
>
> In master, ldd shows that openmp is coming from 3rdparty:
>
>         libomp.so =>
> /home/piotr/mxnet_master/build/3rdparty/openmp/runtime/src/libomp.so
> (0x00007f1ea2353000)
>
>
>
> Could you please explain your argument on what's the recommended way
> and what's the divergence?  As explained in the PR there seems to be
> no performance advantage on adding our own version of openmp.
>
> Thanks.
>
> On Mon, Jun 17, 2019 at 12:55 PM Pedro Larroy
> <pe...@gmail.com> wrote:
> >
> > Thanks for your answer Chris. I'm not militant in regards to one
> > option or another nor belong of any club that will accept me as
> > member, but I think we should drive this to conclusion and understand
> > why we keep it or if we should remove it and use the platform provided
> > version. There are open issues linked on that PR that are causing
> > problems, that remain unaddressed it has been said that linking with
> > two openmp versions causes undefined behaviour, I had the experience
> > of having MXNet hang when using OpenMP running the tests in plain
> > ubuntu CPU, I know we are reentering OpenMP initialization when
> > creating threads in the engine and getting an assertion, etc, so I
> > have serious concerns when running MXNet with OpenMP enabled in
> > production.
> >
> > I think you are the expert in OpenMP and HPC and it would be good to
> > have a documented and explainable outcome of one option or another
> > either in the mailing list or in the wiki.
> >
> > I also feel bad for contributors spending time measuring and
> > benchmarking without conclusion, I feel they have given up working on
> > this issue since the PR keeps getting closed and is not moved forward.
> >
> > Please help us understand whats the best way forward with this issue
> > and not just close the PR without explanations.
> >
> > Pedro.
> >
> > On Mon, Jun 17, 2019 at 11:15 AM Chris Olivier <cj...@gmail.com>
> wrote:
> > >
> > > I am curious why you're being so militant troll about this.  libomp is
> used
> > > in every MKL build (download mxnet-mkl yourself and see).  I don't see
> any
> > > convincing reason to change it and so far as I can tell, no real issue
> has
> > > been proven to be related.  Anyway, I am reluctant to feed trolls any
> more
> > > than this, so I don't really have much else to add.
> > >
> > > ldd /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so
> > >         linux-vdso.so.1 (0x00007ffc989cf000)
> > >         libmklml_intel.so =>
> > > /usr/local/lib/python3.6/dist-packages/mxnet/libmklml_intel.so
> > > (0x00007f0afb7c1000)
> > >        * libiomp5.so =>
> > > /usr/local/lib/python3.6/dist-packages/mxnet/libiomp5.so
> > > (0x00007f0afb3e5000)*
> > >         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1
> (0x00007f0afb1dd000)
> > >         libmkldnn.so.0 =>
> > > /usr/local/lib/python3.6/dist-packages/mxnet/libmkldnn.so.0
> > > (0x00007f0afa7ba000)
> > >         libgfortran.so.3 =>
> > > /usr/local/lib/python3.6/dist-packages/mxnet/libgfortran.so.3
> > > (0x00007f0afa493000)
> > >         libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2
> (0x00007f0afa28f000)
> > >         libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> > > (0x00007f0af9f06000)
> > >         libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6
> (0x00007f0af9b68000)
> > >         libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
> > > (0x00007f0af9950000)
> > >         libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> > > (0x00007f0af9731000)
> > >         libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
> (0x00007f0af9340000)
> > >         /lib64/ld-linux-x86-64.so.2 (0x00007f0b073f4000)
> > >         libquadmath.so.0 =>
> > > /usr/local/lib/python3.6/dist-packages/mxnet/libquadmath.so.0
> > > (0x00007f0af9100000)
> > >
> > >
> > > On Mon, Jun 17, 2019 at 10:58 AM Pedro Larroy <
> pedro.larroy.lists@gmail.com>
> > > wrote:
> > >
> > > > I had read the "Apache Voting Process" guide here:
> > > > https://www.apache.org/foundation/voting.html  and I thought code
> > > > changes could be discussed on the mailing list in cases where the PR
> > > > is stuck or there's no response for a long time, also I understood
> > > > that -1's have to be justified.  Could you, or someone more familiar
> > > > which the Apache way enlighten on how to move this issue forward in a
> > > > constructive way?
> > > >
> > > > Thanks a lot.
> > > >
> > > > Pedro.
> > > >
> > > > On Mon, Jun 17, 2019 at 10:46 AM Pedro Larroy
> > > > <pe...@gmail.com> wrote:
> > > > >
> > > > > Thanks.
> > > > >
> > > > > How do we go on advancing this PR then? all the questions have been
> > > > > answered, performance numbers provided and more. Until how long
> can a
> > > > > veto stand? Also without replies to contributors.
> > > > >
> > > > > Pedro.
> > > > >
> > > > > On Fri, Jun 14, 2019 at 5:44 PM Sheng Zha <zh...@apache.org>
> wrote:
> > > > > >
> > > > > > This vote is invalid as the original PR has been vetoed by a
> > > > committer. A vote on dev@ won't help you circumvent a veto.
> > > > > >
> > > > > > -sz
> > > > > >
> > > > > > On 2019/06/14 23:59:33, Pedro Larroy <
> pedro.larroy.lists@gmail.com>
> > > > wrote:
> > > > > > > Hi all
> > > > > > >
> > > > > > > This is a 5-day vote to act and wrap up an outstanding PR that
> > > > removes
> > > > > > > linkage with multiple OpenMP from 3rdparty and uses the system
> > > > > > > provided one which might resolve a number of difficult to debug
> > > > issues
> > > > > > > and possible undefined behaviour.
> > > > > > >
> > > > > > > https://github.com/apache/incubator-mxnet/pull/12160
> > > > > > >
> > > > > > > See the comments in the thread for more details but in short,
> linking
> > > > > > > with multiple openmp versions seems to lead to undefined
> behaviour,
> > > > > > > plus it would simplify not having to deal with a custom openmp
> > > > version
> > > > > > > and rely on the platform provided one.
> > > > > > >
> > > > > > > This is expected to simplify builds and address a number of
> problems.
> > > > > > > Seems it doesn't cause any performance degradation, (the Gluon
> tests
> > > > > > > run almost 4x faster in my 64 core machine).
> > > > > > >
> > > > > > > There has been in depth study of performance implications by
> > > > > > > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > > > > > > concerns and comments from the reviewers have been addressed
> and we
> > > > > > > can't keep asking open ended questions to block PRs. Reviewers
> are
> > > > > > > expected to be proactive and responsive to contributors so we
> keep
> > > > > > > encouraging active contributors.
> > > > > > >
> > > > > > > please vote to merge this PR accordingly:
> > > > > > >
> > > > > > > +1 = approve
> > > > > > > +0 = no opinion
> > > > > > > -1 = disapprove (provide reason)
> > > > > > >
> > > > > > > If we observe regressions reported by any internal performance
> > > > systems
> > > > > > > or by contributors the PR can be reverted easily. So it's not
> a one
> > > > > > > way door. But it will be useful to try this in master for a
> while.
> > > > > > >
> > > > > > > Thank you.
> > > > > > >
> > > > > > > Pedro.
> > > > > > >
> > > >
>

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Pedro Larroy <pe...@gmail.com>.
Regarding your paste of ldd, not sure what's your point. This is the
output with the patch from PR applied, the openmp version used is the
one provided by MKL:


Version of the PR that removes openmp from 3rdparty comes from MKL:

        linux-vdso.so.1 (0x00007ffc4c993000)
        libmkldnn.so.0 =>
/home/piotr/mxnet_openmp/build/3rdparty/mkldnn/src/libmkldnn.so.0
(0x00007f7874519000)
        libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0
(0x00007f7872273000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f787206b000)
        libjemalloc.so.1 => /usr/lib/x86_64-linux-gnu/libjemalloc.so.1
(0x00007f7871e35000)
        libmklml_intel.so =>
/home/piotr/mxnet_openmp/build/mklml/mklml_lnx_2019.0.5.20190502/lib/libmklml_intel.so
(0x00007f786a3a3000)
        libiomp5.so =>
/home/piotr/mxnet_openmp/build/mklml/mklml_lnx_2019.0.5.20190502/lib/libiomp5.so
(0x00007f7869fae000)



In master, ldd shows that openmp is coming from 3rdparty:

        libomp.so =>
/home/piotr/mxnet_master/build/3rdparty/openmp/runtime/src/libomp.so
(0x00007f1ea2353000)



Could you please explain your argument on what's the recommended way
and what's the divergence?  As explained in the PR there seems to be
no performance advantage on adding our own version of openmp.

Thanks.

On Mon, Jun 17, 2019 at 12:55 PM Pedro Larroy
<pe...@gmail.com> wrote:
>
> Thanks for your answer Chris. I'm not militant in regards to one
> option or another nor belong of any club that will accept me as
> member, but I think we should drive this to conclusion and understand
> why we keep it or if we should remove it and use the platform provided
> version. There are open issues linked on that PR that are causing
> problems, that remain unaddressed it has been said that linking with
> two openmp versions causes undefined behaviour, I had the experience
> of having MXNet hang when using OpenMP running the tests in plain
> ubuntu CPU, I know we are reentering OpenMP initialization when
> creating threads in the engine and getting an assertion, etc, so I
> have serious concerns when running MXNet with OpenMP enabled in
> production.
>
> I think you are the expert in OpenMP and HPC and it would be good to
> have a documented and explainable outcome of one option or another
> either in the mailing list or in the wiki.
>
> I also feel bad for contributors spending time measuring and
> benchmarking without conclusion, I feel they have given up working on
> this issue since the PR keeps getting closed and is not moved forward.
>
> Please help us understand whats the best way forward with this issue
> and not just close the PR without explanations.
>
> Pedro.
>
> On Mon, Jun 17, 2019 at 11:15 AM Chris Olivier <cj...@gmail.com> wrote:
> >
> > I am curious why you're being so militant troll about this.  libomp is used
> > in every MKL build (download mxnet-mkl yourself and see).  I don't see any
> > convincing reason to change it and so far as I can tell, no real issue has
> > been proven to be related.  Anyway, I am reluctant to feed trolls any more
> > than this, so I don't really have much else to add.
> >
> > ldd /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so
> >         linux-vdso.so.1 (0x00007ffc989cf000)
> >         libmklml_intel.so =>
> > /usr/local/lib/python3.6/dist-packages/mxnet/libmklml_intel.so
> > (0x00007f0afb7c1000)
> >        * libiomp5.so =>
> > /usr/local/lib/python3.6/dist-packages/mxnet/libiomp5.so
> > (0x00007f0afb3e5000)*
> >         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f0afb1dd000)
> >         libmkldnn.so.0 =>
> > /usr/local/lib/python3.6/dist-packages/mxnet/libmkldnn.so.0
> > (0x00007f0afa7ba000)
> >         libgfortran.so.3 =>
> > /usr/local/lib/python3.6/dist-packages/mxnet/libgfortran.so.3
> > (0x00007f0afa493000)
> >         libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f0afa28f000)
> >         libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> > (0x00007f0af9f06000)
> >         libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0af9b68000)
> >         libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
> > (0x00007f0af9950000)
> >         libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> > (0x00007f0af9731000)
> >         libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0af9340000)
> >         /lib64/ld-linux-x86-64.so.2 (0x00007f0b073f4000)
> >         libquadmath.so.0 =>
> > /usr/local/lib/python3.6/dist-packages/mxnet/libquadmath.so.0
> > (0x00007f0af9100000)
> >
> >
> > On Mon, Jun 17, 2019 at 10:58 AM Pedro Larroy <pe...@gmail.com>
> > wrote:
> >
> > > I had read the "Apache Voting Process" guide here:
> > > https://www.apache.org/foundation/voting.html  and I thought code
> > > changes could be discussed on the mailing list in cases where the PR
> > > is stuck or there's no response for a long time, also I understood
> > > that -1's have to be justified.  Could you, or someone more familiar
> > > which the Apache way enlighten on how to move this issue forward in a
> > > constructive way?
> > >
> > > Thanks a lot.
> > >
> > > Pedro.
> > >
> > > On Mon, Jun 17, 2019 at 10:46 AM Pedro Larroy
> > > <pe...@gmail.com> wrote:
> > > >
> > > > Thanks.
> > > >
> > > > How do we go on advancing this PR then? all the questions have been
> > > > answered, performance numbers provided and more. Until how long can a
> > > > veto stand? Also without replies to contributors.
> > > >
> > > > Pedro.
> > > >
> > > > On Fri, Jun 14, 2019 at 5:44 PM Sheng Zha <zh...@apache.org> wrote:
> > > > >
> > > > > This vote is invalid as the original PR has been vetoed by a
> > > committer. A vote on dev@ won't help you circumvent a veto.
> > > > >
> > > > > -sz
> > > > >
> > > > > On 2019/06/14 23:59:33, Pedro Larroy <pe...@gmail.com>
> > > wrote:
> > > > > > Hi all
> > > > > >
> > > > > > This is a 5-day vote to act and wrap up an outstanding PR that
> > > removes
> > > > > > linkage with multiple OpenMP from 3rdparty and uses the system
> > > > > > provided one which might resolve a number of difficult to debug
> > > issues
> > > > > > and possible undefined behaviour.
> > > > > >
> > > > > > https://github.com/apache/incubator-mxnet/pull/12160
> > > > > >
> > > > > > See the comments in the thread for more details but in short, linking
> > > > > > with multiple openmp versions seems to lead to undefined behaviour,
> > > > > > plus it would simplify not having to deal with a custom openmp
> > > version
> > > > > > and rely on the platform provided one.
> > > > > >
> > > > > > This is expected to simplify builds and address a number of problems.
> > > > > > Seems it doesn't cause any performance degradation, (the Gluon tests
> > > > > > run almost 4x faster in my 64 core machine).
> > > > > >
> > > > > > There has been in depth study of performance implications by
> > > > > > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > > > > > concerns and comments from the reviewers have been addressed and we
> > > > > > can't keep asking open ended questions to block PRs. Reviewers are
> > > > > > expected to be proactive and responsive to contributors so we keep
> > > > > > encouraging active contributors.
> > > > > >
> > > > > > please vote to merge this PR accordingly:
> > > > > >
> > > > > > +1 = approve
> > > > > > +0 = no opinion
> > > > > > -1 = disapprove (provide reason)
> > > > > >
> > > > > > If we observe regressions reported by any internal performance
> > > systems
> > > > > > or by contributors the PR can be reverted easily. So it's not a one
> > > > > > way door. But it will be useful to try this in master for a while.
> > > > > >
> > > > > > Thank you.
> > > > > >
> > > > > > Pedro.
> > > > > >
> > >

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Pedro Larroy <pe...@gmail.com>.
Thanks for your answer Chris. I'm not militant in regards to one
option or another nor belong of any club that will accept me as
member, but I think we should drive this to conclusion and understand
why we keep it or if we should remove it and use the platform provided
version. There are open issues linked on that PR that are causing
problems, that remain unaddressed it has been said that linking with
two openmp versions causes undefined behaviour, I had the experience
of having MXNet hang when using OpenMP running the tests in plain
ubuntu CPU, I know we are reentering OpenMP initialization when
creating threads in the engine and getting an assertion, etc, so I
have serious concerns when running MXNet with OpenMP enabled in
production.

I think you are the expert in OpenMP and HPC and it would be good to
have a documented and explainable outcome of one option or another
either in the mailing list or in the wiki.

I also feel bad for contributors spending time measuring and
benchmarking without conclusion, I feel they have given up working on
this issue since the PR keeps getting closed and is not moved forward.

Please help us understand whats the best way forward with this issue
and not just close the PR without explanations.

Pedro.

On Mon, Jun 17, 2019 at 11:15 AM Chris Olivier <cj...@gmail.com> wrote:
>
> I am curious why you're being so militant troll about this.  libomp is used
> in every MKL build (download mxnet-mkl yourself and see).  I don't see any
> convincing reason to change it and so far as I can tell, no real issue has
> been proven to be related.  Anyway, I am reluctant to feed trolls any more
> than this, so I don't really have much else to add.
>
> ldd /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so
>         linux-vdso.so.1 (0x00007ffc989cf000)
>         libmklml_intel.so =>
> /usr/local/lib/python3.6/dist-packages/mxnet/libmklml_intel.so
> (0x00007f0afb7c1000)
>        * libiomp5.so =>
> /usr/local/lib/python3.6/dist-packages/mxnet/libiomp5.so
> (0x00007f0afb3e5000)*
>         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f0afb1dd000)
>         libmkldnn.so.0 =>
> /usr/local/lib/python3.6/dist-packages/mxnet/libmkldnn.so.0
> (0x00007f0afa7ba000)
>         libgfortran.so.3 =>
> /usr/local/lib/python3.6/dist-packages/mxnet/libgfortran.so.3
> (0x00007f0afa493000)
>         libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f0afa28f000)
>         libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> (0x00007f0af9f06000)
>         libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0af9b68000)
>         libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
> (0x00007f0af9950000)
>         libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> (0x00007f0af9731000)
>         libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0af9340000)
>         /lib64/ld-linux-x86-64.so.2 (0x00007f0b073f4000)
>         libquadmath.so.0 =>
> /usr/local/lib/python3.6/dist-packages/mxnet/libquadmath.so.0
> (0x00007f0af9100000)
>
>
> On Mon, Jun 17, 2019 at 10:58 AM Pedro Larroy <pe...@gmail.com>
> wrote:
>
> > I had read the "Apache Voting Process" guide here:
> > https://www.apache.org/foundation/voting.html  and I thought code
> > changes could be discussed on the mailing list in cases where the PR
> > is stuck or there's no response for a long time, also I understood
> > that -1's have to be justified.  Could you, or someone more familiar
> > which the Apache way enlighten on how to move this issue forward in a
> > constructive way?
> >
> > Thanks a lot.
> >
> > Pedro.
> >
> > On Mon, Jun 17, 2019 at 10:46 AM Pedro Larroy
> > <pe...@gmail.com> wrote:
> > >
> > > Thanks.
> > >
> > > How do we go on advancing this PR then? all the questions have been
> > > answered, performance numbers provided and more. Until how long can a
> > > veto stand? Also without replies to contributors.
> > >
> > > Pedro.
> > >
> > > On Fri, Jun 14, 2019 at 5:44 PM Sheng Zha <zh...@apache.org> wrote:
> > > >
> > > > This vote is invalid as the original PR has been vetoed by a
> > committer. A vote on dev@ won't help you circumvent a veto.
> > > >
> > > > -sz
> > > >
> > > > On 2019/06/14 23:59:33, Pedro Larroy <pe...@gmail.com>
> > wrote:
> > > > > Hi all
> > > > >
> > > > > This is a 5-day vote to act and wrap up an outstanding PR that
> > removes
> > > > > linkage with multiple OpenMP from 3rdparty and uses the system
> > > > > provided one which might resolve a number of difficult to debug
> > issues
> > > > > and possible undefined behaviour.
> > > > >
> > > > > https://github.com/apache/incubator-mxnet/pull/12160
> > > > >
> > > > > See the comments in the thread for more details but in short, linking
> > > > > with multiple openmp versions seems to lead to undefined behaviour,
> > > > > plus it would simplify not having to deal with a custom openmp
> > version
> > > > > and rely on the platform provided one.
> > > > >
> > > > > This is expected to simplify builds and address a number of problems.
> > > > > Seems it doesn't cause any performance degradation, (the Gluon tests
> > > > > run almost 4x faster in my 64 core machine).
> > > > >
> > > > > There has been in depth study of performance implications by
> > > > > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > > > > concerns and comments from the reviewers have been addressed and we
> > > > > can't keep asking open ended questions to block PRs. Reviewers are
> > > > > expected to be proactive and responsive to contributors so we keep
> > > > > encouraging active contributors.
> > > > >
> > > > > please vote to merge this PR accordingly:
> > > > >
> > > > > +1 = approve
> > > > > +0 = no opinion
> > > > > -1 = disapprove (provide reason)
> > > > >
> > > > > If we observe regressions reported by any internal performance
> > systems
> > > > > or by contributors the PR can be reverted easily. So it's not a one
> > > > > way door. But it will be useful to try this in master for a while.
> > > > >
> > > > > Thank you.
> > > > >
> > > > > Pedro.
> > > > >
> >

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Chris Olivier <cj...@gmail.com>.
I am curious why you're being so militant troll about this.  libomp is used
in every MKL build (download mxnet-mkl yourself and see).  I don't see any
convincing reason to change it and so far as I can tell, no real issue has
been proven to be related.  Anyway, I am reluctant to feed trolls any more
than this, so I don't really have much else to add.

ldd /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so
        linux-vdso.so.1 (0x00007ffc989cf000)
        libmklml_intel.so =>
/usr/local/lib/python3.6/dist-packages/mxnet/libmklml_intel.so
(0x00007f0afb7c1000)
       * libiomp5.so =>
/usr/local/lib/python3.6/dist-packages/mxnet/libiomp5.so
(0x00007f0afb3e5000)*
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f0afb1dd000)
        libmkldnn.so.0 =>
/usr/local/lib/python3.6/dist-packages/mxnet/libmkldnn.so.0
(0x00007f0afa7ba000)
        libgfortran.so.3 =>
/usr/local/lib/python3.6/dist-packages/mxnet/libgfortran.so.3
(0x00007f0afa493000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f0afa28f000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(0x00007f0af9f06000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0af9b68000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
(0x00007f0af9950000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
(0x00007f0af9731000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0af9340000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f0b073f4000)
        libquadmath.so.0 =>
/usr/local/lib/python3.6/dist-packages/mxnet/libquadmath.so.0
(0x00007f0af9100000)


On Mon, Jun 17, 2019 at 10:58 AM Pedro Larroy <pe...@gmail.com>
wrote:

> I had read the "Apache Voting Process" guide here:
> https://www.apache.org/foundation/voting.html  and I thought code
> changes could be discussed on the mailing list in cases where the PR
> is stuck or there's no response for a long time, also I understood
> that -1's have to be justified.  Could you, or someone more familiar
> which the Apache way enlighten on how to move this issue forward in a
> constructive way?
>
> Thanks a lot.
>
> Pedro.
>
> On Mon, Jun 17, 2019 at 10:46 AM Pedro Larroy
> <pe...@gmail.com> wrote:
> >
> > Thanks.
> >
> > How do we go on advancing this PR then? all the questions have been
> > answered, performance numbers provided and more. Until how long can a
> > veto stand? Also without replies to contributors.
> >
> > Pedro.
> >
> > On Fri, Jun 14, 2019 at 5:44 PM Sheng Zha <zh...@apache.org> wrote:
> > >
> > > This vote is invalid as the original PR has been vetoed by a
> committer. A vote on dev@ won't help you circumvent a veto.
> > >
> > > -sz
> > >
> > > On 2019/06/14 23:59:33, Pedro Larroy <pe...@gmail.com>
> wrote:
> > > > Hi all
> > > >
> > > > This is a 5-day vote to act and wrap up an outstanding PR that
> removes
> > > > linkage with multiple OpenMP from 3rdparty and uses the system
> > > > provided one which might resolve a number of difficult to debug
> issues
> > > > and possible undefined behaviour.
> > > >
> > > > https://github.com/apache/incubator-mxnet/pull/12160
> > > >
> > > > See the comments in the thread for more details but in short, linking
> > > > with multiple openmp versions seems to lead to undefined behaviour,
> > > > plus it would simplify not having to deal with a custom openmp
> version
> > > > and rely on the platform provided one.
> > > >
> > > > This is expected to simplify builds and address a number of problems.
> > > > Seems it doesn't cause any performance degradation, (the Gluon tests
> > > > run almost 4x faster in my 64 core machine).
> > > >
> > > > There has been in depth study of performance implications by
> > > > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > > > concerns and comments from the reviewers have been addressed and we
> > > > can't keep asking open ended questions to block PRs. Reviewers are
> > > > expected to be proactive and responsive to contributors so we keep
> > > > encouraging active contributors.
> > > >
> > > > please vote to merge this PR accordingly:
> > > >
> > > > +1 = approve
> > > > +0 = no opinion
> > > > -1 = disapprove (provide reason)
> > > >
> > > > If we observe regressions reported by any internal performance
> systems
> > > > or by contributors the PR can be reverted easily. So it's not a one
> > > > way door. But it will be useful to try this in master for a while.
> > > >
> > > > Thank you.
> > > >
> > > > Pedro.
> > > >
>

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Pedro Larroy <pe...@gmail.com>.
I had read the "Apache Voting Process" guide here:
https://www.apache.org/foundation/voting.html  and I thought code
changes could be discussed on the mailing list in cases where the PR
is stuck or there's no response for a long time, also I understood
that -1's have to be justified.  Could you, or someone more familiar
which the Apache way enlighten on how to move this issue forward in a
constructive way?

Thanks a lot.

Pedro.

On Mon, Jun 17, 2019 at 10:46 AM Pedro Larroy
<pe...@gmail.com> wrote:
>
> Thanks.
>
> How do we go on advancing this PR then? all the questions have been
> answered, performance numbers provided and more. Until how long can a
> veto stand? Also without replies to contributors.
>
> Pedro.
>
> On Fri, Jun 14, 2019 at 5:44 PM Sheng Zha <zh...@apache.org> wrote:
> >
> > This vote is invalid as the original PR has been vetoed by a committer. A vote on dev@ won't help you circumvent a veto.
> >
> > -sz
> >
> > On 2019/06/14 23:59:33, Pedro Larroy <pe...@gmail.com> wrote:
> > > Hi all
> > >
> > > This is a 5-day vote to act and wrap up an outstanding PR that removes
> > > linkage with multiple OpenMP from 3rdparty and uses the system
> > > provided one which might resolve a number of difficult to debug issues
> > > and possible undefined behaviour.
> > >
> > > https://github.com/apache/incubator-mxnet/pull/12160
> > >
> > > See the comments in the thread for more details but in short, linking
> > > with multiple openmp versions seems to lead to undefined behaviour,
> > > plus it would simplify not having to deal with a custom openmp version
> > > and rely on the platform provided one.
> > >
> > > This is expected to simplify builds and address a number of problems.
> > > Seems it doesn't cause any performance degradation, (the Gluon tests
> > > run almost 4x faster in my 64 core machine).
> > >
> > > There has been in depth study of performance implications by
> > > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > > concerns and comments from the reviewers have been addressed and we
> > > can't keep asking open ended questions to block PRs. Reviewers are
> > > expected to be proactive and responsive to contributors so we keep
> > > encouraging active contributors.
> > >
> > > please vote to merge this PR accordingly:
> > >
> > > +1 = approve
> > > +0 = no opinion
> > > -1 = disapprove (provide reason)
> > >
> > > If we observe regressions reported by any internal performance systems
> > > or by contributors the PR can be reverted easily. So it's not a one
> > > way door. But it will be useful to try this in master for a while.
> > >
> > > Thank you.
> > >
> > > Pedro.
> > >

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Pedro Larroy <pe...@gmail.com>.
Thanks.

How do we go on advancing this PR then? all the questions have been
answered, performance numbers provided and more. Until how long can a
veto stand? Also without replies to contributors.

Pedro.

On Fri, Jun 14, 2019 at 5:44 PM Sheng Zha <zh...@apache.org> wrote:
>
> This vote is invalid as the original PR has been vetoed by a committer. A vote on dev@ won't help you circumvent a veto.
>
> -sz
>
> On 2019/06/14 23:59:33, Pedro Larroy <pe...@gmail.com> wrote:
> > Hi all
> >
> > This is a 5-day vote to act and wrap up an outstanding PR that removes
> > linkage with multiple OpenMP from 3rdparty and uses the system
> > provided one which might resolve a number of difficult to debug issues
> > and possible undefined behaviour.
> >
> > https://github.com/apache/incubator-mxnet/pull/12160
> >
> > See the comments in the thread for more details but in short, linking
> > with multiple openmp versions seems to lead to undefined behaviour,
> > plus it would simplify not having to deal with a custom openmp version
> > and rely on the platform provided one.
> >
> > This is expected to simplify builds and address a number of problems.
> > Seems it doesn't cause any performance degradation, (the Gluon tests
> > run almost 4x faster in my 64 core machine).
> >
> > There has been in depth study of performance implications by
> > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > concerns and comments from the reviewers have been addressed and we
> > can't keep asking open ended questions to block PRs. Reviewers are
> > expected to be proactive and responsive to contributors so we keep
> > encouraging active contributors.
> >
> > please vote to merge this PR accordingly:
> >
> > +1 = approve
> > +0 = no opinion
> > -1 = disapprove (provide reason)
> >
> > If we observe regressions reported by any internal performance systems
> > or by contributors the PR can be reverted easily. So it's not a one
> > way door. But it will be useful to try this in master for a while.
> >
> > Thank you.
> >
> > Pedro.
> >

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Sheng Zha <zh...@apache.org>.
This vote is invalid as the original PR has been vetoed by a committer. A vote on dev@ won't help you circumvent a veto.

-sz

On 2019/06/14 23:59:33, Pedro Larroy <pe...@gmail.com> wrote: 
> Hi all
> 
> This is a 5-day vote to act and wrap up an outstanding PR that removes
> linkage with multiple OpenMP from 3rdparty and uses the system
> provided one which might resolve a number of difficult to debug issues
> and possible undefined behaviour.
> 
> https://github.com/apache/incubator-mxnet/pull/12160
> 
> See the comments in the thread for more details but in short, linking
> with multiple openmp versions seems to lead to undefined behaviour,
> plus it would simplify not having to deal with a custom openmp version
> and rely on the platform provided one.
> 
> This is expected to simplify builds and address a number of problems.
> Seems it doesn't cause any performance degradation, (the Gluon tests
> run almost 4x faster in my 64 core machine).
> 
> There has been in depth study of performance implications by
> contributors like Stanislav Tsukrov and Anton Chernov.  All the
> concerns and comments from the reviewers have been addressed and we
> can't keep asking open ended questions to block PRs. Reviewers are
> expected to be proactive and responsive to contributors so we keep
> encouraging active contributors.
> 
> please vote to merge this PR accordingly:
> 
> +1 = approve
> +0 = no opinion
> -1 = disapprove (provide reason)
> 
> If we observe regressions reported by any internal performance systems
> or by contributors the PR can be reverted easily. So it's not a one
> way door. But it will be useful to try this in master for a while.
> 
> Thank you.
> 
> Pedro.
> 

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Pedro Larroy <pe...@gmail.com>.
Let's cancel this vote for now and move the OMP conversation to a
discuss thread as suggested here in order not to antagonize anyone.
Let's find the best solution, this is not about one upping anyone but
to gather facts and document what's going on.

Chris, would you be so kind to reopen the PR so we can keep gathering
facts to understand and reproduce your results so we can have a
constructive discussion?  I think this would be a compromise to move
forward in a non confrontational way.

Thank you.

Pedro.

On Tue, Jun 18, 2019 at 1:35 PM Pedro Larroy
<pe...@gmail.com> wrote:
>
> Good suggestion. I thought the issue was discussed at length but I
> agree might be better to softly bring them to the list as a discuss
> first instead of a direct vote. Will do so in the future.
>
> Thanks.
>
>
> On Tue, Jun 18, 2019 at 12:56 PM Tianqi Chen <tq...@cs.washington.edu> wrote:
> >
> > In a situation like this, I would suggest a DISCUSS thread is more proper
> > before the voting one.
> > As the tension can generally be high in a voting thread and it is hard to
> > make a technical decision without previous discussions and pros/cons being
> > listed.
> >
> > Tianqi
> >
> > On Fri, Jun 14, 2019 at 4:59 PM Pedro Larroy <pe...@gmail.com>
> > wrote:
> >
> > > Hi all
> > >
> > > This is a 5-day vote to act and wrap up an outstanding PR that removes
> > > linkage with multiple OpenMP from 3rdparty and uses the system
> > > provided one which might resolve a number of difficult to debug issues
> > > and possible undefined behaviour.
> > >
> > > https://github.com/apache/incubator-mxnet/pull/12160
> > >
> > > See the comments in the thread for more details but in short, linking
> > > with multiple openmp versions seems to lead to undefined behaviour,
> > > plus it would simplify not having to deal with a custom openmp version
> > > and rely on the platform provided one.
> > >
> > > This is expected to simplify builds and address a number of problems.
> > > Seems it doesn't cause any performance degradation, (the Gluon tests
> > > run almost 4x faster in my 64 core machine).
> > >
> > > There has been in depth study of performance implications by
> > > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > > concerns and comments from the reviewers have been addressed and we
> > > can't keep asking open ended questions to block PRs. Reviewers are
> > > expected to be proactive and responsive to contributors so we keep
> > > encouraging active contributors.
> > >
> > > please vote to merge this PR accordingly:
> > >
> > > +1 = approve
> > > +0 = no opinion
> > > -1 = disapprove (provide reason)
> > >
> > > If we observe regressions reported by any internal performance systems
> > > or by contributors the PR can be reverted easily. So it's not a one
> > > way door. But it will be useful to try this in master for a while.
> > >
> > > Thank you.
> > >
> > > Pedro.
> > >

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Pedro Larroy <pe...@gmail.com>.
Good suggestion. I thought the issue was discussed at length but I
agree might be better to softly bring them to the list as a discuss
first instead of a direct vote. Will do so in the future.

Thanks.


On Tue, Jun 18, 2019 at 12:56 PM Tianqi Chen <tq...@cs.washington.edu> wrote:
>
> In a situation like this, I would suggest a DISCUSS thread is more proper
> before the voting one.
> As the tension can generally be high in a voting thread and it is hard to
> make a technical decision without previous discussions and pros/cons being
> listed.
>
> Tianqi
>
> On Fri, Jun 14, 2019 at 4:59 PM Pedro Larroy <pe...@gmail.com>
> wrote:
>
> > Hi all
> >
> > This is a 5-day vote to act and wrap up an outstanding PR that removes
> > linkage with multiple OpenMP from 3rdparty and uses the system
> > provided one which might resolve a number of difficult to debug issues
> > and possible undefined behaviour.
> >
> > https://github.com/apache/incubator-mxnet/pull/12160
> >
> > See the comments in the thread for more details but in short, linking
> > with multiple openmp versions seems to lead to undefined behaviour,
> > plus it would simplify not having to deal with a custom openmp version
> > and rely on the platform provided one.
> >
> > This is expected to simplify builds and address a number of problems.
> > Seems it doesn't cause any performance degradation, (the Gluon tests
> > run almost 4x faster in my 64 core machine).
> >
> > There has been in depth study of performance implications by
> > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > concerns and comments from the reviewers have been addressed and we
> > can't keep asking open ended questions to block PRs. Reviewers are
> > expected to be proactive and responsive to contributors so we keep
> > encouraging active contributors.
> >
> > please vote to merge this PR accordingly:
> >
> > +1 = approve
> > +0 = no opinion
> > -1 = disapprove (provide reason)
> >
> > If we observe regressions reported by any internal performance systems
> > or by contributors the PR can be reverted easily. So it's not a one
> > way door. But it will be useful to try this in master for a while.
> >
> > Thank you.
> >
> > Pedro.
> >

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Junru Shao <ju...@gmail.com>.
+1 for doing a [discuss] thread.

Listing historic discussion threads/issues/prs on omp would be helpful as
well.

By the way, I suggest that we remain purely technical by discussing pros
and cons. Let’s not include others’ names in technical discussion.

Thanks,
Junru

On Tue, Jun 18, 2019 at 12:56 Tianqi Chen <tq...@cs.washington.edu> wrote:

> In a situation like this, I would suggest a DISCUSS thread is more proper
> before the voting one.
> As the tension can generally be high in a voting thread and it is hard to
> make a technical decision without previous discussions and pros/cons being
> listed.
>
> Tianqi
>
> On Fri, Jun 14, 2019 at 4:59 PM Pedro Larroy <pedro.larroy.lists@gmail.com
> >
> wrote:
>
> > Hi all
> >
> > This is a 5-day vote to act and wrap up an outstanding PR that removes
> > linkage with multiple OpenMP from 3rdparty and uses the system
> > provided one which might resolve a number of difficult to debug issues
> > and possible undefined behaviour.
> >
> > https://github.com/apache/incubator-mxnet/pull/12160
> >
> > See the comments in the thread for more details but in short, linking
> > with multiple openmp versions seems to lead to undefined behaviour,
> > plus it would simplify not having to deal with a custom openmp version
> > and rely on the platform provided one.
> >
> > This is expected to simplify builds and address a number of problems.
> > Seems it doesn't cause any performance degradation, (the Gluon tests
> > run almost 4x faster in my 64 core machine).
> >
> > There has been in depth study of performance implications by
> > contributors like Stanislav Tsukrov and Anton Chernov.  All the
> > concerns and comments from the reviewers have been addressed and we
> > can't keep asking open ended questions to block PRs. Reviewers are
> > expected to be proactive and responsive to contributors so we keep
> > encouraging active contributors.
> >
> > please vote to merge this PR accordingly:
> >
> > +1 = approve
> > +0 = no opinion
> > -1 = disapprove (provide reason)
> >
> > If we observe regressions reported by any internal performance systems
> > or by contributors the PR can be reverted easily. So it's not a one
> > way door. But it will be useful to try this in master for a while.
> >
> > Thank you.
> >
> > Pedro.
> >
>

Re: [VOTE] Remove conflicting OpenMP from CMake builds

Posted by Tianqi Chen <tq...@cs.washington.edu>.
In a situation like this, I would suggest a DISCUSS thread is more proper
before the voting one.
As the tension can generally be high in a voting thread and it is hard to
make a technical decision without previous discussions and pros/cons being
listed.

Tianqi

On Fri, Jun 14, 2019 at 4:59 PM Pedro Larroy <pe...@gmail.com>
wrote:

> Hi all
>
> This is a 5-day vote to act and wrap up an outstanding PR that removes
> linkage with multiple OpenMP from 3rdparty and uses the system
> provided one which might resolve a number of difficult to debug issues
> and possible undefined behaviour.
>
> https://github.com/apache/incubator-mxnet/pull/12160
>
> See the comments in the thread for more details but in short, linking
> with multiple openmp versions seems to lead to undefined behaviour,
> plus it would simplify not having to deal with a custom openmp version
> and rely on the platform provided one.
>
> This is expected to simplify builds and address a number of problems.
> Seems it doesn't cause any performance degradation, (the Gluon tests
> run almost 4x faster in my 64 core machine).
>
> There has been in depth study of performance implications by
> contributors like Stanislav Tsukrov and Anton Chernov.  All the
> concerns and comments from the reviewers have been addressed and we
> can't keep asking open ended questions to block PRs. Reviewers are
> expected to be proactive and responsive to contributors so we keep
> encouraging active contributors.
>
> please vote to merge this PR accordingly:
>
> +1 = approve
> +0 = no opinion
> -1 = disapprove (provide reason)
>
> If we observe regressions reported by any internal performance systems
> or by contributors the PR can be reverted easily. So it's not a one
> way door. But it will be useful to try this in master for a while.
>
> Thank you.
>
> Pedro.
>