You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mxnet.apache.org by Anton Chernov <me...@gmail.com> on 2019/02/12 12:37:31 UTC

Benchmarking MXNet with different compilers and different OpenMP implementations (results)

Dear MXNet community,

Due to multiple problems related to OpenMP and stale proposed change [1] we
have been working on gathering performance data on the impact of using
different OpenMP implementations with MXNet (great thanks to Stanislav
Tsukrov for the hard work). The results can be found here [2].

As a short summary of the investigation: The difference between different
compilers is insignificant. Native OpenMP implementations (more or less
recent) perform equally (<5% difference). See more details in the document.

Please review the document and share your thoughts on the topic.

Thanks!

Best
Anton

[1]
https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@
<dev.mxnet.apache.org>
[2] https://cwiki.apache.org/confluence/x/2wclBg

Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

Posted by Pedro Larroy <pe...@gmail.com>.
Are there any updates on this?

This is still affecting multiprocessing, some tests hang:

rces. For information on submitting this issue, please see
https://bugs.llvm.org/.
[INFO] Setting test np/mx/python random seeds, use
MXNET_TEST_SEED=2124604270 to reproduce.
Assertion failure at kmp_runtime.cpp(6479): __kmp_thread_pool == __null.
OMP: Error #13: Assertion failure at kmp_runtime.cpp(6479).
OMP: Hint: Please submit a bug report with this message, compile and
run commands used, and machine configuration info including native
compiler and operating system versions. Faster response will be
obtained by including all program sources. For information on
submitting this issue, please see https://bugs.llvm.org/.
Assertion failure at kmp_runtime.cpp(6479): __kmp_thread_pool == __null.
OMP: Error #13: Assertion failure at kmp_runtime.cpp(6479).
OMP: Hint: Please submit a bug report with this message, compile and
run commands used, and machine configuration info including native
compiler and operating system versions. Faster response will be
obtained by including all program sources. For information on
submitting this issue, please see https://bugs.llvm.org/.
^CException ignored in: <bound method DataLoader.__del__ of
<mxnet.gluon.data.dataloader.DataLoader object at 0x7f18fb7b6400>>
Traceback (most recent call last):
  File "/home/piotr/mxnet_other/python/mxnet/gluon/data/dataloader.py",
line 595, in __del__
    self._worker_pool.terminate()
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 567, in terminate
    self._terminate()
  File "/usr/lib/python3.6/multiprocessing/util.py", line 186, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 597, in
_terminate_pool
    cls._help_stuff_finish(inqueue, task_handler, len(pool))
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 582, in
_help_stuff_finish
    inqueue._rlock.acquire()
KeyboardInterrupt

Pedro.

On Thu, Feb 14, 2019 at 6:30 AM Tsukrov, Stanislav
<st...@gmail.com> wrote:
>
> Thanks Aaron for the feedback.
>
> > As for your next steps, would you propose that cmake be brought up to parity?
> Yes. sse2 in cmake vs sse3 in make is a minor example without high impact. There are others.
>
> > It seems strange that it causes slowness and if so, it shouldn't be recommended for now.
> There are some issues in the cmake-files code, that should be fixed. Some of them are workarounded for the benchmark.
>
> Best Regards
>
> Stas
>
> On 14.02.19, 14:09, "Anton Chernov" <me...@gmail.com> wrote:
>
>     Thank you, Aaron, for your interest on the topic.
>
>     My main previous proposal still stands: remove bundled OpenMP submodule and
>     use OpenMP provided by the environment [1]. This might lead to performance
>     degradation in some cases where an old OpenMP library is used or thread
>     affinity wasn't set properly. But that would be a problem of the
>     environment, not MXNet.
>
>     I described some alternative solutions in [1] as part of this [2] thread.
>     Tricking the linker with symlinks in both cases should allow to avoid
>     multiple OpenMP implementations linked simultaneously to MXNet. Windows
>     questions would be still open.
>
>     Best
>     Anton
>
>     [1] https://github.com/apache/incubator-mxnet/pull/12160
>     [2]
>     https://lists.apache.org/thread.html/007d8db15a1782e1b20896a4050b62710d4ff0908c67b94af7cb0f8b@%3Cdev.mxnet.apache.org%3E
>     [3]
>     https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@%3Cdev.mxnet.apache.org%3E
>
>
>     вт, 12 февр. 2019 г. в 16:39, Aaron Markham <aa...@gmail.com>:
>
>     > This is really great research. I've often wondered what the difference
>     > really is, and why it has to be so complicated. It seems the answer is
>     > there isn't much difference and it shouldn't be as complex.
>     > As for your next steps, would you propose that cmake be brought up to
>     > parity? It seems strange that it causes slowness and if so, it shouldn't be
>     > recommended for now.
>     > Also, testing for windows compliers might be quite important as install
>     > stats suggest a significant portion of windows users. Wouldn't this nudge
>     > the decision of what to use as a rule going forward?
>     > I ran into this submodule openmp issue on windows myself. How does that get
>     > fixed? Do we have to repackage all of the submodules to make sure they use
>     > the recommended implementation or they use what the system expects?
>     >
>     > Cheers,
>     > Aaron
>     >
>     > On Tue, Feb 12, 2019, 04:37 Anton Chernov <me...@gmail.com> wrote:
>     >
>     > > Dear MXNet community,
>     > >
>     > > Due to multiple problems related to OpenMP and stale proposed change [1]
>     > we
>     > > have been working on gathering performance data on the impact of using
>     > > different OpenMP implementations with MXNet (great thanks to Stanislav
>     > > Tsukrov for the hard work). The results can be found here [2].
>     > >
>     > > As a short summary of the investigation: The difference between different
>     > > compilers is insignificant. Native OpenMP implementations (more or less
>     > > recent) perform equally (<5% difference). See more details in the
>     > document.
>     > >
>     > > Please review the document and share your thoughts on the topic.
>     > >
>     > > Thanks!
>     > >
>     > > Best
>     > > Anton
>     > >
>     > > [1]
>     > >
>     > >
>     > https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@
>     > > <dev.mxnet.apache.org>
>     > > [2] https://cwiki.apache.org/confluence/x/2wclBg
>     > >
>     >
>
>
>

Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

Posted by "Tsukrov, Stanislav" <st...@gmail.com>.
Thanks Aaron for the feedback.

> As for your next steps, would you propose that cmake be brought up to parity? 
Yes. sse2 in cmake vs sse3 in make is a minor example without high impact. There are others.

> It seems strange that it causes slowness and if so, it shouldn't be recommended for now.
There are some issues in the cmake-files code, that should be fixed. Some of them are workarounded for the benchmark.

Best Regards

Stas

On 14.02.19, 14:09, "Anton Chernov" <me...@gmail.com> wrote:

    Thank you, Aaron, for your interest on the topic.
    
    My main previous proposal still stands: remove bundled OpenMP submodule and
    use OpenMP provided by the environment [1]. This might lead to performance
    degradation in some cases where an old OpenMP library is used or thread
    affinity wasn't set properly. But that would be a problem of the
    environment, not MXNet.
    
    I described some alternative solutions in [1] as part of this [2] thread.
    Tricking the linker with symlinks in both cases should allow to avoid
    multiple OpenMP implementations linked simultaneously to MXNet. Windows
    questions would be still open.
    
    Best
    Anton
    
    [1] https://github.com/apache/incubator-mxnet/pull/12160
    [2]
    https://lists.apache.org/thread.html/007d8db15a1782e1b20896a4050b62710d4ff0908c67b94af7cb0f8b@%3Cdev.mxnet.apache.org%3E
    [3]
    https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@%3Cdev.mxnet.apache.org%3E
    
    
    вт, 12 февр. 2019 г. в 16:39, Aaron Markham <aa...@gmail.com>:
    
    > This is really great research. I've often wondered what the difference
    > really is, and why it has to be so complicated. It seems the answer is
    > there isn't much difference and it shouldn't be as complex.
    > As for your next steps, would you propose that cmake be brought up to
    > parity? It seems strange that it causes slowness and if so, it shouldn't be
    > recommended for now.
    > Also, testing for windows compliers might be quite important as install
    > stats suggest a significant portion of windows users. Wouldn't this nudge
    > the decision of what to use as a rule going forward?
    > I ran into this submodule openmp issue on windows myself. How does that get
    > fixed? Do we have to repackage all of the submodules to make sure they use
    > the recommended implementation or they use what the system expects?
    >
    > Cheers,
    > Aaron
    >
    > On Tue, Feb 12, 2019, 04:37 Anton Chernov <me...@gmail.com> wrote:
    >
    > > Dear MXNet community,
    > >
    > > Due to multiple problems related to OpenMP and stale proposed change [1]
    > we
    > > have been working on gathering performance data on the impact of using
    > > different OpenMP implementations with MXNet (great thanks to Stanislav
    > > Tsukrov for the hard work). The results can be found here [2].
    > >
    > > As a short summary of the investigation: The difference between different
    > > compilers is insignificant. Native OpenMP implementations (more or less
    > > recent) perform equally (<5% difference). See more details in the
    > document.
    > >
    > > Please review the document and share your thoughts on the topic.
    > >
    > > Thanks!
    > >
    > > Best
    > > Anton
    > >
    > > [1]
    > >
    > >
    > https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@
    > > <dev.mxnet.apache.org>
    > > [2] https://cwiki.apache.org/confluence/x/2wclBg
    > >
    >
    



Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

Posted by Anton Chernov <me...@gmail.com>.
Thank you, Aaron, for your interest on the topic.

My main previous proposal still stands: remove bundled OpenMP submodule and
use OpenMP provided by the environment [1]. This might lead to performance
degradation in some cases where an old OpenMP library is used or thread
affinity wasn't set properly. But that would be a problem of the
environment, not MXNet.

I described some alternative solutions in [1] as part of this [2] thread.
Tricking the linker with symlinks in both cases should allow to avoid
multiple OpenMP implementations linked simultaneously to MXNet. Windows
questions would be still open.

Best
Anton

[1] https://github.com/apache/incubator-mxnet/pull/12160
[2]
https://lists.apache.org/thread.html/007d8db15a1782e1b20896a4050b62710d4ff0908c67b94af7cb0f8b@%3Cdev.mxnet.apache.org%3E
[3]
https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@%3Cdev.mxnet.apache.org%3E


вт, 12 февр. 2019 г. в 16:39, Aaron Markham <aa...@gmail.com>:

> This is really great research. I've often wondered what the difference
> really is, and why it has to be so complicated. It seems the answer is
> there isn't much difference and it shouldn't be as complex.
> As for your next steps, would you propose that cmake be brought up to
> parity? It seems strange that it causes slowness and if so, it shouldn't be
> recommended for now.
> Also, testing for windows compliers might be quite important as install
> stats suggest a significant portion of windows users. Wouldn't this nudge
> the decision of what to use as a rule going forward?
> I ran into this submodule openmp issue on windows myself. How does that get
> fixed? Do we have to repackage all of the submodules to make sure they use
> the recommended implementation or they use what the system expects?
>
> Cheers,
> Aaron
>
> On Tue, Feb 12, 2019, 04:37 Anton Chernov <me...@gmail.com> wrote:
>
> > Dear MXNet community,
> >
> > Due to multiple problems related to OpenMP and stale proposed change [1]
> we
> > have been working on gathering performance data on the impact of using
> > different OpenMP implementations with MXNet (great thanks to Stanislav
> > Tsukrov for the hard work). The results can be found here [2].
> >
> > As a short summary of the investigation: The difference between different
> > compilers is insignificant. Native OpenMP implementations (more or less
> > recent) perform equally (<5% difference). See more details in the
> document.
> >
> > Please review the document and share your thoughts on the topic.
> >
> > Thanks!
> >
> > Best
> > Anton
> >
> > [1]
> >
> >
> https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@
> > <dev.mxnet.apache.org>
> > [2] https://cwiki.apache.org/confluence/x/2wclBg
> >
>

Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

Posted by Aaron Markham <aa...@gmail.com>.
This is really great research. I've often wondered what the difference
really is, and why it has to be so complicated. It seems the answer is
there isn't much difference and it shouldn't be as complex.
As for your next steps, would you propose that cmake be brought up to
parity? It seems strange that it causes slowness and if so, it shouldn't be
recommended for now.
Also, testing for windows compliers might be quite important as install
stats suggest a significant portion of windows users. Wouldn't this nudge
the decision of what to use as a rule going forward?
I ran into this submodule openmp issue on windows myself. How does that get
fixed? Do we have to repackage all of the submodules to make sure they use
the recommended implementation or they use what the system expects?

Cheers,
Aaron

On Tue, Feb 12, 2019, 04:37 Anton Chernov <me...@gmail.com> wrote:

> Dear MXNet community,
>
> Due to multiple problems related to OpenMP and stale proposed change [1] we
> have been working on gathering performance data on the impact of using
> different OpenMP implementations with MXNet (great thanks to Stanislav
> Tsukrov for the hard work). The results can be found here [2].
>
> As a short summary of the investigation: The difference between different
> compilers is insignificant. Native OpenMP implementations (more or less
> recent) perform equally (<5% difference). See more details in the document.
>
> Please review the document and share your thoughts on the topic.
>
> Thanks!
>
> Best
> Anton
>
> [1]
>
> https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@
> <dev.mxnet.apache.org>
> [2] https://cwiki.apache.org/confluence/x/2wclBg
>