You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@arrow.apache.org by Philipp Moritz <pc...@gmail.com> on 2018/12/16 04:43:25 UTC

TensorFlow, PyTorch, and manylinux1

Dear all,

As some of you know, there is a standard in Python called manylinux (
https://www.python.org/dev/peps/pep-0513/) to package binary executables
and libraries into a “wheel” in a way that allows the code to be run on a
wide variety of Linux distributions. This is very convenient for Python
users, since such libraries can be easily installed via pip.

This standard is also important for a second reason: If many different
wheels are used together in a single Python process, adhering to manylinux
ensures that these libraries work together well and don’t trip on each
other’s toes (this could easily happen if different versions of libstdc++
are used for example). Therefore *even if support for only a single
distribution like Ubuntu is desired*, it is important to be manylinux
compatible to make sure everybody’s wheels work together well.

TensorFlow and PyTorch unfortunately don’t produce manylinux compatible
wheels. The challenge is due, at least in part, to the need to use
nvidia-docker to build GPU binaries [10]. This causes various levels of
pain for the rest of the Python community, see for example [1] [2] [3] [4]
[5] [6] [7] [8].

The purpose of the e-mail is to get a discussion started on how we can make
TensorFlow and PyTorch manylinux compliant. There is a new standard in the
works [9] so hopefully we can discuss what would be necessary to make sure
TensorFlow and PyTorch can adhere to this standard in the future.

It would make everybody’s lives just a little bit better! Any ideas are
appreciated.

@soumith: Could you cc the relevant list? I couldn't find a pytorch dev
mailing list.

Best,
Philipp.

[1] https://github.com/tensorflow/tensorflow/issues/5033
[2] https://github.com/tensorflow/tensorflow/issues/8802
[3] https://github.com/primitiv/primitiv-python/issues/28
[4] https://github.com/zarr-developers/numcodecs/issues/70
[5] https://github.com/apache/arrow/pull/3177
[6] https://github.com/tensorflow/tensorflow/issues/13615
[7] https://github.com/pytorch/pytorch/issues/8358
[8] https://github.com/ray-project/ray/issues/2159
[9] https://www.python.org/dev/peps/pep-0571/
[10]
https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940

Re: TensorFlow, PyTorch, and manylinux1

Posted by soumith <so...@gmail.com>.

Hey Travis,

PyTorch and anaconda are actually smooth. There are no issues with
Anaconda, and we officially maintain conda packages (it's also our
recommended and default package manager).

Conda-forge recipes are currently not possible because conda-forge hasn't
finalized their CUDA packaging mechanisms.

This thread is mostly focusing on unscrewing the PyPI situation.
--
S

On Mon, Dec 17, 2018 at 9:54 AM Travis Oliphant <tr...@quansight.com>
wrote:

> Can PyTorch provide and maintain a conda-forge recipe?
>
> This would allow the large and growing conda forge ecosystem to easily
> install PyTorch in a community-supported way.
>
> Are there problems with using conda or another general package manager?
>
> I agree that the machine learning packages are trying to make a language
> specific package manager do more than it was intended and other open source
> solutions already exist.
>
> Thanks,
>
> Travis
>
>
> On Mon, Dec 17, 2018, 12:32 AM soumith <soumith@gmail.com wrote:
>
> > I'm reposting my original reply below the current reply (below a dotted
> > line). It was filtered out because I wasn't subscribed to the relevant
> > mailing lists.
> >
> >  tl;dr: manylinux2010 looks pretty promising, because CUDA supports
> CentOS6
> > (for now).
> >
> > In the meanwhile, I dug into what pyarrow does, and it looks like it
> links
> > with `static-libstdc++` along with a linker version script [1].
> >
> > PyTorch did exactly that until Jan this year [2], except that our linker
> > version script didn't cover the subtleties of statically linking stdc++
> as
> > well as Arrow did. Because we weren't covering all of the stdc++ static
> > linking subtleties, we were facing huge issues that amplified wheel
> > incompatibility (import X; import torch crashing under various X). Hence,
> > we moved since then to linking with system-shipped libstdc++, doing no
> > static stdc++ linking.
> >
> > I'll revisit this in light of manylinux2010, and go down the path of
> static
> > linkage of stdc++ again, though I'm wary of the subtleties around
> handling
> > of weak symbols, std::string destruction across library boundaries [3]
> and
> > std::string's ABI incompatibility issues.
> >
> > I've opened a tracking issue here:
> > https://github.com/pytorch/pytorch/issues/15294
> >
> > I'm looking forward to hearing from the TensorFlow devs if manylinux2010
> is
> > sufficient for them, or what additional constraints they have.
> >
> > As a personal thought, I find multiple libraries in the same process
> > statically linking to stdc++ gross, but without a package manager like
> > Anaconda that actually is willing to deal with the C++-side dependencies,
> > there aren't many options on the table.
> >
> > References:
> >
> > [1]
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/symbols.map
> > [2] https://github.com/pytorch/pytorch/blob/v0.3.1/tools/pytorch.version
> > [3]
> https://github.com/pytorch/pytorch/issues/5400#issuecomment-369428125
> >
> >
> ............................................................................................................................................................
> > Hi Philipp,
> >
> > Thanks a lot for getting a discussion started. I've sunk ~100+ hours over
> > the last 2 years making PyTorch wheels play well with OpenCV, TensorFlow
> > and other wheels, that I'm glad to see this discussion started.
> >
> >
> > On the PyTorch wheels, we have been shipping with the minimum glibc and
> > libstdc++ versions we can possibly work with, while keeping two hard
> > constraints:
> >
> > 1. CUDA support
> > 2. C++11 support
> >
> >
> > 1. CUDA support
> >
> > manylinux1 is not an option, considering CUDA doesn't work out of
> CentOS5.
> > I explored this option [1] to no success.
> >
> > manylinux2010 is an option at the moment wrt CUDA, but it's unclear when
> > NVIDIA will lift support for CentOS6 under us.
> > Additionally, CuDNN 7.0 (if I remember) was compiled against Ubuntu 12.04
> > (meaning the glibc version is newer than CentOS6), and binaries linked
> > against CuDNN refused to run on CentOS6. I requested that this constraint
> > be lifted, and the next dot release fixed it.
> >
> > The reason PyTorch binaries are not manylinux2010 compatible at the
> moment
> > is because of the next constraint: C++11.
> >
> > 2. C++11
> >
> > We picked C++11 as the minimum supported dialect for PyTorch, primarily
> to
> > serve the default compilers of older machines, i.e. Ubuntu 14.04 and
> > CentOS7. The newer options were C++14 / C++17, but we decided to polyfill
> > what we needed to support older distros better.
> >
> > A fully fleshed out C++11 implementation landed in gcc in various stages,
> > with gradual ABI changes [2]. Unfortunately, the libstdc++ that ships
> with
> > centos6 (and hence manylinx2010) isn't sufficient to cover all of C++11.
> > For example, the binaries we built with devtoolset3 (gcc 4.9.2) on
> CentOS6
> > didn't run with the default libstdc++ on CentOS6 either due to ABI
> changes
> > or minimum GLIBCXX version for some of the symbols being unavailable.
> >
> > We tried our best to support our binaries running on CentOS6 and above
> with
> > various ranges of static linking hacks until 0.3.1 (January 2018), but at
> > some point hacks over hacks was only getting more fragile. Hence we moved
> > to a CentOS7-based image in April 2018 [3], and relied only on dynamic
> > linking to the system-shipped libstdc++.
> >
> > As Wes mentions [4], an option is to host a modern C++ standard library
> via
> > PyPI would put manylinux2010 on the table. There are however subtle
> > consequences with this -- if this package gets installed into a conda
> > environment, it'll clobber anaconda-shipped libstdc++, possibly
> corrupting
> > environments for thousands of anaconda users (this is actually similar to
> > the issues with `mkl` shipped via PyPI and Conda clobbering each other).
> >
> >
> > References:
> >
> > [1] https://github.com/NVIDIA/nvidia-docker/issues/348
> > [2] https://gcc.gnu.org/wiki/Cxx11AbiCompatibility
> > [3]
> >
> >
> https://github.com/pytorch/builder/commit/44d9bfa607a7616c66fe6492fadd8f05f3578b93
> > [4] https://github.com/apache/arrow/pull/3177#issuecomment-447515982
> >
> >
> ..............................................................................................................................................................................................
> >
> > On Sun, Dec 16, 2018 at 2:57 PM Wes McKinney <we...@gmail.com>
> wrote:
> >
> > > Reposting since I wasn't subscribed to developers@tensorflow.org. I
> > > also didn't see Soumith's response since it didn't come through to
> > > dev@arrow.apache.org
> > >
> > > In response to the non-conforming ABI in the TF and PyTorch wheels, we
> > > have attempted to hack around the issue with some elaborate
> > > workarounds [1] [2] that have ultimately proved to not work
> > > universally. The bottom line is that this is burdening other projects
> > > in the Python ecosystem and causing confusing application crashes.
> > >
> > > First, to state what should hopefully obvious to many of you, Python
> > > wheels are not a robust way to deploy complex C++ projects, even
> > > setting aside the compiler toolchain issue. If a project has
> > > non-trivial third party dependencies, you either have to statically
> > > link them or bundle shared libraries with the wheel (we do a bit of
> > > both in Apache Arrow). Neither solution is foolproof in all cases.
> > > There are other downsides to wheels when it comes to numerical
> > > computing -- it is difficult to utilize things like the Intel MKL
> > > which may be used by multiple projects. If two projects have the same
> > > third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
> > > straw man example), it's hard to guarantee that versions or ABI will
> > > not conflict with each other.
> > >
> > > In packaging with conda, we pin all dependencies when building
> > > projects that depend on them, then package and deploy the dependencies
> > > as separate shared libraries instead of bundling. To resolve the need
> > > for newer compilers or newer C++ standard library, libstdc++.so and
> > > other system shared libraries are packaged and installed as
> > > dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
> > > is used as it performs selective static linking of symbols to enable
> > > C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
> > > environment functions as sort of portable miniature Linux
> > > distribution.
> > >
> > > Given the current state of things, as using the TensorFlow and PyTorch
> > > wheels in the same process as other conforming manylinux1 wheels is
> > > unsafe, it's hard to see how one can continue to recommend pip as a
> > > preferred installation path until the ABI problems are resolved. For
> > > example, "pip" is what is recommended for installing TensorFlow on
> > > Linux [3]. It's unclear that non-compliant wheels should be allowed in
> > > the package manager at all (I'm aware that this was deemed to not be
> > > the responsibility of PyPI to verify policy compliance [4]).
> > >
> > > A couple possible paths forward (there may be others):
> > >
> > > * Collaborate with the Python packaging authority to evolve the
> > > manylinux ABI to be able to produce compliant wheels that support the
> > > build and deployment requirements of these projects
> > > * Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
> > > projects can ship packages that can be guaranteed to work properly
> > > with TF/PyTorch. This might require vendoring libstdc++ in some kind
> > > of "toolchain" wheel that projects using this new ABI can depend on
> > >
> > > Note that these toolchain and deployment issues are absent when
> > > building and deploying with conda packages, since build- and run-time
> > > dependencies can be pinned and shared across all the projects that
> > > depend on them, ensuring ABI cross-compatibility. It's great to have
> > > the convenience of "pip install $PROJECT", but I believe that these
> > > projects have outgrown the intended use for pip and wheel
> > > distributions.
> > >
> > > Until the ABI incompatibilities are resolved, I would encourage more
> > > prominent user documentation about the non-portability and potential
> > > for crashes with these Linux wheels.
> > >
> > > Thanks,
> > > Wes
> > >
> > > [1]:
> > >
> >
> https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
> > > [2]:
> > >
> >
> https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
> > > [3]: https://www.tensorflow.org/install/
> > > [4]: https://www.python.org/dev/peps/pep-0513/#id50
> > > On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
> > > <ro...@gmail.com> wrote:
> > > >
> > > > On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com>
> > > wrote:
> > > >
> > > > > Dear all,
> > > > >
> > > > > As some of you know, there is a standard in Python called
> manylinux (
> > > > > https://www.python.org/dev/peps/pep-0513/) to package binary
> > > executables
> > > > > and libraries into a “wheel” in a way that allows the code to be
> run
> > > on a
> > > > > wide variety of Linux distributions. This is very convenient for
> > Python
> > > > > users, since such libraries can be easily installed via pip.
> > > > >
> > > > > This standard is also important for a second reason: If many
> > different
> > > > > wheels are used together in a single Python process, adhering to
> > > manylinux
> > > > > ensures that these libraries work together well and don’t trip on
> > each
> > > > > other’s toes (this could easily happen if different versions of
> > > libstdc++
> > > > > are used for example). Therefore *even if support for only a single
> > > > > distribution like Ubuntu is desired*, it is important to be
> manylinux
> > > > > compatible to make sure everybody’s wheels work together well.
> > > > >
> > > > > TensorFlow and PyTorch unfortunately don’t produce manylinux
> > compatible
> > > > > wheels. The challenge is due, at least in part, to the need to use
> > > > > nvidia-docker to build GPU binaries [10]. This causes various
> levels
> > of
> > > > > pain for the rest of the Python community, see for example [1] [2]
> > [3]
> > > [4]
> > > > > [5] [6] [7] [8].
> > > > >
> > > > > The purpose of the e-mail is to get a discussion started on how we
> > can
> > > > > make TensorFlow and PyTorch manylinux compliant. There is a new
> > > standard in
> > > > > the works [9] so hopefully we can discuss what would be necessary
> to
> > > make
> > > > > sure TensorFlow and PyTorch can adhere to this standard in the
> > future.
> > > > >
> > > > > It would make everybody’s lives just a little bit better! Any ideas
> > are
> > > > > appreciated.
> > > > >
> > > > > @soumith: Could you cc the relevant list? I couldn't find a pytorch
> > dev
> > > > > mailing list.
> > > > >
> > > > > Best,
> > > > > Philipp.
> > > > >
> > > > > [1] https://github.com/tensorflow/tensorflow/issues/5033
> > > > > [2] https://github.com/tensorflow/tensorflow/issues/8802
> > > > > [3] https://github.com/primitiv/primitiv-python/issues/28
> > > > > [4] https://github.com/zarr-developers/numcodecs/issues/70
> > > > > [5] https://github.com/apache/arrow/pull/3177
> > > > > [6] https://github.com/tensorflow/tensorflow/issues/13615
> > > > > [7] https://github.com/pytorch/pytorch/issues/8358
> > > > > [8] https://github.com/ray-project/ray/issues/2159
> > > > > [9] https://www.python.org/dev/peps/pep-0571/
> > > > > [10]
> > > > >
> > >
> >
> https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
> > > > >
> > > > > --
> > > > > You received this message because you are subscribed to the Google
> > > Groups
> > > > > "ray-dev" group.
> > > > > To unsubscribe from this group and stop receiving emails from it,
> > send
> > > an
> > > > > email to ray-dev+unsubscribe@googlegroups.com.
> > > > > To post to this group, send email to ray-dev@googlegroups.com.
> > > > > To view this discussion on the web visit
> > > > >
> > >
> >
> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
> > > > > <
> > >
> >
> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer
> > > >
> > > > > .
> > > > > For more options, visit https://groups.google.com/d/optout.
> > > > >
> > >
> >
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Travis Oliphant <tr...@quansight.com>.

Can PyTorch provide and maintain a conda-forge recipe?

This would allow the large and growing conda forge ecosystem to easily
install PyTorch in a community-supported way.

Are there problems with using conda or another general package manager?

I agree that the machine learning packages are trying to make a language
specific package manager do more than it was intended and other open source
solutions already exist.

Thanks,

Travis


On Mon, Dec 17, 2018, 12:32 AM soumith <soumith@gmail.com wrote:

> I'm reposting my original reply below the current reply (below a dotted
> line). It was filtered out because I wasn't subscribed to the relevant
> mailing lists.
>
>  tl;dr: manylinux2010 looks pretty promising, because CUDA supports CentOS6
> (for now).
>
> In the meanwhile, I dug into what pyarrow does, and it looks like it links
> with `static-libstdc++` along with a linker version script [1].
>
> PyTorch did exactly that until Jan this year [2], except that our linker
> version script didn't cover the subtleties of statically linking stdc++ as
> well as Arrow did. Because we weren't covering all of the stdc++ static
> linking subtleties, we were facing huge issues that amplified wheel
> incompatibility (import X; import torch crashing under various X). Hence,
> we moved since then to linking with system-shipped libstdc++, doing no
> static stdc++ linking.
>
> I'll revisit this in light of manylinux2010, and go down the path of static
> linkage of stdc++ again, though I'm wary of the subtleties around handling
> of weak symbols, std::string destruction across library boundaries [3] and
> std::string's ABI incompatibility issues.
>
> I've opened a tracking issue here:
> https://github.com/pytorch/pytorch/issues/15294
>
> I'm looking forward to hearing from the TensorFlow devs if manylinux2010 is
> sufficient for them, or what additional constraints they have.
>
> As a personal thought, I find multiple libraries in the same process
> statically linking to stdc++ gross, but without a package manager like
> Anaconda that actually is willing to deal with the C++-side dependencies,
> there aren't many options on the table.
>
> References:
>
> [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/symbols.map
> [2] https://github.com/pytorch/pytorch/blob/v0.3.1/tools/pytorch.version
> [3] https://github.com/pytorch/pytorch/issues/5400#issuecomment-369428125
>
> ............................................................................................................................................................
> Hi Philipp,
>
> Thanks a lot for getting a discussion started. I've sunk ~100+ hours over
> the last 2 years making PyTorch wheels play well with OpenCV, TensorFlow
> and other wheels, that I'm glad to see this discussion started.
>
>
> On the PyTorch wheels, we have been shipping with the minimum glibc and
> libstdc++ versions we can possibly work with, while keeping two hard
> constraints:
>
> 1. CUDA support
> 2. C++11 support
>
>
> 1. CUDA support
>
> manylinux1 is not an option, considering CUDA doesn't work out of CentOS5.
> I explored this option [1] to no success.
>
> manylinux2010 is an option at the moment wrt CUDA, but it's unclear when
> NVIDIA will lift support for CentOS6 under us.
> Additionally, CuDNN 7.0 (if I remember) was compiled against Ubuntu 12.04
> (meaning the glibc version is newer than CentOS6), and binaries linked
> against CuDNN refused to run on CentOS6. I requested that this constraint
> be lifted, and the next dot release fixed it.
>
> The reason PyTorch binaries are not manylinux2010 compatible at the moment
> is because of the next constraint: C++11.
>
> 2. C++11
>
> We picked C++11 as the minimum supported dialect for PyTorch, primarily to
> serve the default compilers of older machines, i.e. Ubuntu 14.04 and
> CentOS7. The newer options were C++14 / C++17, but we decided to polyfill
> what we needed to support older distros better.
>
> A fully fleshed out C++11 implementation landed in gcc in various stages,
> with gradual ABI changes [2]. Unfortunately, the libstdc++ that ships with
> centos6 (and hence manylinx2010) isn't sufficient to cover all of C++11.
> For example, the binaries we built with devtoolset3 (gcc 4.9.2) on CentOS6
> didn't run with the default libstdc++ on CentOS6 either due to ABI changes
> or minimum GLIBCXX version for some of the symbols being unavailable.
>
> We tried our best to support our binaries running on CentOS6 and above with
> various ranges of static linking hacks until 0.3.1 (January 2018), but at
> some point hacks over hacks was only getting more fragile. Hence we moved
> to a CentOS7-based image in April 2018 [3], and relied only on dynamic
> linking to the system-shipped libstdc++.
>
> As Wes mentions [4], an option is to host a modern C++ standard library via
> PyPI would put manylinux2010 on the table. There are however subtle
> consequences with this -- if this package gets installed into a conda
> environment, it'll clobber anaconda-shipped libstdc++, possibly corrupting
> environments for thousands of anaconda users (this is actually similar to
> the issues with `mkl` shipped via PyPI and Conda clobbering each other).
>
>
> References:
>
> [1] https://github.com/NVIDIA/nvidia-docker/issues/348
> [2] https://gcc.gnu.org/wiki/Cxx11AbiCompatibility
> [3]
>
> https://github.com/pytorch/builder/commit/44d9bfa607a7616c66fe6492fadd8f05f3578b93
> [4] https://github.com/apache/arrow/pull/3177#issuecomment-447515982
>
> ..............................................................................................................................................................................................
>
> On Sun, Dec 16, 2018 at 2:57 PM Wes McKinney <we...@gmail.com> wrote:
>
> > Reposting since I wasn't subscribed to developers@tensorflow.org. I
> > also didn't see Soumith's response since it didn't come through to
> > dev@arrow.apache.org
> >
> > In response to the non-conforming ABI in the TF and PyTorch wheels, we
> > have attempted to hack around the issue with some elaborate
> > workarounds [1] [2] that have ultimately proved to not work
> > universally. The bottom line is that this is burdening other projects
> > in the Python ecosystem and causing confusing application crashes.
> >
> > First, to state what should hopefully obvious to many of you, Python
> > wheels are not a robust way to deploy complex C++ projects, even
> > setting aside the compiler toolchain issue. If a project has
> > non-trivial third party dependencies, you either have to statically
> > link them or bundle shared libraries with the wheel (we do a bit of
> > both in Apache Arrow). Neither solution is foolproof in all cases.
> > There are other downsides to wheels when it comes to numerical
> > computing -- it is difficult to utilize things like the Intel MKL
> > which may be used by multiple projects. If two projects have the same
> > third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
> > straw man example), it's hard to guarantee that versions or ABI will
> > not conflict with each other.
> >
> > In packaging with conda, we pin all dependencies when building
> > projects that depend on them, then package and deploy the dependencies
> > as separate shared libraries instead of bundling. To resolve the need
> > for newer compilers or newer C++ standard library, libstdc++.so and
> > other system shared libraries are packaged and installed as
> > dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
> > is used as it performs selective static linking of symbols to enable
> > C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
> > environment functions as sort of portable miniature Linux
> > distribution.
> >
> > Given the current state of things, as using the TensorFlow and PyTorch
> > wheels in the same process as other conforming manylinux1 wheels is
> > unsafe, it's hard to see how one can continue to recommend pip as a
> > preferred installation path until the ABI problems are resolved. For
> > example, "pip" is what is recommended for installing TensorFlow on
> > Linux [3]. It's unclear that non-compliant wheels should be allowed in
> > the package manager at all (I'm aware that this was deemed to not be
> > the responsibility of PyPI to verify policy compliance [4]).
> >
> > A couple possible paths forward (there may be others):
> >
> > * Collaborate with the Python packaging authority to evolve the
> > manylinux ABI to be able to produce compliant wheels that support the
> > build and deployment requirements of these projects
> > * Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
> > projects can ship packages that can be guaranteed to work properly
> > with TF/PyTorch. This might require vendoring libstdc++ in some kind
> > of "toolchain" wheel that projects using this new ABI can depend on
> >
> > Note that these toolchain and deployment issues are absent when
> > building and deploying with conda packages, since build- and run-time
> > dependencies can be pinned and shared across all the projects that
> > depend on them, ensuring ABI cross-compatibility. It's great to have
> > the convenience of "pip install $PROJECT", but I believe that these
> > projects have outgrown the intended use for pip and wheel
> > distributions.
> >
> > Until the ABI incompatibilities are resolved, I would encourage more
> > prominent user documentation about the non-portability and potential
> > for crashes with these Linux wheels.
> >
> > Thanks,
> > Wes
> >
> > [1]:
> >
> https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
> > [2]:
> >
> https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
> > [3]: https://www.tensorflow.org/install/
> > [4]: https://www.python.org/dev/peps/pep-0513/#id50
> > On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
> > <ro...@gmail.com> wrote:
> > >
> > > On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com>
> > wrote:
> > >
> > > > Dear all,
> > > >
> > > > As some of you know, there is a standard in Python called manylinux (
> > > > https://www.python.org/dev/peps/pep-0513/) to package binary
> > executables
> > > > and libraries into a “wheel” in a way that allows the code to be run
> > on a
> > > > wide variety of Linux distributions. This is very convenient for
> Python
> > > > users, since such libraries can be easily installed via pip.
> > > >
> > > > This standard is also important for a second reason: If many
> different
> > > > wheels are used together in a single Python process, adhering to
> > manylinux
> > > > ensures that these libraries work together well and don’t trip on
> each
> > > > other’s toes (this could easily happen if different versions of
> > libstdc++
> > > > are used for example). Therefore *even if support for only a single
> > > > distribution like Ubuntu is desired*, it is important to be manylinux
> > > > compatible to make sure everybody’s wheels work together well.
> > > >
> > > > TensorFlow and PyTorch unfortunately don’t produce manylinux
> compatible
> > > > wheels. The challenge is due, at least in part, to the need to use
> > > > nvidia-docker to build GPU binaries [10]. This causes various levels
> of
> > > > pain for the rest of the Python community, see for example [1] [2]
> [3]
> > [4]
> > > > [5] [6] [7] [8].
> > > >
> > > > The purpose of the e-mail is to get a discussion started on how we
> can
> > > > make TensorFlow and PyTorch manylinux compliant. There is a new
> > standard in
> > > > the works [9] so hopefully we can discuss what would be necessary to
> > make
> > > > sure TensorFlow and PyTorch can adhere to this standard in the
> future.
> > > >
> > > > It would make everybody’s lives just a little bit better! Any ideas
> are
> > > > appreciated.
> > > >
> > > > @soumith: Could you cc the relevant list? I couldn't find a pytorch
> dev
> > > > mailing list.
> > > >
> > > > Best,
> > > > Philipp.
> > > >
> > > > [1] https://github.com/tensorflow/tensorflow/issues/5033
> > > > [2] https://github.com/tensorflow/tensorflow/issues/8802
> > > > [3] https://github.com/primitiv/primitiv-python/issues/28
> > > > [4] https://github.com/zarr-developers/numcodecs/issues/70
> > > > [5] https://github.com/apache/arrow/pull/3177
> > > > [6] https://github.com/tensorflow/tensorflow/issues/13615
> > > > [7] https://github.com/pytorch/pytorch/issues/8358
> > > > [8] https://github.com/ray-project/ray/issues/2159
> > > > [9] https://www.python.org/dev/peps/pep-0571/
> > > > [10]
> > > >
> >
> https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
> > > >
> > > > --
> > > > You received this message because you are subscribed to the Google
> > Groups
> > > > "ray-dev" group.
> > > > To unsubscribe from this group and stop receiving emails from it,
> send
> > an
> > > > email to ray-dev+unsubscribe@googlegroups.com.
> > > > To post to this group, send email to ray-dev@googlegroups.com.
> > > > To view this discussion on the web visit
> > > >
> >
> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
> > > > <
> >
> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer
> > >
> > > > .
> > > > For more options, visit https://groups.google.com/d/optout.
> > > >
> >
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Michael Sarahan <ms...@gmail.com>.

> Somehow we need to arrange that the same compiler toolchain (with
consistent minimum glibc, libstdc++ version) is used to build all of the
binaries we are discussing here. Short of that some system configurations
will continue to have problems.

This was exactly the purpose of Anaconda's crosstool-ng-based compiler
toolchains.  We wrote up a bit at
https://www.anaconda.com/blog/developer-blog/utilizing-the-new-compilers-in-anaconda-distribution-5/

It's not free lunch, as it requires shipping the libstdc++ as others have
noted.  The glibc bound has ultimately been determined by other software
for us - many things require features in newer glibc, and we have found it
infeasible to continue support back to glibc 2.5 (centos 5).

It's still painful on Mac because we can't distribute the old SDK's, and
using the new SDKs that people have does not seem to have the backwards
compatibility guarantees that Apple says they do.

On the bright side, Microsoft seems to have done very well with
compatibility between VS 2015 and 2017.  Fingers crossed that that trend
continues.

On Mon, Dec 17, 2018 at 9:31 AM Wes McKinney <we...@gmail.com> wrote:

> hi Soumith,
>
> On Mon, Dec 17, 2018 at 12:32 AM soumith <so...@gmail.com> wrote:
> >
> > I'm reposting my original reply below the current reply (below a dotted
> line). It was filtered out because I wasn't subscribed to the relevant
> mailing lists.
> >
> >  tl;dr: manylinux2010 looks pretty promising, because CUDA supports
> CentOS6 (for now).
> >
> > In the meanwhile, I dug into what pyarrow does, and it looks like it
> links with `static-libstdc++` along with a linker version script [1].
>
> We aren't passing -static-libstdc++. The static linking of certain
> symbols (so that C++11 features work on older systems) is handled
> automatically by devtoolset-2; we are modifying the visibility of some
> of these linked symbols, though
>
> >
> > PyTorch did exactly that until Jan this year [2], except that our linker
> version script didn't cover the subtleties of statically linking stdc++ as
> well as Arrow did. Because we weren't covering all of the stdc++ static
> linking subtleties, we were facing huge issues that amplified wheel
> incompatibility (import X; import torch crashing under various X). Hence,
> we moved since then to linking with system-shipped libstdc++, doing no
> static stdc++ linking.
> >
>
> Unless you were using the devtoolset-2 toolchain, you were doing
> something different :) My understanding is that passing
> -static-libstdc++ with stock gcc or clang is mainly only appropriate
> when building dependency-free binary applications
>
> > I'll revisit this in light of manylinux2010, and go down the path of
> static linkage of stdc++ again, though I'm wary of the subtleties around
> handling of weak symbols, std::string destruction across library boundaries
> [3] and std::string's ABI incompatibility issues.
> >
> > I've opened a tracking issue here:
> https://github.com/pytorch/pytorch/issues/15294
> >
> > I'm looking forward to hearing from the TensorFlow devs if manylinux2010
> is sufficient for them, or what additional constraints they have.
> >
> > As a personal thought, I find multiple libraries in the same process
> statically linking to stdc++ gross, but without a package manager like
> Anaconda that actually is willing to deal with the C++-side dependencies,
> there aren't many options on the table.
>
> IIUC the idea of the devtoolset-* toolchains is that all libraries
> should use the same toolchain then there are no issues. Having
> multiple projects passing -static-libstdc++ when linking would indeed
> be problematic. The problem we are having is that if any library is
> using devtoolset-2, all libraries need to in order to be compatible.
>
> >
> > References:
> >
> > [1]
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/symbols.map
> > [2] https://github.com/pytorch/pytorch/blob/v0.3.1/tools/pytorch.version
> > [3]
> https://github.com/pytorch/pytorch/issues/5400#issuecomment-369428125
> >
> ............................................................................................................................................................
> > Hi Philipp,
> >
> > Thanks a lot for getting a discussion started. I've sunk ~100+ hours
> over the last 2 years making PyTorch wheels play well with OpenCV,
> TensorFlow and other wheels, that I'm glad to see this discussion started.
> >
> >
> > On the PyTorch wheels, we have been shipping with the minimum glibc and
> libstdc++ versions we can possibly work with, while keeping two hard
> constraints:
> >
> > 1. CUDA support
> > 2. C++11 support
> >
> >
> > 1. CUDA support
> >
> > manylinux1 is not an option, considering CUDA doesn't work out of
> CentOS5. I explored this option [1] to no success.
> >
> > manylinux2010 is an option at the moment wrt CUDA, but it's unclear when
> NVIDIA will lift support for CentOS6 under us.
> > Additionally, CuDNN 7.0 (if I remember) was compiled against Ubuntu
> 12.04 (meaning the glibc version is newer than CentOS6), and binaries
> linked against CuDNN refused to run on CentOS6. I requested that this
> constraint be lifted, and the next dot release fixed it.
> >
> > The reason PyTorch binaries are not manylinux2010 compatible at the
> moment is because of the next constraint: C++11.
>
> Do we need to involve NVIDIA in this discussion? Having problematic
> GPU-enabled libraries in PyPI isn't too good for them either.
>
> >
> > 2. C++11
> >
> > We picked C++11 as the minimum supported dialect for PyTorch, primarily
> to serve the default compilers of older machines, i.e. Ubuntu 14.04 and
> CentOS7. The newer options were C++14 / C++17, but we decided to polyfill
> what we needed to support older distros better.
> >
> > A fully fleshed out C++11 implementation landed in gcc in various
> stages, with gradual ABI changes [2]. Unfortunately, the libstdc++ that
> ships with centos6 (and hence manylinx2010) isn't sufficient to cover all
> of C++11. For example, the binaries we built with devtoolset3 (gcc 4.9.2)
> on CentOS6 didn't run with the default libstdc++ on CentOS6 either due to
> ABI changes or minimum GLIBCXX version for some of the symbols being
> unavailable.
> >
>
> Do you have a link to the paper trail about this? I had thought a
> major raison d'etre of the devtoolset compilers is to support C++11 on
> older Linuxes. For example, we are using C++11 in Arrow but we're
> limiting ourselves at present to what's available in gcc 4.8.x; our
> binaries work fine on CentOS5 and 6.
>
> > We tried our best to support our binaries running on CentOS6 and above
> with various ranges of static linking hacks until 0.3.1 (January 2018), but
> at some point hacks over hacks was only getting more fragile. Hence we
> moved to a CentOS7-based image in April 2018 [3], and relied only on
> dynamic linking to the system-shipped libstdc++.
> >
> > As Wes mentions [4], an option is to host a modern C++ standard library
> via PyPI would put manylinux2010 on the table. There are however subtle
> consequences with this -- if this package gets installed into a conda
> environment, it'll clobber anaconda-shipped libstdc++, possibly corrupting
> environments for thousands of anaconda users (this is actually similar to
> the issues with `mkl` shipped via PyPI and Conda clobbering each other).
> >
>
> More evidence that "pip" as a packaging tool may have already outlived
> its usefulness to this community.
>
> Somehow we need to arrange that the same compiler toolchain (with
> consistent minimum glibc, libstdc++ version) is used to build all of
> the binaries we are discussing here. Short of that some system
> configurations will continue to have problems.
>
> - Wes
>
> >
> > References:
> >
> > [1] https://github.com/NVIDIA/nvidia-docker/issues/348
> > [2] https://gcc.gnu.org/wiki/Cxx11AbiCompatibility
> > [3]
> https://github.com/pytorch/builder/commit/44d9bfa607a7616c66fe6492fadd8f05f3578b93
> > [4] https://github.com/apache/arrow/pull/3177#issuecomment-447515982
> >
> ..............................................................................................................................................................................................
> >
> > On Sun, Dec 16, 2018 at 2:57 PM Wes McKinney <we...@gmail.com>
> wrote:
> >>
> >> Reposting since I wasn't subscribed to developers@tensorflow.org. I
> >> also didn't see Soumith's response since it didn't come through to
> >> dev@arrow.apache.org
> >>
> >> In response to the non-conforming ABI in the TF and PyTorch wheels, we
> >> have attempted to hack around the issue with some elaborate
> >> workarounds [1] [2] that have ultimately proved to not work
> >> universally. The bottom line is that this is burdening other projects
> >> in the Python ecosystem and causing confusing application crashes.
> >>
> >> First, to state what should hopefully obvious to many of you, Python
> >> wheels are not a robust way to deploy complex C++ projects, even
> >> setting aside the compiler toolchain issue. If a project has
> >> non-trivial third party dependencies, you either have to statically
> >> link them or bundle shared libraries with the wheel (we do a bit of
> >> both in Apache Arrow). Neither solution is foolproof in all cases.
> >> There are other downsides to wheels when it comes to numerical
> >> computing -- it is difficult to utilize things like the Intel MKL
> >> which may be used by multiple projects. If two projects have the same
> >> third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
> >> straw man example), it's hard to guarantee that versions or ABI will
> >> not conflict with each other.
> >>
> >> In packaging with conda, we pin all dependencies when building
> >> projects that depend on them, then package and deploy the dependencies
> >> as separate shared libraries instead of bundling. To resolve the need
> >> for newer compilers or newer C++ standard library, libstdc++.so and
> >> other system shared libraries are packaged and installed as
> >> dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
> >> is used as it performs selective static linking of symbols to enable
> >> C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
> >> environment functions as sort of portable miniature Linux
> >> distribution.
> >>
> >> Given the current state of things, as using the TensorFlow and PyTorch
> >> wheels in the same process as other conforming manylinux1 wheels is
> >> unsafe, it's hard to see how one can continue to recommend pip as a
> >> preferred installation path until the ABI problems are resolved. For
> >> example, "pip" is what is recommended for installing TensorFlow on
> >> Linux [3]. It's unclear that non-compliant wheels should be allowed in
> >> the package manager at all (I'm aware that this was deemed to not be
> >> the responsibility of PyPI to verify policy compliance [4]).
> >>
> >> A couple possible paths forward (there may be others):
> >>
> >> * Collaborate with the Python packaging authority to evolve the
> >> manylinux ABI to be able to produce compliant wheels that support the
> >> build and deployment requirements of these projects
> >> * Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
> >> projects can ship packages that can be guaranteed to work properly
> >> with TF/PyTorch. This might require vendoring libstdc++ in some kind
> >> of "toolchain" wheel that projects using this new ABI can depend on
> >>
> >> Note that these toolchain and deployment issues are absent when
> >> building and deploying with conda packages, since build- and run-time
> >> dependencies can be pinned and shared across all the projects that
> >> depend on them, ensuring ABI cross-compatibility. It's great to have
> >> the convenience of "pip install $PROJECT", but I believe that these
> >> projects have outgrown the intended use for pip and wheel
> >> distributions.
> >>
> >> Until the ABI incompatibilities are resolved, I would encourage more
> >> prominent user documentation about the non-portability and potential
> >> for crashes with these Linux wheels.
> >>
> >> Thanks,
> >> Wes
> >>
> >> [1]:
> https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
> >> [2]:
> https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
> >> [3]: https://www.tensorflow.org/install/
> >> [4]: https://www.python.org/dev/peps/pep-0513/#id50
> >> On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
> >> <ro...@gmail.com> wrote:
> >> >
> >> > On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com>
> wrote:
> >> >
> >> > > Dear all,
> >> > >
> >> > > As some of you know, there is a standard in Python called manylinux
> (
> >> > > https://www.python.org/dev/peps/pep-0513/) to package binary
> executables
> >> > > and libraries into a “wheel” in a way that allows the code to be
> run on a
> >> > > wide variety of Linux distributions. This is very convenient for
> Python
> >> > > users, since such libraries can be easily installed via pip.
> >> > >
> >> > > This standard is also important for a second reason: If many
> different
> >> > > wheels are used together in a single Python process, adhering to
> manylinux
> >> > > ensures that these libraries work together well and don’t trip on
> each
> >> > > other’s toes (this could easily happen if different versions of
> libstdc++
> >> > > are used for example). Therefore *even if support for only a single
> >> > > distribution like Ubuntu is desired*, it is important to be
> manylinux
> >> > > compatible to make sure everybody’s wheels work together well.
> >> > >
> >> > > TensorFlow and PyTorch unfortunately don’t produce manylinux
> compatible
> >> > > wheels. The challenge is due, at least in part, to the need to use
> >> > > nvidia-docker to build GPU binaries [10]. This causes various
> levels of
> >> > > pain for the rest of the Python community, see for example [1] [2]
> [3] [4]
> >> > > [5] [6] [7] [8].
> >> > >
> >> > > The purpose of the e-mail is to get a discussion started on how we
> can
> >> > > make TensorFlow and PyTorch manylinux compliant. There is a new
> standard in
> >> > > the works [9] so hopefully we can discuss what would be necessary
> to make
> >> > > sure TensorFlow and PyTorch can adhere to this standard in the
> future.
> >> > >
> >> > > It would make everybody’s lives just a little bit better! Any ideas
> are
> >> > > appreciated.
> >> > >
> >> > > @soumith: Could you cc the relevant list? I couldn't find a pytorch
> dev
> >> > > mailing list.
> >> > >
> >> > > Best,
> >> > > Philipp.
> >> > >
> >> > > [1] https://github.com/tensorflow/tensorflow/issues/5033
> >> > > [2] https://github.com/tensorflow/tensorflow/issues/8802
> >> > > [3] https://github.com/primitiv/primitiv-python/issues/28
> >> > > [4] https://github.com/zarr-developers/numcodecs/issues/70
> >> > > [5] https://github.com/apache/arrow/pull/3177
> >> > > [6] https://github.com/tensorflow/tensorflow/issues/13615
> >> > > [7] https://github.com/pytorch/pytorch/issues/8358
> >> > > [8] https://github.com/ray-project/ray/issues/2159
> >> > > [9] https://www.python.org/dev/peps/pep-0571/
> >> > > [10]
> >> > >
> https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
> >> > >
> >> > > --
> >> > > You received this message because you are subscribed to the Google
> Groups
> >> > > "ray-dev" group.
> >> > > To unsubscribe from this group and stop receiving emails from it,
> send an
> >> > > email to ray-dev+unsubscribe@googlegroups.com.
> >> > > To post to this group, send email to ray-dev@googlegroups.com.
> >> > > To view this discussion on the web visit
> >> > >
> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
> >> > > <
> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer
> >
> >> > > .
> >> > > For more options, visit https://groups.google.com/d/optout.
> >> > >
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Wes McKinney <we...@gmail.com>.

hi Soumith,

On Mon, Dec 17, 2018 at 12:32 AM soumith <so...@gmail.com> wrote:
>
> I'm reposting my original reply below the current reply (below a dotted line). It was filtered out because I wasn't subscribed to the relevant mailing lists.
>
>  tl;dr: manylinux2010 looks pretty promising, because CUDA supports CentOS6 (for now).
>
> In the meanwhile, I dug into what pyarrow does, and it looks like it links with `static-libstdc++` along with a linker version script [1].

We aren't passing -static-libstdc++. The static linking of certain
symbols (so that C++11 features work on older systems) is handled
automatically by devtoolset-2; we are modifying the visibility of some
of these linked symbols, though

>
> PyTorch did exactly that until Jan this year [2], except that our linker version script didn't cover the subtleties of statically linking stdc++ as well as Arrow did. Because we weren't covering all of the stdc++ static linking subtleties, we were facing huge issues that amplified wheel incompatibility (import X; import torch crashing under various X). Hence, we moved since then to linking with system-shipped libstdc++, doing no static stdc++ linking.
>

Unless you were using the devtoolset-2 toolchain, you were doing
something different :) My understanding is that passing
-static-libstdc++ with stock gcc or clang is mainly only appropriate
when building dependency-free binary applications

> I'll revisit this in light of manylinux2010, and go down the path of static linkage of stdc++ again, though I'm wary of the subtleties around handling of weak symbols, std::string destruction across library boundaries [3] and std::string's ABI incompatibility issues.
>
> I've opened a tracking issue here: https://github.com/pytorch/pytorch/issues/15294
>
> I'm looking forward to hearing from the TensorFlow devs if manylinux2010 is sufficient for them, or what additional constraints they have.
>
> As a personal thought, I find multiple libraries in the same process statically linking to stdc++ gross, but without a package manager like Anaconda that actually is willing to deal with the C++-side dependencies, there aren't many options on the table.

IIUC the idea of the devtoolset-* toolchains is that all libraries
should use the same toolchain then there are no issues. Having
multiple projects passing -static-libstdc++ when linking would indeed
be problematic. The problem we are having is that if any library is
using devtoolset-2, all libraries need to in order to be compatible.

>
> References:
>
> [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/symbols.map
> [2] https://github.com/pytorch/pytorch/blob/v0.3.1/tools/pytorch.version
> [3] https://github.com/pytorch/pytorch/issues/5400#issuecomment-369428125
> ............................................................................................................................................................
> Hi Philipp,
>
> Thanks a lot for getting a discussion started. I've sunk ~100+ hours over the last 2 years making PyTorch wheels play well with OpenCV, TensorFlow and other wheels, that I'm glad to see this discussion started.
>
>
> On the PyTorch wheels, we have been shipping with the minimum glibc and libstdc++ versions we can possibly work with, while keeping two hard constraints:
>
> 1. CUDA support
> 2. C++11 support
>
>
> 1. CUDA support
>
> manylinux1 is not an option, considering CUDA doesn't work out of CentOS5. I explored this option [1] to no success.
>
> manylinux2010 is an option at the moment wrt CUDA, but it's unclear when NVIDIA will lift support for CentOS6 under us.
> Additionally, CuDNN 7.0 (if I remember) was compiled against Ubuntu 12.04 (meaning the glibc version is newer than CentOS6), and binaries linked against CuDNN refused to run on CentOS6. I requested that this constraint be lifted, and the next dot release fixed it.
>
> The reason PyTorch binaries are not manylinux2010 compatible at the moment is because of the next constraint: C++11.

Do we need to involve NVIDIA in this discussion? Having problematic
GPU-enabled libraries in PyPI isn't too good for them either.

>
> 2. C++11
>
> We picked C++11 as the minimum supported dialect for PyTorch, primarily to serve the default compilers of older machines, i.e. Ubuntu 14.04 and CentOS7. The newer options were C++14 / C++17, but we decided to polyfill what we needed to support older distros better.
>
> A fully fleshed out C++11 implementation landed in gcc in various stages, with gradual ABI changes [2]. Unfortunately, the libstdc++ that ships with centos6 (and hence manylinx2010) isn't sufficient to cover all of C++11. For example, the binaries we built with devtoolset3 (gcc 4.9.2) on CentOS6 didn't run with the default libstdc++ on CentOS6 either due to ABI changes or minimum GLIBCXX version for some of the symbols being unavailable.
>

Do you have a link to the paper trail about this? I had thought a
major raison d'etre of the devtoolset compilers is to support C++11 on
older Linuxes. For example, we are using C++11 in Arrow but we're
limiting ourselves at present to what's available in gcc 4.8.x; our
binaries work fine on CentOS5 and 6.

> We tried our best to support our binaries running on CentOS6 and above with various ranges of static linking hacks until 0.3.1 (January 2018), but at some point hacks over hacks was only getting more fragile. Hence we moved to a CentOS7-based image in April 2018 [3], and relied only on dynamic linking to the system-shipped libstdc++.
>
> As Wes mentions [4], an option is to host a modern C++ standard library via PyPI would put manylinux2010 on the table. There are however subtle consequences with this -- if this package gets installed into a conda environment, it'll clobber anaconda-shipped libstdc++, possibly corrupting environments for thousands of anaconda users (this is actually similar to the issues with `mkl` shipped via PyPI and Conda clobbering each other).
>

More evidence that "pip" as a packaging tool may have already outlived
its usefulness to this community.

Somehow we need to arrange that the same compiler toolchain (with
consistent minimum glibc, libstdc++ version) is used to build all of
the binaries we are discussing here. Short of that some system
configurations will continue to have problems.

- Wes

>
> References:
>
> [1] https://github.com/NVIDIA/nvidia-docker/issues/348
> [2] https://gcc.gnu.org/wiki/Cxx11AbiCompatibility
> [3] https://github.com/pytorch/builder/commit/44d9bfa607a7616c66fe6492fadd8f05f3578b93
> [4] https://github.com/apache/arrow/pull/3177#issuecomment-447515982
> ..............................................................................................................................................................................................
>
> On Sun, Dec 16, 2018 at 2:57 PM Wes McKinney <we...@gmail.com> wrote:
>>
>> Reposting since I wasn't subscribed to developers@tensorflow.org. I
>> also didn't see Soumith's response since it didn't come through to
>> dev@arrow.apache.org
>>
>> In response to the non-conforming ABI in the TF and PyTorch wheels, we
>> have attempted to hack around the issue with some elaborate
>> workarounds [1] [2] that have ultimately proved to not work
>> universally. The bottom line is that this is burdening other projects
>> in the Python ecosystem and causing confusing application crashes.
>>
>> First, to state what should hopefully obvious to many of you, Python
>> wheels are not a robust way to deploy complex C++ projects, even
>> setting aside the compiler toolchain issue. If a project has
>> non-trivial third party dependencies, you either have to statically
>> link them or bundle shared libraries with the wheel (we do a bit of
>> both in Apache Arrow). Neither solution is foolproof in all cases.
>> There are other downsides to wheels when it comes to numerical
>> computing -- it is difficult to utilize things like the Intel MKL
>> which may be used by multiple projects. If two projects have the same
>> third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
>> straw man example), it's hard to guarantee that versions or ABI will
>> not conflict with each other.
>>
>> In packaging with conda, we pin all dependencies when building
>> projects that depend on them, then package and deploy the dependencies
>> as separate shared libraries instead of bundling. To resolve the need
>> for newer compilers or newer C++ standard library, libstdc++.so and
>> other system shared libraries are packaged and installed as
>> dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
>> is used as it performs selective static linking of symbols to enable
>> C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
>> environment functions as sort of portable miniature Linux
>> distribution.
>>
>> Given the current state of things, as using the TensorFlow and PyTorch
>> wheels in the same process as other conforming manylinux1 wheels is
>> unsafe, it's hard to see how one can continue to recommend pip as a
>> preferred installation path until the ABI problems are resolved. For
>> example, "pip" is what is recommended for installing TensorFlow on
>> Linux [3]. It's unclear that non-compliant wheels should be allowed in
>> the package manager at all (I'm aware that this was deemed to not be
>> the responsibility of PyPI to verify policy compliance [4]).
>>
>> A couple possible paths forward (there may be others):
>>
>> * Collaborate with the Python packaging authority to evolve the
>> manylinux ABI to be able to produce compliant wheels that support the
>> build and deployment requirements of these projects
>> * Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
>> projects can ship packages that can be guaranteed to work properly
>> with TF/PyTorch. This might require vendoring libstdc++ in some kind
>> of "toolchain" wheel that projects using this new ABI can depend on
>>
>> Note that these toolchain and deployment issues are absent when
>> building and deploying with conda packages, since build- and run-time
>> dependencies can be pinned and shared across all the projects that
>> depend on them, ensuring ABI cross-compatibility. It's great to have
>> the convenience of "pip install $PROJECT", but I believe that these
>> projects have outgrown the intended use for pip and wheel
>> distributions.
>>
>> Until the ABI incompatibilities are resolved, I would encourage more
>> prominent user documentation about the non-portability and potential
>> for crashes with these Linux wheels.
>>
>> Thanks,
>> Wes
>>
>> [1]: https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
>> [2]: https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
>> [3]: https://www.tensorflow.org/install/
>> [4]: https://www.python.org/dev/peps/pep-0513/#id50
>> On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
>> <ro...@gmail.com> wrote:
>> >
>> > On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com> wrote:
>> >
>> > > Dear all,
>> > >
>> > > As some of you know, there is a standard in Python called manylinux (
>> > > https://www.python.org/dev/peps/pep-0513/) to package binary executables
>> > > and libraries into a “wheel” in a way that allows the code to be run on a
>> > > wide variety of Linux distributions. This is very convenient for Python
>> > > users, since such libraries can be easily installed via pip.
>> > >
>> > > This standard is also important for a second reason: If many different
>> > > wheels are used together in a single Python process, adhering to manylinux
>> > > ensures that these libraries work together well and don’t trip on each
>> > > other’s toes (this could easily happen if different versions of libstdc++
>> > > are used for example). Therefore *even if support for only a single
>> > > distribution like Ubuntu is desired*, it is important to be manylinux
>> > > compatible to make sure everybody’s wheels work together well.
>> > >
>> > > TensorFlow and PyTorch unfortunately don’t produce manylinux compatible
>> > > wheels. The challenge is due, at least in part, to the need to use
>> > > nvidia-docker to build GPU binaries [10]. This causes various levels of
>> > > pain for the rest of the Python community, see for example [1] [2] [3] [4]
>> > > [5] [6] [7] [8].
>> > >
>> > > The purpose of the e-mail is to get a discussion started on how we can
>> > > make TensorFlow and PyTorch manylinux compliant. There is a new standard in
>> > > the works [9] so hopefully we can discuss what would be necessary to make
>> > > sure TensorFlow and PyTorch can adhere to this standard in the future.
>> > >
>> > > It would make everybody’s lives just a little bit better! Any ideas are
>> > > appreciated.
>> > >
>> > > @soumith: Could you cc the relevant list? I couldn't find a pytorch dev
>> > > mailing list.
>> > >
>> > > Best,
>> > > Philipp.
>> > >
>> > > [1] https://github.com/tensorflow/tensorflow/issues/5033
>> > > [2] https://github.com/tensorflow/tensorflow/issues/8802
>> > > [3] https://github.com/primitiv/primitiv-python/issues/28
>> > > [4] https://github.com/zarr-developers/numcodecs/issues/70
>> > > [5] https://github.com/apache/arrow/pull/3177
>> > > [6] https://github.com/tensorflow/tensorflow/issues/13615
>> > > [7] https://github.com/pytorch/pytorch/issues/8358
>> > > [8] https://github.com/ray-project/ray/issues/2159
>> > > [9] https://www.python.org/dev/peps/pep-0571/
>> > > [10]
>> > > https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
>> > >
>> > > --
>> > > You received this message because you are subscribed to the Google Groups
>> > > "ray-dev" group.
>> > > To unsubscribe from this group and stop receiving emails from it, send an
>> > > email to ray-dev+unsubscribe@googlegroups.com.
>> > > To post to this group, send email to ray-dev@googlegroups.com.
>> > > To view this discussion on the web visit
>> > > https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
>> > > <https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> > > .
>> > > For more options, visit https://groups.google.com/d/optout.
>> > >

Re: TensorFlow, PyTorch, and manylinux1

Posted by Robert Nishihara <ro...@gmail.com>.

Thanks Soumith and Martin for the detailed thoughts.

Jean-Marc would you be able to chime in or perhaps cc the relevant people? It'd
be really great to hear from someone at NVIDIA, since NVIDIA seems best
positioned to make manlinux2010 work out and will probably need to be part
of a plan for manylinux2014 or some sort of manylinux-rolling.

I didn't realize that manylinux1 doesn't fully support C++11. We've been
using C++11 pretty extensively and compiling on manylinux1 without issues
as far as I know, but maybe we just haven't hit the relevant missing
symbols.

Martin, I agree that meeting up to hammer out a proposal (or perhaps doing
a call if that's easier) would be helpful.

On Mon, Dec 17, 2018 at 3:49 PM 'Martin Wicke' via TensorFlow Developers <
developers@tensorflow.org> wrote:

> I have created a fork of tensorflow/community and added a file:
>
> https://github.com/martinwicke/community/blob/master/sigs/build/manylinux-proposal.md
>
> It's presently empty.
>
> I've invited Soumith, Wes, and Philipp to collaborate on the repo, let's
> work on this there? If anybody else wants to join, just let me know.
>
> On Mon, Dec 17, 2018 at 1:55 PM soumith <so...@gmail.com> wrote:
>
>> > The group on this thread is a good start, maybe we can get together and
>> make a proposal that meets the need of the scientific computing community?
>> I think that would probably involve updating the minimum requirements
>> (possibly to CentOS 7, I heard there was talk of a manylinux2014), carving
>> out NVIDIA libraries, and creating a smoother path for updating these
>> requirements (maybe a manylinux-rolling, which automatically updates
>> maximum versions based on age or support status without requiring new
>> PEPs).
>>
>> Martin, this sounds great. I'm really looking forward to the day where
>> pytorch package binary sizes aren't heavily bloated because we have to ship
>> with all of the CUDA / CuDNN / NCCL bits.
>>
>> Is there a github issue or a private google doc that we can collaborate
>> on, to clear our thoughts and requirements into a proposal? We can propose
>> a manylinux2014 (or realize that manylinux2010 is somehow sufficient), as
>> well as push NVIDIA to address the distribution situation of the CUDA stack.
>>
>> --
>> S
>>
>> On Mon, Dec 17, 2018 at 12:31 PM Martin Wicke <wi...@google.com> wrote:
>>
>>> Thank you Philipp for getting this started. We've been trying to get in
>>> touch and have tried via Nick Coghlan and Nathaniel Smith, but we never got
>>> far.
>>>
>>> I'm a little late to the party, but basically, what Soumith said. We
>>> have the exact same constraints (C++-11, CUDA/cuDNN). These would be
>>> extremely common for any computation-heavy packages, and properly solving
>>> this issue would be a huge boon for the Python community.
>>>
>>> Actual compliance with manylinux1 is out since it cannot fulfill those
>>> constraints. I'll also add that there is no way to build compliant wheels
>>> without using software beyond end-of-life (even beyond security updates).
>>>
>>> manylinux2010 is indeed promising, and I saw that Nick merged support
>>> for it recently, though I don't think there has been a pip release
>>> including the support yet (maybe that has now changed?).
>>>
>>> However, manylinux2010 still has (possible fatal) problems:
>>>
>>> - CUDA10's minimum versions are higher than manylinux2010's maximum
>>> versions: specifically, GCC 4.4.7 > 4.3.0.
>>>
>>> - NVIDIA's license terms for CUDA/cuDNN are not standard and
>>> redistribution can be problematic, and may depend on agreements you may
>>> have with NVIDIA. The libraries are also large, and including them would
>>> make distribution via pypi problematic. It would be much preferable if
>>> there was an approved way to distribute Python packages depending on
>>> external CUDA/cuDNN. I don't think this should be a problem, it is similar
>>> in spirit to the exception made for libGL.
>>>
>>> I've added JM Ludwig to this thread, I think as was mentioned by someone
>>> else, having NVIDIA in the conversation is critical.
>>>
>>> The group on this thread is a good start, maybe we can get together and
>>> make a proposal that meets the need of the scientific computing community?
>>> I think that would probably involve updating the minimum requirements
>>> (possibly to CentOS 7, I heard there was talk of a manylinux2014), carving
>>> out NVIDIA libraries, and creating a smoother path for updating these
>>> requirements (maybe a manylinux-rolling, which automatically updates
>>> maximum versions based on age or support status without requiring new
>>> PEPs).
>>>
>>> I'm very interested in solving this problem, I feel bad for abusing the
>>> manylinux1 tag.
>>>
>>> Martin
>>>
>>> On Sun, Dec 16, 2018 at 10:32 PM soumith <so...@gmail.com> wrote:
>>>
>>>> I'm reposting my original reply below the current reply (below a dotted
>>>> line). It was filtered out because I wasn't subscribed to the relevant
>>>> mailing lists.
>>>>
>>>>  tl;dr: manylinux2010 looks pretty promising, because CUDA supports
>>>> CentOS6 (for now).
>>>>
>>>> In the meanwhile, I dug into what pyarrow does, and it looks like it
>>>> links with `static-libstdc++` along with a linker version script [1].
>>>>
>>>> PyTorch did exactly that until Jan this year [2], except that our
>>>> linker version script didn't cover the subtleties of statically linking
>>>> stdc++ as well as Arrow did. Because we weren't covering all of the stdc++
>>>> static linking subtleties, we were facing huge issues that amplified wheel
>>>> incompatibility (import X; import torch crashing under various X). Hence,
>>>> we moved since then to linking with system-shipped libstdc++, doing no
>>>> static stdc++ linking.
>>>>
>>>> I'll revisit this in light of manylinux2010, and go down the path of
>>>> static linkage of stdc++ again, though I'm wary of the subtleties around
>>>> handling of weak symbols, std::string destruction across library boundaries
>>>> [3] and std::string's ABI incompatibility issues.
>>>>
>>>> I've opened a tracking issue here:
>>>> https://github.com/pytorch/pytorch/issues/15294
>>>>
>>>> I'm looking forward to hearing from the TensorFlow devs if
>>>> manylinux2010 is sufficient for them, or what additional constraints they
>>>> have.
>>>>
>>>> As a personal thought, I find multiple libraries in the same process
>>>> statically linking to stdc++ gross, but without a package manager like
>>>> Anaconda that actually is willing to deal with the C++-side dependencies,
>>>> there aren't many options on the table.
>>>>
>>>> References:
>>>>
>>>> [1]
>>>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/symbols.map
>>>> [2]
>>>> https://github.com/pytorch/pytorch/blob/v0.3.1/tools/pytorch.version
>>>> [3]
>>>> https://github.com/pytorch/pytorch/issues/5400#issuecomment-369428125
>>>>
>>>> ............................................................................................................................................................
>>>> Hi Philipp,
>>>>
>>>> Thanks a lot for getting a discussion started. I've sunk ~100+ hours
>>>> over the last 2 years making PyTorch wheels play well with OpenCV,
>>>> TensorFlow and other wheels, that I'm glad to see this discussion started.
>>>>
>>>>
>>>> On the PyTorch wheels, we have been shipping with the minimum glibc and
>>>> libstdc++ versions we can possibly work with, while keeping two hard
>>>> constraints:
>>>>
>>>> 1. CUDA support
>>>> 2. C++11 support
>>>>
>>>>
>>>> 1. CUDA support
>>>>
>>>> manylinux1 is not an option, considering CUDA doesn't work out of
>>>> CentOS5. I explored this option [1] to no success.
>>>>
>>>> manylinux2010 is an option at the moment wrt CUDA, but it's unclear
>>>> when NVIDIA will lift support for CentOS6 under us.
>>>> Additionally, CuDNN 7.0 (if I remember) was compiled against Ubuntu
>>>> 12.04 (meaning the glibc version is newer than CentOS6), and binaries
>>>> linked against CuDNN refused to run on CentOS6. I requested that this
>>>> constraint be lifted, and the next dot release fixed it.
>>>>
>>>> The reason PyTorch binaries are not manylinux2010 compatible at the
>>>> moment is because of the next constraint: C++11.
>>>>
>>>> 2. C++11
>>>>
>>>> We picked C++11 as the minimum supported dialect for PyTorch, primarily
>>>> to serve the default compilers of older machines, i.e. Ubuntu 14.04 and
>>>> CentOS7. The newer options were C++14 / C++17, but we decided to polyfill
>>>> what we needed to support older distros better.
>>>>
>>>> A fully fleshed out C++11 implementation landed in gcc in various
>>>> stages, with gradual ABI changes [2]. Unfortunately, the libstdc++ that
>>>> ships with centos6 (and hence manylinx2010) isn't sufficient to cover all
>>>> of C++11. For example, the binaries we built with devtoolset3 (gcc 4.9.2)
>>>> on CentOS6 didn't run with the default libstdc++ on CentOS6 either due to
>>>> ABI changes or minimum GLIBCXX version for some of the symbols being
>>>> unavailable.
>>>>
>>>> We tried our best to support our binaries running on CentOS6 and above
>>>> with various ranges of static linking hacks until 0.3.1 (January 2018), but
>>>> at some point hacks over hacks was only getting more fragile. Hence we
>>>> moved to a CentOS7-based image in April 2018 [3], and relied only on
>>>> dynamic linking to the system-shipped libstdc++.
>>>>
>>>> As Wes mentions [4], an option is to host a modern C++ standard library
>>>> via PyPI would put manylinux2010 on the table. There are however subtle
>>>> consequences with this -- if this package gets installed into a conda
>>>> environment, it'll clobber anaconda-shipped libstdc++, possibly corrupting
>>>> environments for thousands of anaconda users (this is actually similar to
>>>> the issues with `mkl` shipped via PyPI and Conda clobbering each other).
>>>>
>>>>
>>>> References:
>>>>
>>>> [1] https://github.com/NVIDIA/nvidia-docker/issues/348
>>>> [2] https://gcc.gnu.org/wiki/Cxx11AbiCompatibility
>>>> [3]
>>>> https://github.com/pytorch/builder/commit/44d9bfa607a7616c66fe6492fadd8f05f3578b93
>>>> [4] https://github.com/apache/arrow/pull/3177#issuecomment-447515982
>>>>
>>>> ..............................................................................................................................................................................................
>>>>
>>>> On Sun, Dec 16, 2018 at 2:57 PM Wes McKinney <we...@gmail.com>
>>>> wrote:
>>>>
>>>>> Reposting since I wasn't subscribed to developers@tensorflow.org. I
>>>>> also didn't see Soumith's response since it didn't come through to
>>>>> dev@arrow.apache.org
>>>>>
>>>>> In response to the non-conforming ABI in the TF and PyTorch wheels, we
>>>>> have attempted to hack around the issue with some elaborate
>>>>> workarounds [1] [2] that have ultimately proved to not work
>>>>> universally. The bottom line is that this is burdening other projects
>>>>> in the Python ecosystem and causing confusing application crashes.
>>>>>
>>>>> First, to state what should hopefully obvious to many of you, Python
>>>>> wheels are not a robust way to deploy complex C++ projects, even
>>>>> setting aside the compiler toolchain issue. If a project has
>>>>> non-trivial third party dependencies, you either have to statically
>>>>> link them or bundle shared libraries with the wheel (we do a bit of
>>>>> both in Apache Arrow). Neither solution is foolproof in all cases.
>>>>> There are other downsides to wheels when it comes to numerical
>>>>> computing -- it is difficult to utilize things like the Intel MKL
>>>>> which may be used by multiple projects. If two projects have the same
>>>>> third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
>>>>> straw man example), it's hard to guarantee that versions or ABI will
>>>>> not conflict with each other.
>>>>>
>>>>> In packaging with conda, we pin all dependencies when building
>>>>> projects that depend on them, then package and deploy the dependencies
>>>>> as separate shared libraries instead of bundling. To resolve the need
>>>>> for newer compilers or newer C++ standard library, libstdc++.so and
>>>>> other system shared libraries are packaged and installed as
>>>>> dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
>>>>> is used as it performs selective static linking of symbols to enable
>>>>> C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
>>>>> environment functions as sort of portable miniature Linux
>>>>> distribution.
>>>>>
>>>>> Given the current state of things, as using the TensorFlow and PyTorch
>>>>> wheels in the same process as other conforming manylinux1 wheels is
>>>>> unsafe, it's hard to see how one can continue to recommend pip as a
>>>>> preferred installation path until the ABI problems are resolved. For
>>>>> example, "pip" is what is recommended for installing TensorFlow on
>>>>> Linux [3]. It's unclear that non-compliant wheels should be allowed in
>>>>> the package manager at all (I'm aware that this was deemed to not be
>>>>> the responsibility of PyPI to verify policy compliance [4]).
>>>>>
>>>>> A couple possible paths forward (there may be others):
>>>>>
>>>>> * Collaborate with the Python packaging authority to evolve the
>>>>> manylinux ABI to be able to produce compliant wheels that support the
>>>>> build and deployment requirements of these projects
>>>>> * Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
>>>>> projects can ship packages that can be guaranteed to work properly
>>>>> with TF/PyTorch. This might require vendoring libstdc++ in some kind
>>>>> of "toolchain" wheel that projects using this new ABI can depend on
>>>>>
>>>>> Note that these toolchain and deployment issues are absent when
>>>>> building and deploying with conda packages, since build- and run-time
>>>>> dependencies can be pinned and shared across all the projects that
>>>>> depend on them, ensuring ABI cross-compatibility. It's great to have
>>>>> the convenience of "pip install $PROJECT", but I believe that these
>>>>> projects have outgrown the intended use for pip and wheel
>>>>> distributions.
>>>>>
>>>>> Until the ABI incompatibilities are resolved, I would encourage more
>>>>> prominent user documentation about the non-portability and potential
>>>>> for crashes with these Linux wheels.
>>>>>
>>>>> Thanks,
>>>>> Wes
>>>>>
>>>>> [1]:
>>>>> https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
>>>>> [2]:
>>>>> https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
>>>>> [3]: https://www.tensorflow.org/install/
>>>>> [4]: https://www.python.org/dev/peps/pep-0513/#id50
>>>>> On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
>>>>> <ro...@gmail.com> wrote:
>>>>> >
>>>>> > On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > > Dear all,
>>>>> > >
>>>>> > > As some of you know, there is a standard in Python called
>>>>> manylinux (
>>>>> > > https://www.python.org/dev/peps/pep-0513/) to package binary
>>>>> executables
>>>>> > > and libraries into a “wheel” in a way that allows the code to be
>>>>> run on a
>>>>> > > wide variety of Linux distributions. This is very convenient for
>>>>> Python
>>>>> > > users, since such libraries can be easily installed via pip.
>>>>> > >
>>>>> > > This standard is also important for a second reason: If many
>>>>> different
>>>>> > > wheels are used together in a single Python process, adhering to
>>>>> manylinux
>>>>> > > ensures that these libraries work together well and don’t trip on
>>>>> each
>>>>> > > other’s toes (this could easily happen if different versions of
>>>>> libstdc++
>>>>> > > are used for example). Therefore *even if support for only a single
>>>>> > > distribution like Ubuntu is desired*, it is important to be
>>>>> manylinux
>>>>> > > compatible to make sure everybody’s wheels work together well.
>>>>> > >
>>>>> > > TensorFlow and PyTorch unfortunately don’t produce manylinux
>>>>> compatible
>>>>> > > wheels. The challenge is due, at least in part, to the need to use
>>>>> > > nvidia-docker to build GPU binaries [10]. This causes various
>>>>> levels of
>>>>> > > pain for the rest of the Python community, see for example [1] [2]
>>>>> [3] [4]
>>>>> > > [5] [6] [7] [8].
>>>>> > >
>>>>> > > The purpose of the e-mail is to get a discussion started on how we
>>>>> can
>>>>> > > make TensorFlow and PyTorch manylinux compliant. There is a new
>>>>> standard in
>>>>> > > the works [9] so hopefully we can discuss what would be necessary
>>>>> to make
>>>>> > > sure TensorFlow and PyTorch can adhere to this standard in the
>>>>> future.
>>>>> > >
>>>>> > > It would make everybody’s lives just a little bit better! Any
>>>>> ideas are
>>>>> > > appreciated.
>>>>> > >
>>>>> > > @soumith: Could you cc the relevant list? I couldn't find a
>>>>> pytorch dev
>>>>> > > mailing list.
>>>>> > >
>>>>> > > Best,
>>>>> > > Philipp.
>>>>> > >
>>>>> > > [1] https://github.com/tensorflow/tensorflow/issues/5033
>>>>> > > [2] https://github.com/tensorflow/tensorflow/issues/8802
>>>>> > > [3] https://github.com/primitiv/primitiv-python/issues/28
>>>>> > > [4] https://github.com/zarr-developers/numcodecs/issues/70
>>>>> > > [5] https://github.com/apache/arrow/pull/3177
>>>>> > > [6] https://github.com/tensorflow/tensorflow/issues/13615
>>>>> > > [7] https://github.com/pytorch/pytorch/issues/8358
>>>>> > > [8] https://github.com/ray-project/ray/issues/2159
>>>>> > > [9] https://www.python.org/dev/peps/pep-0571/
>>>>> > > [10]
>>>>> > >
>>>>> https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
>>>>> > >
>>>>> > > --
>>>>> > > You received this message because you are subscribed to the Google
>>>>> Groups
>>>>> > > "ray-dev" group.
>>>>> > > To unsubscribe from this group and stop receiving emails from it,
>>>>> send an
>>>>> > > email to ray-dev+unsubscribe@googlegroups.com.
>>>>> > > To post to this group, send email to ray-dev@googlegroups.com.
>>>>> > > To view this discussion on the web visit
>>>>> > >
>>>>> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
>>>>> > > <
>>>>> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer
>>>>> >
>>>>> > > .
>>>>> > > For more options, visit https://groups.google.com/d/optout.
>>>>> > >
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "TensorFlow Developers" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to developers+unsubscribe@tensorflow.org.
>>>> Visit this group at
>>>> https://groups.google.com/a/tensorflow.org/group/developers/.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAGZdauXqQ9Gze6eAB0R3%3D2j6X2yWfh7QPbrGj1%3D5xuvQUninpQ%40mail.gmail.com
>>>> <https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAGZdauXqQ9Gze6eAB0R3%3D2j6X2yWfh7QPbrGj1%3D5xuvQUninpQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> --
> You received this message because you are subscribed to the Google Groups
> "TensorFlow Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to developers+unsubscribe@tensorflow.org.
> Visit this group at
> https://groups.google.com/a/tensorflow.org/group/developers/.
> To view this discussion on the web visit
> https://groups.google.com/a/tensorflow.org/d/msgid/developers/CADtzJKMzpDj2SfFRygaxKTgJD3eoKi7kKBUgZExN9cceMN2CyQ%40mail.gmail.com
> <https://groups.google.com/a/tensorflow.org/d/msgid/developers/CADtzJKMzpDj2SfFRygaxKTgJD3eoKi7kKBUgZExN9cceMN2CyQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Jason Zaman <ja...@perfinion.com>.

Hey all,
Just a quick reminder that we're gonna have the follow-up call tomorrow
(Tuesday) 5pm UTC, 9am PST, noon EST, 1am Wednesday Singapore. (About 23hrs
from this email) so the folks in europe can make the call too.
It'll be a hangouts call same as before and we'll put the link and dial-in
number in the google doc:
https://docs.google.com/document/d/1uYZK2jQtDUPpo3AHe18ZCH1jS9be9s8zR3axLR1SOG0/edit#heading=h.7sjot6x53yvw

Thanks,
Jason

On Thu, 7 Feb 2019 at 02:45, Philipp Moritz <pc...@gmail.com> wrote:

> Would building our manylinux2010 wheels against
> https://github.com/pypa/manylinux/pull/252 solve the C++11 problems? In
> that case we should just do that. Otherwise let's propose a minimally
> modified manylinux2011 that fixes C++11 support so we can move on and don't
> have to wait 9 more months till manylinux2014 or whatever will support
> c++14.
>
> On Wed, Feb 6, 2019 at 9:14 AM Philipp Moritz <pc...@gmail.com> wrote:
>
>> The problems arose if some functionality of C++11 <future> were used. It
>> led to certain symbols being statically linked into the shared library
>> which clashed with other shared libraries that had the same symbols in the
>> same address space, linked against a different version of libstdc++
>> (specifically, tensorflow's). There is some discussion about this in
>> https://github.com/apache/arrow/pull/3177.
>>
>> This might happen in the future again if pre g++ 5 stdlib is mixed with
>> post g++ 5. But with manylinux20xx we will be in a better situation if the
>> major packages (TensorFlow, PyTorch, Ray, Arrow) standardize on g++ >= 5.
>> Older manylinux1 packages from pip might still clash but we can flag them
>> as not manylinux20xx compatible and work towards them being fixed.
>>
>> On Wed, Feb 6, 2019 at 5:37 AM Antoine Pitrou <an...@python.org> wrote:
>>
>>>
>>> Le 06/02/2019 à 14:27, Manuel Klimek a écrit :
>>> > On Wed, Feb 6, 2019 at 12:38 PM Antoine Pitrou <antoine@python.org
>>> > <ma...@python.org>> wrote:
>>> >
>>> >
>>> >     Le 06/02/2019 à 01:06, Philipp Moritz a écrit :
>>> >     > Thanks for the meeting! One question concerning a point that is
>>> still
>>> >     > not super clear to me:
>>> >     >
>>> >     > Say we define a new manylinux standard based on gcc >=5 (with
>>> stable
>>> >     > c++11 support). There will still be a lot of wheels form the
>>> >     manylinux1
>>> >     > days that are built against gcc 4.8 that might use the c++11
>>> features
>>> >     > before they became stable. How do we prevent bugs from that? Is
>>> >     the plan
>>> >     > to convince everybody who uses these c++11 features to use the
>>> new
>>> >     > manylinux standard?
>>> >
>>> >     Yes, that's a bit of a problem.
>>> >
>>> >     This discussion arised from the incompatibility between Tensorflow
>>> >     wheels (compiled with a later toolchain) and other Python wheels
>>> >     (compiled with a manylinux1-compatible toolchain).
>>> >
>>> >
>>> > Do you know where these communicate with std types? (due to ABI tagging
>>> > loading them into the same process should work, right?)
>>>
>>> They don't.  I don't remember the specifics, Philipp Moritz might know
>>> more about this.
>>>
>>> Regards
>>>
>>> Antoine.
>>>
>>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Philipp Moritz <pc...@gmail.com>.

Would building our manylinux2010 wheels against
https://github.com/pypa/manylinux/pull/252 solve the C++11 problems? In
that case we should just do that. Otherwise let's propose a minimally
modified manylinux2011 that fixes C++11 support so we can move on and don't
have to wait 9 more months till manylinux2014 or whatever will support
c++14.

On Wed, Feb 6, 2019 at 9:14 AM Philipp Moritz <pc...@gmail.com> wrote:

> The problems arose if some functionality of C++11 <future> were used. It
> led to certain symbols being statically linked into the shared library
> which clashed with other shared libraries that had the same symbols in the
> same address space, linked against a different version of libstdc++
> (specifically, tensorflow's). There is some discussion about this in
> https://github.com/apache/arrow/pull/3177.
>
> This might happen in the future again if pre g++ 5 stdlib is mixed with
> post g++ 5. But with manylinux20xx we will be in a better situation if the
> major packages (TensorFlow, PyTorch, Ray, Arrow) standardize on g++ >= 5.
> Older manylinux1 packages from pip might still clash but we can flag them
> as not manylinux20xx compatible and work towards them being fixed.
>
> On Wed, Feb 6, 2019 at 5:37 AM Antoine Pitrou <an...@python.org> wrote:
>
>>
>> Le 06/02/2019 à 14:27, Manuel Klimek a écrit :
>> > On Wed, Feb 6, 2019 at 12:38 PM Antoine Pitrou <antoine@python.org
>> > <ma...@python.org>> wrote:
>> >
>> >
>> >     Le 06/02/2019 à 01:06, Philipp Moritz a écrit :
>> >     > Thanks for the meeting! One question concerning a point that is
>> still
>> >     > not super clear to me:
>> >     >
>> >     > Say we define a new manylinux standard based on gcc >=5 (with
>> stable
>> >     > c++11 support). There will still be a lot of wheels form the
>> >     manylinux1
>> >     > days that are built against gcc 4.8 that might use the c++11
>> features
>> >     > before they became stable. How do we prevent bugs from that? Is
>> >     the plan
>> >     > to convince everybody who uses these c++11 features to use the new
>> >     > manylinux standard?
>> >
>> >     Yes, that's a bit of a problem.
>> >
>> >     This discussion arised from the incompatibility between Tensorflow
>> >     wheels (compiled with a later toolchain) and other Python wheels
>> >     (compiled with a manylinux1-compatible toolchain).
>> >
>> >
>> > Do you know where these communicate with std types? (due to ABI tagging
>> > loading them into the same process should work, right?)
>>
>> They don't.  I don't remember the specifics, Philipp Moritz might know
>> more about this.
>>
>> Regards
>>
>> Antoine.
>>
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Philipp Moritz <pc...@gmail.com>.

The problems arose if some functionality of C++11 <future> were used. It
led to certain symbols being statically linked into the shared library
which clashed with other shared libraries that had the same symbols in the
same address space, linked against a different version of libstdc++
(specifically, tensorflow's). There is some discussion about this in
https://github.com/apache/arrow/pull/3177.

This might happen in the future again if pre g++ 5 stdlib is mixed with
post g++ 5. But with manylinux20xx we will be in a better situation if the
major packages (TensorFlow, PyTorch, Ray, Arrow) standardize on g++ >= 5.
Older manylinux1 packages from pip might still clash but we can flag them
as not manylinux20xx compatible and work towards them being fixed.

On Wed, Feb 6, 2019 at 5:37 AM Antoine Pitrou <an...@python.org> wrote:

>
> Le 06/02/2019 à 14:27, Manuel Klimek a écrit :
> > On Wed, Feb 6, 2019 at 12:38 PM Antoine Pitrou <antoine@python.org
> > <ma...@python.org>> wrote:
> >
> >
> >     Le 06/02/2019 à 01:06, Philipp Moritz a écrit :
> >     > Thanks for the meeting! One question concerning a point that is
> still
> >     > not super clear to me:
> >     >
> >     > Say we define a new manylinux standard based on gcc >=5 (with
> stable
> >     > c++11 support). There will still be a lot of wheels form the
> >     manylinux1
> >     > days that are built against gcc 4.8 that might use the c++11
> features
> >     > before they became stable. How do we prevent bugs from that? Is
> >     the plan
> >     > to convince everybody who uses these c++11 features to use the new
> >     > manylinux standard?
> >
> >     Yes, that's a bit of a problem.
> >
> >     This discussion arised from the incompatibility between Tensorflow
> >     wheels (compiled with a later toolchain) and other Python wheels
> >     (compiled with a manylinux1-compatible toolchain).
> >
> >
> > Do you know where these communicate with std types? (due to ABI tagging
> > loading them into the same process should work, right?)
>
> They don't.  I don't remember the specifics, Philipp Moritz might know
> more about this.
>
> Regards
>
> Antoine.
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Antoine Pitrou <an...@python.org>.

Le 06/02/2019 à 14:27, Manuel Klimek a écrit :
> On Wed, Feb 6, 2019 at 12:38 PM Antoine Pitrou <antoine@python.org
> <ma...@python.org>> wrote:
> 
> 
>     Le 06/02/2019 à 01:06, Philipp Moritz a écrit :
>     > Thanks for the meeting! One question concerning a point that is still
>     > not super clear to me:
>     >
>     > Say we define a new manylinux standard based on gcc >=5 (with stable
>     > c++11 support). There will still be a lot of wheels form the
>     manylinux1
>     > days that are built against gcc 4.8 that might use the c++11 features
>     > before they became stable. How do we prevent bugs from that? Is
>     the plan
>     > to convince everybody who uses these c++11 features to use the new
>     > manylinux standard?
> 
>     Yes, that's a bit of a problem.
> 
>     This discussion arised from the incompatibility between Tensorflow
>     wheels (compiled with a later toolchain) and other Python wheels
>     (compiled with a manylinux1-compatible toolchain).
> 
> 
> Do you know where these communicate with std types? (due to ABI tagging
> loading them into the same process should work, right?)

They don't.  I don't remember the specifics, Philipp Moritz might know
more about this.

Regards

Antoine.

Re: TensorFlow, PyTorch, and manylinux1

Posted by Antoine Pitrou <an...@python.org>.

Le 06/02/2019 à 01:06, Philipp Moritz a écrit :
> Thanks for the meeting! One question concerning a point that is still
> not super clear to me:
> 
> Say we define a new manylinux standard based on gcc >=5 (with stable
> c++11 support). There will still be a lot of wheels form the manylinux1
> days that are built against gcc 4.8 that might use the c++11 features
> before they became stable. How do we prevent bugs from that? Is the plan
> to convince everybody who uses these c++11 features to use the new
> manylinux standard?

Yes, that's a bit of a problem.

This discussion arised from the incompatibility between Tensorflow
wheels (compiled with a later toolchain) and other Python wheels
(compiled with a manylinux1-compatible toolchain).

Intuitively, by using the new C++ ABI we may prevent such issues when
installing manylinux1 wheels and manylinux20XX wheels side-by-side.  But
it's difficult to say for sure.

Regards

Antoine.



> 
> On Tue, Feb 5, 2019 at 8:14 AM Jonathan Helmus <jhelmus@anaconda.com
> <ma...@anaconda.com>> wrote:
> 
> 
> 
>     On 2/5/19 9:29 AM, 'Manuel Klimek' via TensorFlow Developers wrote:
>>     On Tue, Feb 5, 2019 at 4:28 PM Antoine Pitrou <antoine@python.org
>>     <ma...@python.org>> wrote:
>>
>>
>>
>>         Le 05/02/2019 à 16:22, Manuel Klimek a écrit :
>>         > On Tue, Feb 5, 2019 at 2:01 PM Uwe L. Korn <xhochy@gmail.com
>>         <ma...@gmail.com>
>>         > <mailto:xhochy@gmail.com <ma...@gmail.com>>> wrote:
>>         >
>>         >     Also to reiterate a point raised earlier: C++11 with
>>         manylinux1
>>         >     works smoothly. With gcc 4.8.5, everything we need in Arrow
>>         >     supported. C++14 and more are out of scope and can only
>>         be used
>>         >     starting with manylinux{2010/2014}.
>>         >
>>         > From the requirements side (Martin will correct me if I'm
>>         getting these
>>         > wrong):
>>         > - it seems like from the TF point of view, our users are on
>>         pip, so we
>>         > need to deliver there
>>         > - LLVM is going to require C++14 ~in March as far as I can tell
>>         > - from trying to find info about manylinux2010 / 14, it
>>         seems like these
>>         > have stalled? (but I'm happy to be proven wrong here :)
>>
>>         manylinux2010 hasn't stalled, it's been progressing slowly. 
>>         Apparently
>>         pip 19.0 is out which supports downloading and installing
>>         manylinux2010
>>         packages.  See status page here:
>>         https://github.com/pypa/manylinux/issues/179#issuecomment-457002180
>>
>>
>>     Cool! The problem is that it doesn't solve the C++14 issue, right?
> 
>     Devtoolset-7 can be installed on RHEL6/CentOS 6 which is the
>     reference distribution of manylinux2010.  Devtoolset-7 includes GCC
>     7.3.1 which has full support for C++14.  On RHEL6/CentOS 6 the
>     devtoolset compilers target the older GCC C++ ABI
>     (-D_GLIBCXX_USE_CXX11_ABI=0) and will not emit the newer ABI.  There
>     is a open pull request to the manylinux repository to create a
>     docker image containing this toolset which may be of interest:
> 
>     https://github.com/pypa/manylinux/pull/252
> 
>     Cheers,
> 
>         - Jonathan Helmus
> 
>>
>>         manylinux2014 is an entirely different question.  It needs
>>         interested
>>         parties to gather and devise a spec and then get it accepted
>>         as a new PEP.
>>
>>         Regards
>>
>>         Antoine.
>>
>>     -- 
>>     You received this message because you are subscribed to the Google
>>     Groups "TensorFlow Developers" group.
>>     To unsubscribe from this group and stop receiving emails from it,
>>     send an email to developers+unsubscribe@tensorflow.org
>>     <ma...@tensorflow.org>.
>>     Visit this group at
>>     https://groups.google.com/a/tensorflow.org/group/developers/.
>>     To view this discussion on the web visit
>>     https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAOsfVvnY5jSsB6aF8qw-d1TFF3XX1OKgCpXC8%3DQ9dyfYbGed_w%40mail.gmail.com
>>     <https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAOsfVvnY5jSsB6aF8qw-d1TFF3XX1OKgCpXC8%3DQ9dyfYbGed_w%40mail.gmail.com?utm_medium=email&utm_source=footer>.
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Antoine Pitrou <an...@python.org>.

Le 05/02/2019 à 16:29, Manuel Klimek a écrit :
> 
>     manylinux2010 hasn't stalled, it's been progressing slowly.  Apparently
>     pip 19.0 is out which supports downloading and installing manylinux2010
>     packages.  See status page here:
>     https://github.com/pypa/manylinux/issues/179#issuecomment-457002180
> 
> Cool! The problem is that it doesn't solve the C++14 issue, right?

I'm not sure.  But apparently this may be the case (due to C++ ABI
issues), if you read this comment and the subsequent ones here:
https://github.com/pypa/manylinux/pull/152#discussion_r167242743

Regards

Antoine.

Re: TensorFlow, PyTorch, and manylinux1

Posted by Jason Zaman <ja...@perfinion.com>.

Thanks everyone for attending the meeting! It was great to have people
from so many different groups so we can figure out how to solve this
best for everyone. :)

A lot was discussed, I split the notes from the wheel part of the
discussion out into a separate doc:
https://docs.google.com/document/d/1uYZK2jQtDUPpo3AHe18ZCH1jS9be9s8zR3axLR1SOG0/edit#
It is set to globally commentable so please add any thing that was
missed / incorrect.
We should definitely have a follow up call later on so the folks in
Europe can make it too. Does 19th Feb (Tuesday) 5pm UTC work for
everyone? (9am PST, noon EST, 1am Wednesday Singapore).

Thanks,
Jason

On Wed, 6 Feb 2019 at 08:06, Philipp Moritz <pc...@gmail.com> wrote:
>
> Thanks for the meeting! One question concerning a point that is still not super clear to me:
>
> Say we define a new manylinux standard based on gcc >=5 (with stable c++11 support). There will still be a lot of wheels form the manylinux1 days that are built against gcc 4.8 that might use the c++11 features before they became stable. How do we prevent bugs from that? Is the plan to convince everybody who uses these c++11 features to use the new manylinux standard?
>
> On Tue, Feb 5, 2019 at 8:14 AM Jonathan Helmus <jh...@anaconda.com> wrote:
>>
>>
>>
>> On 2/5/19 9:29 AM, 'Manuel Klimek' via TensorFlow Developers wrote:
>>
>> On Tue, Feb 5, 2019 at 4:28 PM Antoine Pitrou <an...@python.org> wrote:
>>>
>>>
>>>
>>> Le 05/02/2019 à 16:22, Manuel Klimek a écrit :
>>> > On Tue, Feb 5, 2019 at 2:01 PM Uwe L. Korn <xhochy@gmail.com
>>> > <ma...@gmail.com>> wrote:
>>> >
>>> >     Also to reiterate a point raised earlier: C++11 with manylinux1
>>> >     works smoothly. With gcc 4.8.5, everything we need in Arrow
>>> >     supported. C++14 and more are out of scope and can only be used
>>> >     starting with manylinux{2010/2014}.
>>> >
>>> > From the requirements side (Martin will correct me if I'm getting these
>>> > wrong):
>>> > - it seems like from the TF point of view, our users are on pip, so we
>>> > need to deliver there
>>> > - LLVM is going to require C++14 ~in March as far as I can tell
>>> > - from trying to find info about manylinux2010 / 14, it seems like these
>>> > have stalled? (but I'm happy to be proven wrong here :)
>>>
>>> manylinux2010 hasn't stalled, it's been progressing slowly.  Apparently
>>> pip 19.0 is out which supports downloading and installing manylinux2010
>>> packages.  See status page here:
>>> https://github.com/pypa/manylinux/issues/179#issuecomment-457002180
>>
>>
>> Cool! The problem is that it doesn't solve the C++14 issue, right?
>>
>>
>> Devtoolset-7 can be installed on RHEL6/CentOS 6 which is the reference distribution of manylinux2010.  Devtoolset-7 includes GCC 7.3.1 which has full support for C++14.  On RHEL6/CentOS 6 the devtoolset compilers target the older GCC C++ ABI (-D_GLIBCXX_USE_CXX11_ABI=0) and will not emit the newer ABI.  There is a open pull request to the manylinux repository to create a docker image containing this toolset which may be of interest:
>>
>> https://github.com/pypa/manylinux/pull/252
>>
>> Cheers,
>>
>>     - Jonathan Helmus
>>
>>
>>> manylinux2014 is an entirely different question.  It needs interested
>>> parties to gather and devise a spec and then get it accepted as a new PEP.
>>>
>>> Regards
>>>
>>> Antoine.
>>
>> --
>> You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to developers+unsubscribe@tensorflow.org.
>> Visit this group at https://groups.google.com/a/tensorflow.org/group/developers/.
>> To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAOsfVvnY5jSsB6aF8qw-d1TFF3XX1OKgCpXC8%3DQ9dyfYbGed_w%40mail.gmail.com.
>>
>>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Philipp Moritz <pc...@gmail.com>.

Thanks for the meeting! One question concerning a point that is still not
super clear to me:

Say we define a new manylinux standard based on gcc >=5 (with stable c++11
support). There will still be a lot of wheels form the manylinux1 days that
are built against gcc 4.8 that might use the c++11 features before they
became stable. How do we prevent bugs from that? Is the plan to convince
everybody who uses these c++11 features to use the new manylinux standard?

On Tue, Feb 5, 2019 at 8:14 AM Jonathan Helmus <jh...@anaconda.com> wrote:

>
>
> On 2/5/19 9:29 AM, 'Manuel Klimek' via TensorFlow Developers wrote:
>
> On Tue, Feb 5, 2019 at 4:28 PM Antoine Pitrou <an...@python.org> wrote:
>
>>
>>
>> Le 05/02/2019 à 16:22, Manuel Klimek a écrit :
>> > On Tue, Feb 5, 2019 at 2:01 PM Uwe L. Korn <xhochy@gmail.com
>> > <ma...@gmail.com>> wrote:
>> >
>> >     Also to reiterate a point raised earlier: C++11 with manylinux1
>> >     works smoothly. With gcc 4.8.5, everything we need in Arrow
>> >     supported. C++14 and more are out of scope and can only be used
>> >     starting with manylinux{2010/2014}.
>> >
>> > From the requirements side (Martin will correct me if I'm getting these
>> > wrong):
>> > - it seems like from the TF point of view, our users are on pip, so we
>> > need to deliver there
>> > - LLVM is going to require C++14 ~in March as far as I can tell
>> > - from trying to find info about manylinux2010 / 14, it seems like these
>> > have stalled? (but I'm happy to be proven wrong here :)
>>
>> manylinux2010 hasn't stalled, it's been progressing slowly.  Apparently
>> pip 19.0 is out which supports downloading and installing manylinux2010
>> packages.  See status page here:
>> https://github.com/pypa/manylinux/issues/179#issuecomment-457002180
>
>
> Cool! The problem is that it doesn't solve the C++14 issue, right?
>
>
> Devtoolset-7 can be installed on RHEL6/CentOS 6 which is the reference
> distribution of manylinux2010.  Devtoolset-7 includes GCC 7.3.1 which has
> full support for C++14.  On RHEL6/CentOS 6 the devtoolset compilers target
> the older GCC C++ ABI (-D_GLIBCXX_USE_CXX11_ABI=0) and will not emit the
> newer ABI.  There is a open pull request to the manylinux repository to
> create a docker image containing this toolset which may be of interest:
>
> https://github.com/pypa/manylinux/pull/252
>
> Cheers,
>
>     - Jonathan Helmus
>
>
> manylinux2014 is an entirely different question.  It needs interested
>> parties to gather and devise a spec and then get it accepted as a new PEP.
>>
>> Regards
>>
>> Antoine.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "TensorFlow Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to developers+unsubscribe@tensorflow.org.
> Visit this group at
> https://groups.google.com/a/tensorflow.org/group/developers/.
> To view this discussion on the web visit
> https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAOsfVvnY5jSsB6aF8qw-d1TFF3XX1OKgCpXC8%3DQ9dyfYbGed_w%40mail.gmail.com
> <https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAOsfVvnY5jSsB6aF8qw-d1TFF3XX1OKgCpXC8%3DQ9dyfYbGed_w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
>
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Antoine Pitrou <an...@python.org>.

Le 05/02/2019 à 16:22, Manuel Klimek a écrit :
> On Tue, Feb 5, 2019 at 2:01 PM Uwe L. Korn <xhochy@gmail.com
> <ma...@gmail.com>> wrote:
> 
>     Also to reiterate a point raised earlier: C++11 with manylinux1
>     works smoothly. With gcc 4.8.5, everything we need in Arrow
>     supported. C++14 and more are out of scope and can only be used
>     starting with manylinux{2010/2014}.
> 
> From the requirements side (Martin will correct me if I'm getting these
> wrong):
> - it seems like from the TF point of view, our users are on pip, so we
> need to deliver there
> - LLVM is going to require C++14 ~in March as far as I can tell
> - from trying to find info about manylinux2010 / 14, it seems like these
> have stalled? (but I'm happy to be proven wrong here :)

manylinux2010 hasn't stalled, it's been progressing slowly.  Apparently
pip 19.0 is out which supports downloading and installing manylinux2010
packages.  See status page here:
https://github.com/pypa/manylinux/issues/179#issuecomment-457002180

manylinux2014 is an entirely different question.  It needs interested
parties to gather and devise a spec and then get it accepted as a new PEP.

Regards

Antoine.

Re: TensorFlow, PyTorch, and manylinux1

Posted by Robert Nishihara <ro...@gmail.com>.

Replying to the thread because the last two messages got dropped.

On Mon, Feb 4, 2019 at 10:00 AM soumith <so...@gmail.com> wrote:

> > I think trying to package CUDA is the wrong way to think about it.
> Instead, perhaps you should try to make the package compatible with
> system CUDA installs.
>
> I agree in principle.
> The problem fundamentally stems from user expectation.
>
> In my ~6+ years of supporting Torch and PyTorch, installing CUDA on a
> system can take days, with a user mean approximately half a day. It might
> be userland incompetence, or that CUDA is a magical snowflake, but the
> reality is that installing CUDA is never great.
> So, a huge amount of issues reported by userland are side-effects from
> broken CUDA installs.
> It doesn't help that the PyPI user expectations of "my package should just
> work after a pip install".
>
> If we can reliably install an up-to-date CUDA in a standardized way, and
> NVIDIA simply doesn't sidestep the userland issues by saying "user our
> docker", or "our PPA is 100% reliable", we would've been in a better state.
>
> Until then, I think it's best that we find a solution for PyPI users that
> can work out of box with PyPI.
>
> On Mon, Feb 4, 2019 at 12:52 PM Antoine Pitrou <so...@pitrou.net>
> wrote:
>
> > On Tue, 5 Feb 2019 01:45:34 +0800
> > Jason Zaman <ja...@perfinion.com> wrote:
> > > On Tue, 5 Feb 2019 at 01:30, soumith <so...@gmail.com> wrote:
> > > >
> > > > Unfortunately I'll be on a long flight, and cannot make it to the
> > SIGBuild meeting.
> > > > I'm definitely interested in the meeting notes and any follow-up
> > meeting.
> > > >
> > > > > I think we should leave CUDA out of the
> > > > discussion initially and see if we can get the cpu-only wheel working
> > > > correctly. Hopefully cpu-only is viable on manylinux2014 then we can
> > > > tackle CUDA afterwards.
> > > >
> > > > 50% of the complexity is in the CUDA packaging.
> > > > The other 50% is in shipping a more modern libstdc++.so
> > > > I believe we'll make progress if we ignore CUDA, but we'll not
> address
> > half of the issue.
> > >
> > > Yeah, we'll definitely need both to solve it fully. My thinking is
> > > that all packages need at least C++11 but only some need CUDA. Or
> > > might we end up where the libstcc++.so is incompatible with CUDA if we
> > > don't work on everything together?
> >
> > I think trying to package CUDA is the wrong way to think about it.
> > Instead, perhaps you should try to make the package compatible with
> > system CUDA installs.
> >
> > For example, the Numba pip wheel almost works out-of-the-box with a
> > system CUDA install on Ubuntu 18.04.  I say "almost" because I had to
> > set two environment variables:
> > https://github.com/numba/numba/issues/3738
> >
> > Regards
> >
> > Antoine.
> >
> >
> >
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by soumith <so...@gmail.com>.

> I think trying to package CUDA is the wrong way to think about it.
Instead, perhaps you should try to make the package compatible with
system CUDA installs.

I agree in principle.
The problem fundamentally stems from user expectation.

In my ~6+ years of supporting Torch and PyTorch, installing CUDA on a
system can take days, with a user mean approximately half a day. It might
be userland incompetence, or that CUDA is a magical snowflake, but the
reality is that installing CUDA is never great.
So, a huge amount of issues reported by userland are side-effects from
broken CUDA installs.
It doesn't help that the PyPI user expectations of "my package should just
work after a pip install".

If we can reliably install an up-to-date CUDA in a standardized way, and
NVIDIA simply doesn't sidestep the userland issues by saying "user our
docker", or "our PPA is 100% reliable", we would've been in a better state.

Until then, I think it's best that we find a solution for PyPI users that
can work out of box with PyPI.

On Mon, Feb 4, 2019 at 12:52 PM Antoine Pitrou <so...@pitrou.net> wrote:

> On Tue, 5 Feb 2019 01:45:34 +0800
> Jason Zaman <ja...@perfinion.com> wrote:
> > On Tue, 5 Feb 2019 at 01:30, soumith <so...@gmail.com> wrote:
> > >
> > > Unfortunately I'll be on a long flight, and cannot make it to the
> SIGBuild meeting.
> > > I'm definitely interested in the meeting notes and any follow-up
> meeting.
> > >
> > > > I think we should leave CUDA out of the
> > > discussion initially and see if we can get the cpu-only wheel working
> > > correctly. Hopefully cpu-only is viable on manylinux2014 then we can
> > > tackle CUDA afterwards.
> > >
> > > 50% of the complexity is in the CUDA packaging.
> > > The other 50% is in shipping a more modern libstdc++.so
> > > I believe we'll make progress if we ignore CUDA, but we'll not address
> half of the issue.
> >
> > Yeah, we'll definitely need both to solve it fully. My thinking is
> > that all packages need at least C++11 but only some need CUDA. Or
> > might we end up where the libstcc++.so is incompatible with CUDA if we
> > don't work on everything together?
>
> I think trying to package CUDA is the wrong way to think about it.
> Instead, perhaps you should try to make the package compatible with
> system CUDA installs.
>
> For example, the Numba pip wheel almost works out-of-the-box with a
> system CUDA install on Ubuntu 18.04.  I say "almost" because I had to
> set two environment variables:
> https://github.com/numba/numba/issues/3738
>
> Regards
>
> Antoine.
>
>
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Antoine Pitrou <so...@pitrou.net>.

On Tue, 5 Feb 2019 01:45:34 +0800
Jason Zaman <ja...@perfinion.com> wrote:
> On Tue, 5 Feb 2019 at 01:30, soumith <so...@gmail.com> wrote:
> >
> > Unfortunately I'll be on a long flight, and cannot make it to the SIGBuild meeting.
> > I'm definitely interested in the meeting notes and any follow-up meeting.
> >  
> > > I think we should leave CUDA out of the  
> > discussion initially and see if we can get the cpu-only wheel working
> > correctly. Hopefully cpu-only is viable on manylinux2014 then we can
> > tackle CUDA afterwards.
> >
> > 50% of the complexity is in the CUDA packaging.
> > The other 50% is in shipping a more modern libstdc++.so
> > I believe we'll make progress if we ignore CUDA, but we'll not address half of the issue.  
> 
> Yeah, we'll definitely need both to solve it fully. My thinking is
> that all packages need at least C++11 but only some need CUDA. Or
> might we end up where the libstcc++.so is incompatible with CUDA if we
> don't work on everything together?

I think trying to package CUDA is the wrong way to think about it.
Instead, perhaps you should try to make the package compatible with
system CUDA installs.

For example, the Numba pip wheel almost works out-of-the-box with a
system CUDA install on Ubuntu 18.04.  I say "almost" because I had to
set two environment variables:
https://github.com/numba/numba/issues/3738

Regards

Antoine.

Re: TensorFlow, PyTorch, and manylinux1

Posted by Jason Zaman <ja...@perfinion.com>.

On Tue, 5 Feb 2019 at 01:30, soumith <so...@gmail.com> wrote:
>
> Unfortunately I'll be on a long flight, and cannot make it to the SIGBuild meeting.
> I'm definitely interested in the meeting notes and any follow-up meeting.
>
> > I think we should leave CUDA out of the
> discussion initially and see if we can get the cpu-only wheel working
> correctly. Hopefully cpu-only is viable on manylinux2014 then we can
> tackle CUDA afterwards.
>
> 50% of the complexity is in the CUDA packaging.
> The other 50% is in shipping a more modern libstdc++.so
> I believe we'll make progress if we ignore CUDA, but we'll not address half of the issue.

Yeah, we'll definitely need both to solve it fully. My thinking is
that all packages need at least C++11 but only some need CUDA. Or
might we end up where the libstcc++.so is incompatible with CUDA if we
don't work on everything together?

-- Jason

>
> --
> S
>
> On Mon, Feb 4, 2019 at 12:21 PM Antoine Pitrou <an...@python.org> wrote:
>>
>>
>> Le 04/02/2019 à 17:36, Uwe L. Korn a écrit :
>> > I think that problem is whether this would get merged. Conda was created
>> > after a meeting with Guido van Rossum and other folks at a PyCon quite
>> > some years ago where the final call was that this is not a problem of
>> > the core Python ecosystem and that the scientific Python community has
>> > to roll their own solution.
>> >
>> > @Wes McKinney <ma...@gmail.com> or someone else: Were you at
>> > this meeting and can outline why it was declined back then?
>>
>> I'm not sure anyone in this CC list was at that meeting (I wasn't).  If
>> it's important to have the precise answer, I can try to CC someone.
>>
>> But I think the general answer is that it's a complex and difficult
>> endeavour, and the contribution structures inside the Python packaging
>> ecosystem, where most people are volunteers, didn't allow for it.
>> There's already enough lag maintaining the current software stack (pip
>> et al.).
>>
>> Anaconda then came up and became history, so to speak.
>>
>> Regards
>>
>> Antoine.
>>
>>
>> >
>> > Uwe
>> >
>> > Am Mo., 4. Feb. 2019 um 17:33 Uhr schrieb Manuel Klimek
>> > <klimek@google.com <ma...@google.com>>:
>> >
>> >     On Mon, Feb 4, 2019 at 5:32 PM Uwe L. Korn <xhochy@gmail.com
>> >     <ma...@gmail.com>> wrote:
>> >
>> >         Just as a heads-up: I would like to also join the meeting but am
>> >         also located in Europe.
>> >
>> >         I have spent quite some time with the packaging of wheels for
>> >         pyarrow and turbodbc thus I would like to also give input on
>> >         this. For Apache Arrow, I see newer manylinux2014 standard as a
>> >         possible way to go. I'm not so fond of rolloing lib(std)c++
>> >         packages inside of pip. It's sadly the case that the features of
>> >         pip don't allow a good dependency resolution, also with taking
>> >         CUDA into account, a dependency resolution that differs between
>> >         source and binary builds of a package. For this case, exactly
>> >         conda was developed because it was considered out-of-scope for
>> >         the core Python packaging system. I'm not sure whether we
>> >         actually can fit all the requirements of the packages that take
>> >         part in this mail thread into pip without simply reimplementing
>> >         conda inside of pip.
>> >
>> >
>> >     One question is probably: what would that entail, and why would it
>> >     be bad? :)
>> >
>> >
>> >
>> >         Uwe
>> >
>> >         Am Mo., 4. Feb. 2019 um 16:34 Uhr schrieb Jason Zaman
>> >         <jason@perfinion.com <ma...@perfinion.com>>:
>> >
>> >             yeah that's expected. The timing is complicated with people
>> >             spread all
>> >             over. We will post notes after the meeting on the SIG-Build
>> >             mailing
>> >             list and I'd also be up for organizing a separate call with
>> >             europe
>> >             folks if that would be of interest.
>> >
>> >             On Mon, 4 Feb 2019 at 19:29, 'Manuel Klimek' via SIG Build
>> >             <build@tensorflow.org <ma...@tensorflow.org>> wrote:
>> >             >
>> >             > +Dmitri Gribenko
>> >             >
>> >             > Dmitri has experience with EasyBuild, which seems to be
>> >             used by the HPC community to solve the bootstrap problem and
>> >             could be used to build a toolchain image & pip package.
>> >             >
>> >             > Unfortunately we'll not be able to join the meeting as
>> >             it's at midnight CEST - looking forward to the conclusions
>> >             from the meeting!
>> >             >
>> >             > On Mon, Feb 4, 2019 at 6:00 AM Jason Zaman
>> >             <jason@perfinion.com <ma...@perfinion.com>> wrote:
>> >             >>
>> >             >> Hey all,
>> >             >>
>> >             >> We're having the TensorFlow SIG-Build meeting on 5th Feb
>> >             3pm PST
>> >             >>
>> >             (https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190205T15&p1=224).
>> >             >> Agenda is linked from:
>> >             >>
>> >             https://groups.google.com/a/tensorflow.org/forum/#!topic/build/akyPcGoBIy4
>> >             >>
>> >             >> I'd like to invite everyone from this thread to join the
>> >             call if at
>> >             >> all possible. The agenda for this meeting will spend most
>> >             of the time
>> >             >> focusing on the manylinux issue and hopefully we can get
>> >             together to
>> >             >> flesh out a decent plan on how to tackle this.
>> >             >>
>> >             >> Thanks,
>> >             >> Jason
>> >             >>
>> >             >> On Wed, 30 Jan 2019 at 23:34, 'Manuel Klimek' via SIG Build
>> >             >> <build@tensorflow.org <ma...@tensorflow.org>> wrote:
>> >             >> >
>> >             >> > On Wed, Jan 30, 2019 at 4:21 PM Antoine Pitrou
>> >             <antoine@python.org <ma...@python.org>> wrote:
>> >             >> >>
>> >             >> >>
>> >             >> >> Le 30/01/2019 à 16:09, Manuel Klimek a écrit :
>> >             >> >> >
>> >             >> >> > On Wed, Jan 30, 2019 at 3:51 PM Antoine Pitrou
>> >             <antoine@python.org <ma...@python.org>
>> >             >> >> > <mailto:antoine@python.org
>> >             <ma...@python.org>>> wrote:
>> >             >> >> >
>> >             >> >> >
>> >             >> >> >     Le 30/01/2019 à 15:35, Manuel Klimek a écrit :
>> >             >> >> >     >
>> >             >> >> >     >     Am I reading you wrong, or are you
>> >             actually proposing to
>> >             >> >> >     package another
>> >             >> >> >     >     libstdc++ version as a Python wheel?
>> >             >> >> >     >
>> >             >> >> >     >
>> >             >> >> >     > That would be the idea.
>> >             >> >> >     >
>> >             >> >> >     >
>> >             >> >> >     >     If so, are you going to claim that the
>> >             given wheel is
>> >             >> >> >     >     manylinux-compatible?
>> >             >> >> >     >
>> >             >> >> >     >
>> >             >> >> >     > That is my question :) Why wouldn't it be?
>> >             (I'd link it against
>> >             >> >> >     > manylinux libc and other C-only system libs)
>> >             >> >> >
>> >             >> >> >     The problem is when you are loading two modules
>> >             that link against
>> >             >> >> >     different libstdc++ versions in the same
>> >             process.  Incidentally, it's
>> >             >> >> >     the problem which prompted this discussion.
>> >             >> >> >
>> >             >> >> >
>> >             >> >> > Sure, I'm aware :) I think as long as the
>> >             requirement that all libraries
>> >             >> >> > that want to exchange runtime-ABI-compatible
>> >             versions are compiled with
>> >             >> >> > the same toolchain, we can provide a way to mangle
>> >             the symbols
>> >             >> >> > differently.
>> >             >> >>
>> >             >> >> Ah, I see... Indeed, mangling the symbols may work for
>> >             this.
>> >             >> >>
>> >             >> >> That said, if you're looking to create a de facto
>> >             standard, why can't it
>> >             >> >> be proposed as a manylinux iteration?
>> >             >> >
>> >             >> >
>> >             >> > I'd have thought because it doesn't change the system
>> >             requirements, while manylinux seems to be all about system
>> >             requirements.
>> >             >> > The idea is that that toolchain would still work on any
>> >             manylinux compatible machine.
>> >             >> >
>> >             >> >
>> >             >> >
>> >             >> >>
>> >             >> >>
>> >             >> >> Regards
>> >             >> >>
>> >             >> >> Antoine.
>> >             >> >
>> >             >> > --
>> >             >> > You received this message because you are subscribed to
>> >             the Google Groups "SIG Build" group.
>> >             >> > To unsubscribe from this group and stop receiving
>> >             emails from it, send an email to
>> >             build+unsubscribe@tensorflow.org
>> >             <ma...@tensorflow.org>.
>> >             >> > Visit this group at
>> >             https://groups.google.com/a/tensorflow.org/group/build/.
>> >             >
>> >             > --
>> >             > You received this message because you are subscribed to
>> >             the Google Groups "SIG Build" group.
>> >             > To unsubscribe from this group and stop receiving emails
>> >             from it, send an email to build+unsubscribe@tensorflow.org
>> >             <ma...@tensorflow.org>.
>> >             > Visit this group at
>> >             https://groups.google.com/a/tensorflow.org/group/build/.
>> >

Re: TensorFlow, PyTorch, and manylinux1

Posted by soumith <so...@gmail.com>.

Unfortunately I'll be on a long flight, and cannot make it to the SIGBuild
meeting.
I'm definitely interested in the meeting notes and any follow-up meeting.

> I think we should leave CUDA out of the
discussion initially and see if we can get the cpu-only wheel working
correctly. Hopefully cpu-only is viable on manylinux2014 then we can
tackle CUDA afterwards.

50% of the complexity is in the CUDA packaging.
The other 50% is in shipping a more modern libstdc++.so
I believe we'll make progress if we ignore CUDA, but we'll not address half
of the issue.

--
S

On Mon, Feb 4, 2019 at 12:21 PM Antoine Pitrou <an...@python.org> wrote:

>
> Le 04/02/2019 à 17:36, Uwe L. Korn a écrit :
> > I think that problem is whether this would get merged. Conda was created
> > after a meeting with Guido van Rossum and other folks at a PyCon quite
> > some years ago where the final call was that this is not a problem of
> > the core Python ecosystem and that the scientific Python community has
> > to roll their own solution.
> >
> > @Wes McKinney <ma...@gmail.com> or someone else: Were you at
> > this meeting and can outline why it was declined back then?
>
> I'm not sure anyone in this CC list was at that meeting (I wasn't).  If
> it's important to have the precise answer, I can try to CC someone.
>
> But I think the general answer is that it's a complex and difficult
> endeavour, and the contribution structures inside the Python packaging
> ecosystem, where most people are volunteers, didn't allow for it.
> There's already enough lag maintaining the current software stack (pip
> et al.).
>
> Anaconda then came up and became history, so to speak.
>
> Regards
>
> Antoine.
>
>
> >
> > Uwe
> >
> > Am Mo., 4. Feb. 2019 um 17:33 Uhr schrieb Manuel Klimek
> > <klimek@google.com <ma...@google.com>>:
> >
> >     On Mon, Feb 4, 2019 at 5:32 PM Uwe L. Korn <xhochy@gmail.com
> >     <ma...@gmail.com>> wrote:
> >
> >         Just as a heads-up: I would like to also join the meeting but am
> >         also located in Europe.
> >
> >         I have spent quite some time with the packaging of wheels for
> >         pyarrow and turbodbc thus I would like to also give input on
> >         this. For Apache Arrow, I see newer manylinux2014 standard as a
> >         possible way to go. I'm not so fond of rolloing lib(std)c++
> >         packages inside of pip. It's sadly the case that the features of
> >         pip don't allow a good dependency resolution, also with taking
> >         CUDA into account, a dependency resolution that differs between
> >         source and binary builds of a package. For this case, exactly
> >         conda was developed because it was considered out-of-scope for
> >         the core Python packaging system. I'm not sure whether we
> >         actually can fit all the requirements of the packages that take
> >         part in this mail thread into pip without simply reimplementing
> >         conda inside of pip.
> >
> >
> >     One question is probably: what would that entail, and why would it
> >     be bad? :)
> >
> >
> >
> >         Uwe
> >
> >         Am Mo., 4. Feb. 2019 um 16:34 Uhr schrieb Jason Zaman
> >         <jason@perfinion.com <ma...@perfinion.com>>:
> >
> >             yeah that's expected. The timing is complicated with people
> >             spread all
> >             over. We will post notes after the meeting on the SIG-Build
> >             mailing
> >             list and I'd also be up for organizing a separate call with
> >             europe
> >             folks if that would be of interest.
> >
> >             On Mon, 4 Feb 2019 at 19:29, 'Manuel Klimek' via SIG Build
> >             <build@tensorflow.org <ma...@tensorflow.org>> wrote:
> >             >
> >             > +Dmitri Gribenko
> >             >
> >             > Dmitri has experience with EasyBuild, which seems to be
> >             used by the HPC community to solve the bootstrap problem and
> >             could be used to build a toolchain image & pip package.
> >             >
> >             > Unfortunately we'll not be able to join the meeting as
> >             it's at midnight CEST - looking forward to the conclusions
> >             from the meeting!
> >             >
> >             > On Mon, Feb 4, 2019 at 6:00 AM Jason Zaman
> >             <jason@perfinion.com <ma...@perfinion.com>> wrote:
> >             >>
> >             >> Hey all,
> >             >>
> >             >> We're having the TensorFlow SIG-Build meeting on 5th Feb
> >             3pm PST
> >             >>
> >             (
> https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190205T15&p1=224
> ).
> >             >> Agenda is linked from:
> >             >>
> >
> https://groups.google.com/a/tensorflow.org/forum/#!topic/build/akyPcGoBIy4
> >             >>
> >             >> I'd like to invite everyone from this thread to join the
> >             call if at
> >             >> all possible. The agenda for this meeting will spend most
> >             of the time
> >             >> focusing on the manylinux issue and hopefully we can get
> >             together to
> >             >> flesh out a decent plan on how to tackle this.
> >             >>
> >             >> Thanks,
> >             >> Jason
> >             >>
> >             >> On Wed, 30 Jan 2019 at 23:34, 'Manuel Klimek' via SIG
> Build
> >             >> <build@tensorflow.org <ma...@tensorflow.org>>
> wrote:
> >             >> >
> >             >> > On Wed, Jan 30, 2019 at 4:21 PM Antoine Pitrou
> >             <antoine@python.org <ma...@python.org>> wrote:
> >             >> >>
> >             >> >>
> >             >> >> Le 30/01/2019 à 16:09, Manuel Klimek a écrit :
> >             >> >> >
> >             >> >> > On Wed, Jan 30, 2019 at 3:51 PM Antoine Pitrou
> >             <antoine@python.org <ma...@python.org>
> >             >> >> > <mailto:antoine@python.org
> >             <ma...@python.org>>> wrote:
> >             >> >> >
> >             >> >> >
> >             >> >> >     Le 30/01/2019 à 15:35, Manuel Klimek a écrit :
> >             >> >> >     >
> >             >> >> >     >     Am I reading you wrong, or are you
> >             actually proposing to
> >             >> >> >     package another
> >             >> >> >     >     libstdc++ version as a Python wheel?
> >             >> >> >     >
> >             >> >> >     >
> >             >> >> >     > That would be the idea.
> >             >> >> >     >
> >             >> >> >     >
> >             >> >> >     >     If so, are you going to claim that the
> >             given wheel is
> >             >> >> >     >     manylinux-compatible?
> >             >> >> >     >
> >             >> >> >     >
> >             >> >> >     > That is my question :) Why wouldn't it be?
> >             (I'd link it against
> >             >> >> >     > manylinux libc and other C-only system libs)
> >             >> >> >
> >             >> >> >     The problem is when you are loading two modules
> >             that link against
> >             >> >> >     different libstdc++ versions in the same
> >             process.  Incidentally, it's
> >             >> >> >     the problem which prompted this discussion.
> >             >> >> >
> >             >> >> >
> >             >> >> > Sure, I'm aware :) I think as long as the
> >             requirement that all libraries
> >             >> >> > that want to exchange runtime-ABI-compatible
> >             versions are compiled with
> >             >> >> > the same toolchain, we can provide a way to mangle
> >             the symbols
> >             >> >> > differently.
> >             >> >>
> >             >> >> Ah, I see... Indeed, mangling the symbols may work for
> >             this.
> >             >> >>
> >             >> >> That said, if you're looking to create a de facto
> >             standard, why can't it
> >             >> >> be proposed as a manylinux iteration?
> >             >> >
> >             >> >
> >             >> > I'd have thought because it doesn't change the system
> >             requirements, while manylinux seems to be all about system
> >             requirements.
> >             >> > The idea is that that toolchain would still work on any
> >             manylinux compatible machine.
> >             >> >
> >             >> >
> >             >> >
> >             >> >>
> >             >> >>
> >             >> >> Regards
> >             >> >>
> >             >> >> Antoine.
> >             >> >
> >             >> > --
> >             >> > You received this message because you are subscribed to
> >             the Google Groups "SIG Build" group.
> >             >> > To unsubscribe from this group and stop receiving
> >             emails from it, send an email to
> >             build+unsubscribe@tensorflow.org
> >             <ma...@tensorflow.org>.
> >             >> > Visit this group at
> >             https://groups.google.com/a/tensorflow.org/group/build/.
> >             >
> >             > --
> >             > You received this message because you are subscribed to
> >             the Google Groups "SIG Build" group.
> >             > To unsubscribe from this group and stop receiving emails
> >             from it, send an email to build+unsubscribe@tensorflow.org
> >             <ma...@tensorflow.org>.
> >             > Visit this group at
> >             https://groups.google.com/a/tensorflow.org/group/build/.
> >
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Antoine Pitrou <an...@python.org>.

Le 04/02/2019 à 17:36, Uwe L. Korn a écrit :
> I think that problem is whether this would get merged. Conda was created
> after a meeting with Guido van Rossum and other folks at a PyCon quite
> some years ago where the final call was that this is not a problem of
> the core Python ecosystem and that the scientific Python community has
> to roll their own solution.
> 
> @Wes McKinney <ma...@gmail.com> or someone else: Were you at
> this meeting and can outline why it was declined back then?

I'm not sure anyone in this CC list was at that meeting (I wasn't).  If
it's important to have the precise answer, I can try to CC someone.

But I think the general answer is that it's a complex and difficult
endeavour, and the contribution structures inside the Python packaging
ecosystem, where most people are volunteers, didn't allow for it.
There's already enough lag maintaining the current software stack (pip
et al.).

Anaconda then came up and became history, so to speak.

Regards

Antoine.


> 
> Uwe
> 
> Am Mo., 4. Feb. 2019 um 17:33 Uhr schrieb Manuel Klimek
> <klimek@google.com <ma...@google.com>>:
> 
>     On Mon, Feb 4, 2019 at 5:32 PM Uwe L. Korn <xhochy@gmail.com
>     <ma...@gmail.com>> wrote:
> 
>         Just as a heads-up: I would like to also join the meeting but am
>         also located in Europe. 
> 
>         I have spent quite some time with the packaging of wheels for
>         pyarrow and turbodbc thus I would like to also give input on
>         this. For Apache Arrow, I see newer manylinux2014 standard as a
>         possible way to go. I'm not so fond of rolloing lib(std)c++
>         packages inside of pip. It's sadly the case that the features of
>         pip don't allow a good dependency resolution, also with taking
>         CUDA into account, a dependency resolution that differs between
>         source and binary builds of a package. For this case, exactly
>         conda was developed because it was considered out-of-scope for
>         the core Python packaging system. I'm not sure whether we
>         actually can fit all the requirements of the packages that take
>         part in this mail thread into pip without simply reimplementing
>         conda inside of pip.
> 
> 
>     One question is probably: what would that entail, and why would it
>     be bad? :)
>      
> 
> 
>         Uwe
> 
>         Am Mo., 4. Feb. 2019 um 16:34 Uhr schrieb Jason Zaman
>         <jason@perfinion.com <ma...@perfinion.com>>:
> 
>             yeah that's expected. The timing is complicated with people
>             spread all
>             over. We will post notes after the meeting on the SIG-Build
>             mailing
>             list and I'd also be up for organizing a separate call with
>             europe
>             folks if that would be of interest.
> 
>             On Mon, 4 Feb 2019 at 19:29, 'Manuel Klimek' via SIG Build
>             <build@tensorflow.org <ma...@tensorflow.org>> wrote:
>             >
>             > +Dmitri Gribenko
>             >
>             > Dmitri has experience with EasyBuild, which seems to be
>             used by the HPC community to solve the bootstrap problem and
>             could be used to build a toolchain image & pip package.
>             >
>             > Unfortunately we'll not be able to join the meeting as
>             it's at midnight CEST - looking forward to the conclusions
>             from the meeting!
>             >
>             > On Mon, Feb 4, 2019 at 6:00 AM Jason Zaman
>             <jason@perfinion.com <ma...@perfinion.com>> wrote:
>             >>
>             >> Hey all,
>             >>
>             >> We're having the TensorFlow SIG-Build meeting on 5th Feb
>             3pm PST
>             >>
>             (https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190205T15&p1=224).
>             >> Agenda is linked from:
>             >>
>             https://groups.google.com/a/tensorflow.org/forum/#!topic/build/akyPcGoBIy4
>             >>
>             >> I'd like to invite everyone from this thread to join the
>             call if at
>             >> all possible. The agenda for this meeting will spend most
>             of the time
>             >> focusing on the manylinux issue and hopefully we can get
>             together to
>             >> flesh out a decent plan on how to tackle this.
>             >>
>             >> Thanks,
>             >> Jason
>             >>
>             >> On Wed, 30 Jan 2019 at 23:34, 'Manuel Klimek' via SIG Build
>             >> <build@tensorflow.org <ma...@tensorflow.org>> wrote:
>             >> >
>             >> > On Wed, Jan 30, 2019 at 4:21 PM Antoine Pitrou
>             <antoine@python.org <ma...@python.org>> wrote:
>             >> >>
>             >> >>
>             >> >> Le 30/01/2019 à 16:09, Manuel Klimek a écrit :
>             >> >> >
>             >> >> > On Wed, Jan 30, 2019 at 3:51 PM Antoine Pitrou
>             <antoine@python.org <ma...@python.org>
>             >> >> > <mailto:antoine@python.org
>             <ma...@python.org>>> wrote:
>             >> >> >
>             >> >> >
>             >> >> >     Le 30/01/2019 à 15:35, Manuel Klimek a écrit :
>             >> >> >     >
>             >> >> >     >     Am I reading you wrong, or are you
>             actually proposing to
>             >> >> >     package another
>             >> >> >     >     libstdc++ version as a Python wheel?
>             >> >> >     >
>             >> >> >     >
>             >> >> >     > That would be the idea.
>             >> >> >     >
>             >> >> >     >
>             >> >> >     >     If so, are you going to claim that the
>             given wheel is
>             >> >> >     >     manylinux-compatible?
>             >> >> >     >
>             >> >> >     >
>             >> >> >     > That is my question :) Why wouldn't it be?
>             (I'd link it against
>             >> >> >     > manylinux libc and other C-only system libs)
>             >> >> >
>             >> >> >     The problem is when you are loading two modules
>             that link against
>             >> >> >     different libstdc++ versions in the same
>             process.  Incidentally, it's
>             >> >> >     the problem which prompted this discussion.
>             >> >> >
>             >> >> >
>             >> >> > Sure, I'm aware :) I think as long as the
>             requirement that all libraries
>             >> >> > that want to exchange runtime-ABI-compatible
>             versions are compiled with
>             >> >> > the same toolchain, we can provide a way to mangle
>             the symbols
>             >> >> > differently.
>             >> >>
>             >> >> Ah, I see... Indeed, mangling the symbols may work for
>             this.
>             >> >>
>             >> >> That said, if you're looking to create a de facto
>             standard, why can't it
>             >> >> be proposed as a manylinux iteration?
>             >> >
>             >> >
>             >> > I'd have thought because it doesn't change the system
>             requirements, while manylinux seems to be all about system
>             requirements.
>             >> > The idea is that that toolchain would still work on any
>             manylinux compatible machine.
>             >> >
>             >> >
>             >> >
>             >> >>
>             >> >>
>             >> >> Regards
>             >> >>
>             >> >> Antoine.
>             >> >
>             >> > --
>             >> > You received this message because you are subscribed to
>             the Google Groups "SIG Build" group.
>             >> > To unsubscribe from this group and stop receiving
>             emails from it, send an email to
>             build+unsubscribe@tensorflow.org
>             <ma...@tensorflow.org>.
>             >> > Visit this group at
>             https://groups.google.com/a/tensorflow.org/group/build/.
>             >
>             > --
>             > You received this message because you are subscribed to
>             the Google Groups "SIG Build" group.
>             > To unsubscribe from this group and stop receiving emails
>             from it, send an email to build+unsubscribe@tensorflow.org
>             <ma...@tensorflow.org>.
>             > Visit this group at
>             https://groups.google.com/a/tensorflow.org/group/build/.
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Jason Zaman <ja...@perfinion.com>.

Hm, lets have this SIG-Build meeting as scheduled and then have
another follow-up later probably around 9am PST, 6pm Europe, 1am
Singapore. Does that time work for everyone? (Date TBD).

My take on this whole thing is that it sounds a lot like
re-implementing an entire distro complete with package manager inside
pip just because pip is not sufficient for what we need. My longer
term goal is to fix things up so TensorFlow can just be packaged
directly in distro package repos and most users would go that route.
This would definitely not be a universal solution and we'd still need
to have a pip package anyway. I think we should leave CUDA out of the
discussion initially and see if we can get the cpu-only wheel working
correctly. Hopefully cpu-only is viable on manylinux2014 then we can
tackle CUDA afterwards.



On Tue, 5 Feb 2019 at 00:36, Uwe L. Korn <xh...@gmail.com> wrote:
>
> I think that problem is whether this would get merged. Conda was created after a meeting with Guido van Rossum and other folks at a PyCon quite some years ago where the final call was that this is not a problem of the core Python ecosystem and that the scientific Python community has to roll their own solution.
>
> @Wes McKinney or someone else: Were you at this meeting and can outline why it was declined back then?
>
> Uwe
>
> Am Mo., 4. Feb. 2019 um 17:33 Uhr schrieb Manuel Klimek <kl...@google.com>:
>>
>> On Mon, Feb 4, 2019 at 5:32 PM Uwe L. Korn <xh...@gmail.com> wrote:
>>>
>>> Just as a heads-up: I would like to also join the meeting but am also located in Europe.
>>>
>>> I have spent quite some time with the packaging of wheels for pyarrow and turbodbc thus I would like to also give input on this. For Apache Arrow, I see newer manylinux2014 standard as a possible way to go. I'm not so fond of rolloing lib(std)c++ packages inside of pip. It's sadly the case that the features of pip don't allow a good dependency resolution, also with taking CUDA into account, a dependency resolution that differs between source and binary builds of a package. For this case, exactly conda was developed because it was considered out-of-scope for the core Python packaging system. I'm not sure whether we actually can fit all the requirements of the packages that take part in this mail thread into pip without simply reimplementing conda inside of pip.
>>
>>
>> One question is probably: what would that entail, and why would it be bad? :)
>>
>>>
>>>
>>> Uwe
>>>
>>> Am Mo., 4. Feb. 2019 um 16:34 Uhr schrieb Jason Zaman <ja...@perfinion.com>:
>>>>
>>>> yeah that's expected. The timing is complicated with people spread all
>>>> over. We will post notes after the meeting on the SIG-Build mailing
>>>> list and I'd also be up for organizing a separate call with europe
>>>> folks if that would be of interest.
>>>>
>>>> On Mon, 4 Feb 2019 at 19:29, 'Manuel Klimek' via SIG Build
>>>> <bu...@tensorflow.org> wrote:
>>>> >
>>>> > +Dmitri Gribenko
>>>> >
>>>> > Dmitri has experience with EasyBuild, which seems to be used by the HPC community to solve the bootstrap problem and could be used to build a toolchain image & pip package.
>>>> >
>>>> > Unfortunately we'll not be able to join the meeting as it's at midnight CEST - looking forward to the conclusions from the meeting!
>>>> >
>>>> > On Mon, Feb 4, 2019 at 6:00 AM Jason Zaman <ja...@perfinion.com> wrote:
>>>> >>
>>>> >> Hey all,
>>>> >>
>>>> >> We're having the TensorFlow SIG-Build meeting on 5th Feb 3pm PST
>>>> >> (https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190205T15&p1=224).
>>>> >> Agenda is linked from:
>>>> >> https://groups.google.com/a/tensorflow.org/forum/#!topic/build/akyPcGoBIy4
>>>> >>
>>>> >> I'd like to invite everyone from this thread to join the call if at
>>>> >> all possible. The agenda for this meeting will spend most of the time
>>>> >> focusing on the manylinux issue and hopefully we can get together to
>>>> >> flesh out a decent plan on how to tackle this.
>>>> >>
>>>> >> Thanks,
>>>> >> Jason
>>>> >>
>>>> >> On Wed, 30 Jan 2019 at 23:34, 'Manuel Klimek' via SIG Build
>>>> >> <bu...@tensorflow.org> wrote:
>>>> >> >
>>>> >> > On Wed, Jan 30, 2019 at 4:21 PM Antoine Pitrou <an...@python.org> wrote:
>>>> >> >>
>>>> >> >>
>>>> >> >> Le 30/01/2019 à 16:09, Manuel Klimek a écrit :
>>>> >> >> >
>>>> >> >> > On Wed, Jan 30, 2019 at 3:51 PM Antoine Pitrou <antoine@python.org
>>>> >> >> > <ma...@python.org>> wrote:
>>>> >> >> >
>>>> >> >> >
>>>> >> >> >     Le 30/01/2019 à 15:35, Manuel Klimek a écrit :
>>>> >> >> >     >
>>>> >> >> >     >     Am I reading you wrong, or are you actually proposing to
>>>> >> >> >     package another
>>>> >> >> >     >     libstdc++ version as a Python wheel?
>>>> >> >> >     >
>>>> >> >> >     >
>>>> >> >> >     > That would be the idea.
>>>> >> >> >     >
>>>> >> >> >     >
>>>> >> >> >     >     If so, are you going to claim that the given wheel is
>>>> >> >> >     >     manylinux-compatible?
>>>> >> >> >     >
>>>> >> >> >     >
>>>> >> >> >     > That is my question :) Why wouldn't it be? (I'd link it against
>>>> >> >> >     > manylinux libc and other C-only system libs)
>>>> >> >> >
>>>> >> >> >     The problem is when you are loading two modules that link against
>>>> >> >> >     different libstdc++ versions in the same process.  Incidentally, it's
>>>> >> >> >     the problem which prompted this discussion.
>>>> >> >> >
>>>> >> >> >
>>>> >> >> > Sure, I'm aware :) I think as long as the requirement that all libraries
>>>> >> >> > that want to exchange runtime-ABI-compatible versions are compiled with
>>>> >> >> > the same toolchain, we can provide a way to mangle the symbols
>>>> >> >> > differently.
>>>> >> >>
>>>> >> >> Ah, I see... Indeed, mangling the symbols may work for this.
>>>> >> >>
>>>> >> >> That said, if you're looking to create a de facto standard, why can't it
>>>> >> >> be proposed as a manylinux iteration?
>>>> >> >
>>>> >> >
>>>> >> > I'd have thought because it doesn't change the system requirements, while manylinux seems to be all about system requirements.
>>>> >> > The idea is that that toolchain would still work on any manylinux compatible machine.
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >>
>>>> >> >>
>>>> >> >> Regards
>>>> >> >>
>>>> >> >> Antoine.
>>>> >> >
>>>> >> > --
>>>> >> > You received this message because you are subscribed to the Google Groups "SIG Build" group.
>>>> >> > To unsubscribe from this group and stop receiving emails from it, send an email to build+unsubscribe@tensorflow.org.
>>>> >> > Visit this group at https://groups.google.com/a/tensorflow.org/group/build/.
>>>> >
>>>> > --
>>>> > You received this message because you are subscribed to the Google Groups "SIG Build" group.
>>>> > To unsubscribe from this group and stop receiving emails from it, send an email to build+unsubscribe@tensorflow.org.
>>>> > Visit this group at https://groups.google.com/a/tensorflow.org/group/build/.

Re: TensorFlow, PyTorch, and manylinux1

Posted by Jason Zaman <ja...@perfinion.com>.

yeah that's expected. The timing is complicated with people spread all
over. We will post notes after the meeting on the SIG-Build mailing
list and I'd also be up for organizing a separate call with europe
folks if that would be of interest.

On Mon, 4 Feb 2019 at 19:29, 'Manuel Klimek' via SIG Build
<bu...@tensorflow.org> wrote:
>
> +Dmitri Gribenko
>
> Dmitri has experience with EasyBuild, which seems to be used by the HPC community to solve the bootstrap problem and could be used to build a toolchain image & pip package.
>
> Unfortunately we'll not be able to join the meeting as it's at midnight CEST - looking forward to the conclusions from the meeting!
>
> On Mon, Feb 4, 2019 at 6:00 AM Jason Zaman <ja...@perfinion.com> wrote:
>>
>> Hey all,
>>
>> We're having the TensorFlow SIG-Build meeting on 5th Feb 3pm PST
>> (https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190205T15&p1=224).
>> Agenda is linked from:
>> https://groups.google.com/a/tensorflow.org/forum/#!topic/build/akyPcGoBIy4
>>
>> I'd like to invite everyone from this thread to join the call if at
>> all possible. The agenda for this meeting will spend most of the time
>> focusing on the manylinux issue and hopefully we can get together to
>> flesh out a decent plan on how to tackle this.
>>
>> Thanks,
>> Jason
>>
>> On Wed, 30 Jan 2019 at 23:34, 'Manuel Klimek' via SIG Build
>> <bu...@tensorflow.org> wrote:
>> >
>> > On Wed, Jan 30, 2019 at 4:21 PM Antoine Pitrou <an...@python.org> wrote:
>> >>
>> >>
>> >> Le 30/01/2019 à 16:09, Manuel Klimek a écrit :
>> >> >
>> >> > On Wed, Jan 30, 2019 at 3:51 PM Antoine Pitrou <antoine@python.org
>> >> > <ma...@python.org>> wrote:
>> >> >
>> >> >
>> >> >     Le 30/01/2019 à 15:35, Manuel Klimek a écrit :
>> >> >     >
>> >> >     >     Am I reading you wrong, or are you actually proposing to
>> >> >     package another
>> >> >     >     libstdc++ version as a Python wheel?
>> >> >     >
>> >> >     >
>> >> >     > That would be the idea.
>> >> >     >
>> >> >     >
>> >> >     >     If so, are you going to claim that the given wheel is
>> >> >     >     manylinux-compatible?
>> >> >     >
>> >> >     >
>> >> >     > That is my question :) Why wouldn't it be? (I'd link it against
>> >> >     > manylinux libc and other C-only system libs)
>> >> >
>> >> >     The problem is when you are loading two modules that link against
>> >> >     different libstdc++ versions in the same process.  Incidentally, it's
>> >> >     the problem which prompted this discussion.
>> >> >
>> >> >
>> >> > Sure, I'm aware :) I think as long as the requirement that all libraries
>> >> > that want to exchange runtime-ABI-compatible versions are compiled with
>> >> > the same toolchain, we can provide a way to mangle the symbols
>> >> > differently.
>> >>
>> >> Ah, I see... Indeed, mangling the symbols may work for this.
>> >>
>> >> That said, if you're looking to create a de facto standard, why can't it
>> >> be proposed as a manylinux iteration?
>> >
>> >
>> > I'd have thought because it doesn't change the system requirements, while manylinux seems to be all about system requirements.
>> > The idea is that that toolchain would still work on any manylinux compatible machine.
>> >
>> >
>> >
>> >>
>> >>
>> >> Regards
>> >>
>> >> Antoine.
>> >
>> > --
>> > You received this message because you are subscribed to the Google Groups "SIG Build" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an email to build+unsubscribe@tensorflow.org.
>> > Visit this group at https://groups.google.com/a/tensorflow.org/group/build/.
>
> --
> You received this message because you are subscribed to the Google Groups "SIG Build" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to build+unsubscribe@tensorflow.org.
> Visit this group at https://groups.google.com/a/tensorflow.org/group/build/.

Re: TensorFlow, PyTorch, and manylinux1

Posted by "Uwe L. Korn" <xh...@gmail.com>.

>
> From the requirements side (Martin will correct me if I'm getting these
> wrong):
> - it seems like from the TF point of view, our users are on pip, so we
> need to deliver there
> - LLVM is going to require C++14 ~in March as far as I can tell
> - from trying to find info about manylinux2010 / 14, it seems like these
> have stalled? (but I'm happy to be proven wrong here :)
>

Can we start a shared Google Doc to collect all the requirements and
constraints?


>
>
>>
>> Uwe
>>
>> Am Di., 5. Feb. 2019 um 12:19 Uhr schrieb Dmitri Gribenko <
>> dmitrig@google.com>:
>>
>>> On Mon, Feb 4, 2019 at 12:29 PM Manuel Klimek <kl...@google.com> wrote:
>>>
>>>> +Dmitri Gribenko <dm...@google.com>
>>>>
>>>
>>> Thanks for looping me in, Manuel.
>>>
>>> So I wanted to go back to the requirements and enumerate possible
>>> solutions.
>>>
>>> From soumith's email:
>>> 1. CUDA support
>>> 2. C++11 support
>>>
>>> Neither newest CUDA, nor C++11 work on manylinux1 (CentOS 5.11).
>>>
>>> The original email does not go into detail why CUDA does not work, but I
>>> can imagine it is because of the old userspace libraries (libc, libstdc++,
>>> libpthread etc).  C++11 does not work because of an old libstdc++ and old
>>> GCC.
>>>
>>> *So what can we do about old userspace libraries?*
>>>
>>> *Option "Userspace-1": Pip package uses libraries installed on the
>>> system where the pip package runs.*  (AKA the current manylinux
>>> approach.)
>>>
>>> Advantages:
>>> - Smaller download size.
>>>
>>> Disadvantages:
>>> - Pip packages have to be built against an old version of userspace
>>> libraries to be maximally-compatible.
>>> - No nice upgrade path.  When we need a specific new feature for
>>> something (e.g., today it is modern CUDA and C++11), we have to bump the
>>> requirements for the host system.  We will always be extremely cautious
>>> about not bumping the requirements too much, and therefore we will be
>>> always stuck with oldest possible libraries that can do the job.
>>>
>>> *Option "Userspace-2": When the pip package runs, ignore the system
>>> userspace libraries.  Use libraries from somewhere else.*
>>>
>>> Advantages:
>>> - We control which versions of userspace libraries we use.  We can use
>>> libraries that are newer than system ones.
>>> - Complete isolation from the userspace of the system where the pip
>>> package runs.  The only remaining point of contact with the user's system
>>> is the kernel.
>>>
>>> Disadvantages:
>>> - We need to figure out where to get these libraries from.
>>> - Bigger download size for users.
>>>
>>> So where do we get the userspace libraries from?
>>>
>>> *Option "Userspace-2a": Pip community owns all userspace libraries that
>>> binaries in a pip package can use.*
>>> All userspace components defined by manylinux are packaged into a pip
>>> package.  TensorFlow/PyTorch/... pip packages declare what version of the
>>> userspace pip package they depend on.
>>>
>>> Advantages:
>>> - Pip community owns all userspace components.
>>>
>>> Disadvantages:
>>> - Pip community owns way more stuff than before.
>>>
>>> *Option "Userpace-2b": Pip takes all userspace libraries from an
>>> existing packager.*
>>> Same as "Userspace-2a", but instead of owning the build process for the
>>> userspace libraries, we take them from an existing packager, for example,
>>> Debian, CentOS, Anaconda, Nix package manager, whatever we decide on.
>>>
>>> Advantages:
>>> - Pip community controls userspace components.
>>>
>>> Disadvantages:
>>> - Pip community owns more stuff than before.
>>>
>>> *What can we do about old toolchain?*
>>>
>>> *Option "Toolchain-1": Use a toolchain from a certain old distribution,
>>> so that the output is maximally-compatible.*
>>> This option is compatible with any choice of userspace, as long as the
>>> libraries don't require a new compiler or language features.
>>>
>>> Disadvantages:
>>> - Ancient toolchain that does not support modern C++.
>>>
>>> *Option "Toolchain-2": Make a modern toolchain that produces
>>> maximally-compatible output.*
>>> This option is difficult to implement, since a modern toolchain using a
>>> modern C++ version will require a using a contemporary C++ standard library
>>> (libc++ or libstdc++).
>>>
>>> *Option "Toolchain-3": Make a modern toolchain that requires a modern
>>> C++ library.*
>>> AKA what Manuel is proposing.  Package modern libc++ as a wheel, make a
>>> Docker container with the corresponding Clang for building binary packages
>>> like Tensorflow.
>>>
>>> Thoughts?
>>>
>>> Dmitri
>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "SIG Build" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to build+unsubscribe@tensorflow.org.
>> Visit this group at
>> https://groups.google.com/a/tensorflow.org/group/build/.
>>
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Jason Zaman <ja...@perfinion.com>.

Hey all,

We're having the TensorFlow SIG-Build meeting on 5th Feb 3pm PST
(https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190205T15&p1=224).
Agenda is linked from:
https://groups.google.com/a/tensorflow.org/forum/#!topic/build/akyPcGoBIy4

I'd like to invite everyone from this thread to join the call if at
all possible. The agenda for this meeting will spend most of the time
focusing on the manylinux issue and hopefully we can get together to
flesh out a decent plan on how to tackle this.

Thanks,
Jason

On Wed, 30 Jan 2019 at 23:34, 'Manuel Klimek' via SIG Build
<bu...@tensorflow.org> wrote:
>
> On Wed, Jan 30, 2019 at 4:21 PM Antoine Pitrou <an...@python.org> wrote:
>>
>>
>> Le 30/01/2019 à 16:09, Manuel Klimek a écrit :
>> >
>> > On Wed, Jan 30, 2019 at 3:51 PM Antoine Pitrou <antoine@python.org
>> > <ma...@python.org>> wrote:
>> >
>> >
>> >     Le 30/01/2019 à 15:35, Manuel Klimek a écrit :
>> >     >
>> >     >     Am I reading you wrong, or are you actually proposing to
>> >     package another
>> >     >     libstdc++ version as a Python wheel?
>> >     >
>> >     >
>> >     > That would be the idea.
>> >     >
>> >     >
>> >     >     If so, are you going to claim that the given wheel is
>> >     >     manylinux-compatible?
>> >     >
>> >     >
>> >     > That is my question :) Why wouldn't it be? (I'd link it against
>> >     > manylinux libc and other C-only system libs)
>> >
>> >     The problem is when you are loading two modules that link against
>> >     different libstdc++ versions in the same process.  Incidentally, it's
>> >     the problem which prompted this discussion.
>> >
>> >
>> > Sure, I'm aware :) I think as long as the requirement that all libraries
>> > that want to exchange runtime-ABI-compatible versions are compiled with
>> > the same toolchain, we can provide a way to mangle the symbols
>> > differently.
>>
>> Ah, I see... Indeed, mangling the symbols may work for this.
>>
>> That said, if you're looking to create a de facto standard, why can't it
>> be proposed as a manylinux iteration?
>
>
> I'd have thought because it doesn't change the system requirements, while manylinux seems to be all about system requirements.
> The idea is that that toolchain would still work on any manylinux compatible machine.
>
>
>
>>
>>
>> Regards
>>
>> Antoine.
>
> --
> You received this message because you are subscribed to the Google Groups "SIG Build" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to build+unsubscribe@tensorflow.org.
> Visit this group at https://groups.google.com/a/tensorflow.org/group/build/.

Re: TensorFlow, PyTorch, and manylinux1

Posted by Antoine Pitrou <an...@python.org>.

Le 30/01/2019 à 16:09, Manuel Klimek a écrit :
> 
> On Wed, Jan 30, 2019 at 3:51 PM Antoine Pitrou <antoine@python.org
> <ma...@python.org>> wrote:
> 
> 
>     Le 30/01/2019 à 15:35, Manuel Klimek a écrit :
>     >
>     >     Am I reading you wrong, or are you actually proposing to
>     package another
>     >     libstdc++ version as a Python wheel?
>     >
>     >
>     > That would be the idea.
>     >  
>     >
>     >     If so, are you going to claim that the given wheel is
>     >     manylinux-compatible?
>     >
>     >
>     > That is my question :) Why wouldn't it be? (I'd link it against
>     > manylinux libc and other C-only system libs)
> 
>     The problem is when you are loading two modules that link against
>     different libstdc++ versions in the same process.  Incidentally, it's
>     the problem which prompted this discussion.
> 
> 
> Sure, I'm aware :) I think as long as the requirement that all libraries
> that want to exchange runtime-ABI-compatible versions are compiled with
> the same toolchain, we can provide a way to mangle the symbols
> differently.

Ah, I see... Indeed, mangling the symbols may work for this.

That said, if you're looking to create a de facto standard, why can't it
be proposed as a manylinux iteration?

Regards

Antoine.

Re: TensorFlow, PyTorch, and manylinux1

Posted by Antoine Pitrou <an...@python.org>.

Le 30/01/2019 à 15:35, Manuel Klimek a écrit :
> 
>     Am I reading you wrong, or are you actually proposing to package another
>     libstdc++ version as a Python wheel?
> 
> 
> That would be the idea.
>  
> 
>     If so, are you going to claim that the given wheel is
>     manylinux-compatible?
> 
> 
> That is my question :) Why wouldn't it be? (I'd link it against
> manylinux libc and other C-only system libs)

The problem is when you are loading two modules that link against
different libstdc++ versions in the same process.  Incidentally, it's
the problem which prompted this discussion.

Regards

Antoine.

Re: TensorFlow, PyTorch, and manylinux1

Posted by Antoine Pitrou <an...@python.org>.

Le 30/01/2019 à 14:30, Manuel Klimek a écrit :
>     >
>     > What would the requirements for such a toolchain wheel be for it
>     to have a chance to be widely used? (note that I come from a C++
>     background and don't have a lot of experience with Python outside of
>     modules using C++ under the hood :)
> 
>     In principle I would think that the requirement would be that we
>     demonstrate that wheels built with the newer compiler toolchain and
>     libstdc++ dependency can coexist with manylinux1 / manylinux2010
>     packages. This is supposed to be the promise of devtoolset-produced
>     libraries anyhow. A potential problem might be projects that need to
>     pass std::* objects between shared libraries in their C++ API. For
>     example, the "turbodbc" package uses the "pyarrow" package's C++ API.
>     This would just mean that any wheel that needs to depend on a wheel in
>     the "TF/PyTorch-compatible toolchain" ecosystem would necessarily need
>     to use the alternative build toolchain instead of manylinux*
> 
> Fundamentally, the C++ dependency chain seems to be solvable with pip
> package deps down to the libstdc++/libc++ level.
> I think we'd basically need to provide:
> a) a toolchain pip package to depend on
> b) a manylinux docker image with those libraries and a compiler
> toolchain targeting them installed so packagers have an easy way to
> build these packages

Am I reading you wrong, or are you actually proposing to package another
libstdc++ version as a Python wheel?

If so, are you going to claim that the given wheel is manylinux-compatible?

Regards

Antoine.

Re: TensorFlow, PyTorch, and manylinux1

Posted by Wes McKinney <we...@gmail.com>.

hi Manuel,

Adding a couple more folks from Apache Arrow to the thread to make
sure they see this discussion.

On Tue, Jan 22, 2019 at 3:48 AM Manuel Klimek <kl...@google.com> wrote:
>
> Sorry if I'm missing something fundamental, but it seems like a new manylinux standard would come with the same problem of basically being static and growing outdated.
>
> I'd be interested in helping to provide a toolchain wheel, as mentioned in the initial post, at least for libc++ (potentially libstdc++) - it seems like that could be updated on an ongoing basis, use standard dependency management and if necessary be bootstrapped with a statically linked compiler.
>
> What would the requirements for such a toolchain wheel be for it to have a chance to be widely used? (note that I come from a C++ background and don't have a lot of experience with Python outside of modules using C++ under the hood :)

In principle I would think that the requirement would be that we
demonstrate that wheels built with the newer compiler toolchain and
libstdc++ dependency can coexist with manylinux1 / manylinux2010
packages. This is supposed to be the promise of devtoolset-produced
libraries anyhow. A potential problem might be projects that need to
pass std::* objects between shared libraries in their C++ API. For
example, the "turbodbc" package uses the "pyarrow" package's C++ API.
This would just mean that any wheel that needs to depend on a wheel in
the "TF/PyTorch-compatible toolchain" ecosystem would necessarily need
to use the alternative build toolchain instead of manylinux*

If I'm reading the room right, it seems that manylinux2010 is
effectively DOA for TensorFlow and PyTorch, is that right? If that's
the case then we shouldn't spend another year or more wringing our
hands in hopes that the PyPA solves the problem in that way that we
need. We've got to get busy shipping software and move on with our
lives

- Wes

>
> Similarly, what would the downsides of such a toolchain wheel be?
>
> On Wednesday, December 19, 2018 at 5:07:49 AM UTC+1, Jay Furmanek wrote:
>>
>> Hi Martin,
>>
>> If the goal here is to propose a new manylinux standard, I'd love to be involved as well. Currently the existing standards excludes alternative (non-Intel) CPU architectures and specify certain levels that predate the existence of ppc64le and arm64 as Linux architectures. I could lend some insight to make the proposal a little more acceptable to those arches.
>>
>>
>> Jason M. Furmanek
>> Power Systems and Open Power Innovation and Solutions
>> IBM Systems & Technology Group
>> Mobile: 1-512-638-9692
>> E-mail: furm...@us.ibm.com
>>
>>
>>
>> ----- Original message -----
>> From: "'Martin Wicke' via TensorFlow Developers" <de...@tensorflow.org>
>> To: soumith <so...@gmail.com>
>> Cc: Jean-Marc Ludwig <JL...@nvidia.com>, bu...@tensorflow.org, Wes McKinney <we...@gmail.com>, d...@arrow.apache.org, Philipp Moritz <pc...@gmail.com>, TensorFlow Developers <de...@tensorflow.org>, ray...@googlegroups.com, yi...@yifeifeng.com, Edd Wilder-James <e....@google.com>
>> Subject: Re: TensorFlow, PyTorch, and manylinux1
>> Date: Mon, Dec 17, 2018 5:49 PM
>>
>> I have created a fork of tensorflow/community and added a file:
>> https://github.com/martinwicke/community/blob/master/sigs/build/manylinux-proposal.md
>>
>> It's presently empty.
>>
>> I've invited Soumith, Wes, and Philipp to collaborate on the repo, let's work on this there? If anybody else wants to join, just let me know.
>>
>> On Mon, Dec 17, 2018 at 1:55 PM soumith <so...@gmail.com> wrote:
>>
>> > The group on this thread is a good start, maybe we can get together and make a proposal that meets the need of the scientific computing community? I think that would probably involve updating the minimum requirements (possibly to CentOS 7, I heard there was talk of a manylinux2014), carving out NVIDIA libraries, and creating a smoother path for updating these requirements (maybe a manylinux-rolling, which automatically updates maximum versions based on age or support status without requiring new PEPs).
>>
>> Martin, this sounds great. I'm really looking forward to the day where pytorch package binary sizes aren't heavily bloated because we have to ship with all of the CUDA / CuDNN / NCCL bits.
>>
>> Is there a github issue or a private google doc that we can collaborate on, to clear our thoughts and requirements into a proposal? We can propose a manylinux2014 (or realize that manylinux2010 is somehow sufficient), as well as push NVIDIA to address the distribution situation of the CUDA stack.
>>
>> --
>> S
>>
>> On Mon, Dec 17, 2018 at 12:31 PM Martin Wicke <wi...@google.com> wrote:
>>
>> Thank you Philipp for getting this started. We've been trying to get in touch and have tried via Nick Coghlan and Nathaniel Smith, but we never got far.
>>
>> I'm a little late to the party, but basically, what Soumith said. We have the exact same constraints (C++-11, CUDA/cuDNN). These would be extremely common for any computation-heavy packages, and properly solving this issue would be a huge boon for the Python community.
>>
>> Actual compliance with manylinux1 is out since it cannot fulfill those constraints. I'll also add that there is no way to build compliant wheels without using software beyond end-of-life (even beyond security updates).
>>
>> manylinux2010 is indeed promising, and I saw that Nick merged support for it recently, though I don't think there has been a pip release including the support yet (maybe that has now changed?).
>>
>> However, manylinux2010 still has (possible fatal) problems:
>>
>> - CUDA10's minimum versions are higher than manylinux2010's maximum versions: specifically, GCC 4.4.7 > 4.3.0.
>>
>> - NVIDIA's license terms for CUDA/cuDNN are not standard and redistribution can be problematic, and may depend on agreements you may have with NVIDIA. The libraries are also large, and including them would make distribution via pypi problematic. It would be much preferable if there was an approved way to distribute Python packages depending on external CUDA/cuDNN. I don't think this should be a problem, it is similar in spirit to the exception made for libGL.
>>
>> I've added JM Ludwig to this thread, I think as was mentioned by someone else, having NVIDIA in the conversation is critical.
>>
>> The group on this thread is a good start, maybe we can get together and make a proposal that meets the need of the scientific computing community? I think that would probably involve updating the minimum requirements (possibly to CentOS 7, I heard there was talk of a manylinux2014), carving out NVIDIA libraries, and creating a smoother path for updating these requirements (maybe a manylinux-rolling, which automatically updates maximum versions based on age or support status without requiring new PEPs).
>>
>> I'm very interested in solving this problem, I feel bad for abusing the manylinux1 tag.
>>
>> Martin
>>
>> On Sun, Dec 16, 2018 at 10:32 PM soumith <so...@gmail.com> wrote:
>>
>> I'm reposting my original reply below the current reply (below a dotted line). It was filtered out because I wasn't subscribed to the relevant mailing lists.
>>
>>  tl;dr: manylinux2010 looks pretty promising, because CUDA supports CentOS6 (for now).
>>
>> In the meanwhile, I dug into what pyarrow does, and it looks like it links with `static-libstdc++` along with a linker version script [1].
>>
>> PyTorch did exactly that until Jan this year [2], except that our linker version script didn't cover the subtleties of statically linking stdc++ as well as Arrow did. Because we weren't covering all of the stdc++ static linking subtleties, we were facing huge issues that amplified wheel incompatibility (import X; import torch crashing under various X). Hence, we moved since then to linking with system-shipped libstdc++, doing no static stdc++ linking.
>>
>> I'll revisit this in light of manylinux2010, and go down the path of static linkage of stdc++ again, though I'm wary of the subtleties around handling of weak symbols, std::string destruction across library boundaries [3] and std::string's ABI incompatibility issues.
>>
>> I've opened a tracking issue here: https://github.com/pytorch/pytorch/issues/15294
>>
>> I'm looking forward to hearing from the TensorFlow devs if manylinux2010 is sufficient for them, or what additional constraints they have.
>>
>> As a personal thought, I find multiple libraries in the same process statically linking to stdc++ gross, but without a package manager like Anaconda that actually is willing to deal with the C++-side dependencies, there aren't many options on the table.
>>
>> References:
>>
>> [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/symbols.map
>> [2] https://github.com/pytorch/pytorch/blob/v0.3.1/tools/pytorch.version
>> [3] https://github.com/pytorch/pytorch/issues/5400#issuecomment-369428125
>> ............................................................................................................................................................
>> Hi Philipp,
>>
>> Thanks a lot for getting a discussion started. I've sunk ~100+ hours over the last 2 years making PyTorch wheels play well with OpenCV, TensorFlow and other wheels, that I'm glad to see this discussion started.
>>
>>
>> On the PyTorch wheels, we have been shipping with the minimum glibc and libstdc++ versions we can possibly work with, while keeping two hard constraints:
>>
>> 1. CUDA support
>> 2. C++11 support
>>
>>
>> 1. CUDA support
>>
>> manylinux1 is not an option, considering CUDA doesn't work out of CentOS5. I explored this option [1] to no success.
>>
>> manylinux2010 is an option at the moment wrt CUDA, but it's unclear when NVIDIA will lift support for CentOS6 under us.
>> Additionally, CuDNN 7.0 (if I remember) was compiled against Ubuntu 12.04 (meaning the glibc version is newer than CentOS6), and binaries linked against CuDNN refused to run on CentOS6. I requested that this constraint be lifted, and the next dot release fixed it.
>>
>> The reason PyTorch binaries are not manylinux2010 compatible at the moment is because of the next constraint: C++11.
>>
>> 2. C++11
>>
>> We picked C++11 as the minimum supported dialect for PyTorch, primarily to serve the default compilers of older machines, i.e. Ubuntu 14.04 and CentOS7. The newer options were C++14 / C++17, but we decided to polyfill what we needed to support older distros better.
>>
>> A fully fleshed out C++11 implementation landed in gcc in various stages, with gradual ABI changes [2]. Unfortunately, the libstdc++ that ships with centos6 (and hence manylinx2010) isn't sufficient to cover all of C++11. For example, the binaries we built with devtoolset3 (gcc 4.9.2) on CentOS6 didn't run with the default libstdc++ on CentOS6 either due to ABI changes or minimum GLIBCXX version for some of the symbols being unavailable.
>>
>> We tried our best to support our binaries running on CentOS6 and above with various ranges of static linking hacks until 0.3.1 (January 2018), but at some point hacks over hacks was only getting more fragile. Hence we moved to a CentOS7-based image in April 2018 [3], and relied only on dynamic linking to the system-shipped libstdc++.
>>
>> As Wes mentions [4], an option is to host a modern C++ standard library via PyPI would put manylinux2010 on the table. There are however subtle consequences with this -- if this package gets installed into a conda environment, it'll clobber anaconda-shipped libstdc++, possibly corrupting environments for thousands of anaconda users (this is actually similar to the issues with `mkl` shipped via PyPI and Conda clobbering each other).
>>
>>
>> References:
>>
>> [1] https://github.com/NVIDIA/nvidia-docker/issues/348
>> [2] https://gcc.gnu.org/wiki/Cxx11AbiCompatibility
>> [3] https://github.com/pytorch/builder/commit/44d9bfa607a7616c66fe6492fadd8f05f3578b93
>> [4] https://github.com/apache/arrow/pull/3177#issuecomment-447515982
>> ..............................................................................................................................................................................................
>>
>> On Sun, Dec 16, 2018 at 2:57 PM Wes McKinney <we...@gmail.com> wrote:
>>
>> Reposting since I wasn't subscribed to devel...@tensorflow.org. I
>> also didn't see Soumith's response since it didn't come through to
>> d...@arrow.apache.org
>>
>> In response to the non-conforming ABI in the TF and PyTorch wheels, we
>> have attempted to hack around the issue with some elaborate
>> workarounds [1] [2] that have ultimately proved to not work
>> universally. The bottom line is that this is burdening other projects
>> in the Python ecosystem and causing confusing application crashes.
>>
>> First, to state what should hopefully obvious to many of you, Python
>> wheels are not a robust way to deploy complex C++ projects, even
>> setting aside the compiler toolchain issue. If a project has
>> non-trivial third party dependencies, you either have to statically
>> link them or bundle shared libraries with the wheel (we do a bit of
>> both in Apache Arrow). Neither solution is foolproof in all cases.
>> There are other downsides to wheels when it comes to numerical
>> computing -- it is difficult to utilize things like the Intel MKL
>> which may be used by multiple projects. If two projects have the same
>> third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
>> straw man example), it's hard to guarantee that versions or ABI will
>> not conflict with each other.
>>
>> In packaging with conda, we pin all dependencies when building
>> projects that depend on them, then package and deploy the dependencies
>> as separate shared libraries instead of bundling. To resolve the need
>> for newer compilers or newer C++ standard library, libstdc++.so and
>> other system shared libraries are packaged and installed as
>> dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
>> is used as it performs selective static linking of symbols to enable
>> C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
>> environment functions as sort of portable miniature Linux
>> distribution.
>>
>> Given the current state of things, as using the TensorFlow and PyTorch
>> wheels in the same process as other conforming manylinux1 wheels is
>> unsafe, it's hard to see how one can continue to recommend pip as a
>> preferred installation path until the ABI problems are resolved. For
>> example, "pip" is what is recommended for installing TensorFlow on
>> Linux [3]. It's unclear that non-compliant wheels should be allowed in
>> the package manager at all (I'm aware that this was deemed to not be
>> the responsibility of PyPI to verify policy compliance [4]).
>>
>> A couple possible paths forward (there may be others):
>>
>> * Collaborate with the Python packaging authority to evolve the
>> manylinux ABI to be able to produce compliant wheels that support the
>> build and deployment requirements of these projects
>> * Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
>> projects can ship packages that can be guaranteed to work properly
>> with TF/PyTorch. This might require vendoring libstdc++ in some kind
>> of "toolchain" wheel that projects using this new ABI can depend on
>>
>> Note that these toolchain and deployment issues are absent when
>> building and deploying with conda packages, since build- and run-time
>> dependencies can be pinned and shared across all the projects that
>> depend on them, ensuring ABI cross-compatibility. It's great to have
>> the convenience of "pip install $PROJECT", but I believe that these
>> projects have outgrown the intended use for pip and wheel
>> distributions.
>>
>> Until the ABI incompatibilities are resolved, I would encourage more
>> prominent user documentation about the non-portability and potential
>> for crashes with these Linux wheels.
>>
>> Thanks,
>> Wes
>>
>> [1]: https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
>> [2]: https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
>> [3]: https://www.tensorflow.org/install/
>> [4]: https://www.python.org/dev/peps/pep-0513/#id50
>> On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
>> <ro...@gmail.com> wrote:
>> >
>> > On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com> wrote:
>> >
>> > > Dear all,
>> > >
>> > > As some of you know, there is a standard in Python called manylinux (
>> > > https://www.python.org/dev/peps/pep-0513/) to package binary executables
>> > > and libraries into a “wheel” in a way that allows the code to be run on a
>> > > wide variety of Linux distributions. This is very convenient for Python
>> > > users, since such libraries can be easily installed via pip.
>> > >
>> > > This standard is also important for a second reason: If many different
>> > > wheels are used together in a single Python process, adhering to manylinux
>> > > ensures that these libraries work together well and don’t trip on each
>> > > other’s toes (this could easily happen if different versions of libstdc++
>> > > are used for example). Therefore *even if support for only a single
>> > > distribution like Ubuntu is desired*, it is important to be manylinux
>> > > compatible to make sure everybody’s wheels work together well.
>> > >
>> > > TensorFlow and PyTorch unfortunately don’t produce manylinux compatible
>> > > wheels. The challenge is due, at least in part, to the need to use
>> > > nvidia-docker to build GPU binaries [10]. This causes various levels of
>> > > pain for the rest of the Python community, see for example [1] [2] [3] [4]
>> > > [5] [6] [7] [8].
>> > >
>> > > The purpose of the e-mail is to get a discussion started on how we can
>> > > make TensorFlow and PyTorch manylinux compliant. There is a new standard in
>> > > the works [9] so hopefully we can discuss what would be necessary to make
>> > > sure TensorFlow and PyTorch can adhere to this standard in the future.
>> > >
>> > > It would make everybody’s lives just a little bit better! Any ideas are
>> > > appreciated.
>> > >
>> > > @soumith: Could you cc the relevant list? I couldn't find a pytorch dev
>> > > mailing list.
>> > >
>> > > Best,
>> > > Philipp.
>> > >
>> > > [1] https://github.com/tensorflow/tensorflow/issues/5033
>> > > [2] https://github.com/tensorflow/tensorflow/issues/8802
>> > > [3] https://github.com/primitiv/primitiv-python/issues/28
>> > > [4] https://github.com/zarr-developers/numcodecs/issues/70
>> > > [5] https://github.com/apache/arrow/pull/3177
>> > > [6] https://github.com/tensorflow/tensorflow/issues/13615
>> > > [7] https://github.com/pytorch/pytorch/issues/8358
>> > > [8] https://github.com/ray-project/ray/issues/2159
>> > > [9] https://www.python.org/dev/peps/pep-0571/
>> > > [10]
>> > > https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
>> > >
>> > > --
>> > > You received this message because you are subscribed to the Google Groups
>> > > "ray-dev" group.
>> > > To unsubscribe from this group and stop receiving emails from it, send an
>> > > email to ray-dev+u...@googlegroups.com.
>> > > To post to this group, send email to ray...@googlegroups.com.
>> > > To view this discussion on the web visit
>> > > https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
>> > > <https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> > > .
>> > > For more options, visit https://groups.google.com/d/optout.
>> > >
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@tensorflow.org.
>> Visit this group at https://groups.google.com/a/tensorflow.org/group/developers/.
>> To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAGZdauXqQ9Gze6eAB0R3%3D2j6X2yWfh7QPbrGj1%3D5xuvQUninpQ%40mail.gmail.com.
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@tensorflow.org.
>> Visit this group at https://groups.google.com/a/tensorflow.org/group/developers/.
>> To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/CADtzJKMzpDj2SfFRygaxKTgJD3eoKi7kKBUgZExN9cceMN2CyQ%40mail.gmail.com.
>>
>>
>>

Re: TensorFlow, PyTorch, and manylinux1

Posted by soumith <so...@gmail.com>.

> The group on this thread is a good start, maybe we can get together and
make a proposal that meets the need of the scientific computing community?
I think that would probably involve updating the minimum requirements
(possibly to CentOS 7, I heard there was talk of a manylinux2014), carving
out NVIDIA libraries, and creating a smoother path for updating these
requirements (maybe a manylinux-rolling, which automatically updates
maximum versions based on age or support status without requiring new
PEPs).

Martin, this sounds great. I'm really looking forward to the day where
pytorch package binary sizes aren't heavily bloated because we have to ship
with all of the CUDA / CuDNN / NCCL bits.

Is there a github issue or a private google doc that we can collaborate on,
to clear our thoughts and requirements into a proposal? We can propose a
manylinux2014 (or realize that manylinux2010 is somehow sufficient), as
well as push NVIDIA to address the distribution situation of the CUDA stack.

--
S

On Mon, Dec 17, 2018 at 12:31 PM Martin Wicke <wi...@google.com> wrote:

> Thank you Philipp for getting this started. We've been trying to get in
> touch and have tried via Nick Coghlan and Nathaniel Smith, but we never got
> far.
>
> I'm a little late to the party, but basically, what Soumith said. We have
> the exact same constraints (C++-11, CUDA/cuDNN). These would be extremely
> common for any computation-heavy packages, and properly solving this issue
> would be a huge boon for the Python community.
>
> Actual compliance with manylinux1 is out since it cannot fulfill those
> constraints. I'll also add that there is no way to build compliant wheels
> without using software beyond end-of-life (even beyond security updates).
>
> manylinux2010 is indeed promising, and I saw that Nick merged support for
> it recently, though I don't think there has been a pip release including
> the support yet (maybe that has now changed?).
>
> However, manylinux2010 still has (possible fatal) problems:
>
> - CUDA10's minimum versions are higher than manylinux2010's maximum
> versions: specifically, GCC 4.4.7 > 4.3.0.
>
> - NVIDIA's license terms for CUDA/cuDNN are not standard and
> redistribution can be problematic, and may depend on agreements you may
> have with NVIDIA. The libraries are also large, and including them would
> make distribution via pypi problematic. It would be much preferable if
> there was an approved way to distribute Python packages depending on
> external CUDA/cuDNN. I don't think this should be a problem, it is similar
> in spirit to the exception made for libGL.
>
> I've added JM Ludwig to this thread, I think as was mentioned by someone
> else, having NVIDIA in the conversation is critical.
>
> The group on this thread is a good start, maybe we can get together and
> make a proposal that meets the need of the scientific computing community?
> I think that would probably involve updating the minimum requirements
> (possibly to CentOS 7, I heard there was talk of a manylinux2014), carving
> out NVIDIA libraries, and creating a smoother path for updating these
> requirements (maybe a manylinux-rolling, which automatically updates
> maximum versions based on age or support status without requiring new
> PEPs).
>
> I'm very interested in solving this problem, I feel bad for abusing the
> manylinux1 tag.
>
> Martin
>
> On Sun, Dec 16, 2018 at 10:32 PM soumith <so...@gmail.com> wrote:
>
>> I'm reposting my original reply below the current reply (below a dotted
>> line). It was filtered out because I wasn't subscribed to the relevant
>> mailing lists.
>>
>>  tl;dr: manylinux2010 looks pretty promising, because CUDA supports
>> CentOS6 (for now).
>>
>> In the meanwhile, I dug into what pyarrow does, and it looks like it
>> links with `static-libstdc++` along with a linker version script [1].
>>
>> PyTorch did exactly that until Jan this year [2], except that our linker
>> version script didn't cover the subtleties of statically linking stdc++ as
>> well as Arrow did. Because we weren't covering all of the stdc++ static
>> linking subtleties, we were facing huge issues that amplified wheel
>> incompatibility (import X; import torch crashing under various X). Hence,
>> we moved since then to linking with system-shipped libstdc++, doing no
>> static stdc++ linking.
>>
>> I'll revisit this in light of manylinux2010, and go down the path of
>> static linkage of stdc++ again, though I'm wary of the subtleties around
>> handling of weak symbols, std::string destruction across library boundaries
>> [3] and std::string's ABI incompatibility issues.
>>
>> I've opened a tracking issue here:
>> https://github.com/pytorch/pytorch/issues/15294
>>
>> I'm looking forward to hearing from the TensorFlow devs if manylinux2010
>> is sufficient for them, or what additional constraints they have.
>>
>> As a personal thought, I find multiple libraries in the same process
>> statically linking to stdc++ gross, but without a package manager like
>> Anaconda that actually is willing to deal with the C++-side dependencies,
>> there aren't many options on the table.
>>
>> References:
>>
>> [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/symbols.map
>> [2] https://github.com/pytorch/pytorch/blob/v0.3.1/tools/pytorch.version
>> [3] https://github.com/pytorch/pytorch/issues/5400#issuecomment-369428125
>>
>> ............................................................................................................................................................
>> Hi Philipp,
>>
>> Thanks a lot for getting a discussion started. I've sunk ~100+ hours over
>> the last 2 years making PyTorch wheels play well with OpenCV, TensorFlow
>> and other wheels, that I'm glad to see this discussion started.
>>
>>
>> On the PyTorch wheels, we have been shipping with the minimum glibc and
>> libstdc++ versions we can possibly work with, while keeping two hard
>> constraints:
>>
>> 1. CUDA support
>> 2. C++11 support
>>
>>
>> 1. CUDA support
>>
>> manylinux1 is not an option, considering CUDA doesn't work out of
>> CentOS5. I explored this option [1] to no success.
>>
>> manylinux2010 is an option at the moment wrt CUDA, but it's unclear when
>> NVIDIA will lift support for CentOS6 under us.
>> Additionally, CuDNN 7.0 (if I remember) was compiled against Ubuntu 12.04
>> (meaning the glibc version is newer than CentOS6), and binaries linked
>> against CuDNN refused to run on CentOS6. I requested that this constraint
>> be lifted, and the next dot release fixed it.
>>
>> The reason PyTorch binaries are not manylinux2010 compatible at the
>> moment is because of the next constraint: C++11.
>>
>> 2. C++11
>>
>> We picked C++11 as the minimum supported dialect for PyTorch, primarily
>> to serve the default compilers of older machines, i.e. Ubuntu 14.04 and
>> CentOS7. The newer options were C++14 / C++17, but we decided to polyfill
>> what we needed to support older distros better.
>>
>> A fully fleshed out C++11 implementation landed in gcc in various stages,
>> with gradual ABI changes [2]. Unfortunately, the libstdc++ that ships with
>> centos6 (and hence manylinx2010) isn't sufficient to cover all of C++11.
>> For example, the binaries we built with devtoolset3 (gcc 4.9.2) on CentOS6
>> didn't run with the default libstdc++ on CentOS6 either due to ABI changes
>> or minimum GLIBCXX version for some of the symbols being unavailable.
>>
>> We tried our best to support our binaries running on CentOS6 and above
>> with various ranges of static linking hacks until 0.3.1 (January 2018), but
>> at some point hacks over hacks was only getting more fragile. Hence we
>> moved to a CentOS7-based image in April 2018 [3], and relied only on
>> dynamic linking to the system-shipped libstdc++.
>>
>> As Wes mentions [4], an option is to host a modern C++ standard library
>> via PyPI would put manylinux2010 on the table. There are however subtle
>> consequences with this -- if this package gets installed into a conda
>> environment, it'll clobber anaconda-shipped libstdc++, possibly corrupting
>> environments for thousands of anaconda users (this is actually similar to
>> the issues with `mkl` shipped via PyPI and Conda clobbering each other).
>>
>>
>> References:
>>
>> [1] https://github.com/NVIDIA/nvidia-docker/issues/348
>> [2] https://gcc.gnu.org/wiki/Cxx11AbiCompatibility
>> [3]
>> https://github.com/pytorch/builder/commit/44d9bfa607a7616c66fe6492fadd8f05f3578b93
>> [4] https://github.com/apache/arrow/pull/3177#issuecomment-447515982
>>
>> ..............................................................................................................................................................................................
>>
>> On Sun, Dec 16, 2018 at 2:57 PM Wes McKinney <we...@gmail.com> wrote:
>>
>>> Reposting since I wasn't subscribed to developers@tensorflow.org. I
>>> also didn't see Soumith's response since it didn't come through to
>>> dev@arrow.apache.org
>>>
>>> In response to the non-conforming ABI in the TF and PyTorch wheels, we
>>> have attempted to hack around the issue with some elaborate
>>> workarounds [1] [2] that have ultimately proved to not work
>>> universally. The bottom line is that this is burdening other projects
>>> in the Python ecosystem and causing confusing application crashes.
>>>
>>> First, to state what should hopefully obvious to many of you, Python
>>> wheels are not a robust way to deploy complex C++ projects, even
>>> setting aside the compiler toolchain issue. If a project has
>>> non-trivial third party dependencies, you either have to statically
>>> link them or bundle shared libraries with the wheel (we do a bit of
>>> both in Apache Arrow). Neither solution is foolproof in all cases.
>>> There are other downsides to wheels when it comes to numerical
>>> computing -- it is difficult to utilize things like the Intel MKL
>>> which may be used by multiple projects. If two projects have the same
>>> third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
>>> straw man example), it's hard to guarantee that versions or ABI will
>>> not conflict with each other.
>>>
>>> In packaging with conda, we pin all dependencies when building
>>> projects that depend on them, then package and deploy the dependencies
>>> as separate shared libraries instead of bundling. To resolve the need
>>> for newer compilers or newer C++ standard library, libstdc++.so and
>>> other system shared libraries are packaged and installed as
>>> dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
>>> is used as it performs selective static linking of symbols to enable
>>> C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
>>> environment functions as sort of portable miniature Linux
>>> distribution.
>>>
>>> Given the current state of things, as using the TensorFlow and PyTorch
>>> wheels in the same process as other conforming manylinux1 wheels is
>>> unsafe, it's hard to see how one can continue to recommend pip as a
>>> preferred installation path until the ABI problems are resolved. For
>>> example, "pip" is what is recommended for installing TensorFlow on
>>> Linux [3]. It's unclear that non-compliant wheels should be allowed in
>>> the package manager at all (I'm aware that this was deemed to not be
>>> the responsibility of PyPI to verify policy compliance [4]).
>>>
>>> A couple possible paths forward (there may be others):
>>>
>>> * Collaborate with the Python packaging authority to evolve the
>>> manylinux ABI to be able to produce compliant wheels that support the
>>> build and deployment requirements of these projects
>>> * Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
>>> projects can ship packages that can be guaranteed to work properly
>>> with TF/PyTorch. This might require vendoring libstdc++ in some kind
>>> of "toolchain" wheel that projects using this new ABI can depend on
>>>
>>> Note that these toolchain and deployment issues are absent when
>>> building and deploying with conda packages, since build- and run-time
>>> dependencies can be pinned and shared across all the projects that
>>> depend on them, ensuring ABI cross-compatibility. It's great to have
>>> the convenience of "pip install $PROJECT", but I believe that these
>>> projects have outgrown the intended use for pip and wheel
>>> distributions.
>>>
>>> Until the ABI incompatibilities are resolved, I would encourage more
>>> prominent user documentation about the non-portability and potential
>>> for crashes with these Linux wheels.
>>>
>>> Thanks,
>>> Wes
>>>
>>> [1]:
>>> https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
>>> [2]:
>>> https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
>>> [3]: https://www.tensorflow.org/install/
>>> [4]: https://www.python.org/dev/peps/pep-0513/#id50
>>> On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
>>> <ro...@gmail.com> wrote:
>>> >
>>> > On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com>
>>> wrote:
>>> >
>>> > > Dear all,
>>> > >
>>> > > As some of you know, there is a standard in Python called manylinux (
>>> > > https://www.python.org/dev/peps/pep-0513/) to package binary
>>> executables
>>> > > and libraries into a “wheel” in a way that allows the code to be run
>>> on a
>>> > > wide variety of Linux distributions. This is very convenient for
>>> Python
>>> > > users, since such libraries can be easily installed via pip.
>>> > >
>>> > > This standard is also important for a second reason: If many
>>> different
>>> > > wheels are used together in a single Python process, adhering to
>>> manylinux
>>> > > ensures that these libraries work together well and don’t trip on
>>> each
>>> > > other’s toes (this could easily happen if different versions of
>>> libstdc++
>>> > > are used for example). Therefore *even if support for only a single
>>> > > distribution like Ubuntu is desired*, it is important to be manylinux
>>> > > compatible to make sure everybody’s wheels work together well.
>>> > >
>>> > > TensorFlow and PyTorch unfortunately don’t produce manylinux
>>> compatible
>>> > > wheels. The challenge is due, at least in part, to the need to use
>>> > > nvidia-docker to build GPU binaries [10]. This causes various levels
>>> of
>>> > > pain for the rest of the Python community, see for example [1] [2]
>>> [3] [4]
>>> > > [5] [6] [7] [8].
>>> > >
>>> > > The purpose of the e-mail is to get a discussion started on how we
>>> can
>>> > > make TensorFlow and PyTorch manylinux compliant. There is a new
>>> standard in
>>> > > the works [9] so hopefully we can discuss what would be necessary to
>>> make
>>> > > sure TensorFlow and PyTorch can adhere to this standard in the
>>> future.
>>> > >
>>> > > It would make everybody’s lives just a little bit better! Any ideas
>>> are
>>> > > appreciated.
>>> > >
>>> > > @soumith: Could you cc the relevant list? I couldn't find a pytorch
>>> dev
>>> > > mailing list.
>>> > >
>>> > > Best,
>>> > > Philipp.
>>> > >
>>> > > [1] https://github.com/tensorflow/tensorflow/issues/5033
>>> > > [2] https://github.com/tensorflow/tensorflow/issues/8802
>>> > > [3] https://github.com/primitiv/primitiv-python/issues/28
>>> > > [4] https://github.com/zarr-developers/numcodecs/issues/70
>>> > > [5] https://github.com/apache/arrow/pull/3177
>>> > > [6] https://github.com/tensorflow/tensorflow/issues/13615
>>> > > [7] https://github.com/pytorch/pytorch/issues/8358
>>> > > [8] https://github.com/ray-project/ray/issues/2159
>>> > > [9] https://www.python.org/dev/peps/pep-0571/
>>> > > [10]
>>> > >
>>> https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
>>> > >
>>> > > --
>>> > > You received this message because you are subscribed to the Google
>>> Groups
>>> > > "ray-dev" group.
>>> > > To unsubscribe from this group and stop receiving emails from it,
>>> send an
>>> > > email to ray-dev+unsubscribe@googlegroups.com.
>>> > > To post to this group, send email to ray-dev@googlegroups.com.
>>> > > To view this discussion on the web visit
>>> > >
>>> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
>>> > > <
>>> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer
>>> >
>>> > > .
>>> > > For more options, visit https://groups.google.com/d/optout.
>>> > >
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "TensorFlow Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to developers+unsubscribe@tensorflow.org.
>> Visit this group at
>> https://groups.google.com/a/tensorflow.org/group/developers/.
>> To view this discussion on the web visit
>> https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAGZdauXqQ9Gze6eAB0R3%3D2j6X2yWfh7QPbrGj1%3D5xuvQUninpQ%40mail.gmail.com
>> <https://groups.google.com/a/tensorflow.org/d/msgid/developers/CAGZdauXqQ9Gze6eAB0R3%3D2j6X2yWfh7QPbrGj1%3D5xuvQUninpQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by soumith <so...@gmail.com>.

I'm reposting my original reply below the current reply (below a dotted
line). It was filtered out because I wasn't subscribed to the relevant
mailing lists.

 tl;dr: manylinux2010 looks pretty promising, because CUDA supports CentOS6
(for now).

In the meanwhile, I dug into what pyarrow does, and it looks like it links
with `static-libstdc++` along with a linker version script [1].

PyTorch did exactly that until Jan this year [2], except that our linker
version script didn't cover the subtleties of statically linking stdc++ as
well as Arrow did. Because we weren't covering all of the stdc++ static
linking subtleties, we were facing huge issues that amplified wheel
incompatibility (import X; import torch crashing under various X). Hence,
we moved since then to linking with system-shipped libstdc++, doing no
static stdc++ linking.

I'll revisit this in light of manylinux2010, and go down the path of static
linkage of stdc++ again, though I'm wary of the subtleties around handling
of weak symbols, std::string destruction across library boundaries [3] and
std::string's ABI incompatibility issues.

I've opened a tracking issue here:
https://github.com/pytorch/pytorch/issues/15294

I'm looking forward to hearing from the TensorFlow devs if manylinux2010 is
sufficient for them, or what additional constraints they have.

As a personal thought, I find multiple libraries in the same process
statically linking to stdc++ gross, but without a package manager like
Anaconda that actually is willing to deal with the C++-side dependencies,
there aren't many options on the table.

References:

[1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/symbols.map
[2] https://github.com/pytorch/pytorch/blob/v0.3.1/tools/pytorch.version
[3] https://github.com/pytorch/pytorch/issues/5400#issuecomment-369428125
............................................................................................................................................................
Hi Philipp,

Thanks a lot for getting a discussion started. I've sunk ~100+ hours over
the last 2 years making PyTorch wheels play well with OpenCV, TensorFlow
and other wheels, that I'm glad to see this discussion started.

On the PyTorch wheels, we have been shipping with the minimum glibc and
libstdc++ versions we can possibly work with, while keeping two hard
constraints:

1. CUDA support
2. C++11 support

1. CUDA support

manylinux1 is not an option, considering CUDA doesn't work out of CentOS5.
I explored this option [1] to no success.

manylinux2010 is an option at the moment wrt CUDA, but it's unclear when
NVIDIA will lift support for CentOS6 under us.
Additionally, CuDNN 7.0 (if I remember) was compiled against Ubuntu 12.04
(meaning the glibc version is newer than CentOS6), and binaries linked
against CuDNN refused to run on CentOS6. I requested that this constraint
be lifted, and the next dot release fixed it.

The reason PyTorch binaries are not manylinux2010 compatible at the moment
is because of the next constraint: C++11.

2. C++11

We picked C++11 as the minimum supported dialect for PyTorch, primarily to
serve the default compilers of older machines, i.e. Ubuntu 14.04 and
CentOS7. The newer options were C++14 / C++17, but we decided to polyfill
what we needed to support older distros better.

A fully fleshed out C++11 implementation landed in gcc in various stages,
with gradual ABI changes [2]. Unfortunately, the libstdc++ that ships with
centos6 (and hence manylinx2010) isn't sufficient to cover all of C++11.
For example, the binaries we built with devtoolset3 (gcc 4.9.2) on CentOS6
didn't run with the default libstdc++ on CentOS6 either due to ABI changes
or minimum GLIBCXX version for some of the symbols being unavailable.

We tried our best to support our binaries running on CentOS6 and above with
various ranges of static linking hacks until 0.3.1 (January 2018), but at
some point hacks over hacks was only getting more fragile. Hence we moved
to a CentOS7-based image in April 2018 [3], and relied only on dynamic
linking to the system-shipped libstdc++.

As Wes mentions [4], an option is to host a modern C++ standard library via
PyPI would put manylinux2010 on the table. There are however subtle
consequences with this -- if this package gets installed into a conda
environment, it'll clobber anaconda-shipped libstdc++, possibly corrupting
environments for thousands of anaconda users (this is actually similar to
the issues with `mkl` shipped via PyPI and Conda clobbering each other).

References:

[1] https://github.com/NVIDIA/nvidia-docker/issues/348
[2] https://gcc.gnu.org/wiki/Cxx11AbiCompatibility
[3]
https://github.com/pytorch/builder/commit/44d9bfa607a7616c66fe6492fadd8f05f3578b93
[4] https://github.com/apache/arrow/pull/3177#issuecomment-447515982
..............................................................................................................................................................................................

On Sun, Dec 16, 2018 at 2:57 PM Wes McKinney <we...@gmail.com> wrote:

> Reposting since I wasn't subscribed to developers@tensorflow.org. I
> also didn't see Soumith's response since it didn't come through to
> dev@arrow.apache.org
>
> In response to the non-conforming ABI in the TF and PyTorch wheels, we
> have attempted to hack around the issue with some elaborate
> workarounds [1] [2] that have ultimately proved to not work
> universally. The bottom line is that this is burdening other projects
> in the Python ecosystem and causing confusing application crashes.
>
> First, to state what should hopefully obvious to many of you, Python
> wheels are not a robust way to deploy complex C++ projects, even
> setting aside the compiler toolchain issue. If a project has
> non-trivial third party dependencies, you either have to statically
> link them or bundle shared libraries with the wheel (we do a bit of
> both in Apache Arrow). Neither solution is foolproof in all cases.
> There are other downsides to wheels when it comes to numerical
> computing -- it is difficult to utilize things like the Intel MKL
> which may be used by multiple projects. If two projects have the same
> third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
> straw man example), it's hard to guarantee that versions or ABI will
> not conflict with each other.
>
> In packaging with conda, we pin all dependencies when building
> projects that depend on them, then package and deploy the dependencies
> as separate shared libraries instead of bundling. To resolve the need
> for newer compilers or newer C++ standard library, libstdc++.so and
> other system shared libraries are packaged and installed as
> dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
> is used as it performs selective static linking of symbols to enable
> C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
> environment functions as sort of portable miniature Linux
> distribution.
>
> Given the current state of things, as using the TensorFlow and PyTorch
> wheels in the same process as other conforming manylinux1 wheels is
> unsafe, it's hard to see how one can continue to recommend pip as a
> preferred installation path until the ABI problems are resolved. For
> example, "pip" is what is recommended for installing TensorFlow on
> Linux [3]. It's unclear that non-compliant wheels should be allowed in
> the package manager at all (I'm aware that this was deemed to not be
> the responsibility of PyPI to verify policy compliance [4]).
>
> A couple possible paths forward (there may be others):
>
> * Collaborate with the Python packaging authority to evolve the
> manylinux ABI to be able to produce compliant wheels that support the
> build and deployment requirements of these projects
> * Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
> projects can ship packages that can be guaranteed to work properly
> with TF/PyTorch. This might require vendoring libstdc++ in some kind
> of "toolchain" wheel that projects using this new ABI can depend on
>
> Note that these toolchain and deployment issues are absent when
> building and deploying with conda packages, since build- and run-time
> dependencies can be pinned and shared across all the projects that
> depend on them, ensuring ABI cross-compatibility. It's great to have
> the convenience of "pip install $PROJECT", but I believe that these
> projects have outgrown the intended use for pip and wheel
> distributions.
>
> Until the ABI incompatibilities are resolved, I would encourage more
> prominent user documentation about the non-portability and potential
> for crashes with these Linux wheels.
>
> Thanks,
> Wes
>
> [1]:
> https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
> [2]:
> https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
> [3]: https://www.tensorflow.org/install/
> [4]: https://www.python.org/dev/peps/pep-0513/#id50
> On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
> <ro...@gmail.com> wrote:
> >
> > On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com>
> wrote:
> >
> > > Dear all,
> > >
> > > As some of you know, there is a standard in Python called manylinux (
> > > https://www.python.org/dev/peps/pep-0513/) to package binary
> executables
> > > and libraries into a “wheel” in a way that allows the code to be run
> on a
> > > wide variety of Linux distributions. This is very convenient for Python
> > > users, since such libraries can be easily installed via pip.
> > >
> > > This standard is also important for a second reason: If many different
> > > wheels are used together in a single Python process, adhering to
> manylinux
> > > ensures that these libraries work together well and don’t trip on each
> > > other’s toes (this could easily happen if different versions of
> libstdc++
> > > are used for example). Therefore *even if support for only a single
> > > distribution like Ubuntu is desired*, it is important to be manylinux
> > > compatible to make sure everybody’s wheels work together well.
> > >
> > > TensorFlow and PyTorch unfortunately don’t produce manylinux compatible
> > > wheels. The challenge is due, at least in part, to the need to use
> > > nvidia-docker to build GPU binaries [10]. This causes various levels of
> > > pain for the rest of the Python community, see for example [1] [2] [3]
> [4]
> > > [5] [6] [7] [8].
> > >
> > > The purpose of the e-mail is to get a discussion started on how we can
> > > make TensorFlow and PyTorch manylinux compliant. There is a new
> standard in
> > > the works [9] so hopefully we can discuss what would be necessary to
> make
> > > sure TensorFlow and PyTorch can adhere to this standard in the future.
> > >
> > > It would make everybody’s lives just a little bit better! Any ideas are
> > > appreciated.
> > >
> > > @soumith: Could you cc the relevant list? I couldn't find a pytorch dev
> > > mailing list.
> > >
> > > Best,
> > > Philipp.
> > >
> > > [1] https://github.com/tensorflow/tensorflow/issues/5033
> > > [2] https://github.com/tensorflow/tensorflow/issues/8802
> > > [3] https://github.com/primitiv/primitiv-python/issues/28
> > > [4] https://github.com/zarr-developers/numcodecs/issues/70
> > > [5] https://github.com/apache/arrow/pull/3177
> > > [6] https://github.com/tensorflow/tensorflow/issues/13615
> > > [7] https://github.com/pytorch/pytorch/issues/8358
> > > [8] https://github.com/ray-project/ray/issues/2159
> > > [9] https://www.python.org/dev/peps/pep-0571/
> > > [10]
> > >
> https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "ray-dev" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> an
> > > email to ray-dev+unsubscribe@googlegroups.com.
> > > To post to this group, send email to ray-dev@googlegroups.com.
> > > To view this discussion on the web visit
> > >
> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
> > > <
> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer
> >
> > > .
> > > For more options, visit https://groups.google.com/d/optout.
> > >
>

Re: TensorFlow, PyTorch, and manylinux1

Posted by Wes McKinney <we...@gmail.com>.

Reposting since I wasn't subscribed to developers@tensorflow.org. I
also didn't see Soumith's response since it didn't come through to
dev@arrow.apache.org

In response to the non-conforming ABI in the TF and PyTorch wheels, we
have attempted to hack around the issue with some elaborate
workarounds [1] [2] that have ultimately proved to not work
universally. The bottom line is that this is burdening other projects
in the Python ecosystem and causing confusing application crashes.

First, to state what should hopefully obvious to many of you, Python
wheels are not a robust way to deploy complex C++ projects, even
setting aside the compiler toolchain issue. If a project has
non-trivial third party dependencies, you either have to statically
link them or bundle shared libraries with the wheel (we do a bit of
both in Apache Arrow). Neither solution is foolproof in all cases.
There are other downsides to wheels when it comes to numerical
computing -- it is difficult to utilize things like the Intel MKL
which may be used by multiple projects. If two projects have the same
third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
straw man example), it's hard to guarantee that versions or ABI will
not conflict with each other.

In packaging with conda, we pin all dependencies when building
projects that depend on them, then package and deploy the dependencies
as separate shared libraries instead of bundling. To resolve the need
for newer compilers or newer C++ standard library, libstdc++.so and
other system shared libraries are packaged and installed as
dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
is used as it performs selective static linking of symbols to enable
C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
environment functions as sort of portable miniature Linux
distribution.

Given the current state of things, as using the TensorFlow and PyTorch
wheels in the same process as other conforming manylinux1 wheels is
unsafe, it's hard to see how one can continue to recommend pip as a
preferred installation path until the ABI problems are resolved. For
example, "pip" is what is recommended for installing TensorFlow on
Linux [3]. It's unclear that non-compliant wheels should be allowed in
the package manager at all (I'm aware that this was deemed to not be
the responsibility of PyPI to verify policy compliance [4]).

A couple possible paths forward (there may be others):

* Collaborate with the Python packaging authority to evolve the
manylinux ABI to be able to produce compliant wheels that support the
build and deployment requirements of these projects
* Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
projects can ship packages that can be guaranteed to work properly
with TF/PyTorch. This might require vendoring libstdc++ in some kind
of "toolchain" wheel that projects using this new ABI can depend on

Note that these toolchain and deployment issues are absent when
building and deploying with conda packages, since build- and run-time
dependencies can be pinned and shared across all the projects that
depend on them, ensuring ABI cross-compatibility. It's great to have
the convenience of "pip install $PROJECT", but I believe that these
projects have outgrown the intended use for pip and wheel
distributions.

Until the ABI incompatibilities are resolved, I would encourage more
prominent user documentation about the non-portability and potential
for crashes with these Linux wheels.

Thanks,
Wes

[1]: https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
[2]: https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
[3]: https://www.tensorflow.org/install/
[4]: https://www.python.org/dev/peps/pep-0513/#id50
On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
<ro...@gmail.com> wrote:
>
> On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com> wrote:
>
> > Dear all,
> >
> > As some of you know, there is a standard in Python called manylinux (
> > https://www.python.org/dev/peps/pep-0513/) to package binary executables
> > and libraries into a “wheel” in a way that allows the code to be run on a
> > wide variety of Linux distributions. This is very convenient for Python
> > users, since such libraries can be easily installed via pip.
> >
> > This standard is also important for a second reason: If many different
> > wheels are used together in a single Python process, adhering to manylinux
> > ensures that these libraries work together well and don’t trip on each
> > other’s toes (this could easily happen if different versions of libstdc++
> > are used for example). Therefore *even if support for only a single
> > distribution like Ubuntu is desired*, it is important to be manylinux
> > compatible to make sure everybody’s wheels work together well.
> >
> > TensorFlow and PyTorch unfortunately don’t produce manylinux compatible
> > wheels. The challenge is due, at least in part, to the need to use
> > nvidia-docker to build GPU binaries [10]. This causes various levels of
> > pain for the rest of the Python community, see for example [1] [2] [3] [4]
> > [5] [6] [7] [8].
> >
> > The purpose of the e-mail is to get a discussion started on how we can
> > make TensorFlow and PyTorch manylinux compliant. There is a new standard in
> > the works [9] so hopefully we can discuss what would be necessary to make
> > sure TensorFlow and PyTorch can adhere to this standard in the future.
> >
> > It would make everybody’s lives just a little bit better! Any ideas are
> > appreciated.
> >
> > @soumith: Could you cc the relevant list? I couldn't find a pytorch dev
> > mailing list.
> >
> > Best,
> > Philipp.
> >
> > [1] https://github.com/tensorflow/tensorflow/issues/5033
> > [2] https://github.com/tensorflow/tensorflow/issues/8802
> > [3] https://github.com/primitiv/primitiv-python/issues/28
> > [4] https://github.com/zarr-developers/numcodecs/issues/70
> > [5] https://github.com/apache/arrow/pull/3177
> > [6] https://github.com/tensorflow/tensorflow/issues/13615
> > [7] https://github.com/pytorch/pytorch/issues/8358
> > [8] https://github.com/ray-project/ray/issues/2159
> > [9] https://www.python.org/dev/peps/pep-0571/
> > [10]
> > https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "ray-dev" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to ray-dev+unsubscribe@googlegroups.com.
> > To post to this group, send email to ray-dev@googlegroups.com.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
> > <https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> > .
> > For more options, visit https://groups.google.com/d/optout.
> >

Re: TensorFlow, PyTorch, and manylinux1

Posted by Wes McKinney <we...@gmail.com>.

In response to the non-conforming ABI in the TF and PyTorch wheels, we
have attempted to hack around the issue with some elaborate
workarounds [1] [2] that have ultimately proved to not work
universally. The bottom line is that this is burdening other projects
in the Python ecosystem and causing confusing application crashes.

First, to state what should hopefully obvious to many of you, Python
wheels are not a robust way to deploy complex C++ projects, even
setting aside the compiler toolchain issue. If a project has
non-trivial third party dependencies, you either have to statically
link them or bundle shared libraries with the wheel (we do a bit of
both in Apache Arrow). Neither solution is foolproof in all cases.
There are other downsides to wheels when it comes to numerical
computing -- it is difficult to utilize things like the Intel MKL
which may be used by multiple projects. If two projects have the same
third party C++ dependency (e.g. let's use gRPC or libprotobuf as a
straw man example), it's hard to guarantee that versions or ABI will
not conflict with each other.

In packaging with conda, we pin all dependencies when building
projects that depend on them, then package and deploy the dependencies
as separate shared libraries instead of bundling. To resolve the need
for newer compilers or newer C++ standard library, libstdc++.so and
other system shared libraries are packaged and installed as
dependencies. In manylinux1, the RedHat devtoolset compiler toolchain
is used as it performs selective static linking of symbols to enable
C++11 libraries to be deployed on older Linuxes like RHEL5/6. A conda
environment functions as sort of portable miniature Linux
distribution.

Given the current state of things, as using the TensorFlow and PyTorch
wheels in the same process as other conforming manylinux1 wheels is
unsafe, it's hard to see how one can continue to recommend pip as a
preferred installation path until the ABI problems are resolved. For
example, "pip" is what is recommended for installing TensorFlow on
Linux [3]. It's unclear that non-compliant wheels should be allowed in
the package manager at all (I'm aware that this was deemed to not be
the responsibility of PyPI to verify policy compliance [4]).

A couple possible paths forward (there may be others):

* Collaborate with the Python packaging authority to evolve the
manylinux ABI to be able to produce compliant wheels that support the
build and deployment requirements of these projects
* Create a new ABI tag for CUDA/C++11-enabled Python wheels so that
projects can ship packages that can be guaranteed to work properly
with TF/PyTorch. This might require vendoring libstdc++ in some kind
of "toolchain" wheel that projects using this new ABI can depend on

Note that these toolchain and deployment issues are absent when
building and deploying with conda packages, since build- and run-time
dependencies can be pinned and shared across all the projects that
depend on them, ensuring ABI cross-compatibility. It's great to have
the convenience of "pip install $PROJECT", but I believe that these
projects have outgrown the intended use for pip and wheel
distributions.

Until the ABI incompatibilities are resolved, I would encourage more
prominent user documentation about the non-portability and potential
for crashes with these Linux wheels.

Thanks,
Wes

[1]: https://github.com/apache/arrow/commit/537e7f7fd503dd920c0b9f0cef8a2de86bc69e3b
[2]: https://github.com/apache/arrow/commit/e7aaf7bf3d3e326b5fe58d20f8fc45b5cec01cac
[3]: https://www.tensorflow.org/install/
[4]: https://www.python.org/dev/peps/pep-0513/#id50
On Sat, Dec 15, 2018 at 11:25 PM Robert Nishihara
<ro...@gmail.com> wrote:
>
> On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com> wrote:
>
> > Dear all,
> >
> > As some of you know, there is a standard in Python called manylinux (
> > https://www.python.org/dev/peps/pep-0513/) to package binary executables
> > and libraries into a “wheel” in a way that allows the code to be run on a
> > wide variety of Linux distributions. This is very convenient for Python
> > users, since such libraries can be easily installed via pip.
> >
> > This standard is also important for a second reason: If many different
> > wheels are used together in a single Python process, adhering to manylinux
> > ensures that these libraries work together well and don’t trip on each
> > other’s toes (this could easily happen if different versions of libstdc++
> > are used for example). Therefore *even if support for only a single
> > distribution like Ubuntu is desired*, it is important to be manylinux
> > compatible to make sure everybody’s wheels work together well.
> >
> > TensorFlow and PyTorch unfortunately don’t produce manylinux compatible
> > wheels. The challenge is due, at least in part, to the need to use
> > nvidia-docker to build GPU binaries [10]. This causes various levels of
> > pain for the rest of the Python community, see for example [1] [2] [3] [4]
> > [5] [6] [7] [8].
> >
> > The purpose of the e-mail is to get a discussion started on how we can
> > make TensorFlow and PyTorch manylinux compliant. There is a new standard in
> > the works [9] so hopefully we can discuss what would be necessary to make
> > sure TensorFlow and PyTorch can adhere to this standard in the future.
> >
> > It would make everybody’s lives just a little bit better! Any ideas are
> > appreciated.
> >
> > @soumith: Could you cc the relevant list? I couldn't find a pytorch dev
> > mailing list.
> >
> > Best,
> > Philipp.
> >
> > [1] https://github.com/tensorflow/tensorflow/issues/5033
> > [2] https://github.com/tensorflow/tensorflow/issues/8802
> > [3] https://github.com/primitiv/primitiv-python/issues/28
> > [4] https://github.com/zarr-developers/numcodecs/issues/70
> > [5] https://github.com/apache/arrow/pull/3177
> > [6] https://github.com/tensorflow/tensorflow/issues/13615
> > [7] https://github.com/pytorch/pytorch/issues/8358
> > [8] https://github.com/ray-project/ray/issues/2159
> > [9] https://www.python.org/dev/peps/pep-0571/
> > [10]
> > https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "ray-dev" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to ray-dev+unsubscribe@googlegroups.com.
> > To post to this group, send email to ray-dev@googlegroups.com.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
> > <https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> > .
> > For more options, visit https://groups.google.com/d/optout.
> >

Re: TensorFlow, PyTorch, and manylinux1

Posted by Robert Nishihara <ro...@gmail.com>.

On Sat, Dec 15, 2018 at 8:43 PM Philipp Moritz <pc...@gmail.com> wrote:

> Dear all,
>
> As some of you know, there is a standard in Python called manylinux (
> https://www.python.org/dev/peps/pep-0513/) to package binary executables
> and libraries into a “wheel” in a way that allows the code to be run on a
> wide variety of Linux distributions. This is very convenient for Python
> users, since such libraries can be easily installed via pip.
>
> This standard is also important for a second reason: If many different
> wheels are used together in a single Python process, adhering to manylinux
> ensures that these libraries work together well and don’t trip on each
> other’s toes (this could easily happen if different versions of libstdc++
> are used for example). Therefore *even if support for only a single
> distribution like Ubuntu is desired*, it is important to be manylinux
> compatible to make sure everybody’s wheels work together well.
>
> TensorFlow and PyTorch unfortunately don’t produce manylinux compatible
> wheels. The challenge is due, at least in part, to the need to use
> nvidia-docker to build GPU binaries [10]. This causes various levels of
> pain for the rest of the Python community, see for example [1] [2] [3] [4]
> [5] [6] [7] [8].
>
> The purpose of the e-mail is to get a discussion started on how we can
> make TensorFlow and PyTorch manylinux compliant. There is a new standard in
> the works [9] so hopefully we can discuss what would be necessary to make
> sure TensorFlow and PyTorch can adhere to this standard in the future.
>
> It would make everybody’s lives just a little bit better! Any ideas are
> appreciated.
>
> @soumith: Could you cc the relevant list? I couldn't find a pytorch dev
> mailing list.
>
> Best,
> Philipp.
>
> [1] https://github.com/tensorflow/tensorflow/issues/5033
> [2] https://github.com/tensorflow/tensorflow/issues/8802
> [3] https://github.com/primitiv/primitiv-python/issues/28
> [4] https://github.com/zarr-developers/numcodecs/issues/70
> [5] https://github.com/apache/arrow/pull/3177
> [6] https://github.com/tensorflow/tensorflow/issues/13615
> [7] https://github.com/pytorch/pytorch/issues/8358
> [8] https://github.com/ray-project/ray/issues/2159
> [9] https://www.python.org/dev/peps/pep-0571/
> [10]
> https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-291935940
>
> --
> You received this message because you are subscribed to the Google Groups
> "ray-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ray-dev+unsubscribe@googlegroups.com.
> To post to this group, send email to ray-dev@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com
> <https://groups.google.com/d/msgid/ray-dev/CAFs1FxUBAag6AThj34twiAB6KY3t5sJSJF3g70K3SvF-%2BzGGgw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>