You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Hatem Helal <hh...@mathworks.com> on 2019/08/16 15:11:47 UTC
[DISCUSS] Apache Arrow manylinux1 support
Hi all,
I ran into a surprising (to me) limitation when working on an issue [1]. To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete. This brought on few questions for me:
* What are the pre-conditions for dropping manylinux1 / gcc 4.8.x? I found an open task to remove support altogether [2] .
* What is needed to move to C++ 14?
* Would either of these changes normally require a PMC-driven vote?
I realize these are broad questions but I'm curious to hear thoughts or additional background requirements that I failed to find easily.
Many thanks,
Hatem
[1] https://issues.apache.org/jira/browse/ARROW-6096
[2] https://issues.apache.org/jira/browse/ARROW-5756
Re: [DISCUSS] Apache Arrow manylinux1 support
Posted by Wes McKinney <we...@gmail.com>.
On Mon, Aug 19, 2019 at 8:55 AM Antoine Pitrou <so...@pitrou.net> wrote:
>
> On Mon, 19 Aug 2019 08:44:26 -0500
> Wes McKinney <we...@gmail.com> wrote:
> > Will publishing only manylinux2010 wheels have any consequences (for
> > example, a relatively new version of setuptools may be required)?
>
> A relatively new version of pip is required. But upgrading pip is
> straightforward, at least in a virtual environment or private Python
> install.
>
OK. So people building Dockerfiles on Linux will have to upgrade the
setuptools that's available in their package manager in a lot of
cases. It doesn't seem like a big deal, but it would need to be
appropriately documented since there are likely to be a lot of folks
out there who are running `pip install pyarrow` without much
additional thought about this detail
> Regards
>
> Antoine.
>
>
> >
> > On Fri, Aug 16, 2019 at 11:58 AM Neal Richardson
> > <ne...@gmail.com> wrote:
> > >
> > > For R's official support for various C++ versions, see
> > > https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b11-code
> > > and below. Empirically, C++ > 11 is not really used: there are only 6
> > > packages on CRAN that declare it as a requirement, and none of those
> > > are widely used.
> > >
> > > $ R
> > > > df <- tools::CRAN_package_db()
> > > > table(grepl("C++11", df$SystemRequirements, fixed=TRUE))
> > >
> > > FALSE TRUE
> > > 14502 275
> > > > table(grepl("C++14", df$SystemRequirements, fixed=TRUE))
> > >
> > > FALSE TRUE
> > > 14771 6
> > > > df[grepl("C++14", df$SystemRequirements, fixed=TRUE), c("Package", "Reverse depends", "Reverse imports", "Reverse suggests")]
> > > Package Reverse depends Reverse imports Reverse suggests
> > > 6071 IsoSpecR <NA> <NA> <NA>
> > > 8004 multinet <NA> <NA> <NA>
> > > 8200 ndjson streamR <NA> <NA>
> > > 10487 RcppAlgos <NA> STraTUS bigIntegerAlgos
> > > 11115 rmdcev <NA> <NA> <NA>
> > > 14391 walker <NA> <NA> <NA>
> > > > table(grepl("C++17", df$SystemRequirements, fixed=TRUE))
> > >
> > > FALSE
> > > 14777
> > >
> > > On Fri, Aug 16, 2019 at 8:32 AM Antoine Pitrou <an...@python.org> wrote:
> > > >
> > > >
> > > > Le 16/08/2019 à 17:11, Hatem Helal a écrit :
> > > > > Hi all,
> > > > >
> > > > > I ran into a surprising (to me) limitation when working on an issue [1]. To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete. This brought on few questions for me:
> > > > >
> > > > > * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x? I found an open task to remove support altogether [2] .
> > > >
> > > > Not much IMHO. 1) The people who have been producing Python wheels up
> > > > to now have decided to stop spending valuable time on hairy binary
> > > > compatibility and distribution issues. 2) Last I tried, manylinux2010
> > > > works and someone who's interested in reviving Python Linux wheels can
> > > > probably produce such wheels instead of manylinux1.
> > > >
> > > > So IMHO we can drop manylinux1 support right now. However:
> > > >
> > > > > * What is needed to move to C++ 14?
> > > >
> > > > Make sure that all important toolchains support it. Unfortunately, I
> > > > don't think that's the case for the MinGW version that's used to build R
> > > > packages on Windows. It's using gcc 4.9.3.
> > > >
> > > > See e.g.
> > > > https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666
> > > >
> > > > > * Would either of these changes normally require a PMC-driven vote?
> > > >
> > > > I don't think dropping manylinux1 needs a PMC vote. It's simply a case
> > > > of a high-cost recurring activity that doesn't find a volunteer anymore.
> > > > The PMC can't simply claim that we continue supporting manylinux1 if
> > > > there's nobody around to do the actual work.
> > > >
> > > > As for switching the baseline to C++14, it would probably require a vote
> > > > indeed. And I expect a -1 if the R Windows build can't be migrated to a
> > > > newer compiler.
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> >
>
>
>
Re: [DISCUSS] Apache Arrow manylinux1 support
Posted by Antoine Pitrou <so...@pitrou.net>.
On Mon, 19 Aug 2019 08:44:26 -0500
Wes McKinney <we...@gmail.com> wrote:
> Will publishing only manylinux2010 wheels have any consequences (for
> example, a relatively new version of setuptools may be required)?
A relatively new version of pip is required. But upgrading pip is
straightforward, at least in a virtual environment or private Python
install.
Regards
Antoine.
>
> On Fri, Aug 16, 2019 at 11:58 AM Neal Richardson
> <ne...@gmail.com> wrote:
> >
> > For R's official support for various C++ versions, see
> > https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b11-code
> > and below. Empirically, C++ > 11 is not really used: there are only 6
> > packages on CRAN that declare it as a requirement, and none of those
> > are widely used.
> >
> > $ R
> > > df <- tools::CRAN_package_db()
> > > table(grepl("C++11", df$SystemRequirements, fixed=TRUE))
> >
> > FALSE TRUE
> > 14502 275
> > > table(grepl("C++14", df$SystemRequirements, fixed=TRUE))
> >
> > FALSE TRUE
> > 14771 6
> > > df[grepl("C++14", df$SystemRequirements, fixed=TRUE), c("Package", "Reverse depends", "Reverse imports", "Reverse suggests")]
> > Package Reverse depends Reverse imports Reverse suggests
> > 6071 IsoSpecR <NA> <NA> <NA>
> > 8004 multinet <NA> <NA> <NA>
> > 8200 ndjson streamR <NA> <NA>
> > 10487 RcppAlgos <NA> STraTUS bigIntegerAlgos
> > 11115 rmdcev <NA> <NA> <NA>
> > 14391 walker <NA> <NA> <NA>
> > > table(grepl("C++17", df$SystemRequirements, fixed=TRUE))
> >
> > FALSE
> > 14777
> >
> > On Fri, Aug 16, 2019 at 8:32 AM Antoine Pitrou <an...@python.org> wrote:
> > >
> > >
> > > Le 16/08/2019 à 17:11, Hatem Helal a écrit :
> > > > Hi all,
> > > >
> > > > I ran into a surprising (to me) limitation when working on an issue [1]. To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete. This brought on few questions for me:
> > > >
> > > > * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x? I found an open task to remove support altogether [2] .
> > >
> > > Not much IMHO. 1) The people who have been producing Python wheels up
> > > to now have decided to stop spending valuable time on hairy binary
> > > compatibility and distribution issues. 2) Last I tried, manylinux2010
> > > works and someone who's interested in reviving Python Linux wheels can
> > > probably produce such wheels instead of manylinux1.
> > >
> > > So IMHO we can drop manylinux1 support right now. However:
> > >
> > > > * What is needed to move to C++ 14?
> > >
> > > Make sure that all important toolchains support it. Unfortunately, I
> > > don't think that's the case for the MinGW version that's used to build R
> > > packages on Windows. It's using gcc 4.9.3.
> > >
> > > See e.g.
> > > https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666
> > >
> > > > * Would either of these changes normally require a PMC-driven vote?
> > >
> > > I don't think dropping manylinux1 needs a PMC vote. It's simply a case
> > > of a high-cost recurring activity that doesn't find a volunteer anymore.
> > > The PMC can't simply claim that we continue supporting manylinux1 if
> > > there's nobody around to do the actual work.
> > >
> > > As for switching the baseline to C++14, it would probably require a vote
> > > indeed. And I expect a -1 if the R Windows build can't be migrated to a
> > > newer compiler.
> > >
> > > Regards
> > >
> > > Antoine.
>
Re: [DISCUSS] Apache Arrow manylinux1 support
Posted by Wes McKinney <we...@gmail.com>.
Will publishing only manylinux2010 wheels have any consequences (for
example, a relatively new version of setuptools may be required)?
On Fri, Aug 16, 2019 at 11:58 AM Neal Richardson
<ne...@gmail.com> wrote:
>
> For R's official support for various C++ versions, see
> https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b11-code
> and below. Empirically, C++ > 11 is not really used: there are only 6
> packages on CRAN that declare it as a requirement, and none of those
> are widely used.
>
> $ R
> > df <- tools::CRAN_package_db()
> > table(grepl("C++11", df$SystemRequirements, fixed=TRUE))
>
> FALSE TRUE
> 14502 275
> > table(grepl("C++14", df$SystemRequirements, fixed=TRUE))
>
> FALSE TRUE
> 14771 6
> > df[grepl("C++14", df$SystemRequirements, fixed=TRUE), c("Package", "Reverse depends", "Reverse imports", "Reverse suggests")]
> Package Reverse depends Reverse imports Reverse suggests
> 6071 IsoSpecR <NA> <NA> <NA>
> 8004 multinet <NA> <NA> <NA>
> 8200 ndjson streamR <NA> <NA>
> 10487 RcppAlgos <NA> STraTUS bigIntegerAlgos
> 11115 rmdcev <NA> <NA> <NA>
> 14391 walker <NA> <NA> <NA>
> > table(grepl("C++17", df$SystemRequirements, fixed=TRUE))
>
> FALSE
> 14777
>
> On Fri, Aug 16, 2019 at 8:32 AM Antoine Pitrou <an...@python.org> wrote:
> >
> >
> > Le 16/08/2019 à 17:11, Hatem Helal a écrit :
> > > Hi all,
> > >
> > > I ran into a surprising (to me) limitation when working on an issue [1]. To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete. This brought on few questions for me:
> > >
> > > * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x? I found an open task to remove support altogether [2] .
> >
> > Not much IMHO. 1) The people who have been producing Python wheels up
> > to now have decided to stop spending valuable time on hairy binary
> > compatibility and distribution issues. 2) Last I tried, manylinux2010
> > works and someone who's interested in reviving Python Linux wheels can
> > probably produce such wheels instead of manylinux1.
> >
> > So IMHO we can drop manylinux1 support right now. However:
> >
> > > * What is needed to move to C++ 14?
> >
> > Make sure that all important toolchains support it. Unfortunately, I
> > don't think that's the case for the MinGW version that's used to build R
> > packages on Windows. It's using gcc 4.9.3.
> >
> > See e.g.
> > https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666
> >
> > > * Would either of these changes normally require a PMC-driven vote?
> >
> > I don't think dropping manylinux1 needs a PMC vote. It's simply a case
> > of a high-cost recurring activity that doesn't find a volunteer anymore.
> > The PMC can't simply claim that we continue supporting manylinux1 if
> > there's nobody around to do the actual work.
> >
> > As for switching the baseline to C++14, it would probably require a vote
> > indeed. And I expect a -1 if the R Windows build can't be migrated to a
> > newer compiler.
> >
> > Regards
> >
> > Antoine.
Re: [DISCUSS] Apache Arrow manylinux1 support
Posted by Neal Richardson <ne...@gmail.com>.
For R's official support for various C++ versions, see
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b11-code
and below. Empirically, C++ > 11 is not really used: there are only 6
packages on CRAN that declare it as a requirement, and none of those
are widely used.
$ R
> df <- tools::CRAN_package_db()
> table(grepl("C++11", df$SystemRequirements, fixed=TRUE))
FALSE TRUE
14502 275
> table(grepl("C++14", df$SystemRequirements, fixed=TRUE))
FALSE TRUE
14771 6
> df[grepl("C++14", df$SystemRequirements, fixed=TRUE), c("Package", "Reverse depends", "Reverse imports", "Reverse suggests")]
Package Reverse depends Reverse imports Reverse suggests
6071 IsoSpecR <NA> <NA> <NA>
8004 multinet <NA> <NA> <NA>
8200 ndjson streamR <NA> <NA>
10487 RcppAlgos <NA> STraTUS bigIntegerAlgos
11115 rmdcev <NA> <NA> <NA>
14391 walker <NA> <NA> <NA>
> table(grepl("C++17", df$SystemRequirements, fixed=TRUE))
FALSE
14777
On Fri, Aug 16, 2019 at 8:32 AM Antoine Pitrou <an...@python.org> wrote:
>
>
> Le 16/08/2019 à 17:11, Hatem Helal a écrit :
> > Hi all,
> >
> > I ran into a surprising (to me) limitation when working on an issue [1]. To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete. This brought on few questions for me:
> >
> > * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x? I found an open task to remove support altogether [2] .
>
> Not much IMHO. 1) The people who have been producing Python wheels up
> to now have decided to stop spending valuable time on hairy binary
> compatibility and distribution issues. 2) Last I tried, manylinux2010
> works and someone who's interested in reviving Python Linux wheels can
> probably produce such wheels instead of manylinux1.
>
> So IMHO we can drop manylinux1 support right now. However:
>
> > * What is needed to move to C++ 14?
>
> Make sure that all important toolchains support it. Unfortunately, I
> don't think that's the case for the MinGW version that's used to build R
> packages on Windows. It's using gcc 4.9.3.
>
> See e.g.
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666
>
> > * Would either of these changes normally require a PMC-driven vote?
>
> I don't think dropping manylinux1 needs a PMC vote. It's simply a case
> of a high-cost recurring activity that doesn't find a volunteer anymore.
> The PMC can't simply claim that we continue supporting manylinux1 if
> there's nobody around to do the actual work.
>
> As for switching the baseline to C++14, it would probably require a vote
> indeed. And I expect a -1 if the R Windows build can't be migrated to a
> newer compiler.
>
> Regards
>
> Antoine.
Re: [DISCUSS] Apache Arrow manylinux1 support
Posted by Antoine Pitrou <an...@python.org>.
Le 16/08/2019 à 17:11, Hatem Helal a écrit :
> Hi all,
>
> I ran into a surprising (to me) limitation when working on an issue [1]. To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete. This brought on few questions for me:
>
> * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x? I found an open task to remove support altogether [2] .
Not much IMHO. 1) The people who have been producing Python wheels up
to now have decided to stop spending valuable time on hairy binary
compatibility and distribution issues. 2) Last I tried, manylinux2010
works and someone who's interested in reviving Python Linux wheels can
probably produce such wheels instead of manylinux1.
So IMHO we can drop manylinux1 support right now. However:
> * What is needed to move to C++ 14?
Make sure that all important toolchains support it. Unfortunately, I
don't think that's the case for the MinGW version that's used to build R
packages on Windows. It's using gcc 4.9.3.
See e.g.
https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666
> * Would either of these changes normally require a PMC-driven vote?
I don't think dropping manylinux1 needs a PMC vote. It's simply a case
of a high-cost recurring activity that doesn't find a volunteer anymore.
The PMC can't simply claim that we continue supporting manylinux1 if
there's nobody around to do the actual work.
As for switching the baseline to C++14, it would probably require a vote
indeed. And I expect a -1 if the R Windows build can't be migrated to a
newer compiler.
Regards
Antoine.