You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Hatem Helal <hh...@mathworks.com> on 2019/08/16 15:11:47 UTC

[DISCUSS] Apache Arrow manylinux1 support

Hi all,

I ran into a surprising (to me) limitation when working on an issue [1].  To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete.  This brought on few questions for me:

* What are the pre-conditions for dropping manylinux1 / gcc 4.8.x?  I found an open task to remove support altogether [2] .
* What is needed to move to C++ 14?
* Would either of these changes normally require a PMC-driven vote?

I realize these are broad questions but I'm curious to hear thoughts or additional background requirements that I failed to find easily.

Many thanks,

Hatem

[1] https://issues.apache.org/jira/browse/ARROW-6096
[2] https://issues.apache.org/jira/browse/ARROW-5756


Re: [DISCUSS] Apache Arrow manylinux1 support

Posted by Wes McKinney <we...@gmail.com>.
On Mon, Aug 19, 2019 at 8:55 AM Antoine Pitrou <so...@pitrou.net> wrote:
>
> On Mon, 19 Aug 2019 08:44:26 -0500
> Wes McKinney <we...@gmail.com> wrote:
> > Will publishing only manylinux2010 wheels have any consequences (for
> > example, a relatively new version of setuptools may be required)?
>
> A relatively new version of pip is required.  But upgrading pip is
> straightforward, at least in a virtual environment or private Python
> install.
>

OK. So people building Dockerfiles on Linux will have to upgrade the
setuptools that's available in their package manager in a lot of
cases. It doesn't seem like a big deal, but it would need to be
appropriately documented since there are likely to be a lot of folks
out there who are running `pip install pyarrow` without much
additional thought about this detail

> Regards
>
> Antoine.
>
>
> >
> > On Fri, Aug 16, 2019 at 11:58 AM Neal Richardson
> > <ne...@gmail.com> wrote:
> > >
> > > For R's official support for various C++ versions, see
> > > https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b11-code
> > > and below. Empirically, C++ > 11 is not really used: there are only 6
> > > packages on CRAN that declare it as a requirement, and none of those
> > > are widely used.
> > >
> > > $ R
> > > > df <- tools::CRAN_package_db()
> > > > table(grepl("C++11", df$SystemRequirements, fixed=TRUE))
> > >
> > > FALSE  TRUE
> > > 14502   275
> > > > table(grepl("C++14", df$SystemRequirements, fixed=TRUE))
> > >
> > > FALSE  TRUE
> > > 14771     6
> > > > df[grepl("C++14", df$SystemRequirements, fixed=TRUE), c("Package", "Reverse depends", "Reverse imports", "Reverse suggests")]
> > >         Package Reverse depends Reverse imports Reverse suggests
> > > 6071   IsoSpecR            <NA>            <NA>             <NA>
> > > 8004   multinet            <NA>            <NA>             <NA>
> > > 8200     ndjson         streamR            <NA>             <NA>
> > > 10487 RcppAlgos            <NA>         STraTUS  bigIntegerAlgos
> > > 11115    rmdcev            <NA>            <NA>             <NA>
> > > 14391    walker            <NA>            <NA>             <NA>
> > > > table(grepl("C++17", df$SystemRequirements, fixed=TRUE))
> > >
> > > FALSE
> > > 14777
> > >
> > > On Fri, Aug 16, 2019 at 8:32 AM Antoine Pitrou <an...@python.org> wrote:
> > > >
> > > >
> > > > Le 16/08/2019 à 17:11, Hatem Helal a écrit :
> > > > > Hi all,
> > > > >
> > > > > I ran into a surprising (to me) limitation when working on an issue [1].  To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete.  This brought on few questions for me:
> > > > >
> > > > > * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x?  I found an open task to remove support altogether [2] .
> > > >
> > > > Not much IMHO.   1) The people who have been producing Python wheels up
> > > > to now have decided to stop spending valuable time on hairy binary
> > > > compatibility and distribution issues.  2) Last I tried, manylinux2010
> > > > works and someone who's interested in reviving Python Linux wheels can
> > > > probably produce such wheels instead of manylinux1.
> > > >
> > > > So IMHO we can drop manylinux1 support right now.  However:
> > > >
> > > > > * What is needed to move to C++ 14?
> > > >
> > > > Make sure that all important toolchains support it.  Unfortunately, I
> > > > don't think that's the case for the MinGW version that's used to build R
> > > > packages on Windows.  It's using gcc 4.9.3.
> > > >
> > > > See e.g.
> > > > https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666
> > > >
> > > > > * Would either of these changes normally require a PMC-driven vote?
> > > >
> > > > I don't think dropping manylinux1 needs a PMC vote.  It's simply a case
> > > > of a high-cost recurring activity that doesn't find a volunteer anymore.
> > > >  The PMC can't simply claim that we continue supporting manylinux1 if
> > > > there's nobody around to do the actual work.
> > > >
> > > > As for switching the baseline to C++14, it would probably require a vote
> > > > indeed.  And I expect a -1 if the R Windows build can't be migrated to a
> > > > newer compiler.
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> >
>
>
>

Re: [DISCUSS] Apache Arrow manylinux1 support

Posted by Antoine Pitrou <so...@pitrou.net>.
On Mon, 19 Aug 2019 08:44:26 -0500
Wes McKinney <we...@gmail.com> wrote:
> Will publishing only manylinux2010 wheels have any consequences (for
> example, a relatively new version of setuptools may be required)?

A relatively new version of pip is required.  But upgrading pip is
straightforward, at least in a virtual environment or private Python
install.

Regards

Antoine.


> 
> On Fri, Aug 16, 2019 at 11:58 AM Neal Richardson
> <ne...@gmail.com> wrote:
> >
> > For R's official support for various C++ versions, see
> > https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b11-code
> > and below. Empirically, C++ > 11 is not really used: there are only 6
> > packages on CRAN that declare it as a requirement, and none of those
> > are widely used.
> >
> > $ R  
> > > df <- tools::CRAN_package_db()
> > > table(grepl("C++11", df$SystemRequirements, fixed=TRUE))  
> >
> > FALSE  TRUE
> > 14502   275  
> > > table(grepl("C++14", df$SystemRequirements, fixed=TRUE))  
> >
> > FALSE  TRUE
> > 14771     6  
> > > df[grepl("C++14", df$SystemRequirements, fixed=TRUE), c("Package", "Reverse depends", "Reverse imports", "Reverse suggests")]  
> >         Package Reverse depends Reverse imports Reverse suggests
> > 6071   IsoSpecR            <NA>            <NA>             <NA>
> > 8004   multinet            <NA>            <NA>             <NA>
> > 8200     ndjson         streamR            <NA>             <NA>
> > 10487 RcppAlgos            <NA>         STraTUS  bigIntegerAlgos
> > 11115    rmdcev            <NA>            <NA>             <NA>
> > 14391    walker            <NA>            <NA>             <NA>  
> > > table(grepl("C++17", df$SystemRequirements, fixed=TRUE))  
> >
> > FALSE
> > 14777
> >
> > On Fri, Aug 16, 2019 at 8:32 AM Antoine Pitrou <an...@python.org> wrote:  
> > >
> > >
> > > Le 16/08/2019 à 17:11, Hatem Helal a écrit :  
> > > > Hi all,
> > > >
> > > > I ran into a surprising (to me) limitation when working on an issue [1].  To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete.  This brought on few questions for me:
> > > >
> > > > * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x?  I found an open task to remove support altogether [2] .  
> > >
> > > Not much IMHO.   1) The people who have been producing Python wheels up
> > > to now have decided to stop spending valuable time on hairy binary
> > > compatibility and distribution issues.  2) Last I tried, manylinux2010
> > > works and someone who's interested in reviving Python Linux wheels can
> > > probably produce such wheels instead of manylinux1.
> > >
> > > So IMHO we can drop manylinux1 support right now.  However:
> > >  
> > > > * What is needed to move to C++ 14?  
> > >
> > > Make sure that all important toolchains support it.  Unfortunately, I
> > > don't think that's the case for the MinGW version that's used to build R
> > > packages on Windows.  It's using gcc 4.9.3.
> > >
> > > See e.g.
> > > https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666
> > >  
> > > > * Would either of these changes normally require a PMC-driven vote?  
> > >
> > > I don't think dropping manylinux1 needs a PMC vote.  It's simply a case
> > > of a high-cost recurring activity that doesn't find a volunteer anymore.
> > >  The PMC can't simply claim that we continue supporting manylinux1 if
> > > there's nobody around to do the actual work.
> > >
> > > As for switching the baseline to C++14, it would probably require a vote
> > > indeed.  And I expect a -1 if the R Windows build can't be migrated to a
> > > newer compiler.
> > >
> > > Regards
> > >
> > > Antoine.  
> 




Re: [DISCUSS] Apache Arrow manylinux1 support

Posted by Wes McKinney <we...@gmail.com>.
Will publishing only manylinux2010 wheels have any consequences (for
example, a relatively new version of setuptools may be required)?

On Fri, Aug 16, 2019 at 11:58 AM Neal Richardson
<ne...@gmail.com> wrote:
>
> For R's official support for various C++ versions, see
> https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b11-code
> and below. Empirically, C++ > 11 is not really used: there are only 6
> packages on CRAN that declare it as a requirement, and none of those
> are widely used.
>
> $ R
> > df <- tools::CRAN_package_db()
> > table(grepl("C++11", df$SystemRequirements, fixed=TRUE))
>
> FALSE  TRUE
> 14502   275
> > table(grepl("C++14", df$SystemRequirements, fixed=TRUE))
>
> FALSE  TRUE
> 14771     6
> > df[grepl("C++14", df$SystemRequirements, fixed=TRUE), c("Package", "Reverse depends", "Reverse imports", "Reverse suggests")]
>         Package Reverse depends Reverse imports Reverse suggests
> 6071   IsoSpecR            <NA>            <NA>             <NA>
> 8004   multinet            <NA>            <NA>             <NA>
> 8200     ndjson         streamR            <NA>             <NA>
> 10487 RcppAlgos            <NA>         STraTUS  bigIntegerAlgos
> 11115    rmdcev            <NA>            <NA>             <NA>
> 14391    walker            <NA>            <NA>             <NA>
> > table(grepl("C++17", df$SystemRequirements, fixed=TRUE))
>
> FALSE
> 14777
>
> On Fri, Aug 16, 2019 at 8:32 AM Antoine Pitrou <an...@python.org> wrote:
> >
> >
> > Le 16/08/2019 à 17:11, Hatem Helal a écrit :
> > > Hi all,
> > >
> > > I ran into a surprising (to me) limitation when working on an issue [1].  To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete.  This brought on few questions for me:
> > >
> > > * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x?  I found an open task to remove support altogether [2] .
> >
> > Not much IMHO.   1) The people who have been producing Python wheels up
> > to now have decided to stop spending valuable time on hairy binary
> > compatibility and distribution issues.  2) Last I tried, manylinux2010
> > works and someone who's interested in reviving Python Linux wheels can
> > probably produce such wheels instead of manylinux1.
> >
> > So IMHO we can drop manylinux1 support right now.  However:
> >
> > > * What is needed to move to C++ 14?
> >
> > Make sure that all important toolchains support it.  Unfortunately, I
> > don't think that's the case for the MinGW version that's used to build R
> > packages on Windows.  It's using gcc 4.9.3.
> >
> > See e.g.
> > https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666
> >
> > > * Would either of these changes normally require a PMC-driven vote?
> >
> > I don't think dropping manylinux1 needs a PMC vote.  It's simply a case
> > of a high-cost recurring activity that doesn't find a volunteer anymore.
> >  The PMC can't simply claim that we continue supporting manylinux1 if
> > there's nobody around to do the actual work.
> >
> > As for switching the baseline to C++14, it would probably require a vote
> > indeed.  And I expect a -1 if the R Windows build can't be migrated to a
> > newer compiler.
> >
> > Regards
> >
> > Antoine.

Re: [DISCUSS] Apache Arrow manylinux1 support

Posted by Neal Richardson <ne...@gmail.com>.
For R's official support for various C++ versions, see
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b11-code
and below. Empirically, C++ > 11 is not really used: there are only 6
packages on CRAN that declare it as a requirement, and none of those
are widely used.

$ R
> df <- tools::CRAN_package_db()
> table(grepl("C++11", df$SystemRequirements, fixed=TRUE))

FALSE  TRUE
14502   275
> table(grepl("C++14", df$SystemRequirements, fixed=TRUE))

FALSE  TRUE
14771     6
> df[grepl("C++14", df$SystemRequirements, fixed=TRUE), c("Package", "Reverse depends", "Reverse imports", "Reverse suggests")]
        Package Reverse depends Reverse imports Reverse suggests
6071   IsoSpecR            <NA>            <NA>             <NA>
8004   multinet            <NA>            <NA>             <NA>
8200     ndjson         streamR            <NA>             <NA>
10487 RcppAlgos            <NA>         STraTUS  bigIntegerAlgos
11115    rmdcev            <NA>            <NA>             <NA>
14391    walker            <NA>            <NA>             <NA>
> table(grepl("C++17", df$SystemRequirements, fixed=TRUE))

FALSE
14777

On Fri, Aug 16, 2019 at 8:32 AM Antoine Pitrou <an...@python.org> wrote:
>
>
> Le 16/08/2019 à 17:11, Hatem Helal a écrit :
> > Hi all,
> >
> > I ran into a surprising (to me) limitation when working on an issue [1].  To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete.  This brought on few questions for me:
> >
> > * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x?  I found an open task to remove support altogether [2] .
>
> Not much IMHO.   1) The people who have been producing Python wheels up
> to now have decided to stop spending valuable time on hairy binary
> compatibility and distribution issues.  2) Last I tried, manylinux2010
> works and someone who's interested in reviving Python Linux wheels can
> probably produce such wheels instead of manylinux1.
>
> So IMHO we can drop manylinux1 support right now.  However:
>
> > * What is needed to move to C++ 14?
>
> Make sure that all important toolchains support it.  Unfortunately, I
> don't think that's the case for the MinGW version that's used to build R
> packages on Windows.  It's using gcc 4.9.3.
>
> See e.g.
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666
>
> > * Would either of these changes normally require a PMC-driven vote?
>
> I don't think dropping manylinux1 needs a PMC vote.  It's simply a case
> of a high-cost recurring activity that doesn't find a volunteer anymore.
>  The PMC can't simply claim that we continue supporting manylinux1 if
> there's nobody around to do the actual work.
>
> As for switching the baseline to C++14, it would probably require a vote
> indeed.  And I expect a -1 if the R Windows build can't be migrated to a
> newer compiler.
>
> Regards
>
> Antoine.

Re: [DISCUSS] Apache Arrow manylinux1 support

Posted by Antoine Pitrou <an...@python.org>.
Le 16/08/2019 à 17:11, Hatem Helal a écrit :
> Hi all,
> 
> I ran into a surprising (to me) limitation when working on an issue [1].  To summarize, supporting the manylinux1 standard ties Arrow development to gcc 4.8.x which is technically not C++11 complete.  This brought on few questions for me:
> 
> * What are the pre-conditions for dropping manylinux1 / gcc 4.8.x?  I found an open task to remove support altogether [2] .

Not much IMHO.   1) The people who have been producing Python wheels up
to now have decided to stop spending valuable time on hairy binary
compatibility and distribution issues.  2) Last I tried, manylinux2010
works and someone who's interested in reviving Python Linux wheels can
probably produce such wheels instead of manylinux1.

So IMHO we can drop manylinux1 support right now.  However:

> * What is needed to move to C++ 14?

Make sure that all important toolchains support it.  Unfortunately, I
don't think that's the case for the MinGW version that's used to build R
packages on Windows.  It's using gcc 4.9.3.

See e.g.
https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/26742063/job/7k57qamlpb5cchfh?fullLog=true#L666

> * Would either of these changes normally require a PMC-driven vote?

I don't think dropping manylinux1 needs a PMC vote.  It's simply a case
of a high-cost recurring activity that doesn't find a volunteer anymore.
 The PMC can't simply claim that we continue supporting manylinux1 if
there's nobody around to do the actual work.

As for switching the baseline to C++14, it would probably require a vote
indeed.  And I expect a -1 if the R Windows build can't be migrated to a
newer compiler.

Regards

Antoine.