You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Antoine Pitrou <an...@python.org> on 2021/06/08 13:17:31 UTC

Re: [C++][Discuss] Switch to C++17

Hello,

Note the change in the message topic :-)
We now have a draft PR up to switch the C++ standard level to C++17.
This allows very nice simplifications in the code, especially the use
of elegant constructs that can replace some cumbersome uses of
std::enable_if, SFINAE and other pain points.

https://github.com/apache/arrow/pull/10414

It seems we were finally able to overcome the main platform
compatibility (CI) hurdles, though some effort will probably be
necessary to squash all regressions in that area.

I haven't seen any opposition previously in this thread, so you are
really concerned by this, it would be better to speak up quickly, as
otherwise we may decide to move forward with the change.

Best regards

Antoine.


On Thu, 27 May 2021 10:03:03 +0200
Antoine Pitrou <an...@python.org> wrote:
> Hello,
> 
> It seems the only two platforms that constrained us to C++11 will not be 
> supported anymore (those platforms are RTools 3.5 for R packages, and 
> manylinux1 for Python packages).
> 
> It would be beneficial to bump our C++ requirement to C++14.  There is 
> an issue open listing benefits:
> https://issues.apache.org/jira/browse/ARROW-12816
> 
> An additional benefit is that some useful third-party libraries for us 
> may or will require C++14, including in their headers.
> 
> Is anyone opposed to doing the switch?  Please speak up.
> 
> Best regards
> 
> Antoine.
> 




Re: [C++][Discuss] Switch to C++17

Posted by Antoine Pitrou <an...@python.org>.
On Tue, 8 Jun 2021 14:39:27 -0700
Neal Richardson <ne...@gmail.com> wrote:
> I'm guessing there hasn't been opposition on this thread because the users
> that this might affect aren't following this mailing list.
> 
> I'd be interested to see which other major C++ projects out there have
> bumped their requirement to C++17, and how that experience was for
> everyone--the user community as well as the developers. Do you know of good
> examples?

I'm not sure where to find major C++ projects.  Looking through
https://en.cppreference.com/w/cpp/links/libs, a bunch of them are
C++17-only, but they don't look "major" (but since many of them are
domain-specific, I may be wrong).

If you're talking about heavyweights such as LLVM, my impression is that
they tend to be quite conservative.

> I just checked on CRAN today, and of the 17,694 R packages there,
> only 3 require C++17 (none of which have wide adoption) and only 20 require
> C++14.

How many require C++ at all, though?

Another question: would it make a difference to relax the requirement
to C++14? Or would it be the same problem for the legacy platforms
you're talking about?

Regards

Antoine.



Re: [C++][Discuss] Switch to C++17

Posted by Jonathan Keane <jk...@gmail.com>.
Tying this together, it sounds like we have a few competing priorities
and it would be good to come up with more formal criteria about what
platforms + versions we support.

We have a list of operating systems (and versions) that we support
[1], and some of our client languages list versions that are
compatible [2], but as far as I can tell we don't have criteria for
those lists of compatibility and when we would remove one (I've
searched through the docs + some in the archives, but please let me
know if there's something I'm missing here and we have prior art on
this).

For the operating systems, the maintainers of them set support dates
and those are (generally) dropped when (public) maintenance is ended.
But I don't know if we have a formal set of criteria for determining
support for our client languages in the same way.

Like Neal mentions, and using the R ecosystem as an example, there are
multiple levels of requirements to take into account with our
compatibility. The first two are the absolute minimum to work + be on
CRAN:
1. What R supports (right now: 4.1 supports c++17, and down to version
3.4 had some support on some OSes)
2. What CRAN requires (right now: 4.0, 4.1, devel)

If both of these were our only criteria we would be ok to move to
C++17, but like Neal mentions there are other concerns that we need to
think about before drawing the line at exactly the minimal criteria
above, we should consider user experience and what other projects do.
In Neal's #4 it's noted that many major R packages support the last
4-5 releases. I would advocate for using that as our criteria (for R
at least) to match what others are doing and make sure that arrow
doesn't stand out as an exception to the folks who are trying to
install it.

Since R has yearly releases, supporting 4-5 versions back for R
matches Python's 5 year end of life schedule. (I'm less familiar with
other language ecosystems and their version constraints, please chime
in with others!)

With that criteria (with the notable exception of windows R users), we
should support back to R 3.3 (which we already test in our CI, in
fact!), and with the next release (in ~1 year) we would bump that to R
3.4 which is the first version to support C++ >11 [3].


[1] - https://arrow.apache.org/install/#c-and-glib-c-packages-for-debian-gnulinux-ubuntu-and-centos
[2] - https://arrow.apache.org/docs/python/install.html#python-compatibility
[3] - Note that due to the build chain issues discussed above, for R
users on Windows, this would actually be cutting them off sooner since
they only gained support as of R 4.0. However, this would give them 2
years between when a version of R was current and when Arrow no longer
supported it. Which is short, but not just over 1 year like it would
be now.

-Jon

-Jon


On Wed, Jun 9, 2021 at 1:03 PM Neal Richardson
<ne...@gmail.com> wrote:
>
> Responding to Antoine's specific questions:
>
> * 429 R packages on CRAN list C++11 as a SystemRequirement. These numbers
> may be a slight undercount because the SystemRequirements field is not
> machine-read. Some packages (e.g.
> https://github.com/eddelbuettel/rcppsimdjson/) appear to actually require
> C++17 but don't declare it in SystemRequirements. That said, while there
> are a number of widely used and depended-on packages that require C++11,
> none of the ones that require C++14 and higher have broad adoption.
> * According to the official guide [1], C++14 support is partial on the GCC
> 4.9 that RTools 35 uses. So it would depend on what features we were using
> as to whether it was an issue or not.
> * Binary packages in R: it essentially comes down to what CRAN builds and
> hosts. We provide a source package to CRAN, and they build binaries for
> macOS and Windows for the current R release and the previous release (minor
> releases, done annually). Windows users don't typically install from
> source, so that's not the issue--but we don't get to decide the toolchain
> used to compile the binary because we don't own that.
>
> Some other points on the R ecosystem. There are several unrelated concerns
> here that we should keep distinct in our minds:
>
> 1. What R supports. Per [1], R 3.4 and above have some support for C++14
> and 17, and C++14 is even the default C++ standard for the current R
> release (4.1). We're all good here.
> 2. What CRAN requires. Packages must build on macOS, Windows, and Linux and
> are checked on the previous release, current release, and development
> branch of R. Linux machines use a variety of compilers and toolchains.
> Windows, as we've said, always uses RTools, and as of last month, only
> RTools 40 (gcc 8.3). As noted on the PR, CRAN uses an old macOS (10.13) to
> build mac binary packages, and this has partial C++17 support. Unlike the
> RTools upgrade associated with R 4.0, this is not tied to the R version. So
> we would need to make sure we compile on the same xcode version they use
> (or wait for them to eventually upgrade their machines).
> 3. What users can install on their systems. In the enterprise context,
> users don't always get to upgrade R freely, nor can they always install
> newer compilers. I acknowledge that raising this is FUD, but we just don't
> know how significant this is.
> 4. What other R packages require. Because of #3, maintainers of major R
> packages in the ecosystem generally try to support the last 4-5 releases so
> that users who are stuck unable to upgrade R are not left behind. This
> means 3 versions of R (and, given yearly releases, a 3 year lag) beyond
> what CRAN requires. This is not to say that we have to do the same, just
> that if we don't, then that limits the chances that one of those
> maintainers would view arrow as something they can depend on. (That said, I
> don't think there's high likelihood that these packages would take a hard
> dependency on arrow; optional dependency ("Suggests", in R-speak) is more
> likely, regardless of C++ standard, due to other reasons (size, FUD, etc.).)
>
> Neal
>
> [1]:
> https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b-code
>
> On Wed, Jun 9, 2021 at 10:26 AM Eduardo Ponce <ed...@gmail.com> wrote:
>
> > After the discussion in today's Arrow sync call, I do think it would be
> > beneficial to come up with a formal process for deciding when is a "right
> > time" for upgrading Arrow to a newer C++ standard. I suggest we could
> > consider a set of general metrics/criteria that try to summarize the
> > benefits and drawbacks of such change. Some metrics will be measurable but
> > others will be qualitative. For the latter, we can use a consensus-based
> > scale rating (1-5 with a meaning attached to each value). I am curious what
> > approach other major C++ projects have used to resolve decisions on
> > selecting a C++ standard (aside from crI foreseeitically required
> > features)?
> >
> > The criteria used to evaluate newer C++ standards need to fairly consider
> > people with different roles with regards to the Arrow project, such as
> > developers, contributors, C++ users, other language users (R, Python), and
> > maintainers.
> > Here is a possible (and likely incomplete) set of metrics:
> >
> > Measurable metrics:
> > * code size (source and binary) - measured in bytes
> > * compilation time (consider each major Arrow component)
> > * runtime - what are the performance changes? (consider each major Arrow
> > component)
> > * systems/OS/tools supported and deprecated
> > * ...
> >
> > Qualitative metrics:
> > * code structure/maintainability - how would it improve development?
> > * code readability - ease of understanding details for new/current
> > contributors?
> > * ...
> >
> > I do think this approach will give us a better standpoint for deciding on
> > when to upgrade to a newer C++ standard.
> > Nevertheless, there are complexities for implementing such an approach:
> > * selecting the "correct" metrics
> > * designing the scale rating
> > * How do we get the community to provide their opinion for the qualitative
> > metrics? What is a "good enough" coverage?
> > * How do we summarize the results into a binary decision: upgrade vs not
> > upgrade?
> > * ...
> >
> > In the end, it might not be worthwhile to go through all this work, I am
> > simply expressing an idea.
> >
> > ~Eduardo
> >
> >
> > On Wed, Jun 9, 2021 at 9:40 AM Antoine Pitrou <an...@python.org> wrote:
> >
> > > On Tue, 8 Jun 2021 17:37:30 -0500
> > > Jonathan Keane <jk...@gmail.com> wrote:
> > > > I've been digging a bit to try and put numbers on those users the Neal
> > > > mentions. Specifically, we know that requiring C++17 will mean that R
> > > > users on windows using versions of R before 4.0.0 will not be able to
> > > > compile/install arrow. Although R version 3.6 is no longer supported
> > > > by CRAN [1], many people hang on to older versions for an extended
> > > > period of time.
> > > >
> > > > We are still working on getting more solid numbers about how many
> > > > people might still be on these old versions, but here is what I have
> > > > so far:
> > > >
> > > > Using Rstudio's cran mirror logs of package installations [2] (and
> > > > with the help of Arrow datasets to process/filter these files ) for
> > > > the period from 2020-05-18 [3] to today, for the installations that
> > > > have an r version reported approximately 27% of the windows package
> > > > installs are on versions before 4.0.0 (and therefore would be unable
> > > > to install arrow if we require C++17 right now).
> > >
> > > Is this because binary packages are forbidden in R-land?  Do Windows
> > > users of R really install Arrow from source?  Or is it really
> > > impossible to use a modern compiler when building R packages for R
> > > versions older than 4.0 ?
> > >
> > > Note the requirement we're proposing to bump is for *building* Arrow.
> > > Using binaries should not be affected, especially on Windows (on Linux,
> > > you must be a bit more careful, but normally the CentOS devtoolset
> > > should take care of that).
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > >
> >

Re: [C++][Discuss] Switch to C++17

Posted by Neal Richardson <ne...@gmail.com>.
Responding to Antoine's specific questions:

* 429 R packages on CRAN list C++11 as a SystemRequirement. These numbers
may be a slight undercount because the SystemRequirements field is not
machine-read. Some packages (e.g.
https://github.com/eddelbuettel/rcppsimdjson/) appear to actually require
C++17 but don't declare it in SystemRequirements. That said, while there
are a number of widely used and depended-on packages that require C++11,
none of the ones that require C++14 and higher have broad adoption.
* According to the official guide [1], C++14 support is partial on the GCC
4.9 that RTools 35 uses. So it would depend on what features we were using
as to whether it was an issue or not.
* Binary packages in R: it essentially comes down to what CRAN builds and
hosts. We provide a source package to CRAN, and they build binaries for
macOS and Windows for the current R release and the previous release (minor
releases, done annually). Windows users don't typically install from
source, so that's not the issue--but we don't get to decide the toolchain
used to compile the binary because we don't own that.

Some other points on the R ecosystem. There are several unrelated concerns
here that we should keep distinct in our minds:

1. What R supports. Per [1], R 3.4 and above have some support for C++14
and 17, and C++14 is even the default C++ standard for the current R
release (4.1). We're all good here.
2. What CRAN requires. Packages must build on macOS, Windows, and Linux and
are checked on the previous release, current release, and development
branch of R. Linux machines use a variety of compilers and toolchains.
Windows, as we've said, always uses RTools, and as of last month, only
RTools 40 (gcc 8.3). As noted on the PR, CRAN uses an old macOS (10.13) to
build mac binary packages, and this has partial C++17 support. Unlike the
RTools upgrade associated with R 4.0, this is not tied to the R version. So
we would need to make sure we compile on the same xcode version they use
(or wait for them to eventually upgrade their machines).
3. What users can install on their systems. In the enterprise context,
users don't always get to upgrade R freely, nor can they always install
newer compilers. I acknowledge that raising this is FUD, but we just don't
know how significant this is.
4. What other R packages require. Because of #3, maintainers of major R
packages in the ecosystem generally try to support the last 4-5 releases so
that users who are stuck unable to upgrade R are not left behind. This
means 3 versions of R (and, given yearly releases, a 3 year lag) beyond
what CRAN requires. This is not to say that we have to do the same, just
that if we don't, then that limits the chances that one of those
maintainers would view arrow as something they can depend on. (That said, I
don't think there's high likelihood that these packages would take a hard
dependency on arrow; optional dependency ("Suggests", in R-speak) is more
likely, regardless of C++ standard, due to other reasons (size, FUD, etc.).)

Neal

[1]:
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-C_002b_002b-code

On Wed, Jun 9, 2021 at 10:26 AM Eduardo Ponce <ed...@gmail.com> wrote:

> After the discussion in today's Arrow sync call, I do think it would be
> beneficial to come up with a formal process for deciding when is a "right
> time" for upgrading Arrow to a newer C++ standard. I suggest we could
> consider a set of general metrics/criteria that try to summarize the
> benefits and drawbacks of such change. Some metrics will be measurable but
> others will be qualitative. For the latter, we can use a consensus-based
> scale rating (1-5 with a meaning attached to each value). I am curious what
> approach other major C++ projects have used to resolve decisions on
> selecting a C++ standard (aside from crI foreseeitically required
> features)?
>
> The criteria used to evaluate newer C++ standards need to fairly consider
> people with different roles with regards to the Arrow project, such as
> developers, contributors, C++ users, other language users (R, Python), and
> maintainers.
> Here is a possible (and likely incomplete) set of metrics:
>
> Measurable metrics:
> * code size (source and binary) - measured in bytes
> * compilation time (consider each major Arrow component)
> * runtime - what are the performance changes? (consider each major Arrow
> component)
> * systems/OS/tools supported and deprecated
> * ...
>
> Qualitative metrics:
> * code structure/maintainability - how would it improve development?
> * code readability - ease of understanding details for new/current
> contributors?
> * ...
>
> I do think this approach will give us a better standpoint for deciding on
> when to upgrade to a newer C++ standard.
> Nevertheless, there are complexities for implementing such an approach:
> * selecting the "correct" metrics
> * designing the scale rating
> * How do we get the community to provide their opinion for the qualitative
> metrics? What is a "good enough" coverage?
> * How do we summarize the results into a binary decision: upgrade vs not
> upgrade?
> * ...
>
> In the end, it might not be worthwhile to go through all this work, I am
> simply expressing an idea.
>
> ~Eduardo
>
>
> On Wed, Jun 9, 2021 at 9:40 AM Antoine Pitrou <an...@python.org> wrote:
>
> > On Tue, 8 Jun 2021 17:37:30 -0500
> > Jonathan Keane <jk...@gmail.com> wrote:
> > > I've been digging a bit to try and put numbers on those users the Neal
> > > mentions. Specifically, we know that requiring C++17 will mean that R
> > > users on windows using versions of R before 4.0.0 will not be able to
> > > compile/install arrow. Although R version 3.6 is no longer supported
> > > by CRAN [1], many people hang on to older versions for an extended
> > > period of time.
> > >
> > > We are still working on getting more solid numbers about how many
> > > people might still be on these old versions, but here is what I have
> > > so far:
> > >
> > > Using Rstudio's cran mirror logs of package installations [2] (and
> > > with the help of Arrow datasets to process/filter these files 🎉) for
> > > the period from 2020-05-18 [3] to today, for the installations that
> > > have an r version reported approximately 27% of the windows package
> > > installs are on versions before 4.0.0 (and therefore would be unable
> > > to install arrow if we require C++17 right now).
> >
> > Is this because binary packages are forbidden in R-land?  Do Windows
> > users of R really install Arrow from source?  Or is it really
> > impossible to use a modern compiler when building R packages for R
> > versions older than 4.0 ?
> >
> > Note the requirement we're proposing to bump is for *building* Arrow.
> > Using binaries should not be affected, especially on Windows (on Linux,
> > you must be a bit more careful, but normally the CentOS devtoolset
> > should take care of that).
> >
> > Regards
> >
> > Antoine.
> >
> >
> >
>

Re: [C++][Discuss] Switch to C++17

Posted by Benjamin Kietzman <be...@gmail.com>.
One improvement in read/writability which might be my favorite is the
removal of SFINAE-controlled template instantiation in favor of compile
time branching with `if constexpr`. Here's an example of that in the draft
PR:

https://github.com/apache/arrow/pull/10414/files#diff-058e32693ee8820a3d8967404e05c76b37ef2f646245f794fa4acb6d26703668R241

This one at least is pretty easy to quantify ("how many instances of SFINAE
could we potentially replace?"):

$ rg enable_if | wc
612

On Wed, Jun 9, 2021 at 1:36 PM Antoine Pitrou <an...@python.org> wrote:

>
> Le 09/06/2021 à 19:25, Eduardo Ponce a écrit :
> >
> > Measurable metrics:
> > * code size (source and binary) - measured in bytes
> [...]
> >
> > Qualitative metrics:
> > * code structure/maintainability - how would it improve development?
> > * code readability - ease of understanding details for new/current
> > contributors?
>
> These are a bit difficult to evaluate, because a pervasive upgrade to
> more modern C++ idioms would probably bring significant maintenance and
> readability improvements, but it's unlikely to be achieved *before* the
> upgrade is decided (as it would be quite a bit of work).  In the
> submitted PR, we've upgraded a couple places as a validation that C++17
> does improve ease of writing and reading code, but it is unknown how
> many other places could benefit.
>
> Regards
>
> Antoine.
>

Re: [C++][Discuss] Switch to C++17

Posted by Antoine Pitrou <an...@python.org>.
Le 09/06/2021 à 19:25, Eduardo Ponce a écrit :
> 
> Measurable metrics:
> * code size (source and binary) - measured in bytes
[...]
> 
> Qualitative metrics:
> * code structure/maintainability - how would it improve development?
> * code readability - ease of understanding details for new/current
> contributors?

These are a bit difficult to evaluate, because a pervasive upgrade to 
more modern C++ idioms would probably bring significant maintenance and 
readability improvements, but it's unlikely to be achieved *before* the 
upgrade is decided (as it would be quite a bit of work).  In the 
submitted PR, we've upgraded a couple places as a validation that C++17 
does improve ease of writing and reading code, but it is unknown how 
many other places could benefit.

Regards

Antoine.

Re: [C++][Discuss] Switch to C++17

Posted by Eduardo Ponce <ed...@gmail.com>.
After the discussion in today's Arrow sync call, I do think it would be
beneficial to come up with a formal process for deciding when is a "right
time" for upgrading Arrow to a newer C++ standard. I suggest we could
consider a set of general metrics/criteria that try to summarize the
benefits and drawbacks of such change. Some metrics will be measurable but
others will be qualitative. For the latter, we can use a consensus-based
scale rating (1-5 with a meaning attached to each value). I am curious what
approach other major C++ projects have used to resolve decisions on
selecting a C++ standard (aside from crI foreseeitically required
features)?

The criteria used to evaluate newer C++ standards need to fairly consider
people with different roles with regards to the Arrow project, such as
developers, contributors, C++ users, other language users (R, Python), and
maintainers.
Here is a possible (and likely incomplete) set of metrics:

Measurable metrics:
* code size (source and binary) - measured in bytes
* compilation time (consider each major Arrow component)
* runtime - what are the performance changes? (consider each major Arrow
component)
* systems/OS/tools supported and deprecated
* ...

Qualitative metrics:
* code structure/maintainability - how would it improve development?
* code readability - ease of understanding details for new/current
contributors?
* ...

I do think this approach will give us a better standpoint for deciding on
when to upgrade to a newer C++ standard.
Nevertheless, there are complexities for implementing such an approach:
* selecting the "correct" metrics
* designing the scale rating
* How do we get the community to provide their opinion for the qualitative
metrics? What is a "good enough" coverage?
* How do we summarize the results into a binary decision: upgrade vs not
upgrade?
* ...

In the end, it might not be worthwhile to go through all this work, I am
simply expressing an idea.

~Eduardo


On Wed, Jun 9, 2021 at 9:40 AM Antoine Pitrou <an...@python.org> wrote:

> On Tue, 8 Jun 2021 17:37:30 -0500
> Jonathan Keane <jk...@gmail.com> wrote:
> > I've been digging a bit to try and put numbers on those users the Neal
> > mentions. Specifically, we know that requiring C++17 will mean that R
> > users on windows using versions of R before 4.0.0 will not be able to
> > compile/install arrow. Although R version 3.6 is no longer supported
> > by CRAN [1], many people hang on to older versions for an extended
> > period of time.
> >
> > We are still working on getting more solid numbers about how many
> > people might still be on these old versions, but here is what I have
> > so far:
> >
> > Using Rstudio's cran mirror logs of package installations [2] (and
> > with the help of Arrow datasets to process/filter these files 🎉) for
> > the period from 2020-05-18 [3] to today, for the installations that
> > have an r version reported approximately 27% of the windows package
> > installs are on versions before 4.0.0 (and therefore would be unable
> > to install arrow if we require C++17 right now).
>
> Is this because binary packages are forbidden in R-land?  Do Windows
> users of R really install Arrow from source?  Or is it really
> impossible to use a modern compiler when building R packages for R
> versions older than 4.0 ?
>
> Note the requirement we're proposing to bump is for *building* Arrow.
> Using binaries should not be affected, especially on Windows (on Linux,
> you must be a bit more careful, but normally the CentOS devtoolset
> should take care of that).
>
> Regards
>
> Antoine.
>
>
>

Re: [C++][Discuss] Switch to C++17

Posted by Antoine Pitrou <an...@python.org>.
On Tue, 8 Jun 2021 17:37:30 -0500
Jonathan Keane <jk...@gmail.com> wrote:
> I've been digging a bit to try and put numbers on those users the Neal
> mentions. Specifically, we know that requiring C++17 will mean that R
> users on windows using versions of R before 4.0.0 will not be able to
> compile/install arrow. Although R version 3.6 is no longer supported
> by CRAN [1], many people hang on to older versions for an extended
> period of time.
> 
> We are still working on getting more solid numbers about how many
> people might still be on these old versions, but here is what I have
> so far:
> 
> Using Rstudio's cran mirror logs of package installations [2] (and
> with the help of Arrow datasets to process/filter these files 🎉) for
> the period from 2020-05-18 [3] to today, for the installations that
> have an r version reported approximately 27% of the windows package
> installs are on versions before 4.0.0 (and therefore would be unable
> to install arrow if we require C++17 right now).

Is this because binary packages are forbidden in R-land?  Do Windows
users of R really install Arrow from source?  Or is it really
impossible to use a modern compiler when building R packages for R
versions older than 4.0 ?

Note the requirement we're proposing to bump is for *building* Arrow.
Using binaries should not be affected, especially on Windows (on Linux,
you must be a bit more careful, but normally the CentOS devtoolset
should take care of that).

Regards

Antoine.



Re: [C++][Discuss] Switch to C++17

Posted by Jonathan Keane <jk...@gmail.com>.
I've been digging a bit to try and put numbers on those users the Neal
mentions. Specifically, we know that requiring C++17 will mean that R
users on windows using versions of R before 4.0.0 will not be able to
compile/install arrow. Although R version 3.6 is no longer supported
by CRAN [1], many people hang on to older versions for an extended
period of time.

We are still working on getting more solid numbers about how many
people might still be on these old versions, but here is what I have
so far:

Using Rstudio's cran mirror logs of package installations [2] (and
with the help of Arrow datasets to process/filter these files 🎉) for
the period from 2020-05-18 [3] to today, for the installations that
have an r version reported approximately 27% of the windows package
installs are on versions before 4.0.0 (and therefore would be unable
to install arrow if we require C++17 right now).

There are a number of caveats about this data, however:
* the "that have an r version reported" is very important: only ~17%
of the installations provide an R version. It's possible (and very
likely) that the installations that don't include this information are
not distributed like those that do. This is the biggest problem with
this dataset/analysis and we're trying to see if others have better
information here.
* This is limited to one of many cran repositories. There's no
indication that folks using this repository are more likely to be
using older versions (if anything it is probably the opposite), but we
don't have that information directly.
* There isn't a way to filter out CI and other automated installations
that aren't representative of real-world use cases.

If we get a more reliable dataset for this I will update these
numbers. I'm not sure what the threshold is for if this impacts too
many people (and if these numbers are above that). But wanted to get
this information out here for us to think about. Additionally, it
might be useful to think about how quickly we cut off support for
client languages: if we release on our typical schedule (in July),
people who installed R 1.25 years ago (on windows) would be required
to upgrade R in order to install arrow. That might be long enough, or
the benefits of C++17 outweigh this, but like Neal mentions: the
people likely to run into this are likely not on this list.


[1] - the last release in the 3.6 line (3.6.3) was released on
2020-02-29, and was superceded by 4.0.0 2020-04-24
[2] - http://cran-logs.rstudio.com
[3] - this is the day that R 4.1.0 was released and 3.6.0 stopped
being supported by CRAN

-Jon

On Tue, Jun 8, 2021 at 4:39 PM Neal Richardson
<ne...@gmail.com> wrote:
>
> I'm guessing there hasn't been opposition on this thread because the users
> that this might affect aren't following this mailing list.
>
> I'd be interested to see which other major C++ projects out there have
> bumped their requirement to C++17, and how that experience was for
> everyone--the user community as well as the developers. Do you know of good
> examples? I just checked on CRAN today, and of the 17,694 R packages there,
> only 3 require C++17 (none of which have wide adoption) and only 20 require
> C++14.
>
> Neal
>
> On Tue, Jun 8, 2021 at 6:17 AM Antoine Pitrou <an...@python.org> wrote:
>
> >
> > Hello,
> >
> > Note the change in the message topic :-)
> > We now have a draft PR up to switch the C++ standard level to C++17.
> > This allows very nice simplifications in the code, especially the use
> > of elegant constructs that can replace some cumbersome uses of
> > std::enable_if, SFINAE and other pain points.
> >
> > https://github.com/apache/arrow/pull/10414
> >
> > It seems we were finally able to overcome the main platform
> > compatibility (CI) hurdles, though some effort will probably be
> > necessary to squash all regressions in that area.
> >
> > I haven't seen any opposition previously in this thread, so you are
> > really concerned by this, it would be better to speak up quickly, as
> > otherwise we may decide to move forward with the change.
> >
> > Best regards
> >
> > Antoine.
> >
> >
> > On Thu, 27 May 2021 10:03:03 +0200
> > Antoine Pitrou <an...@python.org> wrote:
> > > Hello,
> > >
> > > It seems the only two platforms that constrained us to C++11 will not be
> > > supported anymore (those platforms are RTools 3.5 for R packages, and
> > > manylinux1 for Python packages).
> > >
> > > It would be beneficial to bump our C++ requirement to C++14.  There is
> > > an issue open listing benefits:
> > > https://issues.apache.org/jira/browse/ARROW-12816
> > >
> > > An additional benefit is that some useful third-party libraries for us
> > > may or will require C++14, including in their headers.
> > >
> > > Is anyone opposed to doing the switch?  Please speak up.
> > >
> > > Best regards
> > >
> > > Antoine.
> > >
> >
> >
> >
> >

Re: [C++][Discuss] Switch to C++17

Posted by Neal Richardson <ne...@gmail.com>.
I'm guessing there hasn't been opposition on this thread because the users
that this might affect aren't following this mailing list.

I'd be interested to see which other major C++ projects out there have
bumped their requirement to C++17, and how that experience was for
everyone--the user community as well as the developers. Do you know of good
examples? I just checked on CRAN today, and of the 17,694 R packages there,
only 3 require C++17 (none of which have wide adoption) and only 20 require
C++14.

Neal

On Tue, Jun 8, 2021 at 6:17 AM Antoine Pitrou <an...@python.org> wrote:

>
> Hello,
>
> Note the change in the message topic :-)
> We now have a draft PR up to switch the C++ standard level to C++17.
> This allows very nice simplifications in the code, especially the use
> of elegant constructs that can replace some cumbersome uses of
> std::enable_if, SFINAE and other pain points.
>
> https://github.com/apache/arrow/pull/10414
>
> It seems we were finally able to overcome the main platform
> compatibility (CI) hurdles, though some effort will probably be
> necessary to squash all regressions in that area.
>
> I haven't seen any opposition previously in this thread, so you are
> really concerned by this, it would be better to speak up quickly, as
> otherwise we may decide to move forward with the change.
>
> Best regards
>
> Antoine.
>
>
> On Thu, 27 May 2021 10:03:03 +0200
> Antoine Pitrou <an...@python.org> wrote:
> > Hello,
> >
> > It seems the only two platforms that constrained us to C++11 will not be
> > supported anymore (those platforms are RTools 3.5 for R packages, and
> > manylinux1 for Python packages).
> >
> > It would be beneficial to bump our C++ requirement to C++14.  There is
> > an issue open listing benefits:
> > https://issues.apache.org/jira/browse/ARROW-12816
> >
> > An additional benefit is that some useful third-party libraries for us
> > may or will require C++14, including in their headers.
> >
> > Is anyone opposed to doing the switch?  Please speak up.
> >
> > Best regards
> >
> > Antoine.
> >
>
>
>
>