You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Antoine Pitrou <an...@python.org> on 2021/02/09 14:08:34 UTC

Re: [C++] adopting an SIMD library - xsimd

Le 09/02/2021 à 10:36, Antoine Pitrou a écrit :
> 
> Note that we need to decouple the SIMD level available at compile-time
> from the SIMD level available at runtime.  That is, we typically build
> optional AVX512 accelerations at compile-time, but only enable them at
> runtime if the CPU supports AVX512 (and if the environment variable
> ARROW_USER_SIMD_LEVEL wasn't forced to a lower value).
> 
> From a quick glance, it's not obvious that xsimd supports that level of
> control.  Though it may just be undocumented.  I will check with the
> authors, since I happen to know them.

Ok, there shouldn't be any problem on that front.  We just need to
compile with the right compiler flags to select the desired SIMD level,
like we already do currently when compiling multiple versions of a function.

I'll note that xsimd isn't very complete.  For example, it seems to lack
the functions required for byte stream split encoding and decoding.
Those functions are exported by libsimdpp under the names "zip_lo" and
"zip_hi".

libsimdpp, on the other hand, seems to lack maintenance.  It hasn't had
a commit in one year, and issues and PRs seem to be unanswered.  So
perhaps xsimd is a better course, provided we want to contribute the
missing functions.

Regards

Antoine.

Re: [C++] adopting an SIMD library - xsimd

Posted by Antoine Pitrou <an...@python.org>.
Hi Eduardo,

Thanks for chiming in.  Just one precision:

On Tue, 02 Mar 2021 06:41:54 -0000
Eduardo Ponce <ed...@gmail.com> wrote:
> In my experience there is no single SIMD library that wraps all possible set of vector instructions across the most common architectures and at the same time provides support for all popular compilers while supporting C and C++11/14. (I mention C because there is an issue for Arrow support in C, https://issues.apache.org/jira/browse/ARROW-1851).

"Arrow support in C" is not really on the roadmap, but it would in any
case be a separate library.

Currently, we recommend the Arrow C data interface for ABI-stable data
exchange between C-minded runtimes:
https://arrow.apache.org/docs/format/CDataInterface.html

Also, we may be able to require C++14 in a couple of months (when the R
ecosystem sunshines old compiler versions).

Regards

Antoine.



Re: [C++] adopting an SIMD library - xsimd

Posted by Eduardo Ponce <ed...@gmail.com>.
In my experience there is no single SIMD library that wraps all possible set of vector instructions across the most common architectures and at the same time provides support for all popular compilers while supporting C and C++11/14. (I mention C because there is an issue for Arrow support in C, https://issues.apache.org/jira/browse/ARROW-1851). I even at one point wanted to solve this problem via a multi-layer library but found that it would require more time and access to a variety of computing resources, see (or not) https://github.com/edponce/libsimdcpp. The main reason is that SIMD libraries are commonly created on a as-needed basis and thus, provide a project-specific subset of vector ISAs.

In terms of performance, most libraries (or better said, most common vector instructions) are wrapped very similarly (many times, identically), so performance should be comparable across SIMD libraries (assuming same compiler/architecture). Given that SIMD functions have very simple bodies, they are usually inlined during compilation. SIMD libraries tend to be header-only. I suggest to care about performance on specific SIMD operations of interest that will be placed in hotspots. Examples of operations that tend to differ in implementation between SIMD libraries are reductions, operands of different type/size, arithmetic shortcuts in modulus/division, conditional masks, and custom ones.

Aspects that I consider important for selecting a SIMD library are:
1) Coverage of vector ISA, architectures supported, and compilers supported
    a) These requirements are project-dependent
2) Library APIs: Does it support function-based paradigm? Object-oriented? Operator overloading for common arithmetic/logical operators? What conditional predicates does it support?
3) Modularity and how is easy it is to extend?
4) Maintainer's support

I personally do not have a "favorite" candidate, but did browsed xsimd and agree that we can fill/improve the gaps needed for Arrow.

~Eduardo

Re: [C++] adopting an SIMD library - xsimd

Posted by Antoine Pitrou <an...@python.org>.
On Fri, 12 Feb 2021 20:47:21 -0800
Micah Kornfield <em...@gmail.com> wrote:
> That is unfortunate, like I said if the consensus is xsimd, let's move
> forward with that.

I would say it's a soft consensus for now, and I would welcome more
viewpoints on the matter.

Regards

Antoine.



> 
> On Fri, Feb 12, 2021 at 2:45 AM Antoine Pitrou <an...@python.org> wrote:
> 
> >
> > There is an std::simd being envisioned.
> > https://en.cppreference.com/w/cpp/experimental/simd/simd
> >
> > The problem is that we need an implementation that's C++11- or
> > C++14-compliant, that works on major compilers, and that provides
> > accelerations for common instruction sets.  It doesn't seem to be the
> > case currently.
> > https://github.com/VcDevel/std-simd
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 12/02/2021 à 05:18, Micah Kornfield a écrit :  
> > > I'm open to x-simd if others think it is the best option.  I think the  
> > last  
> > > time this came up I expressed this opinion, but if possible it would be
> > > nice to use something that is on its way to become a standard to avoid
> > > abandonment issues but I don't know enough about the space to understand  
> > if  
> > > this is a real concern.
> > >
> > > On Tue, Feb 9, 2021 at 7:00 PM Yuqi Gu <yu...@linaro.org> wrote:
> > >  
> > >> Thanks for comments on the SIMD related PR:
> > >> https://github.com/apache/arrow/pull/9424.
> > >> Agree to adopt the *xsimd *as the SIMD wrapper library for Arrow to  
> > avoid a  
> > >> large maintenance burden. It makes sense.
> > >>
> > >> It seems *ximd *is designed for mathematics calculating and it lacks the
> > >> functions like bit/byte shuffling,  byte stream split encoding, ARM SVE
> > >> supporting, etc.
> > >> I'm absolutely willing to contribute the missing functions to *xsimd*.
> > >>
> > >> BRs,
> > >> Yuqi
> > >>
> > >> On Tue, 9 Feb 2021 at 22:08, Antoine Pitrou <an...@python.org> wrote:
> > >>  
> > >>>
> > >>> Le 09/02/2021 à 10:36, Antoine Pitrou a écrit :  
> > >>>>
> > >>>> Note that we need to decouple the SIMD level available at compile-time
> > >>>> from the SIMD level available at runtime.  That is, we typically build
> > >>>> optional AVX512 accelerations at compile-time, but only enable them at
> > >>>> runtime if the CPU supports AVX512 (and if the environment variable
> > >>>> ARROW_USER_SIMD_LEVEL wasn't forced to a lower value).
> > >>>>
> > >>>> From a quick glance, it's not obvious that xsimd supports that level  
> > of  
> > >>>> control.  Though it may just be undocumented.  I will check with the
> > >>>> authors, since I happen to know them.  
> > >>>
> > >>> Ok, there shouldn't be any problem on that front.  We just need to
> > >>> compile with the right compiler flags to select the desired SIMD level,
> > >>> like we already do currently when compiling multiple versions of a
> > >>> function.
> > >>>
> > >>> I'll note that xsimd isn't very complete.  For example, it seems to  
> > lack  
> > >>> the functions required for byte stream split encoding and decoding.
> > >>> Those functions are exported by libsimdpp under the names "zip_lo" and
> > >>> "zip_hi".
> > >>>
> > >>> libsimdpp, on the other hand, seems to lack maintenance.  It hasn't had
> > >>> a commit in one year, and issues and PRs seem to be unanswered.  So
> > >>> perhaps xsimd is a better course, provided we want to contribute the
> > >>> missing functions.
> > >>>
> > >>> Regards
> > >>>
> > >>> Antoine.
> > >>>  
> > >>  
> > >  
> >  
> 




Re: [C++] adopting an SIMD library - xsimd

Posted by Micah Kornfield <em...@gmail.com>.
That is unfortunate, like I said if the consensus is xsimd, let's move
forward with that.

On Fri, Feb 12, 2021 at 2:45 AM Antoine Pitrou <an...@python.org> wrote:

>
> There is an std::simd being envisioned.
> https://en.cppreference.com/w/cpp/experimental/simd/simd
>
> The problem is that we need an implementation that's C++11- or
> C++14-compliant, that works on major compilers, and that provides
> accelerations for common instruction sets.  It doesn't seem to be the
> case currently.
> https://github.com/VcDevel/std-simd
>
> Regards
>
> Antoine.
>
>
> Le 12/02/2021 à 05:18, Micah Kornfield a écrit :
> > I'm open to x-simd if others think it is the best option.  I think the
> last
> > time this came up I expressed this opinion, but if possible it would be
> > nice to use something that is on its way to become a standard to avoid
> > abandonment issues but I don't know enough about the space to understand
> if
> > this is a real concern.
> >
> > On Tue, Feb 9, 2021 at 7:00 PM Yuqi Gu <yu...@linaro.org> wrote:
> >
> >> Thanks for comments on the SIMD related PR:
> >> https://github.com/apache/arrow/pull/9424.
> >> Agree to adopt the *xsimd *as the SIMD wrapper library for Arrow to
> avoid a
> >> large maintenance burden. It makes sense.
> >>
> >> It seems *ximd *is designed for mathematics calculating and it lacks the
> >> functions like bit/byte shuffling,  byte stream split encoding, ARM SVE
> >> supporting, etc.
> >> I'm absolutely willing to contribute the missing functions to *xsimd*.
> >>
> >> BRs,
> >> Yuqi
> >>
> >> On Tue, 9 Feb 2021 at 22:08, Antoine Pitrou <an...@python.org> wrote:
> >>
> >>>
> >>> Le 09/02/2021 à 10:36, Antoine Pitrou a écrit :
> >>>>
> >>>> Note that we need to decouple the SIMD level available at compile-time
> >>>> from the SIMD level available at runtime.  That is, we typically build
> >>>> optional AVX512 accelerations at compile-time, but only enable them at
> >>>> runtime if the CPU supports AVX512 (and if the environment variable
> >>>> ARROW_USER_SIMD_LEVEL wasn't forced to a lower value).
> >>>>
> >>>> From a quick glance, it's not obvious that xsimd supports that level
> of
> >>>> control.  Though it may just be undocumented.  I will check with the
> >>>> authors, since I happen to know them.
> >>>
> >>> Ok, there shouldn't be any problem on that front.  We just need to
> >>> compile with the right compiler flags to select the desired SIMD level,
> >>> like we already do currently when compiling multiple versions of a
> >>> function.
> >>>
> >>> I'll note that xsimd isn't very complete.  For example, it seems to
> lack
> >>> the functions required for byte stream split encoding and decoding.
> >>> Those functions are exported by libsimdpp under the names "zip_lo" and
> >>> "zip_hi".
> >>>
> >>> libsimdpp, on the other hand, seems to lack maintenance.  It hasn't had
> >>> a commit in one year, and issues and PRs seem to be unanswered.  So
> >>> perhaps xsimd is a better course, provided we want to contribute the
> >>> missing functions.
> >>>
> >>> Regards
> >>>
> >>> Antoine.
> >>>
> >>
> >
>

Re: [C++] adopting an SIMD library - xsimd

Posted by Antoine Pitrou <an...@python.org>.
There is an std::simd being envisioned.
https://en.cppreference.com/w/cpp/experimental/simd/simd

The problem is that we need an implementation that's C++11- or
C++14-compliant, that works on major compilers, and that provides
accelerations for common instruction sets.  It doesn't seem to be the
case currently.
https://github.com/VcDevel/std-simd

Regards

Antoine.


Le 12/02/2021 à 05:18, Micah Kornfield a écrit :
> I'm open to x-simd if others think it is the best option.  I think the last
> time this came up I expressed this opinion, but if possible it would be
> nice to use something that is on its way to become a standard to avoid
> abandonment issues but I don't know enough about the space to understand if
> this is a real concern.
> 
> On Tue, Feb 9, 2021 at 7:00 PM Yuqi Gu <yu...@linaro.org> wrote:
> 
>> Thanks for comments on the SIMD related PR:
>> https://github.com/apache/arrow/pull/9424.
>> Agree to adopt the *xsimd *as the SIMD wrapper library for Arrow to avoid a
>> large maintenance burden. It makes sense.
>>
>> It seems *ximd *is designed for mathematics calculating and it lacks the
>> functions like bit/byte shuffling,  byte stream split encoding, ARM SVE
>> supporting, etc.
>> I'm absolutely willing to contribute the missing functions to *xsimd*.
>>
>> BRs,
>> Yuqi
>>
>> On Tue, 9 Feb 2021 at 22:08, Antoine Pitrou <an...@python.org> wrote:
>>
>>>
>>> Le 09/02/2021 à 10:36, Antoine Pitrou a écrit :
>>>>
>>>> Note that we need to decouple the SIMD level available at compile-time
>>>> from the SIMD level available at runtime.  That is, we typically build
>>>> optional AVX512 accelerations at compile-time, but only enable them at
>>>> runtime if the CPU supports AVX512 (and if the environment variable
>>>> ARROW_USER_SIMD_LEVEL wasn't forced to a lower value).
>>>>
>>>> From a quick glance, it's not obvious that xsimd supports that level of
>>>> control.  Though it may just be undocumented.  I will check with the
>>>> authors, since I happen to know them.
>>>
>>> Ok, there shouldn't be any problem on that front.  We just need to
>>> compile with the right compiler flags to select the desired SIMD level,
>>> like we already do currently when compiling multiple versions of a
>>> function.
>>>
>>> I'll note that xsimd isn't very complete.  For example, it seems to lack
>>> the functions required for byte stream split encoding and decoding.
>>> Those functions are exported by libsimdpp under the names "zip_lo" and
>>> "zip_hi".
>>>
>>> libsimdpp, on the other hand, seems to lack maintenance.  It hasn't had
>>> a commit in one year, and issues and PRs seem to be unanswered.  So
>>> perhaps xsimd is a better course, provided we want to contribute the
>>> missing functions.
>>>
>>> Regards
>>>
>>> Antoine.
>>>
>>
> 

Re: [C++] adopting an SIMD library - xsimd

Posted by Micah Kornfield <em...@gmail.com>.
I'm open to x-simd if others think it is the best option.  I think the last
time this came up I expressed this opinion, but if possible it would be
nice to use something that is on its way to become a standard to avoid
abandonment issues but I don't know enough about the space to understand if
this is a real concern.

On Tue, Feb 9, 2021 at 7:00 PM Yuqi Gu <yu...@linaro.org> wrote:

> Thanks for comments on the SIMD related PR:
> https://github.com/apache/arrow/pull/9424.
> Agree to adopt the *xsimd *as the SIMD wrapper library for Arrow to avoid a
> large maintenance burden. It makes sense.
>
> It seems *ximd *is designed for mathematics calculating and it lacks the
> functions like bit/byte shuffling,  byte stream split encoding, ARM SVE
> supporting, etc.
> I'm absolutely willing to contribute the missing functions to *xsimd*.
>
> BRs,
> Yuqi
>
> On Tue, 9 Feb 2021 at 22:08, Antoine Pitrou <an...@python.org> wrote:
>
> >
> > Le 09/02/2021 à 10:36, Antoine Pitrou a écrit :
> > >
> > > Note that we need to decouple the SIMD level available at compile-time
> > > from the SIMD level available at runtime.  That is, we typically build
> > > optional AVX512 accelerations at compile-time, but only enable them at
> > > runtime if the CPU supports AVX512 (and if the environment variable
> > > ARROW_USER_SIMD_LEVEL wasn't forced to a lower value).
> > >
> > > From a quick glance, it's not obvious that xsimd supports that level of
> > > control.  Though it may just be undocumented.  I will check with the
> > > authors, since I happen to know them.
> >
> > Ok, there shouldn't be any problem on that front.  We just need to
> > compile with the right compiler flags to select the desired SIMD level,
> > like we already do currently when compiling multiple versions of a
> > function.
> >
> > I'll note that xsimd isn't very complete.  For example, it seems to lack
> > the functions required for byte stream split encoding and decoding.
> > Those functions are exported by libsimdpp under the names "zip_lo" and
> > "zip_hi".
> >
> > libsimdpp, on the other hand, seems to lack maintenance.  It hasn't had
> > a commit in one year, and issues and PRs seem to be unanswered.  So
> > perhaps xsimd is a better course, provided we want to contribute the
> > missing functions.
> >
> > Regards
> >
> > Antoine.
> >
>

Re: [C++] adopting an SIMD library - xsimd

Posted by Yuqi Gu <yu...@linaro.org>.
Thanks for comments on the SIMD related PR:
https://github.com/apache/arrow/pull/9424.
Agree to adopt the *xsimd *as the SIMD wrapper library for Arrow to avoid a
large maintenance burden. It makes sense.

It seems *ximd *is designed for mathematics calculating and it lacks the
functions like bit/byte shuffling,  byte stream split encoding, ARM SVE
supporting, etc.
I'm absolutely willing to contribute the missing functions to *xsimd*.

BRs,
Yuqi

On Tue, 9 Feb 2021 at 22:08, Antoine Pitrou <an...@python.org> wrote:

>
> Le 09/02/2021 à 10:36, Antoine Pitrou a écrit :
> >
> > Note that we need to decouple the SIMD level available at compile-time
> > from the SIMD level available at runtime.  That is, we typically build
> > optional AVX512 accelerations at compile-time, but only enable them at
> > runtime if the CPU supports AVX512 (and if the environment variable
> > ARROW_USER_SIMD_LEVEL wasn't forced to a lower value).
> >
> > From a quick glance, it's not obvious that xsimd supports that level of
> > control.  Though it may just be undocumented.  I will check with the
> > authors, since I happen to know them.
>
> Ok, there shouldn't be any problem on that front.  We just need to
> compile with the right compiler flags to select the desired SIMD level,
> like we already do currently when compiling multiple versions of a
> function.
>
> I'll note that xsimd isn't very complete.  For example, it seems to lack
> the functions required for byte stream split encoding and decoding.
> Those functions are exported by libsimdpp under the names "zip_lo" and
> "zip_hi".
>
> libsimdpp, on the other hand, seems to lack maintenance.  It hasn't had
> a commit in one year, and issues and PRs seem to be unanswered.  So
> perhaps xsimd is a better course, provided we want to contribute the
> missing functions.
>
> Regards
>
> Antoine.
>