You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "pitrou (via GitHub)" <gi...@apache.org> on 2023/11/02 14:51:07 UTC
[I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
pitrou opened a new issue, #38560:
URL: https://github.com/apache/arrow/issues/38560
### Describe the enhancement requested
Currently, there are BYTE_STREAM_SPLIT optimizations using hand-written x86 intrinsics (for SSE4.2, AVX2 and AVX512), selected at compile-time.
We should rewrite those using the xsimd library so as to provide support for non-x86 ISA extensions such as Arm Neon (most importantly) and SVE.
More precisely:
* rewrite the SSE4.2 acceleration for generic 128-bit SIMD
* rewrite the AVX2 acceleration for generic 256-bit SIMD
* either rewrite the AVX512 acceleration, leave it alone, or remove it (the benefits are probably minor)
### Component(s)
C++, Parquet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1967307859
Or perhaps @cyb70289 wants to take it up :-)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1791033852
Ok, but 1) those are relatively old CPUs 2) the performance loss is not caused by _mixing_ AVX2 and AVX512, but simply by using AVX512 ;-)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "cyb70289 (via GitHub)" <gi...@apache.org>.
cyb70289 commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1967968773
I may not have bandwidth recently. I believe @mapleFU can do it well. Ping me if you need help.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1966666606
After grepping through the xsimd include files, it seems that:
* the 128 bit (currently SSE4.2) variants can probably be migrated to arch-agnostic xsimd code
* the 256 bit (currently AVX2) variants use `_mm256_unpack{lo,hi}_epi8` and permutations in a non-trivial way that may be difficult to reproduce using xsimd
This means to we could at least migrate the 128 bit paths to xsimd, which may get us NEON acceleration.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1790884892
cc @cyb70289 @mapleFU
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "mapleFU (via GitHub)" <gi...@apache.org>.
mapleFU commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1790970321
IMO, Parquet itself has so many hand-written AVX2(mostly in Levels handling, some are in decode etc). So, for parquet, mixing AVX512 and AVX2 may causing performance loss. But if user just want to use this encoder, AVX512 might be useful(Also, AVX10 is coming now...)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1790987182
> mixing AVX512 and AVX2 may causing performance loss
This sounds like an urban legend at this point.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1966620394
FTR, AVX512 variants were removed in https://github.com/apache/arrow/pull/40127
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-2003640816
Issue resolved by pull request 40335
https://github.com/apache/arrow/pull/40335
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "mapleFU (via GitHub)" <gi...@apache.org>.
mapleFU commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1967268252
Nice analysis, I can have a try on migrating this, but I'm a SIMD newbie, some help is need
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "mapleFU (via GitHub)" <gi...@apache.org>.
mapleFU commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1791029916
> This sounds like an urban legend at this point.
Before Icelake optimization [1] [2], AVX512 might cause de-freq when using it [3]
[1] https://www.hc32.hotchips.org/assets/program/conference/day1/HotChips2020_Server_Processors_Intel_Irma_ICX-CPU-final3.pdf
[2] https://travisdowns.github.io/blog/2020/08/19/icl-avx512-freq.html
[3] https://lemire.me/blog/2018/08/15/the-dangers-of-avx-512-throttling-a-3-impact/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]
Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou closed issue #38560: [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd
URL: https://github.com/apache/arrow/issues/38560
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org