You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "pitrou (via GitHub)" <gi...@apache.org> on 2023/11/02 14:51:07 UTC

[I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

pitrou opened a new issue, #38560:
URL: https://github.com/apache/arrow/issues/38560

   ### Describe the enhancement requested
   
   Currently, there are BYTE_STREAM_SPLIT optimizations using hand-written x86 intrinsics (for SSE4.2, AVX2 and AVX512), selected at compile-time.
   
   We should rewrite those using the xsimd library so as to provide support for non-x86 ISA extensions such as Arm Neon (most importantly) and SVE.
   
   More precisely:
   * rewrite the SSE4.2 acceleration for generic 128-bit SIMD
   * rewrite the AVX2 acceleration for generic 256-bit SIMD
   * either rewrite the AVX512 acceleration, leave it alone, or remove it (the benefits are probably minor)
   
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1967307859

   Or perhaps @cyb70289 wants to take it up :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1791033852

   Ok, but 1) those are relatively old CPUs 2) the performance loss is not caused by _mixing_ AVX2 and AVX512, but simply by using AVX512 ;-) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "cyb70289 (via GitHub)" <gi...@apache.org>.
cyb70289 commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1967968773

   I may not have bandwidth recently. I believe @mapleFU can do it well. Ping me if you need help.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1966666606

   After grepping through the xsimd include files, it seems that:
   
   * the 128 bit (currently SSE4.2) variants can probably be migrated to arch-agnostic xsimd code
   * the 256 bit (currently AVX2) variants use `_mm256_unpack{lo,hi}_epi8` and permutations in a non-trivial way that may be difficult to reproduce using xsimd
   
   This means to we could at least migrate the 128 bit paths to xsimd, which may get us NEON acceleration.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1790884892

   cc @cyb70289 @mapleFU 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "mapleFU (via GitHub)" <gi...@apache.org>.
mapleFU commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1790970321

   IMO, Parquet itself has so many hand-written AVX2(mostly in Levels handling, some are in decode etc). So, for parquet, mixing AVX512 and AVX2 may causing performance loss. But if user just want to use this encoder, AVX512 might be useful(Also, AVX10 is coming now...)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1790987182

   > mixing AVX512 and AVX2 may causing performance loss
   
   This sounds like an urban legend at this point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1966620394

   FTR, AVX512 variants were removed in https://github.com/apache/arrow/pull/40127


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-2003640816

   Issue resolved by pull request 40335
   https://github.com/apache/arrow/pull/40335


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "mapleFU (via GitHub)" <gi...@apache.org>.
mapleFU commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1967268252

   Nice analysis, I can have a try on migrating this, but I'm a SIMD newbie, some help is need


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "mapleFU (via GitHub)" <gi...@apache.org>.
mapleFU commented on issue #38560:
URL: https://github.com/apache/arrow/issues/38560#issuecomment-1791029916

   > This sounds like an urban legend at this point.
   
   Before Icelake optimization [1] [2], AVX512 might cause de-freq when using it [3]
   
   [1] https://www.hc32.hotchips.org/assets/program/conference/day1/HotChips2020_Server_Processors_Intel_Irma_ICX-CPU-final3.pdf
   [2] https://travisdowns.github.io/blog/2020/08/19/icl-avx512-freq.html
   [3] https://lemire.me/blog/2018/08/15/the-dangers-of-avx-512-throttling-a-3-impact/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd [arrow]

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou closed issue #38560: [C++][Parquet] Rewrite BYTE_STREAM_SPLIT optimizations using xsimd
URL: https://github.com/apache/arrow/issues/38560


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org