You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/06/02 13:12:44 UTC

[GitHub] [arrow] pitrou commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

pitrou commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-637533180


   Here are some benchmarks on my machine, with gcc 7.5:
   * `ARROW_SIMD_LEVEL=AVX2`:
   ```
   BM_DefinitionLevelsToBitmapRepeatedAllMissing         934 ns          934 ns      3004804 bytes_per_second=2.04274G/s
   BM_DefinitionLevelsToBitmapRepeatedAllPresent        1327 ns         1327 ns      1908147 bytes_per_second=1.4377G/s
   BM_DefinitionLevelsToBitmapRepeatedMostPresent       1725 ns         1725 ns      1649108 bytes_per_second=1.10569G/s
   ```
   * `ARROW_SIMD_LEVEL=SSE4_2`:
   ```
   BM_DefinitionLevelsToBitmapRepeatedAllMissing        1384 ns         1384 ns      2029778 bytes_per_second=1.37806G/s
   BM_DefinitionLevelsToBitmapRepeatedAllPresent        2054 ns         2053 ns      1247469 bytes_per_second=951.163M/s
   BM_DefinitionLevelsToBitmapRepeatedMostPresent       2124 ns         2124 ns      1303578 bytes_per_second=919.503M/s
   ```
   * `ARROW_SIMD_LEVEL=NONE`:
   ```
   BM_DefinitionLevelsToBitmapRepeatedAllMissing         925 ns          925 ns      3025393 bytes_per_second=2.06245G/s
   BM_DefinitionLevelsToBitmapRepeatedAllPresent        1505 ns         1504 ns      1881938 bytes_per_second=1.26785G/s
   BM_DefinitionLevelsToBitmapRepeatedMostPresent       1725 ns         1725 ns      1598820 bytes_per_second=1.10599G/s
   ```
   
   So it seems that gcc's SSE4.2 auto-vectorization may lead to suboptimal code, but that's not a problem for this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org