You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/06/02 13:12:44 UTC
[GitHub] [arrow] pitrou commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
pitrou commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-637533180
Here are some benchmarks on my machine, with gcc 7.5:
* `ARROW_SIMD_LEVEL=AVX2`:
```
BM_DefinitionLevelsToBitmapRepeatedAllMissing 934 ns 934 ns 3004804 bytes_per_second=2.04274G/s
BM_DefinitionLevelsToBitmapRepeatedAllPresent 1327 ns 1327 ns 1908147 bytes_per_second=1.4377G/s
BM_DefinitionLevelsToBitmapRepeatedMostPresent 1725 ns 1725 ns 1649108 bytes_per_second=1.10569G/s
```
* `ARROW_SIMD_LEVEL=SSE4_2`:
```
BM_DefinitionLevelsToBitmapRepeatedAllMissing 1384 ns 1384 ns 2029778 bytes_per_second=1.37806G/s
BM_DefinitionLevelsToBitmapRepeatedAllPresent 2054 ns 2053 ns 1247469 bytes_per_second=951.163M/s
BM_DefinitionLevelsToBitmapRepeatedMostPresent 2124 ns 2124 ns 1303578 bytes_per_second=919.503M/s
```
* `ARROW_SIMD_LEVEL=NONE`:
```
BM_DefinitionLevelsToBitmapRepeatedAllMissing 925 ns 925 ns 3025393 bytes_per_second=2.06245G/s
BM_DefinitionLevelsToBitmapRepeatedAllPresent 1505 ns 1504 ns 1881938 bytes_per_second=1.26785G/s
BM_DefinitionLevelsToBitmapRepeatedMostPresent 1725 ns 1725 ns 1598820 bytes_per_second=1.10599G/s
```
So it seems that gcc's SSE4.2 auto-vectorization may lead to suboptimal code, but that's not a problem for this PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org