You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2020/10/05 15:51:00 UTC
[jira] [Assigned] (ARROW-10058) [C++] Investigate performance of
LevelsToBitmap without BMI2
[ https://issues.apache.org/jira/browse/ARROW-10058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoine Pitrou reassigned ARROW-10058:
--------------------------------------
Assignee: Antoine Pitrou
> [C++] Investigate performance of LevelsToBitmap without BMI2
> ------------------------------------------------------------
>
> Key: ARROW-10058
> URL: https://issues.apache.org/jira/browse/ARROW-10058
> Project: Apache Arrow
> Issue Type: Sub-task
> Components: C++
> Reporter: Antoine Pitrou
> Assignee: Antoine Pitrou
> Priority: Major
> Labels: pull-request-available
> Attachments: opt-level-conv.diff
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Currently, when some Parquet nested data involves some repetition levels, converting the levels to bitmap goes through a slow scalar path unless the BMI2 instruction set is available and efficient (the latter using the PEXT instruction to process 16 levels at once).
> It may be possible to emulate PEXT for 5- or 6-bit masks by using a lookup table, allowing to process 5-6 levels at once.
> (also, it would be good to add nested reading benchmarks for non-trivial nesting; currently we only benchmark one-level struct and one-level list)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)