You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/11 04:46:56 UTC

[GitHub] [arrow] emkornfield edited a comment on pull request #7143: ARROW-8504: [C++] [wip]Add BitRunReader and use it in parquet

emkornfield edited a comment on pull request #7143:
URL: https://github.com/apache/arrow/pull/7143#issuecomment-626464623


   @wesm interesting data point, I updated performance benchmarks to generate random values/nullability (and kept deterministic one).  It seems like the bad regression is really only deterministically alternating nullability.  Randomly generated nullability (at 50%) shows improvements.  I still have some cleanup to do but I'd be curious on peoples thoughts on the need to try to detect/special case deterministic patterns in nullability (its possible the maybe I just chose a bad seed?):
   ![image](https://user-images.githubusercontent.com/17869838/81524493-40889080-9306-11ea-9023-bc64611ef596.png)
   
   Also, https://github.com/apache/impala/blob/19a4d8fe794c9b17e69d6c65473f9a68084916bb/be/src/kudu/util/rle-encoding.h is the file I found int impala?  Maybe we did all the additions for GetBatchSpaced etc?  Was there another source you were thinking of?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org