You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/11/29 16:01:42 UTC

[GitHub] [arrow] Dandandan commented on pull request #8664: ARROW-10588: [Rust] Safe and parallel bit operations for Arrow

Dandandan commented on pull request #8664:
URL: https://github.com/apache/arrow/pull/8664#issuecomment-735416076


   I think there are some very interesting things in this PR:
   
   * Usage of bitvec / new structure for null buffer. I think it makes sense to use this library here rather than reinvent it.
   * For the benchmarks it makes sense to have some bigger / more realistic ones as well. 2 ^ 20 maybe is a bit big. We also have some benchmarks in datafusion / benchmarks directory which can be extended to cover more realistic scenario's.
   * For parallelism,  I am also not convinced that it's a good idea to introduce rayon without being able to turn it off / control it. For big arrays it can be a good idea, but for smaller arrays, projects like datafusion and libraries, it can actually slow it down and/or use more resources overall. I think it would maybe be nice to revisit this sometime and see if we can make it an optional dependency (it's pretty big) and you could opt-in to use it for some kernels / ops?
   
   Would love if this PR would be continued, maybe in a slimmed down form?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org