You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/10 18:00:40 UTC

[GitHub] [arrow-rs] Dandandan opened a new issue #279: Performance improvements for take by specializing on 32 /64 bit

Dandandan opened a new issue #279:
URL: https://github.com/apache/arrow-rs/issues/279


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   Currently `maybe_usize` and `try_from_trusted_len_iter` are used for the `take` kernel. It seems we could do better for `u32`, `i32`, and `u64`/ `i64` (for >64 bit architecture) as we can cast to `usize` directly.
   
   **Describe the solution you'd like**
   Specialize on aforementioned types to avoid overhead of the slow path.
   
   **Describe alternatives you've considered**
   Implementating/exposing a specialized function for the types. 
   
   **Additional context**
    
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] Dandandan commented on issue #279: Performance improvements for take by specializing on 32 / 64 bit integer indices

Posted by GitBox <gi...@apache.org>.
Dandandan commented on issue #279:
URL: https://github.com/apache/arrow-rs/issues/279#issuecomment-837304142


   Some first experimenting didn't reveal a lot of options.
   What did improve performance however, was removing the bound check (-25%) but it doesn't seem like we can get rid of it without checking bounds for all values manual (which seems like it should be slower - even after speeding it up by 2x here https://github.com/apache/arrow-rs/pull/281 )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] ritchie46 commented on issue #279: Performance improvements for take by specializing on 32 / 64 bit integer indices

Posted by GitBox <gi...@apache.org>.
ritchie46 commented on issue #279:
URL: https://github.com/apache/arrow-rs/issues/279#issuecomment-838020696


   > It seems further speedups might come only from somehow removing bound checks and SIMD gather.
   
   SIMD gather is not being used atm right? I remember looking for this operation in `packed_simd`, but not yielding any results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] nevi-me commented on issue #279: Performance improvements for take by specializing on 32 / 64 bit integer indices

Posted by GitBox <gi...@apache.org>.
nevi-me commented on issue #279:
URL: https://github.com/apache/arrow-rs/issues/279#issuecomment-837697996


   > Currently `maybe_usize` and `try_from_trusted_len_iter` are used for the `take` kernel. It seems we could do better for indices with type `u32`, `i32`, and `u64`/ `i64` (for >64 bit architecture) as we can cast to `usize` directly and use the non-optional variants which is likely faster.
   
   @Dandandan I once checked the assembly output of what the `num` crate does, as I was worried that it might be introducing overhead.
   There appeared to be no overhead. For *safe* casts like going from `u32` to `usize`, the compiler optimises out the `Option` check, and returns the result directly.
   
   Here's some example assembly output: https://godbolt.org/z/1nsq4reT7
   
   That could explain why you didn't see perf changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] Dandandan commented on issue #279: Performance improvements for take by specializing on 32 / 64 bit integer indices

Posted by GitBox <gi...@apache.org>.
Dandandan commented on issue #279:
URL: https://github.com/apache/arrow-rs/issues/279#issuecomment-837870031


   > > Currently `maybe_usize` and `try_from_trusted_len_iter` are used for the `take` kernel. It seems we could do better for indices with type `u32`, `i32`, and `u64`/ `i64` (for >64 bit architecture) as we can cast to `usize` directly and use the non-optional variants which is likely faster.
   > 
   > @Dandandan I once checked the assembly output of what the `num` crate does, as I was worried that it might be introducing overhead.
   > There appeared to be no overhead. For _safe_ casts like going from `u32` to `usize`, the compiler optimises out the `Option` check, and returns the result directly.
   > 
   > Here's some example assembly output: https://godbolt.org/z/1nsq4reT7
   > 
   > That could explain why you didn't see perf changes.
   
   Thanks! That makes sense.
   
   It seems further speedups might come only from somehow removing bound checks and SIMD gather. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] nevi-me commented on issue #279: Performance improvements for take by specializing on 32 / 64 bit integer indices

Posted by GitBox <gi...@apache.org>.
nevi-me commented on issue #279:
URL: https://github.com/apache/arrow-rs/issues/279#issuecomment-838276957


   These are good opportunities to use `stdsimd`, as like all things, it'll eventually become stable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] Dandandan commented on issue #279: Performance improvements for take by specializing on 32 / 64 bit integer indices

Posted by GitBox <gi...@apache.org>.
Dandandan commented on issue #279:
URL: https://github.com/apache/arrow-rs/issues/279#issuecomment-838025166


   Yeah I also did some googling - seems not supported in `packed_simd` so I guess at this point it has to be written by hand.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org