You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/07 02:22:35 UTC

[GitHub] [arrow-rs] liukun4515 opened a new issue #1010: SIMD for decimal data type

liukun4515 opened a new issue #1010:
URL: https://github.com/apache/arrow-rs/issues/1010


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   From this [datafusion issue](https://github.com/apache/arrow-datafusion/issues/122), we will add the decimal data type 
   in the datafusion.
   But some basic operations like `+,-` we implemented do not support SIMD.
   
   In order to speed up the calculation, we need to add the SIMD feature for the decimal data type.
   
   I have not figured out how to implement it.
   
   **Describe the solution you'd like**
   TODO
   
   **Describe alternatives you've considered**
   TODO
   
   **Additional context**
   TODO
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] chadbrewbaker commented on issue #1010: SIMD for decimal data type

Posted by GitBox <gi...@apache.org>.
chadbrewbaker commented on issue #1010:
URL: https://github.com/apache/arrow-rs/issues/1010#issuecomment-993903741


   Before mucking with simd - probably good idea to have a repo like Rust coreutils that leverages existing regression tests for the SQL engine. Especially perf regressions.
   
   https://github.com/postgres/postgres/tree/master/src/test/regress
   
   https://github.com/s3team/Squirrel
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #1010: SIMD for decimal data type

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1010:
URL: https://github.com/apache/arrow-rs/issues/1010#issuecomment-991632685


   FWIW I believe https://rust.godbolt.org/ is a popular tool for such work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #1010: SIMD for decimal data type

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1010:
URL: https://github.com/apache/arrow-rs/issues/1010#issuecomment-991632685


   FWIW I believe https://rust.godbolt.org/ is a popular tool for such work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] liukun4515 commented on issue #1010: SIMD for decimal data type

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #1010:
URL: https://github.com/apache/arrow-rs/issues/1010#issuecomment-987510221


   please assign this to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #1010: SIMD for decimal data type

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1010:
URL: https://github.com/apache/arrow-rs/issues/1010#issuecomment-989898243


   FYI @liukun4515  -- before implementing explicit SIMD versions of these kernels, it may be worth doing some profiling / disassembly of what `rustc` creates for the current kernels. 
   
   Perhaps a good start would be to add Decimal support to the existing kernels in https://docs.rs/arrow/6.3.0/arrow/compute/kernels/aggregate/index.html if it doesn't alread exist
   
   I have not reviewed the code in the datafusion aggregate functions for a while, so I am not familiar with how much they do / don't use the arrow compute kernels. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] chadbrewbaker commented on issue #1010: SIMD for decimal data type

Posted by GitBox <gi...@apache.org>.
chadbrewbaker commented on issue #1010:
URL: https://github.com/apache/arrow-rs/issues/1010#issuecomment-994206447


   Here is the Google SIMD assembler https://github.com/google/zetasql/blob/master/zetasql/base/mathutil.h 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] chadbrewbaker commented on issue #1010: SIMD for decimal data type

Posted by GitBox <gi...@apache.org>.
chadbrewbaker commented on issue #1010:
URL: https://github.com/apache/arrow-rs/issues/1010#issuecomment-993890957


   https://ispc.github.io etc are probably the way to go. For any expensive query you will want to shell out to LLVM and custom compile the worker binaries before the run - also probably blast the query plan with an SMT solver to reduce the expense/runtime depending on your constraints. I would avoid dynamic linking like the plague - be like the IBM Blue Gene/L.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] liukun4515 commented on issue #1010: SIMD for decimal data type

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #1010:
URL: https://github.com/apache/arrow-rs/issues/1010#issuecomment-994229886


   @chadbrewbaker  thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org