You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/29 19:56:25 UTC

[GitHub] [arrow-rs] tustvold opened a new issue #1108: Add native comparison kernel support for BinaryArray

tustvold opened a new issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   Currently there is no support for comparing `BinaryArray`
   
   **Describe the solution you'd like**
   
   It should be possible to call `arrow::compute::eq_dyn`, and friends with `BinaryArray` as arguments and have them not return an error. It may be possible to share code with the existing utf8 implementation.
   
   **Additional context**
   
   Encountered whilst implementing #1053 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] tustvold commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1014309178


   Yes it should be largely just a case of adding `_binary` versions of the `_utf8` kernels in that module. 
   
   Typically there are three variants of each operator:
   
   * One with fully qualified array types e.g. `lt_eq_utf8`
   * One with a fully quality array type and a scalar e.g. `lt_eq_utf8_scalar`
   * One with a dynamic array type and qualified scalar e.g. `eq_dyn_utf8_scalar`
   
   Finally there are then two dynamic dispatch functions for each operator
   
   * One that takes two dynamic array types - e.g. `eq_dyn` (the dispatch logic is macro-ified)
   * One that takes a dynamic array and a scalar - e.g. `eq_dyn_scalar` (this doesn't support non-primitive types AFAICT)
   
   I'd perhaps recommend starting with the fully qualified kernels and build up from there with successive PRs. The scalar variants are also optimisations and so could definitely be ignored for an initial cut, they were only added in the last month :grinning: 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1019849720


   BTW, could you please assign this issue to me?
   I have no right to self-assigned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1014329838


   Thank you, Raphael!  I'd like to try if no one else has been doing it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1014561953


   Following your advice, I will file 2 PRs to match this feature. The first PR will add support for fully qualified binary array. The second PR will support comparison for dynamic binary array.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] tustvold edited a comment on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
tustvold edited a comment on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1014309178


   Yes it should be largely just a case of adding `_binary` versions of the `_utf8` kernels in that module. 
   
   Typically there are three variants of each operator:
   
   * One with fully qualified array types e.g. `lt_eq_utf8`
   * One with a fully quality array type and a scalar e.g. `lt_eq_utf8_scalar`
   * One with a dynamic array type and qualified scalar e.g. `lt_eq_dyn_utf8_scalar`
   
   Finally there are then two dynamic dispatch functions for each operator
   
   * One that takes two dynamic array types - e.g. `lt_eq_dyn` (the dispatch logic is macro-ified)
   * One that takes a dynamic array and a scalar - e.g. `lt_eq_dyn_scalar` (this doesn't support non-primitive types AFAICT)
   
   I'd perhaps recommend starting with the fully qualified kernels and build up from there with successive PRs. The scalar variants are also optimisations and so could definitely be ignored for an initial cut, they were only added in the last month :grinning: 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1019771413


   Is there something like `BinaryDictionaryBuilder` in the code, which is needed when testing the `eq_dyn_binary_scalar_with_dict`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1020041844


   > Is there something like BinaryDictionaryBuilder in the code, which is needed when testing the eq_dyn_binary_scalar_with_dict?
   
   
   I do not know of any such thing, sadly


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 removed a comment on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 removed a comment on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1020828316


   In other words, do we support dictionary array with `BinaryType` values?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1020828316






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1020828316


   In other words, do we support dictionary array with `BinaryType` values?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1014215046


   Hi Raphael, I am newcomer and want to contribute to this project.
   Is it reasonable to add a function called `eq_dyn_binary_scalar` and tests in the file `comparison.rs`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1012415514


   This is a pretty good ticket for anyone who wants to mess with the compute kernels -- basically it should be add a small amount of code and a a test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 edited a comment on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 edited a comment on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1019849720


   BTW, could you please assign this issue to me?
   I have no right to self-assign.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 commented on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1020994550


   I do not include `DictionaryArray` in my PR, because I do not find a straight forward way to build a binary dictionary. If it is needed, I will include it later in my PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] HaoYang670 removed a comment on issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
HaoYang670 removed a comment on issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1020828316


   In other words, do we support dictionary array with `BinaryType` values?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb closed issue #1108: Add native comparison kernel support for BinaryArray

Posted by GitBox <gi...@apache.org>.
alamb closed issue #1108:
URL: https://github.com/apache/arrow-rs/issues/1108


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org