You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "izveigor (via GitHub)" <gi...@apache.org> on 2023/03/04 16:30:18 UTC

[GitHub] [arrow-datafusion] izveigor opened a new pull request, #5476: feat: add uint types for bitwise operations

izveigor opened a new pull request, #5476:
URL: https://github.com/apache/arrow-datafusion/pull/5476

   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #5475


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] ursabot commented on pull request #5476: feat: Support bitwise operations for unsigned integer types

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #5476:
URL: https://github.com/apache/arrow-datafusion/pull/5476#issuecomment-1456748444

   Benchmark runs are scheduled for baseline = 99ef989312292b04efdf06b178617e9008b46c84 and contender = a6f4a7719e4cf449fe3f85a68cdd06a7f9bece85. a6f4a7719e4cf449fe3f85a68cdd06a7f9bece85 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/072cd37026f6430ebfda46a72db31028...2b3817bbc9014be285cf0dc91a6caa96/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/3f9e87cbeea54f5a9d43f9c7fa5d6633...97a559971153495784d4fcab5270e951/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/3abdf4afa7a44c0f96a0b76a2c0d840f...3604cdff101847d49235c64d7a8a9a76/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/52bb9b4996f44d30b314c5f85145a0a3...0cfdca50e3084274859f28d1162eb3f6/)
   Buildkite builds:
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] izveigor commented on a diff in pull request #5476: feat: Support bitwise operations for unsigned integer types

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on code in PR #5476:
URL: https://github.com/apache/arrow-datafusion/pull/5476#discussion_r1125707089


##########
datafusion/physical-expr/src/expressions/binary/kernels.rs:
##########
@@ -129,6 +141,38 @@ pub(crate) fn bitwise_shift_right(left: ArrayRef, right: ArrayRef) -> Result<Arr
                 Int64Array
             )
         }
+        DataType::UInt8 => {
+            binary_bitwise_array_op!(
+                left,
+                right,
+                |a: u8, b: u8| a.wrapping_shr(b as u32),
+                UInt8Array
+            )
+        }
+        DataType::UInt16 => {
+            binary_bitwise_array_op!(
+                left,
+                right,
+                |a: u16, b: u16| a.wrapping_shr(b as u32),
+                UInt16Array
+            )
+        }
+        DataType::UInt32 => {
+            binary_bitwise_array_op!(
+                left,
+                right,
+                |a: u32, b: u32| a.wrapping_shr(b),
+                UInt32Array
+            )
+        }
+        DataType::UInt64 => {
+            binary_bitwise_array_op!(
+                left,
+                right,
+                |a: u64, b: u64| a.wrapping_shr(b.try_into().unwrap()),

Review Comment:
   Maybe, I don't know some aspects of programming in Rust, but I think it is impossible to return normal error through macros. Anyway, by the behavior of this function: https://github.com/apache/arrow-datafusion/blob/main/datafusion/physical-expr/src/expressions/binary/kernels.rs#L60, I consider that unwrap() in this situation is not a big problem.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #5476: feat: Support bitwise operations for unsigned integer types

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #5476:
URL: https://github.com/apache/arrow-datafusion/pull/5476


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5476: feat: Support bitwise operations for unsigned integer types

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5476:
URL: https://github.com/apache/arrow-datafusion/pull/5476#discussion_r1125523908


##########
datafusion/expr/src/type_coercion/binary.rs:
##########
@@ -173,12 +173,27 @@ fn bitwise_coercion(left_type: &DataType, right_type: &DataType) -> Option<DataT
         return Some(left_type.clone());
     }
 
-    // TODO support other data type
     match (left_type, right_type) {
-        (Int64, _) | (_, Int64) => Some(Int64),
-        (Int32, _) | (_, Int32) => Some(Int32),
-        (Int16, _) | (_, Int16) => Some(Int16),
+        (UInt64, _) | (_, UInt64) => Some(UInt64),

Review Comment:
   👍  these rules look reasonable to me.



##########
datafusion/physical-expr/src/expressions/binary/kernels.rs:
##########
@@ -125,10 +140,42 @@ pub(crate) fn bitwise_shift_right(left: ArrayRef, right: ArrayRef) -> Result<Arr
             binary_bitwise_array_op!(
                 left,
                 right,
-                |a: i64, b: i64| a.wrapping_shr(b as u32),
+                |a: i64, b: i64| a.wrapping_shr((b as u64).try_into().unwrap()),

Review Comment:
   Since this function returns a result, I think it would be better if this function returned an error on overflow, rather than `panic`ing (aka can we avoid calling the `unwrap`?)



##########
datafusion/physical-expr/src/expressions/binary/kernels.rs:
##########
@@ -125,10 +140,42 @@ pub(crate) fn bitwise_shift_right(left: ArrayRef, right: ArrayRef) -> Result<Arr
             binary_bitwise_array_op!(
                 left,
                 right,
-                |a: i64, b: i64| a.wrapping_shr(b as u32),
+                |a: i64, b: i64| a.wrapping_shr((b as u64).try_into().unwrap()),
                 Int64Array
             )
         }
+        DataType::UInt8 => {
+            binary_bitwise_array_op!(
+                left,
+                right,
+                |a: u8, b: u8| a.wrapping_shr(b as u32),

Review Comment:
   TIL that wrapping_shr for `i8` does in fact take a `u32 input 🤷  It looked strange to me but I double checked https://doc.rust-lang.org/std/primitive.i8.html#method.wrapping_shr ✅ 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5476: feat: Support bitwise operations for unsigned integer types

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5476:
URL: https://github.com/apache/arrow-datafusion/pull/5476#discussion_r1126863158


##########
datafusion/physical-expr/src/expressions/binary/kernels.rs:
##########
@@ -129,6 +141,38 @@ pub(crate) fn bitwise_shift_right(left: ArrayRef, right: ArrayRef) -> Result<Arr
                 Int64Array
             )
         }
+        DataType::UInt8 => {
+            binary_bitwise_array_op!(
+                left,
+                right,
+                |a: u8, b: u8| a.wrapping_shr(b as u32),
+                UInt8Array
+            )
+        }
+        DataType::UInt16 => {
+            binary_bitwise_array_op!(
+                left,
+                right,
+                |a: u16, b: u16| a.wrapping_shr(b as u32),
+                UInt16Array
+            )
+        }
+        DataType::UInt32 => {
+            binary_bitwise_array_op!(
+                left,
+                right,
+                |a: u32, b: u32| a.wrapping_shr(b),
+                UInt32Array
+            )
+        }
+        DataType::UInt64 => {
+            binary_bitwise_array_op!(
+                left,
+                right,
+                |a: u64, b: u64| a.wrapping_shr(b.try_into().unwrap()),

Review Comment:
   > but I think it is impossible to return normal error through macros.
   
   I think it is possible (the function the macro is used in has to return a result)
   
   >  Anyway, by the behavior of this function: https://github.com/apache/arrow-datafusion/blob/main/datafusion/physical-expr/src/expressions/binary/kernels.rs#L60, I consider that unwrap() in this situation is not a big problem.
   
   I agree the pre-existing pattern is a reasonable justification for using unwrap.
   
   Thank you



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org