You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/18 05:50:41 UTC

[GitHub] [arrow-rs] gangliao opened a new pull request #317: Add (simd) modulus op

gangliao opened a new pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317


   # Which issue does this PR close?
   
   https://github.com/apache/arrow-rs/issues/98
   https://github.com/apache/arrow-datafusion/issues/99
   
    # Rationale for this change
    
    <!---
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.
   -->
   
   # What changes are included in this PR?
   
   support modulus operations.
   
   <!---
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are there any user-facing changes?
   
   no.
   
   <!---
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!---
   If there are any breaking changes to public APIs, please add the `breaking change` label.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb merged pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
alamb merged pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#issuecomment-854806859


   Sorry @gangliao !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] gangliao commented on pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
gangliao commented on pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#issuecomment-854802474


   Can anyone merge this PR so that I can proceed with other PR in arrow-datafusion? @alamb @jorgecarleitao 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#issuecomment-843739831


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#317](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2d55a0d) into [master](https://codecov.io/gh/apache/arrow-rs/commit/c863a2c44bffa5c092a49e07910d5e9225483193?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c863a2c) will **increase** coverage by `0.06%`.
   > The diff coverage is `78.69%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/317/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #317      +/-   ##
   ==========================================
   + Coverage   82.52%   82.59%   +0.06%     
   ==========================================
     Files         162      162              
     Lines       44007    44344     +337     
   ==========================================
   + Hits        36316    36624     +308     
   - Misses       7691     7720      +29     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [arrow/src/compute/kernels/arithmetic.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9hcml0aG1ldGljLnJz) | `84.14% <78.69%> (-3.23%)` | :arrow_down: |
   | [parquet/src/encodings/decoding.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2RlY29kaW5nLnJz) | `92.47% <0.00%> (-0.02%)` | :arrow_down: |
   | [arrow/src/error.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2Vycm9yLnJz) | `8.88% <0.00%> (ø)` | |
   | [arrow/src/csv/writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2Nzdi93cml0ZXIucnM=) | `82.64% <0.00%> (ø)` | |
   | [parquet/src/column/writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvY29sdW1uL3dyaXRlci5ycw==) | `93.29% <0.00%> (ø)` | |
   | [arrow/src/array/raw\_pointer.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3Jhd19wb2ludGVyLnJz) | `100.00% <0.00%> (ø)` | |
   | [arrow/src/array/array\_string.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X3N0cmluZy5ycw==) | `96.05% <0.00%> (ø)` | |
   | [arrow/src/util/string\_writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL3V0aWwvc3RyaW5nX3dyaXRlci5ycw==) | `73.33% <0.00%> (ø)` | |
   | [arrow/src/array/array\_boolean.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X2Jvb2xlYW4ucnM=) | `90.90% <0.00%> (ø)` | |
   | [arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X3ByaW1pdGl2ZS5ycw==) | `92.93% <0.00%> (ø)` | |
   | ... and [13 more](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c863a2c...2d55a0d](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] gangliao commented on a change in pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
gangliao commented on a change in pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#discussion_r638066603



##########
File path: arrow/src/compute/kernels/arithmetic.rs
##########
@@ -348,6 +444,40 @@ where
     Ok(PrimitiveArray::<T>::from(data))
 }
 
+/// SIMD vectorized implementation of `left % right`.
+/// If any of the lanes marked as valid in `valid_mask` are `0` then an `ArrowError::ModulusByZero`
+/// is returned. The contents of no-valid lanes are undefined.
+#[cfg(simd)]

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] nevi-me commented on a change in pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#discussion_r634832945



##########
File path: arrow/src/compute/kernels/arithmetic.rs
##########
@@ -189,6 +189,74 @@ where
     Ok(PrimitiveArray::<T>::from(data))
 }
 
+/// Helper function to modulus two arrays.
+///
+/// # Errors
+///
+/// This function errors if:
+/// * the arrays have different lengths
+/// * a division by zero is found
+fn math_modulus<T>(
+    left: &PrimitiveArray<T>,
+    right: &PrimitiveArray<T>,
+) -> Result<PrimitiveArray<T>>
+where
+    T: ArrowNumericType,
+    T::Native: Rem<Output = T::Native> + Zero,
+{
+    if left.len() != right.len() {
+        return Err(ArrowError::ComputeError(
+            "Cannot perform math operation on arrays of different length".to_string(),
+        ));
+    }
+
+    let null_bit_buffer =
+        combine_option_bitmap(left.data_ref(), right.data_ref(), left.len())?;
+
+    let buffer = if let Some(b) = &null_bit_buffer {

Review comment:
       It might be better to check if the null count > 0, as there could be a buffer even though all values are non-null




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on a change in pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#discussion_r641924693



##########
File path: arrow/src/compute/kernels/arithmetic.rs
##########
@@ -835,6 +1242,18 @@ mod tests {
         assert_eq!(9, c.value(4));
     }
 
+    #[test]
+    fn test_primitive_array_modulus() {
+        let a = Int32Array::from(vec![15, 15, 8, 1, 9]);
+        let b = Int32Array::from(vec![5, 6, 8, 9, 1]);
+        let c = modulus(&a, &b).unwrap();
+        assert_eq!(0, c.value(0));

Review comment:
       It is possible to compare the `c` in fewer lines using something like `assert_eq!(c, Int32Array:from(vec![3. 0, 2, 3, 4])`, which also has the advantage if ensuring that `c.len() == 5`. This way is fine too

##########
File path: arrow/src/compute/kernels/arithmetic.rs
##########
@@ -875,6 +1321,19 @@ mod tests {
         assert_eq!(true, c.is_null(5));
     }
 
+    #[test]
+    fn test_primitive_array_modulus_with_nulls() {

Review comment:
       🤗 👍 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] gangliao commented on pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
gangliao commented on pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#issuecomment-843541106


   @jorgecarleitao  First-time contributors need a maintainer to approve running workflows.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] gangliao commented on pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
gangliao commented on pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#issuecomment-842863554


   After this PR is merged, I will submit a new PR to `arrow-datafusion` in order to support the modulus clause in SQL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] gangliao commented on a change in pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
gangliao commented on a change in pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#discussion_r639720608



##########
File path: arrow/src/error.rs
##########
@@ -110,6 +111,7 @@ impl Display for ArrowError {
             ArrowError::SchemaError(desc) => write!(f, "Schema error: {}", desc),
             ArrowError::ComputeError(desc) => write!(f, "Compute error: {}", desc),
             ArrowError::DivideByZero => write!(f, "Divide by zero error"),
+            ArrowError::ModulusByZero => write!(f, "Modulus by zero error"),

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter commented on pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#issuecomment-843739831


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#317](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (03b5fa7) into [master](https://codecov.io/gh/apache/arrow-rs/commit/c863a2c44bffa5c092a49e07910d5e9225483193?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c863a2c) will **decrease** coverage by `0.01%`.
   > The diff coverage is `78.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/317/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #317      +/-   ##
   ==========================================
   - Coverage   82.52%   82.50%   -0.02%     
   ==========================================
     Files         162      162              
     Lines       44007    44177     +170     
   ==========================================
   + Hits        36316    36448     +132     
   - Misses       7691     7729      +38     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [arrow/src/error.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2Vycm9yLnJz) | `8.69% <0.00%> (-0.20%)` | :arrow_down: |
   | [arrow/src/compute/kernels/arithmetic.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9hcml0aG1ldGljLnJz) | `84.14% <78.69%> (-3.23%)` | :arrow_down: |
   | [parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow-rs/pull/317/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2VuY29kaW5nLnJz) | `94.85% <0.00%> (-0.20%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c863a2c...03b5fa7](https://codecov.io/gh/apache/arrow-rs/pull/317?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] jorgecarleitao commented on a change in pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#discussion_r638437150



##########
File path: arrow/src/error.rs
##########
@@ -110,6 +111,7 @@ impl Display for ArrowError {
             ArrowError::SchemaError(desc) => write!(f, "Schema error: {}", desc),
             ArrowError::ComputeError(desc) => write!(f, "Compute error: {}", desc),
             ArrowError::DivideByZero => write!(f, "Divide by zero error"),
+            ArrowError::ModulusByZero => write!(f, "Modulus by zero error"),

Review comment:
       I would not add a new error for this; can't we use `ComputeError` or `DivideByZero`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] roee88 commented on a change in pull request #317: Add (simd) modulus op

Posted by GitBox <gi...@apache.org>.
roee88 commented on a change in pull request #317:
URL: https://github.com/apache/arrow-rs/pull/317#discussion_r637759010



##########
File path: arrow/src/compute/kernels/arithmetic.rs
##########
@@ -348,6 +444,40 @@ where
     Ok(PrimitiveArray::<T>::from(data))
 }
 
+/// SIMD vectorized implementation of `left % right`.
+/// If any of the lanes marked as valid in `valid_mask` are `0` then an `ArrowError::ModulusByZero`
+/// is returned. The contents of no-valid lanes are undefined.
+#[cfg(simd)]

Review comment:
       Just a comment. You will need to rebase and change `simd` to `feature = "simd"` in cfg attributes. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org