You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/16 19:04:11 UTC

[GitHub] [arrow-rs] jhorstmann opened a new pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

jhorstmann opened a new pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048


   # Which issue does this PR close?
   
   <!---
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #980.
   
   # Rationale for this change
   
   We plan to use this new `assume_sorted_dictionaries` flag in our implementation of window functions where the data does not really need to be sorted by the PARTITION BY expressions. 
   
   # What changes are included in this PR?
   
   I still plan to add a microbenchmark showing the performance improvement.
   
   I'm also looking for a better name of the flag as `assume_sorted_dictionaries` doesn't describe the functionality exactly.
   
   # Are there any user-facing changes?
   
   There should be no API changes in this PR, passing `SortOptions` around happens mostly internally. The `build_compare` function was kept and it delegates to a new function `build_compare_with_options`.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] jhorstmann commented on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-998310199


   Thanks for the review @alamb, I will go through the individual comments later. I was still doing some profiling because I expected a bigger speedup. It seems the `DynComparator` and `LexicographicalComparator` take longer than the difference between comparing strings and integers. I was also experimenting with trying to inline `is_valid` calls on another branch to see if that is the bottleneck.
   
   The reason for extending the `SortOptions` is that this functionality helps for two separate usecases. The name of the flag is a bit misleading and another reason why the PR is still in draft. First usecase is of course for faster sorting if the dictionaries are already sorted, for that the `is_ordered` flag on `DictionaryArray` would work fine. The other usecase is for partitioning, as used for example for window functions. In an expression like `SUM(a) OVER (PARTITION BY b ORDER BY c)` Datafusion will sort by `b, c`, with this change we could track that `b` is only used for partitioning in the logical plan and then set the new flag in `SortOptions`, regardless of whether the dictionary is actually sorted.
   
   Now that I'm thinking about, for that usecase we could also pretend that the dictionary is sorted by creating a new DictionaryArray, pointing to the same data, but with the `is_ordered` flag set, and sort by that array.
   
   Renaming the flag to `sort_for_partitioning` or `sort_by_dictionary_keys` could be an option to make the purpose clearer. And the `is_ordered` flag should also be taken into account, regardless of the `SortOption` flag.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] jhorstmann commented on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-996163747


   Benchmark results, not as big an improvement as I'd hoped, about 40% faster. I guess the `DynComparator` and `is_valid` checks already have some overhead. Also a bit unexpected that the high cardinalities are faster.
   
   ```
   lexicographical_partition_ranges(dictionary_values low cardinality) 4096                                                                             
                           time:   [1.1289 us 1.1320 us 1.1377 us]
   
   lexicographical_partition_ranges(dictionary_values high cardinality) 4096                                                                             
                           time:   [809.36 ns 810.00 ns 810.59 ns]
   
   lexicographical_partition_ranges(dictionary_keys low cardinality) 4096                                                                             
                           time:   [686.69 ns 687.22 ns 687.74 ns]
   
   lexicographical_partition_ranges(dictionary_keys high cardinality) 4096                                                                            
                           time:   [487.44 ns 487.90 ns 488.57 ns]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] jhorstmann commented on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-1011009045


   I changed to using the `is_ordered` flag as initially proposed and a `fn as_ordered(&self, is_ordered: bool) -> Self` that would return a DictionaryArray with the flag set (without actually sorting or checking the ordering). Now testing this in our engine I realized that instead of `as_ordered` for partitioning I could also just get the keys array directly and partition on that, `as_ordered` doesn't really simplify that part of the code.
   
   So this code would only become useful when we also add a way to *efficiently* transform a DictionaryArray to being sorted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on a change in pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#discussion_r772629985



##########
File path: arrow/src/array/ord.rs
##########
@@ -303,6 +362,23 @@ pub mod tests {
         Ok(())
     }
 
+    #[test]
+    fn test_dict_keys() -> Result<()> {
+        let data = vec!["a", "b", "c", "a", "a", "c", "c"];
+        let array = data.into_iter().collect::<DictionaryArray<Int16Type>>();

Review comment:
       Are we sure that this line produces a `DictionaryArray` with a sorted dictionary? I think it uses a `StringDictionaryBuilder` which uses a `HashMap` for key / values (and thus I don't think gives any guarantee about the ordering). 

##########
File path: arrow/src/compute/kernels/sort.rs
##########
@@ -403,6 +439,10 @@ pub struct SortOptions {
     pub descending: bool,
     /// Whether to sort nulls first
     pub nulls_first: bool,
+    /// Whether dictionary arrays can be sorted by their keys instead of values.

Review comment:
       There are already some hooks in the arrow codebase for `is_ordered` -- like `DictionaryArray::is_ordered`
   
   What do you think about using those hooks rather than a new `assume_sorted_dictionaries` option on `SortOptions` -- that would make it harder to pick the wrong option
   
   https://sourcegraph.com/search?q=context:global+repo:%5Egithub%5C.com/apache/arrow-rs%24+is_ordered&patternType=literal
   
   Perhaps we could add a function like
   
   ```rust
   impl DictionaryArray {
   
   fn sorted(self) -> Self { 
     // check if dictionary is already sorted, 
     // otherwise sort it
     Self { 
       is_ordered: true
       self,
     }
   }
   ``` 
   
   With an unsafe variant to skip the validation

##########
File path: arrow/src/compute/kernels/sort.rs
##########
@@ -526,12 +567,35 @@ where
             .map(|index| (index, values.value(index as usize)))
             .collect::<Vec<(u32, T::Native)>>()
     };
-    sort_primitive_inner(values, null_indices, cmp, options, limit, valids)
+    sort_primitive_inner(null_indices, cmp, options, limit, valids)
+}
+
+/// Sort a dictionary array by its keys

Review comment:
       ```suggestion
   /// Sort a dictionary array by the numeric value of its keys. 
   /// If the dictionary is sorted, this will produce the same result 
   /// as sorting by the unpacked values. 
   ```

##########
File path: arrow/src/compute/kernels/sort.rs
##########
@@ -403,6 +439,10 @@ pub struct SortOptions {
     pub descending: bool,
     /// Whether to sort nulls first
     pub nulls_first: bool,
+    /// Whether dictionary arrays can be sorted by their keys instead of values.

Review comment:
       Avoiding a new field in `SortOptions` would also likely reduce the size of this PR in terms of number of lines changed as well as keep the change API compatible. 

##########
File path: arrow/src/array/ord.rs
##########
@@ -119,9 +138,32 @@ where
 /// # Ok(())
 /// # }
 /// ```
+/// The returned comparator should only be called for non-null elements as the result is undefined otherwise.
+/// The caller should check the validity of both indices and then decide whether NULLs should come first or last.
 // This is a factory of comparisons.
-// The lifetime 'a enforces that we cannot use the closure beyond any of the array's lifetime.

Review comment:
       πŸ‘ 🧹 

##########
File path: arrow/src/array/ord.rs
##########
@@ -119,9 +138,32 @@ where
 /// # Ok(())
 /// # }
 /// ```
+/// The returned comparator should only be called for non-null elements as the result is undefined otherwise.
+/// The caller should check the validity of both indices and then decide whether NULLs should come first or last.
 // This is a factory of comparisons.
-// The lifetime 'a enforces that we cannot use the closure beyond any of the array's lifetime.
 pub fn build_compare(left: &dyn Array, right: &dyn Array) -> Result<DynComparator> {
+    build_compare_with_options(left, right, &SortOptions::default())
+}
+
+/// returns a comparison function that compares two values at two different positions
+/// between the two arrays.
+/// The arrays' types must be equal.
+///
+/// Only the `assume_sorted_dictionaries` flag of [`SortOptions`] is currently used.
+/// If this flag is set and the arrays are dictionary encoded the comparator will use the keys
+/// instead of values for comparing. This is more efficient if the dictionaries are sorted or
+/// if the sorting is done for partitioning purposes. Setting this flag will give unexpected results
+/// when comparing two arrays with different dictionaries.
+///
+/// Other flags need to be handled when calling the comparator.

Review comment:
       Here is another place where using the `is_ordered` flag directly on the `DictionaryArray` might make the interface (much) cleaner. 

##########
File path: arrow/src/array/ord.rs
##########
@@ -101,6 +102,24 @@ where
     })
 }
 
+fn compare_dict_key<T>(left: &dyn Array, right: &dyn Array) -> DynComparator
+where
+    T: ArrowDictionaryKeyType,
+    T::Native: Ord,
+{
+    let left = left.as_any().downcast_ref::<DictionaryArray<T>>().unwrap();
+    let right = right.as_any().downcast_ref::<DictionaryArray<T>>().unwrap();
+
+    let left_keys: PrimitiveArray<T> = PrimitiveArray::from(left.keys().data().clone());
+    let right_keys: PrimitiveArray<T> = PrimitiveArray::from(right.keys().data().clone());
+
+    Box::new(move |i: usize, j: usize| {
+        let key_left = left_keys.value(i);
+        let key_right = right_keys.value(j);
+        <T::Native as Ord>::cmp(&key_left, &key_right)
+    })

Review comment:
       I don't understand why we need to replicate the logic from `build_compare` here.
   
   I think it is possible to call `build_compare` directly:
   
   ```suggestion
       build_compare(left.keys(), right.keys())
   ```
   
   
   I had to make a few other changes, but then it worked πŸ‘ 
   ```diff
   diff --git a/arrow/src/array/ord.rs b/arrow/src/array/ord.rs
   index 8497a4167d..5573bd67e6 100644
   --- a/arrow/src/array/ord.rs
   +++ b/arrow/src/array/ord.rs
   @@ -102,22 +102,14 @@ where
        })
    }
    
   -fn compare_dict_key<T>(left: &dyn Array, right: &dyn Array) -> DynComparator
   +fn compare_dict_key<T>(left: &dyn Array, right: &dyn Array) -> Result<DynComparator>
    where
        T: ArrowDictionaryKeyType,
        T::Native: Ord,
    {
        let left = left.as_any().downcast_ref::<DictionaryArray<T>>().unwrap();
        let right = right.as_any().downcast_ref::<DictionaryArray<T>>().unwrap();
   -
   -    let left_keys: PrimitiveArray<T> = PrimitiveArray::from(left.keys().data().clone());
   -    let right_keys: PrimitiveArray<T> = PrimitiveArray::from(right.keys().data().clone());
   -
   -    Box::new(move |i: usize, j: usize| {
   -        let key_left = left_keys.value(i);
   -        let key_right = right_keys.value(j);
   -        <T::Native as Ord>::cmp(&key_left, &key_right)
   -    })
   +    build_compare(left.keys(), right.keys())
    }
    
    /// returns a comparison function that compares two values at two different positions
   @@ -242,14 +234,14 @@ pub fn build_compare_with_options(
                    ));
                } else if options.assume_sorted_dictionaries {
                    match (key_type_lhs.as_ref(), key_type_rhs.as_ref()) {
   -                    (UInt8, UInt8) => compare_dict_key::<UInt8Type>(left, right),
   -                    (UInt16, UInt16) => compare_dict_key::<UInt16Type>(left, right),
   -                    (UInt32, UInt32) => compare_dict_key::<UInt32Type>(left, right),
   -                    (UInt64, UInt64) => compare_dict_key::<UInt64Type>(left, right),
   -                    (Int8, Int8) => compare_dict_key::<Int8Type>(left, right),
   -                    (Int16, Int16) => compare_dict_key::<Int16Type>(left, right),
   -                    (Int32, Int32) => compare_dict_key::<Int32Type>(left, right),
   -                    (Int64, Int64) => compare_dict_key::<Int64Type>(left, right),
   +                    (UInt8, UInt8) => compare_dict_key::<UInt8Type>(left, right)?,
   +                    (UInt16, UInt16) => compare_dict_key::<UInt16Type>(left, right)?,
   +                    (UInt32, UInt32) => compare_dict_key::<UInt32Type>(left, right)?,
   +                    (UInt64, UInt64) => compare_dict_key::<UInt64Type>(left, right)?,
   +                    (Int8, Int8) => compare_dict_key::<Int8Type>(left, right)?,
   +                    (Int16, Int16) => compare_dict_key::<Int16Type>(left, right)?,
   +                    (Int32, Int32) => compare_dict_key::<Int32Type>(left, right)?,
   +                    (Int64, Int64) => compare_dict_key::<Int64Type>(left, right)?,
                        (lhs, _) => {
                            return Err(ArrowError::InvalidArgumentError(format!(
                                "Dictionaries do not support keys of type {:?}",
   ```

##########
File path: arrow/benches/partition_kernels.rs
##########
@@ -39,6 +43,38 @@ where
     Arc::new(array)
 }
 
+fn create_sorted_dictionary_array<T: ArrowDictionaryKeyType>(
+    size: usize,
+    num_distinct_keys: usize,
+) -> ArrayRef
+where
+    Standard: Distribution<T::Native>,
+{
+    let mut rng = seedable_rng();
+
+    let key_array: ArrayRef =

Review comment:
       It would be cool if we could add an option to `StringDictionaryBuilder` to ensure the resulting dictionary was sorted

##########
File path: arrow/src/compute/kernels/sort.rs
##########
@@ -2357,12 +2495,116 @@ mod tests {
             Some(SortOptions {
                 descending: false,
                 nulls_first: false,
+                assume_sorted_dictionaries: false,
             }),
             Some(2),
             vec![Some("def"), None],
         );
     }
 
+    #[test]
+    fn test_sort_dicts_by_key() {
+        test_sort_string_dict_arrays::<Int8Type>(
+            vec![None, Some("B"), Some("A"), None, Some("C"), Some("A")],
+            Some(SortOptions {
+                assume_sorted_dictionaries: true,
+                ..Default::default()
+            }),
+            None,
+            vec![None, None, Some("B"), Some("A"), Some("A"), Some("C")],

Review comment:
       I must be missing something here -- shouldn't this output be sorted? Namely `None, None, Some("A"), Some("A"), Some("B"), Some("C")`?
   
   I also don't understand how this code ensures that the `DictionaryArrays` actually have sorted dictionaries as I don't see `test_sort_string_dict_arrays` doing anything special




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter commented on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-996105420


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1048](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38dfd18) into [master](https://codecov.io/gh/apache/arrow-rs/commit/fc343e7c02a05cd7d2a1d401f4e5feeb8d6de179?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fc343e7) will **increase** coverage by `0.00%`.
   > The diff coverage is `75.45%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/1048/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #1048   +/-   ##
   =======================================
     Coverage   82.30%   82.31%           
   =======================================
     Files         168      168           
     Lines       49046    49136   +90     
   =======================================
   + Hits        40368    40445   +77     
   - Misses       8678     8691   +13     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Ξ” | |
   |---|---|---|
   | [arrow/src/compute/kernels/partition.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9wYXJ0aXRpb24ucnM=) | `97.45% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/array/ord.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L29yZC5ycw==) | `68.26% <57.44%> (+0.61%)` | :arrow_up: |
   | [arrow/src/compute/kernels/sort.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9zb3J0LnJz) | `94.67% <88.88%> (-0.28%)` | :arrow_down: |
   | [arrow/src/datatypes/field.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2RhdGF0eXBlcy9maWVsZC5ycw==) | `53.37% <0.00%> (-0.31%)` | :arrow_down: |
   | [arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3RyYW5zZm9ybS9tb2QucnM=) | `85.10% <0.00%> (+0.13%)` | :arrow_up: |
   | [arrow/src/datatypes/datatype.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2RhdGF0eXBlcy9kYXRhdHlwZS5ycw==) | `66.38% <0.00%> (+0.42%)` | :arrow_up: |
   | [parquet\_derive/src/parquet\_field.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldF9kZXJpdmUvc3JjL3BhcnF1ZXRfZmllbGQucnM=) | `66.43% <0.00%> (+0.45%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Ξ” = absolute <relative> (impact)`, `ΓΈ = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [fc343e7...38dfd18](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-996105420


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1048](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (60eca77) into [master](https://codecov.io/gh/apache/arrow-rs/commit/fc343e7c02a05cd7d2a1d401f4e5feeb8d6de179?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fc343e7) will **increase** coverage by `0.00%`.
   > The diff coverage is `75.45%`.
   
   > :exclamation: Current head 60eca77 differs from pull request most recent head cf2aa11. Consider uploading reports for the commit cf2aa11 to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/1048/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #1048   +/-   ##
   =======================================
     Coverage   82.30%   82.31%           
   =======================================
     Files         168      168           
     Lines       49046    49136   +90     
   =======================================
   + Hits        40368    40444   +76     
   - Misses       8678     8692   +14     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Ξ” | |
   |---|---|---|
   | [arrow/src/compute/kernels/partition.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9wYXJ0aXRpb24ucnM=) | `97.45% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/array/ord.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L29yZC5ycw==) | `68.26% <57.44%> (+0.61%)` | :arrow_up: |
   | [arrow/src/compute/kernels/sort.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9zb3J0LnJz) | `94.67% <88.88%> (-0.28%)` | :arrow_down: |
   | [parquet\_derive/src/parquet\_field.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldF9kZXJpdmUvc3JjL3BhcnF1ZXRfZmllbGQucnM=) | `65.98% <0.00%> (ΓΈ)` | |
   | [parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2VuY29kaW5nLnJz) | `93.71% <0.00%> (+0.19%)` | :arrow_up: |
   | [arrow/src/datatypes/datatype.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2RhdGF0eXBlcy9kYXRhdHlwZS5ycw==) | `66.38% <0.00%> (+0.42%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Ξ” = absolute <relative> (impact)`, `ΓΈ = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [fc343e7...cf2aa11](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-998149129


   The integration test failure https://github.com/apache/arrow-rs/runs/4552743883?check_suite_focus=true looked to be related to some insfrastructure problem. I restarted the tests and hopefuly they will pass this time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] jhorstmann edited a comment on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
jhorstmann edited a comment on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-998310199


   Thanks for the review @alamb, I will go through the individual comments later. I was still doing some profiling because I expected a bigger speedup. It seems the `DynComparator` and `LexicographicalComparator` take longer than the difference between comparing strings and integers. I was also experimenting with trying to inline `is_valid` calls on another branch to see if that is the bottleneck.
   
   The reason for extending the `SortOptions` is that this functionality helps for two separate usecases. The name of the flag is a bit misleading and another reason why the PR is still in draft. First usecase is of course for faster sorting if the dictionaries are already sorted, for that the `is_ordered` flag on `DictionaryArray` would work fine. The other usecase is for partitioning, as used for example for window functions. In an expression like `SUM(a) OVER (PARTITION BY b ORDER BY c)` Datafusion will sort by `b, c`, with this change we could track that `b` is only used for partitioning in the logical plan and then set the new flag in `SortOptions`, regardless of whether the dictionary is actually sorted.
   
   Now that I'm thinking about, for that usecase we could also pretend that the dictionary is sorted by creating a new DictionaryArray, pointing to the same data, but with the `is_ordered` flag set, and sort by that array. And that is probably what you meant by the first comment.
   
   Renaming the flag to `sort_for_partitioning` or `sort_by_dictionary_keys` could be an option to make the purpose clearer. And the `is_ordered` flag should also be taken into account, regardless of the `SortOption` flag.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-998844448


   > The other usecase is for partitioning, as used for example for window functions. In an expression like SUM(a) OVER (PARTITION BY b ORDER BY c) Datafusion will sort by b, c, with this change we could track that b is only used for partitioning in the logical plan and then set the new flag in SortOptions, regardless of whether the dictionary is actually sorted.
   
   πŸ‘  Makes sense
   
   Make sense. Maybe the sort options flag could be named something like "sort_dictionary_by_key_value" to make it clear that the request is to sort the data such that the same values that are contiguous but not necessarily sorted by value. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-998844865


   then we could have a follow on PR that also sorts the dictionary by keys if the `is_ordered` flag is set. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] jhorstmann commented on a change in pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on a change in pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#discussion_r772704703



##########
File path: arrow/src/array/ord.rs
##########
@@ -303,6 +362,23 @@ pub mod tests {
         Ok(())
     }
 
+    #[test]
+    fn test_dict_keys() -> Result<()> {
+        let data = vec!["a", "b", "c", "a", "a", "c", "c"];
+        let array = data.into_iter().collect::<DictionaryArray<Int16Type>>();

Review comment:
       For this specific input the dictionary would be sorted because the keys are handed out in the order of the values in the input. So "a" would be key 0, "b" => 1 and "c" => 2. Agree that this is misleading, even more so in the test for the sort kernel where the input values are not already in alphabetical order. I'll try to make that clearer.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-996105420


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1048](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b1cee14) into [master](https://codecov.io/gh/apache/arrow-rs/commit/fc343e7c02a05cd7d2a1d401f4e5feeb8d6de179?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fc343e7) will **increase** coverage by `0.28%`.
   > The diff coverage is `89.19%`.
   
   > :exclamation: Current head b1cee14 differs from pull request most recent head d84fbb3. Consider uploading reports for the commit d84fbb3 to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/1048/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #1048      +/-   ##
   ==========================================
   + Coverage   82.30%   82.59%   +0.28%     
   ==========================================
     Files         168      173       +5     
     Lines       49046    50770    +1724     
   ==========================================
   + Hits        40368    41932    +1564     
   - Misses       8678     8838     +160     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Ξ” | |
   |---|---|---|
   | [arrow/src/array/equal/utils.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2VxdWFsL3V0aWxzLnJz) | `74.00% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/array/transform/boolean.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3RyYW5zZm9ybS9ib29sZWFuLnJz) | `76.92% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/array/transform/utils.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3RyYW5zZm9ybS91dGlscy5ycw==) | `100.00% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/buffer/ops.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2J1ZmZlci9vcHMucnM=) | `96.77% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/compute/kernels/length.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9sZW5ndGgucnM=) | `100.00% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/datatypes/types.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2RhdGF0eXBlcy90eXBlcy5ycw==) | `88.88% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/ipc/gen/Schema.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2lwYy9nZW4vU2NoZW1hLnJz) | `43.72% <ΓΈ> (+2.11%)` | :arrow_up: |
   | [arrow/src/pyarrow.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL3B5YXJyb3cucnM=) | `0.00% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/util/display.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL3V0aWwvZGlzcGxheS5ycw==) | `19.62% <0.00%> (-0.38%)` | :arrow_down: |
   | [integration-testing/src/lib.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aW50ZWdyYXRpb24tdGVzdGluZy9zcmMvbGliLnJz) | `0.00% <0.00%> (ΓΈ)` | |
   | ... and [64 more](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Ξ” = absolute <relative> (impact)`, `ΓΈ = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [fc343e7...d84fbb3](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1048:
URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-996105420


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1048](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cf2aa11) into [master](https://codecov.io/gh/apache/arrow-rs/commit/fc343e7c02a05cd7d2a1d401f4e5feeb8d6de179?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fc343e7) will **increase** coverage by `0.00%`.
   > The diff coverage is `75.45%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/1048/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #1048   +/-   ##
   =======================================
     Coverage   82.30%   82.31%           
   =======================================
     Files         168      168           
     Lines       49046    49136   +90     
   =======================================
   + Hits        40368    40446   +78     
   - Misses       8678     8690   +12     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Ξ” | |
   |---|---|---|
   | [arrow/src/compute/kernels/partition.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9wYXJ0aXRpb24ucnM=) | `97.45% <ΓΈ> (ΓΈ)` | |
   | [arrow/src/array/ord.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L29yZC5ycw==) | `68.26% <57.44%> (+0.61%)` | :arrow_up: |
   | [arrow/src/compute/kernels/sort.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9zb3J0LnJz) | `94.67% <88.88%> (-0.28%)` | :arrow_down: |
   | [arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3RyYW5zZm9ybS9tb2QucnM=) | `85.10% <0.00%> (+0.13%)` | :arrow_up: |
   | [parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2VuY29kaW5nLnJz) | `93.71% <0.00%> (+0.19%)` | :arrow_up: |
   | [parquet\_derive/src/parquet\_field.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldF9kZXJpdmUvc3JjL3BhcnF1ZXRfZmllbGQucnM=) | `66.21% <0.00%> (+0.22%)` | :arrow_up: |
   | [arrow/src/datatypes/datatype.rs](https://codecov.io/gh/apache/arrow-rs/pull/1048/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2RhdGF0eXBlcy9kYXRhdHlwZS5ycw==) | `66.38% <0.00%> (+0.42%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Ξ” = absolute <relative> (impact)`, `ΓΈ = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [fc343e7...cf2aa11](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org