You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/19 12:25:44 UTC

[GitHub] [arrow-rs] alamb opened a new pull request, #1912: Add `DictionaryArray::key` function

alamb opened a new pull request, #1912:
URL: https://github.com/apache/arrow-rs/pull/1912

   # Which issue does this PR close?
   Closes https://github.com/apache/arrow-rs/issues/1911
   
   # Rationale for this change
   See https://github.com/apache/arrow-rs/issues/1911
    
   # What changes are included in this PR?
   
   1. Add `DictionaryArray::key` function
   2. Tests
   
   # Are there any user-facing changes?
   
   Yes new function
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on a diff in pull request #1912: Add `DictionaryArray::key` function

Posted by GitBox <gi...@apache.org>.
alamb commented on code in PR #1912:
URL: https://github.com/apache/arrow-rs/pull/1912#discussion_r902570390


##########
arrow/src/array/array_dictionary.rs:
##########
@@ -169,6 +169,17 @@ impl<'a, K: ArrowPrimitiveType> DictionaryArray<K> {
             .iter()
             .map(|key| key.map(|k| k.to_usize().expect("Dictionary index not usize")))
     }
+
+    /// Return the value of `keys` (the dictionary key) at index `i`,
+    /// cast to `usize`, `None` if the value at `i` is `NULL`.
+    pub fn key(&self, i: usize) -> Option<usize> {
+        self.keys.is_valid(i).then(|| {
+            self.keys
+                .value(i)
+                .to_usize()
+                .expect("Dictionary index not usize")

Review Comment:
   Filed https://github.com/apache/arrow-rs/issues/1918 to track



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1912: Add `DictionaryArray::key` function

Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1912:
URL: https://github.com/apache/arrow-rs/pull/1912#discussion_r901284237


##########
arrow/src/array/array_dictionary.rs:
##########
@@ -169,6 +169,17 @@ impl<'a, K: ArrowPrimitiveType> DictionaryArray<K> {
             .iter()
             .map(|key| key.map(|k| k.to_usize().expect("Dictionary index not usize")))
     }
+
+    /// Return the value of `keys` (the dictionary key) at index `i`,
+    /// cast to `usize`, `None` if the value at `i` is `NULL`.
+    pub fn key(&self, i: usize) -> Option<usize> {
+        self.keys.is_valid(i).then(|| {
+            self.keys
+                .value(i)
+                .to_usize()
+                .expect("Dictionary index not usize")

Review Comment:
   Panicking should be fine, it's a validation failure if the array contains negative indexes. TBH I keep meaning to change all these checked conversions to numeric casts (i.e. as), I wouldn't be surprised if this leads to non-trivial performance benefits.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] waynexia commented on a diff in pull request #1912: Add `DictionaryArray::key` function

Posted by GitBox <gi...@apache.org>.
waynexia commented on code in PR #1912:
URL: https://github.com/apache/arrow-rs/pull/1912#discussion_r901117043


##########
arrow/src/array/array_dictionary.rs:
##########
@@ -169,6 +169,17 @@ impl<'a, K: ArrowPrimitiveType> DictionaryArray<K> {
             .iter()
             .map(|key| key.map(|k| k.to_usize().expect("Dictionary index not usize")))
     }
+
+    /// Return the value of `keys` (the dictionary key) at index `i`,
+    /// cast to `usize`, `None` if the value at `i` is `NULL`.
+    pub fn key(&self, i: usize) -> Option<usize> {
+        self.keys.is_valid(i).then(|| {
+            self.keys
+                .value(i)
+                .to_usize()
+                .expect("Dictionary index not usize")

Review Comment:
   For most cases this unwrap won't panic, but do we need to maintain the same behavior with https://github.com/apache/arrow-datafusion/blob/080c32400ddfa2d45b5bebb820184eac8fd5a03a/datafusion/common/src/scalar.rs#L342-L358 ? If so the return type can either be `Result<Option<_>>` or `Option<_>`, I'm ok with both (hard to choose...).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] codecov-commenter commented on pull request #1912: Add `DictionaryArray::key` function

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #1912:
URL: https://github.com/apache/arrow-rs/pull/1912#issuecomment-1159716391

   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1912?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1912](https://codecov.io/gh/apache/arrow-rs/pull/1912?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (dcdbc50) into [master](https://codecov.io/gh/apache/arrow-rs/commit/ded63168de4dce7e4e92753bd39d60d20bfb683e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ded6316) will **decrease** coverage by `0.00%`.
   > The diff coverage is `84.61%`.
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #1912      +/-   ##
   ==========================================
   - Coverage   83.41%   83.41%   -0.01%     
   ==========================================
     Files         214      214              
     Lines       56991    57004      +13     
   ==========================================
   + Hits        47541    47550       +9     
   - Misses       9450     9454       +4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1912?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [arrow/src/array/array\_dictionary.rs](https://codecov.io/gh/apache/arrow-rs/pull/1912/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X2RpY3Rpb25hcnkucnM=) | `91.53% <84.61%> (-0.39%)` | :arrow_down: |
   | [parquet\_derive/src/parquet\_field.rs](https://codecov.io/gh/apache/arrow-rs/pull/1912/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldF9kZXJpdmUvc3JjL3BhcnF1ZXRfZmllbGQucnM=) | `65.75% <0.00%> (-0.23%)` | :arrow_down: |
   | [parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow-rs/pull/1912/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2VuY29kaW5nLnJz) | `93.43% <0.00%> (-0.20%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1912?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1912?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ded6316...dcdbc50](https://codecov.io/gh/apache/arrow-rs/pull/1912?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold merged pull request #1912: Add `DictionaryArray::key` function

Posted by GitBox <gi...@apache.org>.
tustvold merged PR #1912:
URL: https://github.com/apache/arrow-rs/pull/1912


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org