You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/12 14:49:38 UTC

[GitHub] [arrow-rs] dispanser opened a new pull request, #1550: Fix reading dictionaries from nested structs in ipc `StreamReader`

dispanser opened a new pull request, #1550:
URL: https://github.com/apache/arrow-rs/pull/1550

   # Which issue does this PR close?
   
   Closes #1549.
   
   # Rationale for this change
    
   Fixing a bug in the ipc stream reader, aligning with the `FileReader` implementation.
   
   # What changes are included in this PR?
   
   The actual bugfix is 4 characters, but this PR also includes a test to reproduce the bug.
   
   # Are there any user-facing changes?
   
   No documetation changes needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on pull request #1550: Fix reading dictionaries from nested structs in ipc `StreamReader`

Posted by GitBox <gi...@apache.org>.
alamb commented on PR #1550:
URL: https://github.com/apache/arrow-rs/pull/1550#issuecomment-1096868995

   Thank you @dispanser ! @viirya  I wonder if you might have a chance to review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] dispanser commented on a diff in pull request #1550: Fix reading dictionaries from nested structs in ipc `StreamReader`

Posted by GitBox <gi...@apache.org>.
dispanser commented on code in PR #1550:
URL: https://github.com/apache/arrow-rs/pull/1550#discussion_r848656146


##########
arrow/src/ipc/reader.rs:
##########
@@ -1394,4 +1407,36 @@ mod tests {
         let arrow_json: ArrowJson = serde_json::from_str(&s).unwrap();
         arrow_json
     }
+
+    #[test]
+    fn test_roundtrip_stream_nested_dict() {
+        let xs = vec!["AA", "BB", "AA", "CC", "BB"];
+        let dict = Arc::new(
+            xs.clone()
+                .into_iter()
+                .collect::<DictionaryArray<datatypes::Int8Type>>(),
+        );
+        let string_array: ArrayRef = Arc::new(StringArray::from(xs.clone()));
+        let struct_array = StructArray::from(vec![
+            (Field::new("f1.1", DataType::Utf8, false), string_array),
+            (
+                Field::new("f1.2_struct", dict.data_type().clone(), false),
+                dict.clone() as ArrayRef,
+            ),

Review Comment:
   you're right! Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] viirya commented on pull request #1550: Fix reading dictionaries from nested structs in ipc `StreamReader`

Posted by GitBox <gi...@apache.org>.
viirya commented on PR #1550:
URL: https://github.com/apache/arrow-rs/pull/1550#issuecomment-1096937941

   Thanks @dispanser @alamb. Yeah, I will review this today.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] codecov-commenter commented on pull request #1550: Fix reading dictionaries from nested structs in ipc `StreamReader`

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #1550:
URL: https://github.com/apache/arrow-rs/pull/1550#issuecomment-1096967862

   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1550?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1550](https://codecov.io/gh/apache/arrow-rs/pull/1550?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cd838e5) into [master](https://codecov.io/gh/apache/arrow-rs/commit/68038f595b62202906d9a6235575b3a236c09546?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (68038f5) will **increase** coverage by `0.01%`.
   > The diff coverage is `100.00%`.
   
   > :exclamation: Current head cd838e5 differs from pull request most recent head 30f7b30. Consider uploading reports for the commit 30f7b30 to get more accurate results
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #1550      +/-   ##
   ==========================================
   + Coverage   82.82%   82.84%   +0.01%     
   ==========================================
     Files         190      190              
     Lines       54941    54966      +25     
   ==========================================
   + Hits        45507    45535      +28     
   + Misses       9434     9431       -3     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1550?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [arrow/src/ipc/reader.rs](https://codecov.io/gh/apache/arrow-rs/pull/1550/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2lwYy9yZWFkZXIucnM=) | `88.28% <100.00%> (+0.48%)` | :arrow_up: |
   | [arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow-rs/pull/1550/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3RyYW5zZm9ybS9tb2QucnM=) | `86.46% <0.00%> (+0.11%)` | :arrow_up: |
   | [parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow-rs/pull/1550/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2VuY29kaW5nLnJz) | `93.56% <0.00%> (+0.18%)` | :arrow_up: |
   | [parquet\_derive/src/parquet\_field.rs](https://codecov.io/gh/apache/arrow-rs/pull/1550/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldF9kZXJpdmUvc3JjL3BhcnF1ZXRfZmllbGQucnM=) | `66.43% <0.00%> (+0.22%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1550?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1550?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [68038f5...30f7b30](https://codecov.io/gh/apache/arrow-rs/pull/1550?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold merged pull request #1550: Fix reading dictionaries from nested structs in ipc `StreamReader`

Posted by GitBox <gi...@apache.org>.
tustvold merged PR #1550:
URL: https://github.com/apache/arrow-rs/pull/1550


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] viirya commented on a diff in pull request #1550: Fix reading dictionaries from nested structs in ipc `StreamReader`

Posted by GitBox <gi...@apache.org>.
viirya commented on code in PR #1550:
URL: https://github.com/apache/arrow-rs/pull/1550#discussion_r848648354


##########
arrow/src/ipc/reader.rs:
##########
@@ -1394,4 +1407,36 @@ mod tests {
         let arrow_json: ArrowJson = serde_json::from_str(&s).unwrap();
         arrow_json
     }
+
+    #[test]
+    fn test_roundtrip_stream_nested_dict() {
+        let xs = vec!["AA", "BB", "AA", "CC", "BB"];
+        let dict = Arc::new(
+            xs.clone()
+                .into_iter()
+                .collect::<DictionaryArray<datatypes::Int8Type>>(),
+        );
+        let string_array: ArrayRef = Arc::new(StringArray::from(xs.clone()));
+        let struct_array = StructArray::from(vec![
+            (Field::new("f1.1", DataType::Utf8, false), string_array),
+            (
+                Field::new("f1.2_struct", dict.data_type().clone(), false),
+                dict.clone() as ArrayRef,
+            ),

Review Comment:
   nit: as they are put under `f2_struct`, I guess `f2.1` and `f2.2_struct` look more appropriate?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org