You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "davidsaslawsky-trackinsight (via GitHub)" <gi...@apache.org> on 2023/06/11 14:37:07 UTC

[GitHub] [arrow-rs] davidsaslawsky-trackinsight opened a new issue, #4396: Parquet error: Not all children array length are the same! when using RowSelection to read a parquet file

davidsaslawsky-trackinsight opened a new issue, #4396:
URL: https://github.com/apache/arrow-rs/issues/4396

   **Describe the bug**
   
   There's a `parquet error Not all children array length are the same!` when trying to read the attached parquet file with a `RowSelection` that selects more than one row.  
   
   **To Reproduce**
   Run the attached rust program to read the parquet file.
   
   This triggers the error:
   ```rust
   let selection = RowSelection::from(vec![
       RowSelector::skip(10),
       RowSelector::select(1),
       RowSelector::skip(1),
       RowSelector::select(1),
       RowSelector::skip(7),
       RowSelector::select(1)
   ]);
   ```
   
   However without `RowSelection` or with a trivial row selection this works:
   ```rust
   let selection = RowSelection::from(vec![
       RowSelector::skip(10),
       RowSelector::select(1)
   ]);
   ```
   
   **Expected behavior**
   
   Should not have the error.
   
   **Additional context**
   
   If you remove the `ticker` column from the file, you no longer have the exception however the row selection is wrong. The first `RowSelection` above returns 11 rows, not 3 as expected in that case.
   
   I'm new at Rust so I haven't been able to narrow the problem further, please let me know if there's anything I can do to help.
   
   [Archive.zip](https://github.com/apache/arrow-rs/files/11714804/Archive.zip)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] davidsaslawsky-trackinsight closed issue #4396: Parquet error: Not all children array length are the same! when using RowSelection to read a parquet file

Posted by "davidsaslawsky-trackinsight (via GitHub)" <gi...@apache.org>.
davidsaslawsky-trackinsight closed issue #4396: Parquet error: Not all children array length are the same! when using RowSelection to read a parquet file
URL: https://github.com/apache/arrow-rs/issues/4396


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] AdamGS commented on issue #4396: Parquet error: Not all children array length are the same! when using RowSelection to read a parquet file

Posted by "AdamGS (via GitHub)" <gi...@apache.org>.
AdamGS commented on issue #4396:
URL: https://github.com/apache/arrow-rs/issues/4396#issuecomment-1586773913

   Seems to be related to [this](https://github.com/apache/arrow-rs/issues/4368) issue which was recently fixed, try pulling a version of the `parquet` crate from the `master` branch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] davidsaslawsky-trackinsight commented on issue #4396: Parquet error: Not all children array length are the same! when using RowSelection to read a parquet file

Posted by "davidsaslawsky-trackinsight (via GitHub)" <gi...@apache.org>.
davidsaslawsky-trackinsight commented on issue #4396:
URL: https://github.com/apache/arrow-rs/issues/4396#issuecomment-1586820619

   I confirm that both the ParquetError and the wrong row count are fixed on the `master` branch, thanks for you help :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org