You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/02 19:27:59 UTC

[GitHub] [arrow-rs] andrei-ionescu opened a new issue #993: Reading List(...) into arrow not supported yet

andrei-ionescu opened a new issue #993:
URL: https://github.com/apache/arrow-rs/issues/993


   **Describe the bug**
   
   When trying to read parquet files with deeply nested fields we end up with the following error:
   
   ```
   Parquet reader thread terminated due to error: ParquetError(ArrowError("reading List(GroupType {
   ...
   }) into arrow not supported yet"))
   ```
   
   **To Reproduce**
   
   This is easily visible from the code found at [array_reader.rs#L1516-L1522](https://github.com/apache/arrow-rs/blob/master/parquet/src/arrow/array_reader.rs#L1516-L1522)
   
   **Expected behavior**
   
   To have support for reading nested parquet files into arrow.
   
   **Additional context**
   
   This issue, in my particular case, has been hidden under the 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] chauhanVritul commented on issue #993: Reading List(...) into arrow not supported yet

Posted by GitBox <gi...@apache.org>.
chauhanVritul commented on issue #993:
URL: https://github.com/apache/arrow-rs/issues/993#issuecomment-1057682186


   Here is a sample parquet file [https://github.com/chauhanVritul/sampleparquet/blob/main/part-00000-8e5acb24-eb4e-491c-8c85-88799f25d1f0-c000.snappy.parquet](url) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] chauhanVritul edited a comment on issue #993: Reading List(...) into arrow not supported yet

Posted by GitBox <gi...@apache.org>.
chauhanVritul edited a comment on issue #993:
URL: https://github.com/apache/arrow-rs/issues/993#issuecomment-1057682186


   Here is a sample parquet file [https://github.com/chauhanVritul/sampleparquet/blob/main/part-00000-8e5acb24-eb4e-491c-8c85-88799f25d1f0-c000.snappy.parquet](https://github.com/chauhanVritul/sampleparquet/blob/main/part-00000-8e5acb24-eb4e-491c-8c85-88799f25d1f0-c000.snappy.parquet) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] kesavkolla commented on issue #993: Reading List(...) into arrow not supported yet

Posted by GitBox <gi...@apache.org>.
kesavkolla commented on issue #993:
URL: https://github.com/apache/arrow-rs/issues/993#issuecomment-1055922876


   The nested data structure support is been pending for long time. I am also eagerly looking for this support. I have a fairly large JSON data which I used apache spark to write to parquet files. I wanted to read them via rust arrow have no success.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] Igosuki commented on issue #993: Reading List(...) into arrow not supported yet

Posted by GitBox <gi...@apache.org>.
Igosuki commented on issue #993:
URL: https://github.com/apache/arrow-rs/issues/993#issuecomment-1044356976


   @alamb https://github.com/apache/arrow-rs/blob/master/parquet/src/arrow/array_reader.rs#L1622


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] Igosuki commented on issue #993: Reading List(...) into arrow not supported yet

Posted by GitBox <gi...@apache.org>.
Igosuki commented on issue #993:
URL: https://github.com/apache/arrow-rs/issues/993#issuecomment-1057292937


   I have a parquet file https://github.com/Igosuki/arrow2/blob/main/part-00000-b4749aa1-94e4-4ddb-bab2-954c4d3a290f.c000.snappy.parquet it uses List(List(Float64))


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #993: Reading List(...) into arrow not supported yet

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #993:
URL: https://github.com/apache/arrow-rs/issues/993#issuecomment-1057104928


   Hi @kesavkolla  -- could you possibly share an example file we could use to test any fix?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #993: Reading List(...) into arrow not supported yet

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #993:
URL: https://github.com/apache/arrow-rs/issues/993#issuecomment-1018755432


   This may have been fixed as there has been significant work on reading of nested structures by @helgikrs and others


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] Igosuki commented on issue #993: Reading List(...) into arrow not supported yet

Posted by GitBox <gi...@apache.org>.
Igosuki commented on issue #993:
URL: https://github.com/apache/arrow-rs/issues/993#issuecomment-1057880187


   Yes, this use case is supported in datafusion. For instance, you can read nested avro and json schemas, so the issue should be raised with arrow-rs, I added a comment here https://github.com/apache/arrow-rs/issues/720 with our files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org