You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "VARSHAJOSHY (via GitHub)" <gi...@apache.org> on 2023/09/08 14:03:29 UTC

[GitHub] [arrow] VARSHAJOSHY opened a new issue, #37631: pyarrow.lib.ArrowInvalid: Expected to read 827474256 metadata bytes, but only read 6635

VARSHAJOSHY opened a new issue, #37631:
URL: https://github.com/apache/arrow/issues/37631

   ### Describe the usage question you have. Please include as many useful details as  possible.
   
   
   When I tried to use RecordBatchStreamReader for reading parquet stream, I am getting this error.
   data = parquet byte stream
   reader = pa.RecordBatchStreamReader(data)
   
   ### Component(s)
   
   Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche closed issue #37631: [Python][Parquet] pyarrow.lib.ArrowInvalid: Expected to read 827474256 metadata bytes, but only read 6635

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche closed issue #37631: [Python][Parquet] pyarrow.lib.ArrowInvalid: Expected to read 827474256 metadata bytes, but only read 6635
URL: https://github.com/apache/arrow/issues/37631


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on issue #37631: [Python][Parquet] pyarrow.lib.ArrowInvalid: Expected to read 827474256 metadata bytes, but only read 6635

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on issue #37631:
URL: https://github.com/apache/arrow/issues/37631#issuecomment-1712267512

   You can't use `pyarrow.RecordBatchStreamReader()` for Parquet data.
   You can use `pyarrow.parquet.read_table()`.
   See https://arrow.apache.org/docs/python/parquet.html for details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org