You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "VARSHAJOSHY (via GitHub)" <gi...@apache.org> on 2023/09/08 14:03:29 UTC
[GitHub] [arrow] VARSHAJOSHY opened a new issue, #37631: pyarrow.lib.ArrowInvalid: Expected to read 827474256 metadata bytes, but only read 6635
VARSHAJOSHY opened a new issue, #37631:
URL: https://github.com/apache/arrow/issues/37631
### Describe the usage question you have. Please include as many useful details as possible.
When I tried to use RecordBatchStreamReader for reading parquet stream, I am getting this error.
data = parquet byte stream
reader = pa.RecordBatchStreamReader(data)
### Component(s)
Parquet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorisvandenbossche closed issue #37631: [Python][Parquet] pyarrow.lib.ArrowInvalid: Expected to read 827474256 metadata bytes, but only read 6635
Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche closed issue #37631: [Python][Parquet] pyarrow.lib.ArrowInvalid: Expected to read 827474256 metadata bytes, but only read 6635
URL: https://github.com/apache/arrow/issues/37631
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on issue #37631: [Python][Parquet] pyarrow.lib.ArrowInvalid: Expected to read 827474256 metadata bytes, but only read 6635
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on issue #37631:
URL: https://github.com/apache/arrow/issues/37631#issuecomment-1712267512
You can't use `pyarrow.RecordBatchStreamReader()` for Parquet data.
You can use `pyarrow.parquet.read_table()`.
See https://arrow.apache.org/docs/python/parquet.html for details.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org