You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Uwe L. Korn (JIRA)" <ji...@apache.org> on 2018/06/29 13:13:00 UTC
[jira] [Commented] (PARQUET-1343) Unable to read a parquet file
[ https://issues.apache.org/jira/browse/PARQUET-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527609#comment-16527609 ]
Uwe L. Korn commented on PARQUET-1343:
--------------------------------------
This sounds like your file really got corrupted. When it was working some days ago and you did not change artefact versions that this rather sounds like a data problem.
> Unable to read a parquet file
> ------------------------------
>
> Key: PARQUET-1343
> URL: https://issues.apache.org/jira/browse/PARQUET-1343
> Project: Parquet
> Issue Type: Bug
> Environment: Linux x86
> anaconda python 3.6
> pyarrow version 0.9.0
> Reporter: Alok
> Priority: Minor
>
> I am unable to read a parquet file that was made after converting a csv to a parquet file using pyarrow. Following is the error
> ArrowIOError Traceback (most recent call last) <timed exec> in <module>() ~/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py in read_table(source, columns, nthreads, metadata, use_pandas_metadata) 937 return fs.read_parquet(source, columns=columns, metadata=metadata) 938 --> 939 pf = ParquetFile(source, metadata=metadata) 940 return pf.read(columns=columns, nthreads=nthreads, 941 use_pandas_metadata=use_pandas_metadata) ~/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py in __init__(self, source, metadata, common_metadata) 62 self.reader = ParquetReader() 63 source = _ensure_file(source) ---> 64 self.reader.open(source, metadata=metadata) 65 self.common_metadata = common_metadata 66 self._nested_paths_by_prefix = self._build_nested_paths() _parquet.pyx in pyarrow._parquet.ParquetReader.open() error.pxi in pyarrow.lib.check_status() ArrowIOError: Invalid parquet file. Corrupt footer.
>
> I was able to read and write parquet file earlier (about a few days ago) and how its stopped working
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)