You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@parquet.apache.org by "Uwe L. Korn (JIRA)" <ji...@apache.org> on 2018/02/18 20:37:00 UTC

[jira] [Updated] (PARQUET-1036) parquet file created via pyarrow 0.4.0 ; version 1.0 - incompatible with Spark

     [ https://issues.apache.org/jira/browse/PARQUET-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe L. Korn updated PARQUET-1036:
---------------------------------
    Fix Version/s: cpp-1.5.0

> parquet file created via pyarrow 0.4.0 ; version 1.0 - incompatible with Spark
> ------------------------------------------------------------------------------
>
>                 Key: PARQUET-1036
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1036
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Ashima Sood
>            Priority: Blocker
>             Fix For: cpp-1.5.0
>
>
> using spark sql unable to read parquet file and shows null values. whereas hive reads the values fine.
> 17/06/19 17:50:36 WARN CorruptStatistics: Ignoring statistics because created_by could not be parsed (see PARQUET-251): parquet-cpp version 1.0.0
> org.apache.parquet.VersionParser$VersionParseException: Could not parse created_by: parquet-cpp version 1.0.0 using format: (.+) version ((.*) )?\(build ?(.*)\)
>                 at org.apache.parquet.VersionParser.parse(VersionParser.java:112)
>                 at org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:60)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)