You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Ryan Blue (JIRA)" <ji...@apache.org> on 2018/05/24 18:25:00 UTC

[jira] [Updated] (PARQUET-1309) Parquet Java uses incorrect stats and dictionary filter properties

     [ https://issues.apache.org/jira/browse/PARQUET-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan Blue updated PARQUET-1309:
-------------------------------
    Description: In SPARK-24251, we found that the changes to use HadoopReadOptions accidentally switched the [properties that enable stats and dictionary filters|https://github.com/apache/parquet-mr/blob/8bbc6cb95fd9b4b9e86c924ca1e40fd555ecac1d/parquet-hadoop/src/main/java/org/apache/parquet/HadoopReadOptions.java#L83]. Both are enabled by default so it is unlikely that anyone will need to turn them off and there is an easy work-around, but we should fix the properties for 1.10.1. This doesn't affect the 1.8.x or 1.9.x releases (Spark 2.3.x is on 1.8.x).  (was: In SPARK-24251, we found that the changes to use HadoopReadOptions accidentally switched the [properties that enable stats and dictionary filters|https://github.com/apache/parquet-mr/blob/8bbc6cb95fd9b4b9e86c924ca1e40fd555ecac1d/parquet-hadoop/src/main/java/org/apache/parquet/HadoopReadOptions.java#L83]. Both are enabled by default so it is unlikely that anyone will need to turn them off and there is an easy work-around, but we should fix the properties for 1.10.0. This doesn't affect the 1.8.x or 1.9.x releases (Spark 2.3.x is on 1.8.x).)

> Parquet Java uses incorrect stats and dictionary filter properties
> ------------------------------------------------------------------
>
>                 Key: PARQUET-1309
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1309
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>            Reporter: Ryan Blue
>            Priority: Major
>             Fix For: 1.10.1
>
>
> In SPARK-24251, we found that the changes to use HadoopReadOptions accidentally switched the [properties that enable stats and dictionary filters|https://github.com/apache/parquet-mr/blob/8bbc6cb95fd9b4b9e86c924ca1e40fd555ecac1d/parquet-hadoop/src/main/java/org/apache/parquet/HadoopReadOptions.java#L83]. Both are enabled by default so it is unlikely that anyone will need to turn them off and there is an easy work-around, but we should fix the properties for 1.10.1. This doesn't affect the 1.8.x or 1.9.x releases (Spark 2.3.x is on 1.8.x).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)