You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Davies Liu (JIRA)" <ji...@apache.org> on 2016/03/03 00:04:18 UTC

[jira] [Resolved] (SPARK-13574) Improve parquet dictionary decoding for strings

     [ https://issues.apache.org/jira/browse/SPARK-13574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Davies Liu resolved SPARK-13574.
--------------------------------
       Resolution: Fixed
    Fix Version/s: 2.0.0

Issue resolved by pull request 11454
[https://github.com/apache/spark/pull/11454]

> Improve parquet dictionary decoding for strings
> -----------------------------------------------
>
>                 Key: SPARK-13574
>                 URL: https://issues.apache.org/jira/browse/SPARK-13574
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Nong Li
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> Currently, the parquet reader will copy the dictionary value for each data value. This is bad for string columns as we explode the dictionary during decode. We should instead, have the data values point to the safe backing memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org