You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yin Huai (JIRA)" <ji...@apache.org> on 2015/05/27 19:10:17 UTC

[jira] [Resolved] (SPARK-7847) Fix dynamic partition path escaping

     [ https://issues.apache.org/jira/browse/SPARK-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yin Huai resolved SPARK-7847.
-----------------------------
       Resolution: Fixed
    Fix Version/s: 1.4.0

Issue resolved by pull request 6389
[https://github.com/apache/spark/pull/6389]

> Fix dynamic partition path escaping
> -----------------------------------
>
>                 Key: SPARK-7847
>                 URL: https://issues.apache.org/jira/browse/SPARK-7847
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.0, 1.3.1, 1.4.0
>            Reporter: Cheng Lian
>            Assignee: Cheng Lian
>            Priority: Critical
>             Fix For: 1.4.0
>
>
> Background: when writing dynamic partitions, partition values are converted to string and escaped if necessary. For example, a partition column {{p}} of type {{String}} may have a value {{A/B}}, then the corresponding partition directory name is escaped into {{p=A%2fB}}.
> Currently, there are two issues regarding to dynamic partition path escaping. The first issue is that, when reading back partition values, escaped strings are not unescaped. This one is easy to fix.
> The second issue is more subtle. In [PR #5381|https://github.com/apache/spark/pull/5381/files#diff-c69b9e667e93b7e4693812cc72abb65fR492] we tried to use {{Path.toUri.toString}} to fix an escaping issue related to S3 credentials with {{/}} character. Unfortunately, {{Path.toUri.toString}} also escapes {{%}} characters in the path. Thus, using the dynamic partitioning case mentioned above, {{p=A%2fB}} is double escaped into {{p=A%252fB}} ({{%}} escaped into {{%25}}).
> The expected behavior here should be, only escaping the URI user info part (S3 key and secret) but leave all other components untouched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org