You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Cheng Lian (JIRA)" <ji...@apache.org> on 2015/05/24 14:47:17 UTC

[jira] [Created] (SPARK-7847) Fix dynamic partition path escaping

Cheng Lian created SPARK-7847:
---------------------------------

             Summary: Fix dynamic partition path escaping
                 Key: SPARK-7847
                 URL: https://issues.apache.org/jira/browse/SPARK-7847
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.3.1, 1.3.0, 1.4.0
            Reporter: Cheng Lian
            Assignee: Cheng Lian
            Priority: Critical


Background: when writing dynamic partitions, partition values are converted to string and escaped if necessary. For example, a partition column {{p}} of type {{String}} may have a value {{A/B}}, then the corresponding partition directory name is escaped into {{p=A%2fB}}.

Currently, there are two issues regarding to dynamic partition path escaping. The first issue is that, when reading back partition values, escaped strings are not unescaped. This one is easy to fix.

The second issue is more subtle. In [PR #5381|https://github.com/apache/spark/pull/5381/files#diff-c69b9e667e93b7e4693812cc72abb65fR492] we tried to use {{Path.toUri.toString}} to fix an escaping issue related to S3 credentials with {{/}} character. Unfortunately, {{Path.toUri.toString}} also escapes {{%}} characters in the path. Thus, using the dynamic partitioning case mentioned above, {{p=A%2fB}} is double escaped into {{p=A%252fB}} ({{%}} escaped into {{%25}}).

The expected behavior here should be, only escaping the URI user info part (S3 key and secret) but leave all other components untouched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org