You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "Janaki Lahorani (JIRA)" <ji...@apache.org> on 2018/03/01 19:56:00 UTC

[jira] [Assigned] (HIVE-18603) Use Hash For Partition HDFS File Path

     [ https://issues.apache.org/jira/browse/HIVE-18603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Janaki Lahorani reassigned HIVE-18603:
--------------------------------------

    Assignee: Janaki Lahorani

> Use Hash For Partition HDFS File Path
> -------------------------------------
>
>                 Key: HIVE-18603
>                 URL: https://issues.apache.org/jira/browse/HIVE-18603
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2
>    Affects Versions: 1.2.0, 2.3.0, 3.0.0, 2.4.0
>            Reporter: BELUGA BEHR
>            Assignee: Janaki Lahorani
>            Priority: Minor
>
> Currently, for partitioned tables, Hive uses the literal value of each partition in the HDFS file path.  Instead, perhaps we can use a hash value so that:
>  
>  # The partitioned values are obscured to a casual observer in HDFS
>  # Remove the chance of having a very long HDFS file name when faced with a very long partitioned value
>  # Remove the needs to worry about special characters in the partitioned path name as the hash value would only be HEX string values.
>  
> The suggestion here is that we retain the partition values, just as is done now, but the default HDFS location for each partition will use the hash of the value instead of the value itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)