You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "lvhu (Jira)" <ji...@apache.org> on 2023/02/16 04:10:00 UTC

[jira] [Comment Edited] (HUDI-5810) Add hash partition

    [ https://issues.apache.org/jira/browse/HUDI-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689494#comment-17689494 ] 

lvhu edited comment on HUDI-5810 at 2/16/23 4:09 AM:
-----------------------------------------------------

The PR 7975  implements hash partitioning.

[https://github.com/apache/hudi/pull/7975]

How to use hash partition in spark data source can refer to hudi/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala#testHashPartition


was (Author: JIRAUSER297487):
The PR 7975  implements hash partitioning.

https://github.com/apache/hudi/pull/7975

> Add hash partition
> ------------------
>
>                 Key: HUDI-5810
>                 URL: https://issues.apache.org/jira/browse/HUDI-5810
>             Project: Apache Hudi
>          Issue Type: New Feature
>          Components: spark
>            Reporter: lvhu
>            Assignee: lvhu
>            Priority: Major
>              Labels: pull-request-available
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> It is often difficult to find an appropriate partition key in the existing data. Hash partitioning can easily solve this problem
> When hash.partition.fields is specified and partition.fields contains _hoodie_hash_partition, a column named _hoodie_hash_partition will be added in this table as one of the partition key.
> If predicates of hash.partition.fields appear in the query statement, the _hoodie_hash_partition = X predicate will be automatically added to the query statement for partition pruning.
> The PR 7975  implements hash partitioning.
> [https://github.com/apache/hudi/pull/7975]
> How to use hash partition in spark data source can refer to hudi/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala#testHashPartition



--
This message was sent by Atlassian Jira
(v8.20.10#820010)