You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/01/20 09:12:11 UTC

[GitHub] [hudi] codope commented on a change in pull request #4628: [HUDI-2521] Add doc for hive sync modes

codope commented on a change in pull request #4628:
URL: https://github.com/apache/hudi/pull/4628#discussion_r788544667



##########
File path: website/docs/syncing_metastore.md
##########
@@ -22,6 +22,61 @@ cd hudi-hive
  [hudi-hive]$ ./run_sync_tool.sh --help
 ```
 
+## Hive Sync Configuration
+
+Please take a look at the arguments that can be passed to `run_sync_tool` in [HiveSyncConfig](https://github.com/apache/hudi/blob/master/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfig.java).
+Among them, following are the required arguments:
+```java
+@Parameter(names = {"--database"}, description = "name of the target database in Hive", required = true);
+@Parameter(names = {"--table"}, description = "name of the target table in Hive", required = true);
+@Parameter(names = {"--base-path"}, description = "Basepath of hoodie table to sync", required = true);## Sync modes
+```
+Corresponding datasource options for the most commonly used hive sync configs are as follows:
+
+| HiveSyncConfig | DataSourceWriteOption | Description |
+| -----------   | ----------- | ----------- |
+| --database       | hoodie.datasource.hive_sync.database       | name of the target database in Hive       |
+| --table   | hoodie.datasource.hive_sync.table        | name of the target table in Hive        |
+| --user   | hoodie.datasource.hive_sync.username        | username for hive metastore        | 
+| --pass   | hoodie.datasource.hive_sync.password        | password for hive metastore        | 
+| --use-jdbc   | hoodie.datasource.hive_sync.use_jdbc        | use JDBC to connect to metastore        | 
+| --jdbc-url   | hoodie.datasource.hive_sync.jdbcurl        | Hive metastore url        |
+| --sync-mode   | hoodie.datasource.hive_sync.mode        | Mode to choose for Hive ops. Valid values are hms, jdbc and hiveql.        |
+| --partitioned-by   | hoodie.datasource.hive_sync.partition_fields        | Comma-separated column names in the table to use for determining hive partition.        |
+| --partition-value-extractor   | hoodie.datasource.hive_sync.partition_extractor_class        | Class which implements PartitionValueExtractor to extract the partition values. `SlashEncodedDayPartitionValueExtractor` by default.        |
+| --jdbc-url   | hoodie.datasource.hive_sync.jdbcurl        | Hive metastore url        |

Review comment:
       thanks.. removed it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org