You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Pratyaksh Sharma (Jira)" <ji...@apache.org> on 2019/12/30 15:24:00 UTC

[jira] [Comment Edited] (HUDI-486) Improve documentation for using HiveIncrementalPuller

    [ https://issues.apache.org/jira/browse/HUDI-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005392#comment-17005392 ] 

Pratyaksh Sharma edited comment on HUDI-486 at 12/30/19 3:23 PM:
-----------------------------------------------------------------

Here is the working command - 

java -cp /tmp/jars/commons-io-2.6.jar:/tmp/jars/protobuf-java-2.5.0.jar:/tmp/jars/commons-cli-1.2.jar:/tmp/jars/htrace-core-3.1.0-incubating.jar:/tmp/jars/hadoop-hdfs-2.7.3.jar:/tmp/jars/hive-serde-2.3.1.jar:/tmp/jars/hive-common-2.3.1.jar:/tmp/jars/hive-service-2.3.1.jar:/tmp/jars/hive-service-rpc-2.3.1.jar:/tmp/jars/httpcore-4.3.2.jar:/tmp/jars/libthrift-0.12.0.jar:/tmp/jars/slf4j-api-1.7.16.jar:/tmp/jars/hadoop-auth-2.7.3.jar:/var/hoodie/ws/docker/demo/config/commons-lang-2.6.jar:/var/hoodie/ws/docker/demo/config/commons-configuration-1.6.jar:/var/hoodie/ws/docker/demo/config/commons-collections-3.2.2.jar:/var/hoodie/ws/docker/demo/config/guava-15.0.jar:/var/hoodie/ws/docker/demo/config/commons-logging-1.1.3.jar:/var/hoodie/ws/docker/demo/config/hadoop-common-2.7.3.jar:/var/hoodie/ws/docker/demo/config/hive-jdbc-2.3.1.jar:/var/hoodie/ws/docker/demo/config/log4j-1.2.17.jar:/var/hoodie/ws/docker/demo/config/antlr-runtime-3.5.2.jar:$HUDI_UTILITIES_BUNDLE org.apache.hudi.utilities.HiveIncrementalPuller --hiveUrl jdbc:hive2://hiveserver:10000 --hiveUser hive --hivePass hive --extractSQLFile /var/hoodie/ws/docker/demo/config/incr_pull.txt --sourceDb default --sourceTable stock_ticks_cow --targetDb tmp --targetTable tempTable --fromCommitTime 0 --maxCommits 1

Options - 
 # Mention the above command somewhere
 # List down the jars needed
 # create a script similar to how we have for HiveSyncTool. 

We can choose either one of the above. [~vinoth] WDYT?


was (Author: pratyaksh):
Here is the working command - java -cp /tmp/jars/commons-io-2.6.jar:/tmp/jars/protobuf-java-2.5.0.jar:/tmp/jars/commons-cli-1.2.jar:/tmp/jars/htrace-core-3.1.0-incubating.jar:/tmp/jars/hadoop-hdfs-2.7.3.jar:/tmp/jars/hive-serde-2.3.1.jar:/tmp/jars/hive-common-2.3.1.jar:/tmp/jars/hive-service-2.3.1.jar:/tmp/jars/hive-service-rpc-2.3.1.jar:/tmp/jars/httpcore-4.3.2.jar:/tmp/jars/libthrift-0.12.0.jar:/tmp/jars/slf4j-api-1.7.16.jar:/tmp/jars/hadoop-auth-2.7.3.jar:/var/hoodie/ws/docker/demo/config/commons-lang-2.6.jar:/var/hoodie/ws/docker/demo/config/commons-configuration-1.6.jar:/var/hoodie/ws/docker/demo/config/commons-collections-3.2.2.jar:/var/hoodie/ws/docker/demo/config/guava-15.0.jar:/var/hoodie/ws/docker/demo/config/commons-logging-1.1.3.jar:/var/hoodie/ws/docker/demo/config/hadoop-common-2.7.3.jar:/var/hoodie/ws/docker/demo/config/hive-jdbc-2.3.1.jar:/var/hoodie/ws/docker/demo/config/log4j-1.2.17.jar:/var/hoodie/ws/docker/demo/config/antlr-runtime-3.5.2.jar:$HUDI_UTILITIES_BUNDLE org.apache.hudi.utilities.HiveIncrementalPuller --hiveUrl jdbc:hive2://hiveserver:10000 --hiveUser hive --hivePass hive --extractSQLFile /var/hoodie/ws/docker/demo/config/incr_pull.txt --sourceDb default --sourceTable stock_ticks_cow --targetDb tmp --targetTable tempTable --fromCommitTime 0 --maxCommits 1

Options - 
 # Mention the above command somewhere
 # List down the jars needed
 # create a script similar to how we have for HiveSyncTool. 

We can choose either one of the above. [~vinoth] WDYT?

> Improve documentation for using HiveIncrementalPuller
> -----------------------------------------------------
>
>                 Key: HUDI-486
>                 URL: https://issues.apache.org/jira/browse/HUDI-486
>             Project: Apache Hudi (incubating)
>          Issue Type: Sub-task
>          Components: Incremental Pull
>            Reporter: Pratyaksh Sharma
>            Assignee: Pratyaksh Sharma
>            Priority: Major
>
> For using HiveIncrementalPuller, one needs to have a lot of jars in classPath. These jars are not listed anywhere. As a result, one has to keep on adding the jars incrementally to the classPath with every NoClassDefFoundError coming up when executing. 
> We should list down the jars needed so that it becomes easy for a first-time user to use the mentioned tool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)