You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2022/01/02 20:36:00 UTC

[jira] [Commented] (HUDI-3059) save point rollback not working with hudi-cli

    [ https://issues.apache.org/jira/browse/HUDI-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467698#comment-17467698 ] 

sivabalan narayanan commented on HUDI-3059:
-------------------------------------------

proper way to do savepoint via CLI
{code:java}
savepoint create --commit 20211230071611857 --sparkMaster local[2] --sparkMemory 2g {code}
should explicitly pass in sparkMaster and sparkMemory. setting via configs does not work. 

> save point rollback not working with hudi-cli
> ---------------------------------------------
>
>                 Key: HUDI-3059
>                 URL: https://issues.apache.org/jira/browse/HUDI-3059
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Usability
>            Reporter: sivabalan narayanan
>            Assignee: Harshal Patil
>            Priority: Major
>              Labels: sev:critical
>
> Ref issue:
> [https://github.com/apache/hudi/issues/3870]
>  
>  # create Hudi dataset
>  # add some data so there are multiple commits
>  # create a savepoint
>  # try to rollback savepoint
>  
> I tried locally. 
> 1. savepoint creation with spark master not recognized, fails even after setting spark master.
> {code:java}
> hudi:hudi_trips_cow->set --conf SPARK_MASTER=local[2]
> hudi:hudi_trips_cow->savepoint create --commit 20211217183516921
> 254601 [Spring Shell] INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline  - Loaded instants upto : Option{val=[20211217183516921__commit__COMPLETED]}
> 21/12/17 19:36:45 WARN Utils: Your hostname, Sivabalans-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 192.168.70.90 instead (on interface en0)
> 21/12/17 19:36:45 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
> 21/12/17 19:36:45 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> .
> .
> .
> 21/12/17 19:36:18 ERROR SparkContext: Error initializing SparkContext.
> org.apache.spark.SparkException: Could not parse Master URL: ''
>     at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2784)
>     at org.apache.spark.SparkContext.<init>(SparkContext.scala:493)
>     at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
>     at org.apache.hudi.cli.utils.SparkUtil.initJavaSparkConf(SparkUtil.java:115)
>     at org.apache.hudi.cli.utils.SparkUtil.initJavaSparkConf(SparkUtil.java:110)
>     at org.apache.hudi.cli.commands.SparkMain.main(SparkMain.java:88)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
>     at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
>     at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
>     at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
>     at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
>     at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 21/12/17 19:36:18 INFO SparkUI: Stopped Spark web UI at http://192.168.70.90:4042
> 21/12/17 19:36:18 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
> 21/12/17 19:36:18 INFO MemoryStore: MemoryStore cleared
> 21/12/17 19:36:18 INFO BlockManager: BlockManager stopped
> 21/12/17 19:36:18 INFO BlockManagerMaster: BlockManagerMaster stopped
> 21/12/17 19:36:18 WARN MetricsSystem: Stopping a MetricsSystem that is not running
> 21/12/17 19:36:18 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
> 21/12/17 19:36:18 INFO SparkContext: Successfully stopped SparkContext
> Exception in thread "main" org.apache.spark.SparkException: Could not parse Master URL: ''
>     at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2784)
>     at org.apache.spark.SparkContext.<init>(SparkContext.scala:493)
>     at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
>     at org.apache.hudi.cli.utils.SparkUtil.initJavaSparkConf(SparkUtil.java:115)
>     at org.apache.hudi.cli.utils.SparkUtil.initJavaSparkConf(SparkUtil.java:110)
>     at org.apache.hudi.cli.commands.SparkMain.main(SparkMain.java:88)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
>     at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
>     at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
>     at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
>     at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
>     at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 21/12/17 19:36:18 INFO ShutdownHookManager: Shutdown hook called
> 21/12/17 19:36:18 INFO ShutdownHookManager: Deleting directory /private/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/spark-db7da71a-bb1c-453b-b43b-640c080aaf2a
> 21/12/17 19:36:18 INFO ShutdownHookManager: Deleting directory /private/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/spark-ef4048e5-6071-458a-9a54-05bf70157db2 {code}
> Locally made a fix for now and get past the issue. but we need to fix it properly. 
> {code:java}
> diff --git a/hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java b/hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java
> index d1ee109f5..f925bdb0c 100644
> --- a/hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java
> +++ b/hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java
> @@ -82,11 +82,11 @@ public class SparkMain {
>    public static void main(String[] args) throws Exception {
>      ValidationUtils.checkArgument(args.length >= 4);
>      final String commandString = args[0];
> -    LOG.info("Invoking SparkMain: " + commandString);
> +    LOG.warn("Invoking SparkMain: " + commandString);
>      final SparkCommand cmd = SparkCommand.valueOf(commandString);
>  
>      JavaSparkContext jsc = SparkUtil.initJavaSparkConf("hoodie-cli-" + commandString,
> -        Option.of(args[1]), Option.of(args[2]));
> +        Option.of("local[2]"), Option.of(args[2]));
>  
>      int returnCode = 0;
>      try { {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)