You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/01/21 16:42:43 UTC

[GitHub] [hudi] ganczarek opened a new issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

ganczarek opened a new issue #4666:
URL: https://github.com/apache/hudi/issues/4666


   When downgrading v2 table to v1 with hudi-cli, Hudi fails to delete a file that doesn't exist (see stack trace below). I suspect it's fine to just ignore files that don't exist during deletion.
   
   I don't know how I ended up having only `.hoodie/20220121153537.commit.requested` without corresponding `.hoodie/.temp/20220121153537`, but manual deletion of `.hoodie/20220121153537.commit.requested` fixed the problem, though perhaps it can have dire consequences later.
   
   **Environment Description**
   
   * Hudi version : 0.10.0
   * Spark version : 3.1.1
   * Hadoop version : 3.2.1
   * Storage : S3
   * Running on Docker? : no
   
   **Additional context**
   
   I built hudi-cli myself from `tags/release-0.10.0-rc2` with the following flags
   ```
   mvn clean package -DskipTests -Dscala-2.12 -Dspark3
   ```
   
   Downgrade command in hudi-cli:
   ```
   downgrade table --toVersion ONE --sparkProperties /etc/spark/conf/spark-defaults.conf --sparkMaster local
   ```
   
   **Stacktrace**
   
   ```
   22/01/21 16:25:31 WARN SparkMain: Failed: Could not upgrade/downgrade table at "s3://bucket/table" to version "ONE".
   org.apache.hudi.exception.HoodieIOException: File s3://bucket/table/.hoodie/.temp/20220121153537 does not exist.
   hudi:rawat org.apache.hudi.common.fs.FSUtils.parallelizeSubPathProcess(FSUtils.java:684)
   hudi:rawat org.apache.hudi.table.upgrade.TwoToOneDowngradeHandler.deleteTimelineBasedMarkerFiles(TwoToOneDowngradeHandler.java:129)
   hudi:rawat org.apache.hudi.table.upgrade.TwoToOneDowngradeHandler.convertToDirectMarkers(TwoToOneDowngradeHandler.java:120)
   hudi:rawat org.apache.hudi.table.upgrade.TwoToOneDowngradeHandler.downgrade(TwoToOneDowngradeHandler.java:67)
   hudi:rawat org.apache.hudi.table.upgrade.UpgradeDowngrade.downgrade(UpgradeDowngrade.java:155)
   hudi:rawat org.apache.hudi.table.upgrade.UpgradeDowngrade.run(UpgradeDowngrade.java:125)
   hudi:rawat org.apache.hudi.cli.commands.SparkMain.upgradeOrDowngradeTable(SparkMain.java:462)
   hudi:rawat org.apache.hudi.cli.commands.SparkMain.main(SparkMain.java:230)
   hudi:rawat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   hudi:rawat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   hudi:rawat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   hudi:rawat java.lang.reflect.Method.invoke(Method.java:498)
   hudi:rawat org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   hudi:rawat org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
   hudi:rawat org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   hudi:rawat org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   hudi:rawat org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   hudi:rawat org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1038)
   hudi:rawat org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1047)
   hudi:rawat org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: java.io.FileNotFoundException: File s3://bucket/table/.hoodie/.temp/20220121153537 does not exist.
   hudi:rawat com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.listStatus(S3NativeFileSystem.java:709)
   hudi:rawat com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.listStatus(S3NativeFileSystem.java:636)
   hudi:rawat com.amazon.ws.emr.hadoop.fs.EmrFileSystem.listStatus(EmrFileSystem.java:473)
   hudi:rawat org.apache.hudi.common.fs.FSUtils.parallelizeSubPathProcess(FSUtils.java:677)
   hudi:raw... 19 
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yihua commented on issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #4666:
URL: https://github.com/apache/hudi/issues/4666#issuecomment-1028487024


   > hey @ganczarek : looks like there is a bug https://issues.apache.org/jira/browse/HUDI-3346. we will work on the fix. should be straight forward. Atleast in this case, manually deleting the commit meta file just for this instant should be fine. Here is what could have resulted in this.
   > 
   > Just before downgrade, a commit was started, but before going into inflight or before a single marker file could be created, the process crashed. And so there is no marker dir only created for this commit. Downgrade code missed to check for existence in one place (but there are other places where this check is made) and so it failed.
   > 
   > I have created a tracking [ticket](https://issues.apache.org/jira/browse/HUDI-3346) here. We are good to close this.
   
   Sorry, I missed this.  There should be a check for marker directory before trying to delete it, which is missing before.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4666:
URL: https://github.com/apache/hudi/issues/4666#issuecomment-1018896431


   @yihua : Can you assist here please. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4666:
URL: https://github.com/apache/hudi/issues/4666#issuecomment-1025672081


   hey @ganczarek : looks like there is a bug https://issues.apache.org/jira/browse/HUDI-3346.  we will work on the fix. should be straight forward. Atleast in this case, manually deleting the commit meta file just for this instant should be fine. Here is what could have resulted in this. 
   
   Just before downgrade, a commit was started, but before going into inflight or before a single marker file could be created, the process crashed. And so there is no marker dir only created for this commit. Downgrade code missed to check for existence in one place (but there are other places where this check is made) and so it failed. 
   
   I have created a tracking [ticket](https://issues.apache.org/jira/browse/HUDI-3346) here. We are good to close this. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] ganczarek commented on issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

Posted by GitBox <gi...@apache.org>.
ganczarek commented on issue #4666:
URL: https://github.com/apache/hudi/issues/4666#issuecomment-1026769555


   @nsivabalan Thank you for looking into. I'm sorry, but I'm not able to test this fix. I have already downgraded the table after manually deleting the `*.commit.requested` file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan closed issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

Posted by GitBox <gi...@apache.org>.
nsivabalan closed issue #4666:
URL: https://github.com/apache/hudi/issues/4666


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4666:
URL: https://github.com/apache/hudi/issues/4666#issuecomment-1026937387


   got it, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4666:
URL: https://github.com/apache/hudi/issues/4666#issuecomment-1025974598


   @ganczarek : Can you verify if [this](https://github.com/apache/hudi/pull/4726) fix works. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org