You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/08 01:56:11 UTC

[GitHub] [hudi] prashanthvg89 opened a new issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

prashanthvg89 opened a new issue #2153:
URL: https://github.com/apache/hudi/issues/2153


   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   Random "Failed to delete key" error during UPSERT operation in a Spark Structured Streaming job even with "hoodie.consistency.check.enabled" set to true
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   UNKOWN - Occurs randomly after running streaming job - sometimes for 15 hours and other times for about couple days
   
   **Expected behavior**
   
   UPSERT should be a simple operation and if there is an application bug then we should have faced this as soon as I launched the application but it appears intermittently. The only resolution so far is to restart the job
   
   **Environment Description**
   
   * Hudi version : 0.5.2-incubating
   
   * Spark version : 2.4.4
   
   * Hive version : 2.3.6
   
   * Hadoop version : EMR 5.29.0
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   This is a simple streaming application which listens to Kinesis stream with a batch interval of 15 minutes and updates the Hudi table using MERGE_ON_READ
   
   **Stacktrace**
   
   ```Caused by: org.apache.hudi.exception.HoodieIOException: Failed to delete key: <tableName>/.hoodie/.temp/20201006182950
   	at org.apache.hudi.table.HoodieTable.deleteMarkerDir(HoodieTable.java:333)
   	at org.apache.hudi.table.HoodieTable.cleanFailedWrites(HoodieTable.java:409)
   	at org.apache.hudi.table.HoodieTable.finalizeWrite(HoodieTable.java:315)
   	at org.apache.hudi.table.HoodieMergeOnReadTable.finalizeWrite(HoodieMergeOnReadTable.java:317)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.finalizeWrite(AbstractHoodieWriteClient.java:195)
   	... 66 more
   Caused by: java.io.IOException: Failed to delete key: <tableName>/.hoodie/.temp/20201006182950
   	at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.delete(S3NativeFileSystem.java:767)
   	at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.delete(EmrFileSystem.java:337)
   	at org.apache.hudi.common.io.storage.HoodieWrapperFileSystem.delete(HoodieWrapperFileSystem.java:261)
   	at org.apache.hudi.table.HoodieTable.deleteMarkerDir(HoodieTable.java:330)
   	... 70 more
   Caused by: java.io.IOException: 1 exceptions thrown from 5 batch deletes
   	at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.deleteAll(Jets3tNativeFileSystemStore.java:390)
   	at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.doSingleThreadedBatchDelete(S3NativeFileSystem.java:1494)
   	at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.delete(S3NativeFileSystem.java:765)
   	... 73 more
   Caused by: java.io.IOException: MultiObjectDeleteException thrown with 2 keys in error: <tableName>/.hoodie/.temp/20201006182950/195/2ecca5ce-ba13-4d5a-a2e3-79713261dc49-0_2061-53-37527_20201006182950.marker, <tableName>/.hoodie/.temp/20201006182950/290_$folder$
   	at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.deleteAll(Jets3tNativeFileSystemStore.java:375)
   	... 75 more
   Caused by: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.MultiObjectDeleteException: One or more objects could not be deleted (Service: null; Status Code: 200; Error Code: null; Request ID: 457C53995454141D; S3 Extended Request ID: NKQEApW06BHPRG5oQBP4RffTd6OZQXOCNl6jurU690Ee+iE3cgbRbbtNPugjqa3qyADj6x5zqBk=), S3 Extended Request ID: NKQEApW06BHPRG5oQBP4RffTd6OZQXOCNl6jurU690Ee+iE3cgbRbbtNPugjqa3qyADj6x5zqBk=
   	at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.deleteObjects(AmazonS3Client.java:2267)
   	at com.amazon.ws.emr.hadoop.fs.s3.lite.call.DeleteObjectsCall.perform(DeleteObjectsCall.java:24)
   	at com.amazon.ws.emr.hadoop.fs.s3.lite.call.DeleteObjectsCall.perform(DeleteObjectsCall.java:10)
   	at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor.execute(GlobalS3Executor.java:110)
   	at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:189)
   	at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:184)
   	at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.deleteObjects(AmazonS3LiteClient.java:128)
   	at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.deleteAll(Jets3tNativeFileSystemStore.java:370)```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] prashanthvg89 commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
prashanthvg89 commented on issue #2153:
URL: https://github.com/apache/hudi/issues/2153#issuecomment-706450360


   Sure, will try emr-5.30.0 and hudi-0.6.0. The issue appears intermittently. Since the upgrade is the only option right now, may be we can close the issue. I'll reopen if I see it again even after upgrades. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2153:
URL: https://github.com/apache/hudi/issues/2153#issuecomment-705418948






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2153:
URL: https://github.com/apache/hudi/issues/2153#issuecomment-705418948


   @umehrot2 : Can you throw some light here ? when will EMR/S3 throw this error ? Is this server-side issue which will go away with retry ?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2153:
URL: https://github.com/apache/hudi/issues/2153#issuecomment-706733081


   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] prashanthvg89 commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
prashanthvg89 commented on issue #2153:
URL: https://github.com/apache/hudi/issues/2153#issuecomment-705764448


   Could happen due to race condition https://stackoverflow.com/questions/38750638/spark-1-6-1-s3-multiobjectdeleteexception
   
   I have about 100 retries on S3 failures in my application


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar closed issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
bvaradar closed issue #2153:
URL: https://github.com/apache/hudi/issues/2153


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2153:
URL: https://github.com/apache/hudi/issues/2153#issuecomment-705792876


   With 0.6.0, you can set hoodie.fail.on.timeline.archiving=false to make it non-fatal


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] prashanthvg89 commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
prashanthvg89 commented on issue #2153:
URL: https://github.com/apache/hudi/issues/2153#issuecomment-705764448


   Could happen due to race condition https://stackoverflow.com/questions/38750638/spark-1-6-1-s3-multiobjectdeleteexception
   
   I have about 100 retries on S3 failures in my application


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2153:
URL: https://github.com/apache/hudi/issues/2153#issuecomment-705792507


   @prashanthvg89 : With 0.6.0 release, this was no longer a fatal error. Can you try that version ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] umehrot2 commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

Posted by GitBox <gi...@apache.org>.
umehrot2 commented on issue #2153:
URL: https://github.com/apache/hudi/issues/2153#issuecomment-706423177


   I think this is a manifestation of a bug with EmrFS that has been fixed in future EMR releases. Is it possible for you to give any release >= `emr-5.30.0` a shot ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org