You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2015/01/17 18:41:34 UTC
[jira] [Commented] (HADOOP-11487) NativeS3FileSystem.getStatus must retry on FileNotFoundException

    [ https://issues.apache.org/jira/browse/HADOOP-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281462#comment-14281462 ] 

Steve Loughran commented on HADOOP-11487:
-----------------------------------------

# Which version of hadoop?
# Which S3 zone? Only US-east lacks create consistency

Blobstores are the bane of our lives. They aren't real filesystems...really code around it needs to recognise this and act on it, though as they all have standard expectations of files and their metadata, that's not easy

It's not enough to retry on FS status as there are other inconsistencies: directory renames and deletes, blob updates, etc. 

There's a new FS client,  s3a, in hadoop 2.6 which is where all future fs/s3 work is going on. Try it to see if it is any better, though I doubt it.

If we were to fix it, the route would be to go with something derived off NetFlix S3mper. Retrying on a 404 is not sufficient.

> NativeS3FileSystem.getStatus must retry on FileNotFoundException
> ----------------------------------------------------------------
>
>                 Key: HADOOP-11487
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11487
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, fs/s3
>            Reporter: Paulo Motta
>
> I'm trying to copy a large amount of files from HDFS to S3 via distcp and I'm getting the following exception:
> {code:java}
> 2015-01-16 20:53:18,187 ERROR [main] org.apache.hadoop.tools.mapred.CopyMapper: Failure in copying hdfs://10.165.35.216/hdfsFolder/file.gz to s3n://s3-bucket/file.gz
> java.io.FileNotFoundException: No such file or directory 's3n://s3-bucket/file.gz'
> 	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
> 	at org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
> 	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
> 	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> 2015-01-16 20:53:18,276 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.FileNotFoundException: No such file or directory 's3n://s3-bucket/file.gz'
> 	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
> 	at org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
> 	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
> 	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> {code}
> However, when I try hadoop fs -ls s3n://s3-bucket/file.gz the file is there. So probably due to Amazon's S3 eventual consistency the job failure.
> In my opinion, in order to fix this problem NativeS3FileSystem.getFileStatus must use fs.s3.maxRetries property in order to avoid failures like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)