You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Ian Nowland (JIRA)" <ji...@apache.org> on 2009/05/14 19:57:45 UTC
[jira] Created: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
------------------------------------------------------------------------------------------------
Key: HADOOP-5836
URL: https://issues.apache.org/jira/browse/HADOOP-5836
Project: Hadoop Core
Issue Type: Bug
Components: fs/s3
Affects Versions: 0.18.3
Reporter: Ian Nowland
Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714956#action_12714956 ]
Hadoop QA commented on HADOOP-5836:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12409482/HADOOP-5836-2.patch
against trunk revision 780465.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 10 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 Eclipse classpath. The patch retains Eclipse classpath integrity.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed core unit tests.
-1 contrib tests. The patch failed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/console
This message is automatically generated.
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Nowland updated HADOOP-5836:
--------------------------------
Attachment: (was: HADOOP-5836-1.patch)
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711918#action_12711918 ]
Hadoop QA commented on HADOOP-5836:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12408154/HADOOP-5836-0.patch
against trunk revision 777330.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 11 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/376/console
This message is automatically generated.
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Nowland updated HADOOP-5836:
--------------------------------
Status: Patch Available (was: Open)
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch, HADOOP-5836-1.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Nowland updated HADOOP-5836:
--------------------------------
Status: Patch Available (was: Open)
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Tom White (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713497#action_12713497 ]
Tom White commented on HADOOP-5836:
-----------------------------------
These changes look good. A few comments
* Have you run Jets3tNativeS3FileSystemContractTest? This isn't run by default since it needs an S3 account to test with. This serves as a good regression test.
* There's a mixture of debug-level and info-level debugging here. How noisy is this in practice? Shouldn't it be mainly debug, so folks can enable it when they hit problems?
* Some of the indentation looks wrong in the patch - e.g. in handleServiceException(S3ServiceException).
* The patch doesn't apply cleanly anymore and needs regenerating.
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Nowland updated HADOOP-5836:
--------------------------------
Attachment: HADOOP-5836-1.patch
* Yes, I have run Jets3tNativeS3FileSystemContractTest. Multiple Times in fact, including for the newest patch :).
* I have reworked logging, making everything debug, except the following:
+ LOG.info("Opening key '" + key + "' for reading at position '" + pos + "'");
+ LOG.info("OutputStream for key '" + key + "' writing to tempfile '" + this.backupFile + "'");
+ LOG.info("OutputStream for key '" + key + "' closed. Now beginning upload");
+ LOG.info("OutputStream for key '" + key + "' upload complete");
+ LOG.info("Opening '" + f + "' for reading");
The basic idea is I want to always capture in a tasks syslog what S3 files it is reading from as this is very useful when a subset of tasks fail. Also I wanted to capture the time spent in actually uploading the file to S3 very specifically.
* Good catch - must have happened as part of the diff I did ignoring whitespace. I have now gone through with a fine tooth comb and fixed all indentation issues I could see.
* Done
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch, HADOOP-5836-1.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Nowland updated HADOOP-5836:
--------------------------------
Attachment: HADOOP-5836-0.patch
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Nowland updated HADOOP-5836:
--------------------------------
Fix Version/s: 0.21.0
Status: Patch Available (was: Open)
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Tom White (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom White updated HADOOP-5836:
------------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
I've just committed this. Thanks Ian!
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714659#action_12714659 ]
Hadoop QA commented on HADOOP-5836:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12409198/HADOOP-5836-1.patch
against trunk revision 780114.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 11 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/430/console
This message is automatically generated.
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch, HADOOP-5836-1.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709509#action_12709509 ]
Ian Nowland commented on HADOOP-5836:
-------------------------------------
The main fix here is to check for and just not return this empty file in listStatus(). However along with this, I broadened handling in all S3N methods for the different ways of designating directories in S3, in this way:
* A note about directories. S3 of course has no "native" support for them.
* The idiom we choose then is: for any directory created by this class,
* we use an empty object "#{dirpath}_$folder$" as a marker.
* Further, to interoperate with other S3 tools, we also accept the following:
* - an object "#{dirpath}/' denoting a directory marker
* - if there exists any objects with the prefix "#{dirpath}/", then the
* directory is said to exist
* - if both a file with the name of a directory and a marker for that
* directory exists, then the *file masks the directory*, and the directory
* is never returned.
In particular this meant fixing delete() and rename() to handle all three possible meanings of directory without failing.
This patch also includes the following:
- Add logging any time a file in S3 is accessed for read or write, so when you get failure accessing/using a file its name will be in the task log
- Fix when opening a file for reading which doesn't exist, change the behavior to immediately throw a FileNotFoundException, rather than returning a hard to debug NPE later when the file is closed.
- Rewrite rename so that it only deletes the source files after every destination file has been written, so you never end up with half the files in each location
- Set up retryer so rename automatically retries on S3 errors.
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Nowland updated HADOOP-5836:
--------------------------------
Attachment: HADOOP-5836-2.patch
Regenerating against trunk.
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Tom White (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom White updated HADOOP-5836:
------------------------------
Assignee: Ian Nowland
Status: Open (was: Patch Available)
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory
markers using an object with a trailing "/" causes jobs to fail
Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Nowland updated HADOOP-5836:
--------------------------------
Status: Open (was: Patch Available)
> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5836
> URL: https://issues.apache.org/jira/browse/HADOOP-5836
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.3
> Reporter: Ian Nowland
> Assignee: Ian Nowland
> Fix For: 0.21.0
>
> Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.