You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Ian Nowland (JIRA)" <ji...@apache.org> on 2009/05/14 19:57:45 UTC

[jira] Created: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
------------------------------------------------------------------------------------------------

                 Key: HADOOP-5836
                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
             Project: Hadoop Core
          Issue Type: Bug
          Components: fs/s3
    Affects Versions: 0.18.3
            Reporter: Ian Nowland


Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714956#action_12714956 ] 

Hadoop QA commented on HADOOP-5836:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12409482/HADOOP-5836-2.patch
  against trunk revision 780465.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 10 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/console

This message is automatically generated.

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Nowland updated HADOOP-5836:
--------------------------------

    Attachment:     (was: HADOOP-5836-1.patch)

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711918#action_12711918 ] 

Hadoop QA commented on HADOOP-5836:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12408154/HADOOP-5836-0.patch
  against trunk revision 777330.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 11 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/376/console

This message is automatically generated.

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Nowland updated HADOOP-5836:
--------------------------------

    Status: Patch Available  (was: Open)

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-1.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Nowland updated HADOOP-5836:
--------------------------------

    Status: Patch Available  (was: Open)

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713497#action_12713497 ] 

Tom White commented on HADOOP-5836:
-----------------------------------

These changes look good. A few comments

* Have you run Jets3tNativeS3FileSystemContractTest? This isn't run by default since it needs an S3 account to test with. This serves as a good regression test.
* There's a mixture of debug-level and info-level debugging here. How noisy is this in practice? Shouldn't it be mainly debug, so folks can enable it when they hit problems?
* Some of the indentation looks wrong in the patch - e.g. in handleServiceException(S3ServiceException).
* The patch doesn't apply cleanly anymore and needs regenerating.

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Nowland updated HADOOP-5836:
--------------------------------

    Attachment: HADOOP-5836-1.patch

* Yes, I have run Jets3tNativeS3FileSystemContractTest. Multiple Times in fact, including for the newest patch :).
* I have reworked logging, making everything debug, except the following:
+      LOG.info("Opening key '" + key + "' for reading at position '" + pos + "'");
+      LOG.info("OutputStream for key '" + key + "' writing to tempfile '" + this.backupFile + "'");
+      LOG.info("OutputStream for key '" + key + "' closed. Now beginning upload");
+      LOG.info("OutputStream for key '" + key + "' upload complete");
+    LOG.info("Opening '" + f + "' for reading");

The basic idea is I want to always capture in a tasks syslog what S3 files it is reading from as this is very useful when a subset of tasks fail. Also I wanted to capture the time spent in actually uploading the file to S3 very specifically.

* Good catch - must have happened as part of the diff I did ignoring whitespace. I have now gone through with a fine tooth comb and fixed all indentation issues I could see.

* Done


> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-1.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Nowland updated HADOOP-5836:
--------------------------------

    Attachment: HADOOP-5836-0.patch

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>         Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Nowland updated HADOOP-5836:
--------------------------------

    Fix Version/s: 0.21.0
           Status: Patch Available  (was: Open)

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-5836:
------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Ian!

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714659#action_12714659 ] 

Hadoop QA commented on HADOOP-5836:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12409198/HADOOP-5836-1.patch
  against trunk revision 780114.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 11 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/430/console

This message is automatically generated.

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-1.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709509#action_12709509 ] 

Ian Nowland commented on HADOOP-5836:
-------------------------------------

The main fix here is to check for and just not return this empty file in listStatus(). However along with this, I broadened handling in all S3N methods for the different ways of designating directories in S3, in this way:
 
* A note about directories. S3 of course has no "native" support for them.
 * The idiom we choose then is: for any directory created by this class,
 * we use an empty object "#{dirpath}_$folder$" as a marker.
 * Further, to interoperate with other S3 tools, we also accept the following:
 * - an object "#{dirpath}/' denoting a directory marker
 * - if there exists any objects with the prefix "#{dirpath}/", then the
 *   directory is said to exist
 * - if both a file with the name of a directory and a marker for that
 *   directory exists, then the *file masks the directory*, and the directory
 *   is never returned.
 
In particular this meant fixing delete() and rename() to handle all three possible meanings of directory without failing.
 
This patch also includes the following:
-          Add logging any time a file in S3 is accessed for read or write, so when you get failure accessing/using a file its name will be in the task log
-         Fix when opening a file for reading which doesn't exist, change the behavior to immediately throw a FileNotFoundException, rather than returning a hard to debug NPE later when the file is closed.
-          Rewrite rename so that it only deletes the source files after every destination file has been written, so you never end up with half the files in each location
-         Set up retryer so rename automatically retries on S3 errors.


> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Nowland updated HADOOP-5836:
--------------------------------

    Attachment: HADOOP-5836-2.patch

Regenerating against trunk.

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-5836:
------------------------------

    Assignee: Ian Nowland
      Status: Open  (was: Patch Available)

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

Posted by "Ian Nowland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Nowland updated HADOOP-5836:
--------------------------------

    Status: Open  (was: Patch Available)

> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-2.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.