You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Ben Slusky (JIRA)" <ji...@apache.org> on 2009/06/22 19:58:07 UTC

[jira] Created: (HADOOP-6097) Multiple bugs w/ Hadoop archives

Multiple bugs w/ Hadoop archives
--------------------------------

                 Key: HADOOP-6097
                 URL: https://issues.apache.org/jira/browse/HADOOP-6097
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs
    Affects Versions: 0.20.0, 0.19.1, 0.19.0, 0.18.3, 0.18.2, 0.18.1, 0.18.0
            Reporter: Ben Slusky
             Fix For: 0.20.1


Found and fixed several bugs involving Hadoop archives:

- In makeQualified(), the sloppy conversion from Path to URI and back mangles the path if it contains an escape-worthy character.

- It's possible that fileStatusInIndex() may have to read more than one segment of the index. The LineReader and count of bytes read need to be reset for each block.

- har:// connections cannot be indexed by (scheme, authority, username) -- the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-6097) Multiple bugs w/ Hadoop archives

Posted by "Ben Slusky (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ben Slusky updated HADOOP-6097:
-------------------------------

    Attachment: HADOOP-6097.patch

> Multiple bugs w/ Hadoop archives
> --------------------------------
>
>                 Key: HADOOP-6097
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6097
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.20.0
>            Reporter: Ben Slusky
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-6097.patch
>
>
> Found and fixed several bugs involving Hadoop archives:
> - In makeQualified(), the sloppy conversion from Path to URI and back mangles the path if it contains an escape-worthy character.
> - It's possible that fileStatusInIndex() may have to read more than one segment of the index. The LineReader and count of bytes read need to be reset for each block.
> - har:// connections cannot be indexed by (scheme, authority, username) -- the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-6097) Multiple bugs w/ Hadoop archives

Posted by "Ben Slusky (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ben Slusky updated HADOOP-6097:
-------------------------------

    Status: Patch Available  (was: Open)

> Multiple bugs w/ Hadoop archives
> --------------------------------
>
>                 Key: HADOOP-6097
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6097
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.20.0, 0.19.1, 0.19.0, 0.18.3, 0.18.2, 0.18.1, 0.18.0
>            Reporter: Ben Slusky
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-6097.patch
>
>
> Found and fixed several bugs involving Hadoop archives:
> - In makeQualified(), the sloppy conversion from Path to URI and back mangles the path if it contains an escape-worthy character.
> - It's possible that fileStatusInIndex() may have to read more than one segment of the index. The LineReader and count of bytes read need to be reset for each block.
> - har:// connections cannot be indexed by (scheme, authority, username) -- the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.