You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org> on 2010/03/09 07:15:27 UTC

[jira] Created: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

archive: check and possibly replace the space charater in paths
---------------------------------------------------------------

                 Key: MAPREDUCE-1579
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: harchive
            Reporter: Tsz Wo (Nicholas), SZE
            Assignee: Mahadev konar


Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
# detect if there are spaces in the paths and 
# provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Rodrigo Schmidt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844599#action_12844599 ] 

Rodrigo Schmidt commented on MAPREDUCE-1579:
--------------------------------------------

+1

Everything looks good to me.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844277#action_12844277 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1579:
---------------------------------------------------

Yes, my previous post is actually copied from the output of a new unit test.  I will add some more tests before posting the new patch.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Rodrigo Schmidt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843262#action_12843262 ] 

Rodrigo Schmidt commented on MAPREDUCE-1579:
--------------------------------------------

+1 to Nicholas' suggestion.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843788#action_12843788 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1579:
---------------------------------------------------

> Nicholas, the patch looks good and you did some great refactoring also. ...
Thanks, Rodrigo for taking a look.  For the refactoring, you probably mean the change of the map input vale class from Text to HarEntry (and removing MapStat).  It is necessary because the original codes uses Text to store paths in a SequenceFile, which again has the space problem.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1579:
----------------------------------------------

    Status: Patch Available  (was: Open)

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Rodrigo Schmidt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843794#action_12843794 ] 

Rodrigo Schmidt commented on MAPREDUCE-1579:
--------------------------------------------

Yes, that's exactly what I mean. The code looks much better with HarEntry.

Another thing I just noticed in your path is that you are calling checkSpace inside relPathToRoot() and outside it on writeTopLevelDirs(). 

I think it is better to check for spaces outside relPathToRoot() since checking for spaces is something that we would like to do in the beginning of the execution only (and relPathToRoot() is an auxiliary function we might want to use several times throughout the execution). 

If you agree with that, you can just remove checkSpace from inside relPathToRoot() and place it also after the relPathToRoot() call on method archive(). 

If you prefer to leave the checkSpace() call inside relPathToRoot(), then you can probably remove the checkSpace(relPath) call on writeTopLevelDirs().

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1579:
----------------------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.20.3)
                       (was: 0.21.0)
         Assignee: Tsz Wo (Nicholas), SZE  (was: Mahadev konar)
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I have committed this.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.22.0
>
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843260#action_12843260 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1579:
---------------------------------------------------

I suggest to have two config properties for the version 1 HadoopArchives.
- har.space.replace.enable = <true|false>
- har.space.replacement = <REPLACEMENT_STRING>
If space replacement is not enabled, an exception will be thrown if there are spaces in the paths.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Rodrigo Schmidt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843759#action_12843759 ] 

Rodrigo Schmidt commented on MAPREDUCE-1579:
--------------------------------------------

Nicholas, the patch looks good and you did some great refactoring also. Just two minor things:

1) I think your patch file also contains the changes from MAPREDUCE-1578 and it probably won't merge well. You might want to merge it with trunk and submit it again.

2) The LOG.info() calls might not help much, specially the first one.


> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845104#action_12845104 ] 

Hudson commented on MAPREDUCE-1579:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk #258 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/258/])
    

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.22.0
>
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843815#action_12843815 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1579:
---------------------------------------------------

Need some tests ...

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1579:
----------------------------------------------

         Priority: Blocker  (was: Major)
    Fix Version/s: 0.22.0
                   0.21.0
                   0.20.3

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 0.20.3, 0.21.0, 0.22.0
>
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844664#action_12844664 ] 

Hudson commented on MAPREDUCE-1579:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #276 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/276/])
    Move  from 0.21 to trunk in CHANGES.txt.


> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.22.0
>
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Rodrigo Schmidt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844326#action_12844326 ] 

Rodrigo Schmidt commented on MAPREDUCE-1579:
--------------------------------------------

Your patch looks good, Nicholas.

One minor thing: Do you really want to create a new test source, instead of using the existing TestHarFileSystem.java. Wouldn't it be better to merge all Har-related tests on the same .java file?

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1579:
----------------------------------------------

    Attachment: m1579_20100311.patch

m1579_20100311.patch: added unit tests.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844190#action_12844190 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1579:
---------------------------------------------------

Tested the space replacement option:
- Original paths
{noformat}
-rw-r--r--   2 tsz supergroup          1 2010-03-11 18:43 /user/tsz/test/a
-rw-r--r--   2 tsz supergroup          1 2010-03-11 18:43 /user/tsz/test/b
-rw-r--r--   2 tsz supergroup          1 2010-03-11 18:43 /user/tsz/test/c
-rw-r--r--   2 tsz supergroup          3 2010-03-11 18:43 /user/tsz/test/c c
drwxr-xr-x   - tsz supergroup          0 2010-03-11 18:43 /user/tsz/test/sub 1
-rw-r--r--   2 tsz supergroup          4 2010-03-11 18:43 /user/tsz/test/sub 1/file
-rw-r--r--   2 tsz supergroup         10 2010-03-11 18:43 /user/tsz/test/sub 1/file x y z
-rw-r--r--   2 tsz supergroup          1 2010-03-11 18:43 /user/tsz/test/sub 1/x
-rw-r--r--   2 tsz supergroup          1 2010-03-11 18:43 /user/tsz/test/sub 1/y
-rw-r--r--   2 tsz supergroup          1 2010-03-11 18:43 /user/tsz/test/sub 1/z
drwxr-xr-x   - tsz supergroup          0 2010-03-11 18:43 /user/tsz/test/sub 2
{noformat}
- Replaced paths
{noformat}
drw-r--r--   - tsz supergroup          0 2010-03-11 18:43 /user/tsz/tmp/bar.har/test
-rw-r--r--   5 tsz supergroup          1 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/a
-rw-r--r--   5 tsz supergroup          1 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/b
-rw-r--r--   5 tsz supergroup          1 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/c
drw-r--r--   - tsz supergroup          0 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/sub_1
-rw-r--r--   5 tsz supergroup         10 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/sub_1/file_x_y_z
-rw-r--r--   5 tsz supergroup          1 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/sub_1/x
-rw-r--r--   5 tsz supergroup          1 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/sub_1/y
-rw-r--r--   5 tsz supergroup          1 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/sub_1/z
-rw-r--r--   5 tsz supergroup          4 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/sub_1/file
drw-r--r--   - tsz supergroup          0 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/sub_2
-rw-r--r--   5 tsz supergroup          3 2010-03-11 18:43 /user/tsz/tmp/bar.har/test/c_c
{noformat}

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844631#action_12844631 ] 

Hudson commented on MAPREDUCE-1579:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #275 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/275/])
    . archive: check and possibly replace the space charater in source paths.


> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 0.20.3, 0.21.0, 0.22.0
>
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Rodrigo Schmidt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842938#action_12842938 ] 

Rodrigo Schmidt commented on MAPREDUCE-1579:
--------------------------------------------

I agree, but I really think MAPREDUCE-1578 should be solved first. The way it is now, if we ever change the protocol on HarFileSystem, HadoopArchives will start generating old files with new version numbers, which is much more dangerous in my opinion.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844586#action_12844586 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1579:
---------------------------------------------------

> One minor thing: Do you really want to create a new test source, instead of using the existing TestHarFileSystem.java. Wouldn't it be better to merge all Har-related tests on the same .java file?

I actually began with add the new tests to TestHarFileSystem.  However, there are subtle difference in the new tests compared with the existing tests.  So I created a new file.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1579:
----------------------------------------------

    Attachment: m1579_20100310.patch

m1579_20100310.patch: first patch for reviewing.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1579:
----------------------------------------------

    Attachment: m1579_20100311_y0.20.patch

m1579_20100311_y0.20.patch: for y0.20

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.22.0
>
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch, m1579_20100311_y0.20.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1579:
----------------------------------------------

    Attachment: m1579_20100310b.patch

m1579_20100310b.patch:
- removed the changes from MAPREDUCE-1578
- removed this first LOG.info() calls
- checkSpace() right before writing to the SequenceFile.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844590#action_12844590 ] 

Hadoop QA commented on MAPREDUCE-1579:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438566/m1579_20100311.patch
  against trunk revision 922047.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/524/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/524/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/524/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/524/console

This message is automatically generated.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch, m1579_20100311.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Rodrigo Schmidt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843819#action_12843819 ] 

Rodrigo Schmidt commented on MAPREDUCE-1579:
--------------------------------------------

Looks great to me! A couple of unit tests would be good.

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

Posted by "Rodrigo Schmidt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844193#action_12844193 ] 

Rodrigo Schmidt commented on MAPREDUCE-1579:
--------------------------------------------

Tests look good, Nicholas! Do you intend to add unit tests to your patch?

> archive: check and possibly replace the space charater in paths
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-1579
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Mahadev konar
>         Attachments: m1579_20100310.patch, m1579_20100310b.patch
>
>
> Since the space character is used as a separator in the index files, it won't work if there are spaces in the path (see also HADOOP-6591).  The archive tools should 
> # detect if there are spaces in the paths and 
> # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.