You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Andrew Gudkov (JIRA)" <ji...@apache.org> on 2008/07/21 12:51:33 UTC

[jira] Created: (HADOOP-3800) DistributedCache parses Paths with sheme or port components incorrectly

DistributedCache parses Paths with sheme or port components incorrectly
-----------------------------------------------------------------------

                 Key: HADOOP-3800
                 URL: https://issues.apache.org/jira/browse/HADOOP-3800
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.17.1, 0.17.0
         Environment: linux ("path.separator" is ":")
hdfs filesystem (not "local")
            Reporter: Andrew Gudkov


When passing paths with scheme or port components set up (like 
"hdfs://localhost:9000/deploy/hello") to DistributedCache.addFileToClassPath, they are appended to configuration option "mapred.job.classpath.files" using delimeter "path.separator", which is ":".
This misleads DistributedCache.getFileClassPath: same symbol is used to delimete parts of Path and whole paths.


Example:
I have some jars and conf-files in hdfs directory "/deploy". Next code adds them to job's classpath:
{code:title=Test.java}
     Path deployPath = new Path("/deploy");
      FileSystem fs = deployPath.getFileSystem(new Configuration());

      FileStatus[] jars = fs.listStatus(deployPath);
      for (int i = 0; i < jars.length; i++) {
        System.out.println(jars[i].getPath());
        DistributedCache.addFileToClassPath(jars[i].getPath(), job);
      }
{code}

Launhing task gives stdout output:
{code}
hdfs://localhost:9000/deploy/hello
{code}
And "mapred.job.classpath.files" is set to "hdfs://localhost:9000/deploy/hello" by DistributedCache.
And DistributedCache.getFileClassPaths returns incorrect paths like "9000/deploy/hello/home/gudok/Work/test/bin/../conf".

For now, I've solved this problem by submitting Paths without scheme and port ("/deploy/hello").

Other DistributedCache methods need to be reviewed to.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.