You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chukwa.apache.org by "Eric Yang (JIRA)" <ji...@apache.org> on 2010/05/20 21:11:16 UTC

[jira] Commented: (CHUKWA-488) Hadoop cannot find custom Demux class

    [ https://issues.apache.org/jira/browse/CHUKWA-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12869728#action_12869728 ] 

Eric Yang commented on CHUKWA-488:
----------------------------------

+1 Looks good, and works on my test environment.

> Hadoop cannot find custom Demux class
> -------------------------------------
>
>                 Key: CHUKWA-488
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-488
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: MR Data Processors
>    Affects Versions: 0.4.0
>         Environment: Linux x86-64
> Java 1.6.0_20
>            Reporter: Kirk True
>         Attachments: Demux.diff
>
>
> I'm getting ClassNotFoundException errors when running inside Hadoop's map phase, unable to find my class org.apache.hadoop.chukwa.extraction.demux.processor.mapper.XmlBasedDemux which I've packaged in a JAR named data-collection-demux-0.1.jar.
> The problem seems to be in the values of these two properties in the Hadoop job configuration:
> {code}
> <property>
>     <name>mapred.job.classpath.files</name>
>     <value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar</value>
> </property>
> <property>
>     <name>mapred.cache.files</name>
>     <value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar</value>
> </property>
> {code}
> The problem seems to stem from the fact that the call to DistributedCache.addFileToClassPath is passing in a Path that is in URI form, i.e. hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar whereas the DistributedCache API expects it to be a filesystem-based path (i.e. /chukwa/demux/data-collection-demux-0.1.jar). I'm not sure why, but the FileStatus object returned by FileSystem.listStatus is returning a URL-based path instead of a filesystem-based path.
> I kludged the Demux class' addParsers to strip the "hdfs://localhost:9000" portion of the string and now my class is found. I will attempt to provide a patch today that determines the value of Hadoop's fs.default.name and strips that from the value returned in Demux.java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.