You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sergey (JIRA)" <ji...@apache.org> on 2014/04/21 13:59:14 UTC

[jira] [Comment Edited] (MAHOUT-1498) DistributedCache.setCacheFiles in DictionaryVectorizer overwrites jars pushed using oozie

    [ https://issues.apache.org/jira/browse/MAHOUT-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975537#comment-13975537 ] 

Sergey edited comment on MAHOUT-1498 at 4/21/14 11:58 AM:
----------------------------------------------------------

Yes, it's possible

1. http://commons.apache.org/patches.html
is it a valid guide?

2. Which branch do I have to checkout for patching? Right now it's done 'quick-and-dirty'. It can't be used a a patch...



was (Author: serega_sheypak):
Yes, it's possible

1. http://commons.apache.org/patches.html
is it a valid guide?

2. Which branch do I have to checkout for patching? Right now it's done 'quick-and-dirty'. It can be used a a patch...


> DistributedCache.setCacheFiles in DictionaryVectorizer overwrites jars pushed using oozie
> -----------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-1498
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1498
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.7
>         Environment: mahout-core-0.7-cdh4.4.0.jar
>            Reporter: Sergey
>             Fix For: 1.0
>
>
> Hi, I get exception 
> {code}
> <<< Invocation of Main class completed <<<
> Failing Oozie Launcher, Main class [org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles], main() threw exception, Job failed!
> java.lang.IllegalStateException: Job failed!
> at org.apache.mahout.vectorizer.DictionaryVectorizer.makePartialVectors(DictionaryVectorizer.java:329)
> at org.apache.mahout.vectorizer.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:199)
> at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:271)
> {code}
> The root cause is:
> {code}
> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:247
> {code}
> Looks like it happens because of 
> DictionaryVectorizer.makePartialVectors method.
> It has code:
> {code}
> DistributedCache.setCacheFiles(new URI[] {dictionaryFilePath.toUri()}, conf);
> {code}
> which overrides jars pushed with job by oozie:
> {code}
> public static void More ...setCacheFiles(URI[] files, Configuration conf) {
>          String sfiles = StringUtils.uriToString(files);
>          conf.set("mapred.cache.files", sfiles);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)