You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "leon lee (JIRA)" <ji...@apache.org> on 2010/08/13 09:23:16 UTC

[jira] Commented: (MAHOUT-476) bug when running org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver on hadoop

    [ https://issues.apache.org/jira/browse/MAHOUT-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898113#action_12898113 ] 

leon lee commented on MAHOUT-476:
---------------------------------

similar error happened when running 20news group dataset:

$HADOOP_HOME/bin/hadoop     jar     $MAHOUT_HOME/examples/target/mahout-examples-0.3.job     org.apache.mahout.classifier.bayes.TrainClassifier --gramSize 3    --input 20news-input     --output newsmodel  --classifierType bayes -source hdfs

Status : FAILED
Error: org.apache.mahout.classifier.bayes.mapreduce.common.BayesFeatureMapper$IteratorTokenStream.addAttribute(Ljava/lang/Class;)Lorg/apache/lucene/util/Attribute;



> bug when running org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver on hadoop
> -------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-476
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-476
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>    Affects Versions: 0.3
>         Environment: hadoop 0.20.2
> mahout-0.3
> ubuntu
>            Reporter: leon lee
>
> when I follow wiki instruction: https://cwiki.apache.org/MAHOUT/wikipedia-bayes-example.html 
> (by the way, the bayes examples document in wiki  need update to 0.3 )
> to run step 5:
> Create the countries based Split of wikipedia dataset. 
> I use the following command:
> $HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/examples/target/mahout-examples-0.3.job  org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver -i $MAHOUT_HOME/examples/work/wikipedia/chunks -o $MAHOUT_HOME/examples/work/wikipediainput  -c  $MAHOUT_HOME/examples/src/test/resources/country.txt
> and failed on hadoop.
> see hadoop log, it hint:
> Error: org.apache.lucene.wikipedia.analysis.WikipediaTokenizer.addAttribute(Ljava/lang/Class;)Lorg/apache/lucene/util/Attribute

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.