You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Qiuzhuang Lian (JIRA)" <ji...@apache.org> on 2012/12/31 09:36:13 UTC

[jira] [Commented] (MAHOUT-1034) ERROR in Navie Bayes Training(update: seqdirectory does not give output)

    [ https://issues.apache.org/jira/browse/MAHOUT-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541319#comment-13541319 ] 

Qiuzhuang Lian commented on MAHOUT-1034:
----------------------------------------

I also run into this problem. The point is that extractLabels via reading 20news-train-vectors dir sequence files whose entry is empty as here code in BayesUtils,

    try {
      for (Object label : labels) {
        String theLabel = SLASH.split(((Pair<?, ?>) label).getFirst().toString())[1];
        if (!seen.contains(theLabel)) {
          writer.append(new Text(theLabel), new IntWritable(i++));
          seen.add(theLabel);
        }
      }
    } finally {
      Closeables.closeQuietly(writer);
    }

Even I hard wire label size=20, I still run into other errors similar to above issue. Any clues to help?
                
> ERROR in Navie Bayes Training(update: seqdirectory does not give output)
> ------------------------------------------------------------------------
>
>                 Key: MAHOUT-1034
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1034
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>    Affects Versions: 0.7
>         Environment: Ubuntu 11.04
>            Reporter: Leting Wu
>            Assignee: Robin Anil
>
> When run either examples/classify-20newsgrouops.sh or ash-email-examples.sh, trainnb always fails:
> {noformat}
> INFO mapred.JobClient: Task Id : attempt_201206281546_0003_m_000000_0, Status : FAILED
> java.lang.IllegalArgumentException
> 	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
> 	at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:264)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira