You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Lance Norskog (JIRA)" <ji...@apache.org> on 2011/02/03 06:41:28 UTC

[jira] Commented: (MAHOUT-602) "Partial Implementation" throws exceptions

    [ https://issues.apache.org/jira/browse/MAHOUT-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989968#comment-12989968 ] 

Lance Norskog commented on MAHOUT-602:
--------------------------------------

There is an off-by-one error somewhere. The code generates two files with 'number of trees requested' instead of one. 

To make things easier I created 10 trees instead of 100. Two files of trees are created instead of just one. The patch prints the hashCode() for each tree.toString. You can see that the two files have different trees. I have included the value for each tree in the attached log 10_hashCode.log. (10_toString.log shows the actual string dump for each tree.) 

Apply the patch attached as PartialImplementationBug1.patch, if you want to recreate the experiment. Try different numbers of trees and it will always make two files of N trees instead of just 1.

This was the command line, as per the wiki:
$HADOOP_HOME/bin/hadoop jar /Users/lancenorskog/Documents/open/mahout/examples/target/mahout-examples-0.5-SNAPSHOT-job.jar org.apache.mahout.df.mapreduce.BuildForest -Dmapred.max.split.size=1874231 -oob -d ../../datasets/KDDTrain/KDDTrain+_20Percent.arff -ds ../../datasets/KDDTrain/KDDTrain+_20Percent.info  -sl 5 -p -t 10 -o nsl-forest


> "Partial Implementation" throws exceptions
> ------------------------------------------
>
>                 Key: MAHOUT-602
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-602
>             Project: Mahout
>          Issue Type: Bug
>         Environment: Macos X
> java version "1.6.0_22"
> Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-10M3261)
> Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)
>            Reporter: Lance Norskog
>         Attachments: partialImp_fullKDD_errors.log
>
>
> The "Partial Implementation" described on the wiki page [Partial Implementation|https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation] fails with the given dataset and operations.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira