You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "S. Zhou" <my...@yahoo.com> on 2013/04/05 02:01:12 UTC

"Wrong number of attributes in the string" when running Partial Implementation algorithm

Hi there,
I am playing with Mahout Partial Implementation algorithm (by following the instructions here https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation). So I created a toy data set (see attached). But I run into the "Wrong number of attributes in the string" error. Please see repro steps and detail error messages below. The error message was thrown in the "build forest" step. BTW, I am using Mahour 0.7 version. Thanks for your help!


Toy data set (save it as test-train.csv). Also attached in the e-mail.

1,1,100
1,2,102
2,1,103
2,3,105
3,1,106


Repro steps:
# import data
hadoop fs -mkdir testdata
hadoop fs -put test-train.csv testdata

#generate file descriptor
hadoop jar $MAHOUTHOME/mahout-core-0.7-job.jar org.apache.mahout.classifier.df.tools.Describe -p testdata/test-train.csv -f testdata/test-train.info -d 2 N L -r

#build 
hadoop jar $MAHOUTHOME/mahout-examples-0.7-job.jar org.apache.mahout.classifier.df.mapreduce.BuildForest -Dmapred.max.split.size=1874231 -d testdata/test-train.csv -ds testdata/test-train.info -sl 1 -p -t 5 -o test-forest

and the error in the "build" step is:

13/04/04 15:27:34 INFO mapred.JobClient: Task Id : attempt_201303291046_0007_m_000020_0, Status : FAILED
java.lang.IllegalArgumentException: Wrong number of attributes in the string
    at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
    at org.apache.mahout.classifier.df.data.DataConverter.convert(DataConverter.java:44)
    at org.apache.mahout.classifier.df.mapreduce.partial.Step1Mapper.map(Step1Mapper.java:140)
    at org.apache.mahout.classifier.df.mapreduce.partial.Step1Mapper.map(Step1Mapper.java:45)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

RE: Does Mahout decision forest support regression?

Posted by Daniel Donahoe <da...@q.com>.
Andy,

This youtube presentation may help some members understand the big picture. ASME Utah/ University of Utah hosted Prof. Adele Cutler (Utah State University), co-creator of Random Forests recently: 
http://www.youtube.com/watch?v=ldKCz1-SEAs&feature=youtu.be; reference: http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

Regards,
Dan

Daniel N. Donahoe, Ph.D., P.E.
1000 kilometers®
dan@1000kilometers.com
http://www.1000kilometers.com

Licensed Professional Engineer:			Certified Reliability Engineer:
Arizona 16068 expires Sept. 30, 2014		ASQ 6379 expires Dec. 13, 2013
California 20262 expires June 30, 2013
Utah 7328543-2002 expires March 31, 2015



-----Original Message-----
From: Andy Twigg [mailto:andy.twigg@gmail.com] 
Sent: Tuesday, April 09, 2013 8:37 AM
To: Dan Filimon
Cc: user@mahout.apache.org; S. Zhou
Subject: Re: Does Mahout decision forest support regression?

no, it's not.


Re: Does Mahout decision forest support regression?

Posted by Andy Twigg <an...@gmail.com>.
no, it's not.

Re: Does Mahout decision forest support regression?

Posted by Dan Filimon <da...@gmail.com>.
Though, if you want to try Andy's experimental repo:
https://github.com/andytwigg/mahout

Andy, is it usable?

Re: Does Mahout decision forest support regression?

Posted by Som Satpathy <so...@gmail.com>.
The attribute I'm trying to predict is continuous. Also, I'm running the
distributed version of mahout's random forest.

I pass regression parameter as true in DataLoader.generateDataset(), store
the dataset descriptor in HDFS. I then used the PartialBuilder to build
decision forest.

While testing, the forest.classify for a test instance returns the
predicted attribute value.


Thanks,

Som

On Wed, Jul 10, 2013 at 5:42 PM, Ted Dunning <te...@gmail.com> wrote:

> How did you do it?
>
> Could you post some explanation / description of your method?
>
>
> On Wed, Jul 10, 2013 at 2:56 PM, Som Satpathy <so...@gmail.com>
> wrote:
>
> > I am able to get regression work via Mahout 0.7's random forest.
> >
> > Thanks,
> > Som
> >
> > On Fri, Apr 5, 2013 at 4:48 PM, S. Zhou <my...@yahoo.com> wrote:
> >
> > > I am using Mahout 0.7. Thanks
> >
>

Re: Does Mahout decision forest support regression?

Posted by Ted Dunning <te...@gmail.com>.
How did you do it?

Could you post some explanation / description of your method?


On Wed, Jul 10, 2013 at 2:56 PM, Som Satpathy <so...@gmail.com> wrote:

> I am able to get regression work via Mahout 0.7's random forest.
>
> Thanks,
> Som
>
> On Fri, Apr 5, 2013 at 4:48 PM, S. Zhou <my...@yahoo.com> wrote:
>
> > I am using Mahout 0.7. Thanks
>

Re: Does Mahout decision forest support regression?

Posted by Som Satpathy <so...@gmail.com>.
I am able to get regression work via Mahout 0.7's random forest.

Thanks,
Som

On Fri, Apr 5, 2013 at 4:48 PM, S. Zhou <my...@yahoo.com> wrote:

> I am using Mahout 0.7. Thanks

Re: Does Mahout decision forest support regression?

Posted by Ted Dunning <te...@gmail.com>.
Not at this time.


On Fri, Apr 5, 2013 at 4:48 PM, S. Zhou <my...@yahoo.com> wrote:

> I am using Mahout 0.7. Thanks

Does Mahout decision forest support regression?

Posted by "S. Zhou" <my...@yahoo.com>.
I am using Mahout 0.7. Thanks