You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by deneche abdelhakim <a_...@yahoo.fr> on 2011/04/06 06:14:10 UTC

Re: Re : Partial Implementation of Random Forest

There was a new bug in the code and I fixed it. Please try again after updating the code. I am also using Cloudera's Hadoop and it's running just fine

--- En date de : Jeu 31.3.11, praneet mhatre <pr...@gmail.com> a écrit :

De: praneet mhatre <pr...@gmail.com>
Objet: Re: Re : Partial Implementation of Random Forest
À: user@mahout.apache.org
Cc: "deneche abdelhakim" <a_...@yahoo.fr>
Date: Jeudi 31 mars 2011, 23h34

Hi Deneche,

I used the trunk. I still encounter the same error. By the way, I am running mahout on top of Cloudera's Linux image. I was just wondering if that has anything to do with the error.

Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException

    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
    at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
    at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)

    at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
    at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
    at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)

    at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.io.EOFException
    at java.io.DataInputStream.readFully(DataInputStream.java:180)

    at java.io.DataInputStream.readFully(DataInputStream.java:152)
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)

    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)

    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
    ... 13 more
cloudera@cloudera-demo:~/Downloads/trunk$ 

Thanks,


On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <a_...@yahoo.fr> wrote:

Hi Prannet,



I fixed various bugs since 0.4, could you try using the trunk, and see if it's happening again ?



--- En date de : Mar 29.3.11, praneet mhatre <pr...@gmail.com> a écrit :



De: praneet mhatre <pr...@gmail.com>

Objet: Partial Implementation of Random Forest

À: user@mahout.apache.org

Date: Mardi 29 mars 2011, 18h30



I think my previous mail did not get through.



---------- Forwarded message ----------

From: praneet mhatre <pr...@gmail.com>

Date: Mon, Mar 28, 2011 at 10:50 PM

Subject: Partial Implementation of Random Forest

To: user@mahout.apache.org





Hello all,



I very recently started working on Mahout. To get the feel of things, I was

trying to run  the sample implementation of Random Forest posted on the Wiki

( https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation).

However, even when I issue the exact same commands, I get an

EOFException

error as follows:

Exception in thread "main" java.io.EOFException

    at java.io.DataInputStream.readFully(DataInputStream.java:180)

    at java.io.DataInputStream.readFully(DataInputStream.java:152)

    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)

    at

org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)

    at

org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)

    at

org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)

    at

org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)

    at

org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)

    at

org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)

    at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)

    at

org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)

    at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)

    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

    at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

    at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

    at java.lang.reflect.Method.invoke(Method.java:597)

    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$



Can you please tell me what the problem is?



Thank you,



--

Praneet Mhatre

Graduate Student

Donald Bren School of ICS

University of California, Irvine




-- 
Praneet Mhatre
Graduate Student
Donald Bren School of ICS
University of California, Irvine




Re: Re : Partial Implementation of Random Forest

Posted by Sean Owen <sr...@gmail.com>.
The _SUCCESS fie has nothing to do with Mahout -- it's Cloudera behavior. My
change would ignore the file. Deneche if that doesn't fix it, well I figure
that's good policy anyway to apply the standard filter to ignore _logs,
_SUCCESS, .crc, etc.

On Mon, Apr 18, 2011 at 11:37 PM, praneet mhatre <pr...@gmail.com>wrote:

> Sean,
>
> I still see the _SUCCESS file in my output path. And I'm sure that I used
> the latest trunk. I'm in class right now, so I'll repeat the whole process
> carefully again in some time. Just wanted to let you know.
>
>

Re: Re : Partial Implementation of Random Forest

Posted by praneet mhatre <pr...@gmail.com>.
It's CDH3

On Mon, Apr 18, 2011 at 3:54 PM, Ted Dunning <te...@gmail.com> wrote:

> Praneet,
>
> What version of CDH is that?
>
>
> On Mon, Apr 18, 2011 at 3:37 PM, praneet mhatre <pr...@gmail.com>wrote:
>
>> Sean,
>>
>> I still see the _SUCCESS file in my output path. And I'm sure that I used
>> the latest trunk. I'm in class right now, so I'll repeat the whole process
>> carefully again in some time. Just wanted to let you know.
>>
>> On Mon, Apr 18, 2011 at 3:04 PM, Ted Dunning <te...@gmail.com>
>> wrote:
>>
>> > Deneche,
>> >
>> > My local map-reduce guy says that he doesn't see this in the CDH
>> sources.
>> >  Can you say which version of CDH you were using?
>> >
>> >
>> > On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <sr...@gmail.com> wrote:
>> >
>> > > I can easily add a bit to this method call that will cause it to skip
>> > files
>> > > and directories like .crc, _logs, etc. Seems like the right thing to
>> do
>> > > here
>> > > as it's evidently causing a problem otherwise.
>> > >
>> > > On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <
>> a_deneche@yahoo.fr
>> > > >wrote:
>> > >
>> > > > Ok I was able to finally reproduce this bug, it appears when using
>> > > > Cloudera's distribution of Hadoop. Apparently this distribution
>> > contains
>> > > > some patches from Hadoop 0.21 that create a _SUCCEED file in the
>> output
>> > > > path, the current code doesn't assume such file thus it can't parse
>> it.
>> > > > I tried the standard Hadoop O.20 distribution and it's working just
>> > fine.
>> > > > So for now I think it's safe to just use the standard distribution.
>> > > >
>> > > > --- En date de : Lun 11.4.11, praneet mhatre <
>> praneetmhatre@gmail.com>
>> > a
>> > > > écrit :
>> > > >
>> > > > De: praneet mhatre <pr...@gmail.com>
>> > > > Objet: Re: Re : Partial Implementation of Random Forest
>> > > > À: user@mahout.apache.org
>> > > > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
>> > > > Date: Lundi 11 avril 2011, 23h18
>> > > >
>> > > > Me too. Used the latest code. Still the exact same error as before.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <
>> > a_deneche@yahoo.fr
>> > > > >wrote:
>> > > >
>> > > > > hmm, I will give it a look and see what's causing this
>> > > > >
>> > > > > --- En date de : Lun 11.4.11, ext-ranjit.chellappannair@nokia.com<
>> > > > > ext-ranjit.chellappannair@nokia.com> a écrit :
>> > > > >
>> > > > > De: ext-ranjit.chellappannair@nokia.com <
>> > > > > ext-ranjit.chellappannair@nokia.com>
>> > > > > Objet: RE: Re : Partial Implementation of Random Forest
>> > > > > À: user@mahout.apache.org
>> > > > > Date: Lundi 11 avril 2011, 15h58
>> > > > >
>> > > > > Hi Deneche,
>> > > > >
>> > > > > I used the mahout latest code from the trunk and while running the
>> > > > > BuildForest on KDD dataset I am getting an EOF exception. Please
>> find
>> > > the
>> > > > > exception I am getting below:-
>> > > > >
>> > > > > Exception in thread "main" java.lang.IllegalStateException:
>> > > > > java.io.EOFException
>> > > > >         at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
>> > > > >         at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
>> > > > >         at
>> > > > >
>> > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
>> > > > >         at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
>> > > > >         at
>> > > org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
>> > > > >         at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
>> > > > >         at
>> > > > >
>> org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
>> > > > >         at
>> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> > > > >         at
>> > > > >
>> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
>> > > > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> > Method)
>> > > > >         at
>> > > > >
>> > > >
>> > >
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> > > > >         at
>> > > > >
>> > > >
>> > >
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> > > > >         at java.lang.reflect.Method.invoke(Method.java:597)
>> > > > >         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>> > > > > Caused by: java.io.EOFException
>> > > > >         at
>> > java.io.DataInputStream.readFully(DataInputStream.java:180)
>> > > > >         at
>> > java.io.DataInputStream.readFully(DataInputStream.java:152)
>> > > > >         at
>> > > > >
>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>> > > > >         at
>> > > > >
>> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>> > > > >         at
>> > > > >
>> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>> > > > >         at
>> > > > >
>> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>> > > > >         at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
>> > > > >         at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
>> > > > >         ... 13 more
>> > > > >
>> > > > > Any help in resolving the above error will be greately
>> appreciated.
>> > > > >
>> > > > > Thanks and Regards,
>> > > > > Ranjit.C
>> > > > >
>> > > > > -----Original Message-----
>> > > > > From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr]
>> > > > > Sent: Wednesday, April 06, 2011 9:44 AM
>> > > > > To: user@mahout.apache.org
>> > > > > Subject: Re: Re : Partial Implementation of Random Forest
>> > > > >
>> > > > > There was a new bug in the code and I fixed it. Please try again
>> > after
>> > > > > updating the code. I am also using Cloudera's Hadoop and it's
>> running
>> > > > just
>> > > > > fine
>> > > > >
>> > > > > --- En date de : Jeu 31.3.11, praneet mhatre <
>> > praneetmhatre@gmail.com>
>> > > a
>> > > > > écrit :
>> > > > >
>> > > > > De: praneet mhatre <pr...@gmail.com>
>> > > > > Objet: Re: Re : Partial Implementation of Random Forest
>> > > > > À: user@mahout.apache.org
>> > > > > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
>> > > > > Date: Jeudi 31 mars 2011, 23h34
>> > > > >
>> > > > > Hi Deneche,
>> > > > >
>> > > > > I used the trunk. I still encounter the same error. By the way, I
>> am
>> > > > > running mahout on top of Cloudera's Linux image. I was just
>> wondering
>> > > if
>> > > > > that has anything to do with the error.
>> > > > >
>> > > > > Exception in thread "main" java.lang.IllegalStateException:
>> > > > > java.io.EOFException
>> > > > >
>> > > > >     at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
>> > > > >     at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
>> > > > >     at
>> > > > >
>> > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
>> > > > >
>> > > > >     at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
>> > > > >     at
>> org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
>> > > > >     at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
>> > > > >
>> > > > >     at
>> > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
>> > > > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> > > > >     at
>> > > > >
>> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
>> > > > >
>> > > > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > > > >     at
>> > > > >
>> > > >
>> > >
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> > > > >     at
>> > > > >
>> > > >
>> > >
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> > > > >
>> > > > >     at java.lang.reflect.Method.invoke(Method.java:597)
>> > > > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>> > > > > Caused by: java.io.EOFException
>> > > > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
>> > > > >
>> > > > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
>> > > > >     at
>> > > > >
>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>> > > > >     at
>> > > > >
>> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>> > > > >
>> > > > >     at
>> > > > >
>> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>> > > > >     at
>> > > > >
>> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>> > > > >     at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
>> > > > >
>> > > > >     at
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
>> > > > >     ... 13 more
>> > > > > cloudera@cloudera-demo:~/Downloads/trunk$
>> > > > >
>> > > > > Thanks,
>> > > > >
>> > > > >
>> > > > > On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <
>> > > a_deneche@yahoo.fr>
>> > > > > wrote:
>> > > > >
>> > > > > Hi Prannet,
>> > > > >
>> > > > >
>> > > > >
>> > > > > I fixed various bugs since 0.4, could you try using the trunk, and
>> > see
>> > > if
>> > > > > it's happening again ?
>> > > > >
>> > > > >
>> > > > >
>> > > > > --- En date de : Mar 29.3.11, praneet mhatre <
>> > praneetmhatre@gmail.com>
>> > > a
>> > > > > écrit :
>> > > > >
>> > > > >
>> > > > >
>> > > > > De: praneet mhatre <pr...@gmail.com>
>> > > > >
>> > > > > Objet: Partial Implementation of Random Forest
>> > > > >
>> > > > > À: user@mahout.apache.org
>> > > > >
>> > > > > Date: Mardi 29 mars 2011, 18h30
>> > > > >
>> > > > >
>> > > > >
>> > > > > I think my previous mail did not get through.
>> > > > >
>> > > > >
>> > > > >
>> > > > > ---------- Forwarded message ----------
>> > > > >
>> > > > > From: praneet mhatre <pr...@gmail.com>
>> > > > >
>> > > > > Date: Mon, Mar 28, 2011 at 10:50 PM
>> > > > >
>> > > > > Subject: Partial Implementation of Random Forest
>> > > > >
>> > > > > To: user@mahout.apache.org
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > Hello all,
>> > > > >
>> > > > >
>> > > > >
>> > > > > I very recently started working on Mahout. To get the feel of
>> things,
>> > I
>> > > > was
>> > > > >
>> > > > > trying to run  the sample implementation of Random Forest posted
>> on
>> > the
>> > > > > Wiki
>> > > > >
>> > > > > (
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
>> > > > > ).
>> > > > >
>> > > > > However, even when I issue the exact same commands, I get an
>> > > > >
>> > > > > EOFException
>> > > > >
>> > > > > error as follows:
>> > > > >
>> > > > > Exception in thread "main" java.io.EOFException
>> > > > >
>> > > > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
>> > > > >
>> > > > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
>> > > > >
>> > > > >     at
>> > > > >
>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>> > > > >
>> > > > >     at
>> > > > >
>> > > > >
>> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>> > > > >
>> > > > >     at
>> > > > >
>> > > > >
>> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>> > > > >
>> > > > >     at
>> > > > >
>> > > > >
>> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>> > > > >
>> > > > >     at
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)
>> > > > >
>> > > > >     at
>> > > > >
>> > > > >
>> > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)
>> > > > >
>> > > > >     at
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
>> > > > >
>> > > > >     at
>> org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)
>> > > > >
>> > > > >     at
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
>> > > > >
>> > > > >     at
>> > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
>> > > > >
>> > > > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> > > > >
>> > > > >     at
>> > > > >
>> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)
>> > > > >
>> > > > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > > > >
>> > > > >     at
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> > > > >
>> > > > >     at
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> > > > >
>> > > > >     at java.lang.reflect.Method.invoke(Method.java:597)
>> > > > >
>> > > > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>> > > > >
>> > > > > cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$
>> > > > >
>> > > > >
>> > > > >
>> > > > > Can you please tell me what the problem is?
>> > > > >
>> > > > >
>> > > > >
>> > > > > Thank you,
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > >
>> > > > > Praneet Mhatre
>> > > > >
>> > > > > Graduate Student
>> > > > >
>> > > > > Donald Bren School of ICS
>> > > > >
>> > > > > University of California, Irvine
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Praneet Mhatre
>> > > > > Graduate Student
>> > > > > Donald Bren School of ICS
>> > > > > University of California, Irvine
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > Praneet Mhatre
>> > > > Graduate Student
>> > > > Donald Bren School of ICS
>> > > > University of California, Irvine
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> Praneet Mhatre
>> Graduate Student
>> Donald Bren School of ICS
>> University of California, Irvine
>>
>
>


-- 
Praneet Mhatre
Graduate Student
Donald Bren School of ICS
University of California, Irvine

Re: Re : Partial Implementation of Random Forest

Posted by Ted Dunning <te...@gmail.com>.
Praneet,

What version of CDH is that?

On Mon, Apr 18, 2011 at 3:37 PM, praneet mhatre <pr...@gmail.com>wrote:

> Sean,
>
> I still see the _SUCCESS file in my output path. And I'm sure that I used
> the latest trunk. I'm in class right now, so I'll repeat the whole process
> carefully again in some time. Just wanted to let you know.
>
> On Mon, Apr 18, 2011 at 3:04 PM, Ted Dunning <te...@gmail.com>
> wrote:
>
> > Deneche,
> >
> > My local map-reduce guy says that he doesn't see this in the CDH sources.
> >  Can you say which version of CDH you were using?
> >
> >
> > On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <sr...@gmail.com> wrote:
> >
> > > I can easily add a bit to this method call that will cause it to skip
> > files
> > > and directories like .crc, _logs, etc. Seems like the right thing to do
> > > here
> > > as it's evidently causing a problem otherwise.
> > >
> > > On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <
> a_deneche@yahoo.fr
> > > >wrote:
> > >
> > > > Ok I was able to finally reproduce this bug, it appears when using
> > > > Cloudera's distribution of Hadoop. Apparently this distribution
> > contains
> > > > some patches from Hadoop 0.21 that create a _SUCCEED file in the
> output
> > > > path, the current code doesn't assume such file thus it can't parse
> it.
> > > > I tried the standard Hadoop O.20 distribution and it's working just
> > fine.
> > > > So for now I think it's safe to just use the standard distribution.
> > > >
> > > > --- En date de : Lun 11.4.11, praneet mhatre <
> praneetmhatre@gmail.com>
> > a
> > > > écrit :
> > > >
> > > > De: praneet mhatre <pr...@gmail.com>
> > > > Objet: Re: Re : Partial Implementation of Random Forest
> > > > À: user@mahout.apache.org
> > > > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > > > Date: Lundi 11 avril 2011, 23h18
> > > >
> > > > Me too. Used the latest code. Still the exact same error as before.
> > > >
> > > > Thanks,
> > > >
> > > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <
> > a_deneche@yahoo.fr
> > > > >wrote:
> > > >
> > > > > hmm, I will give it a look and see what's causing this
> > > > >
> > > > > --- En date de : Lun 11.4.11, ext-ranjit.chellappannair@nokia.com<
> > > > > ext-ranjit.chellappannair@nokia.com> a écrit :
> > > > >
> > > > > De: ext-ranjit.chellappannair@nokia.com <
> > > > > ext-ranjit.chellappannair@nokia.com>
> > > > > Objet: RE: Re : Partial Implementation of Random Forest
> > > > > À: user@mahout.apache.org
> > > > > Date: Lundi 11 avril 2011, 15h58
> > > > >
> > > > > Hi Deneche,
> > > > >
> > > > > I used the mahout latest code from the trunk and while running the
> > > > > BuildForest on KDD dataset I am getting an EOF exception. Please
> find
> > > the
> > > > > exception I am getting below:-
> > > > >
> > > > > Exception in thread "main" java.lang.IllegalStateException:
> > > > > java.io.EOFException
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > > > >         at
> > > > >
> > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > > > >         at
> > > org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > > > >         at
> > > > >
> org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > > > >         at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > >         at
> > > > >
> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > > > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > > > >         at
> > > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > > >         at
> > > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > > >         at java.lang.reflect.Method.invoke(Method.java:597)
> > > > >         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > > > Caused by: java.io.EOFException
> > > > >         at
> > java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > > >         at
> > java.io.DataInputStream.readFully(DataInputStream.java:152)
> > > > >         at
> > > > >
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > > > >         at
> > > > >
> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > > > >         at
> > > > >
> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > > > >         at
> > > > >
> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > > > >         ... 13 more
> > > > >
> > > > > Any help in resolving the above error will be greately appreciated.
> > > > >
> > > > > Thanks and Regards,
> > > > > Ranjit.C
> > > > >
> > > > > -----Original Message-----
> > > > > From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr]
> > > > > Sent: Wednesday, April 06, 2011 9:44 AM
> > > > > To: user@mahout.apache.org
> > > > > Subject: Re: Re : Partial Implementation of Random Forest
> > > > >
> > > > > There was a new bug in the code and I fixed it. Please try again
> > after
> > > > > updating the code. I am also using Cloudera's Hadoop and it's
> running
> > > > just
> > > > > fine
> > > > >
> > > > > --- En date de : Jeu 31.3.11, praneet mhatre <
> > praneetmhatre@gmail.com>
> > > a
> > > > > écrit :
> > > > >
> > > > > De: praneet mhatre <pr...@gmail.com>
> > > > > Objet: Re: Re : Partial Implementation of Random Forest
> > > > > À: user@mahout.apache.org
> > > > > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > > > > Date: Jeudi 31 mars 2011, 23h34
> > > > >
> > > > > Hi Deneche,
> > > > >
> > > > > I used the trunk. I still encounter the same error. By the way, I
> am
> > > > > running mahout on top of Cloudera's Linux image. I was just
> wondering
> > > if
> > > > > that has anything to do with the error.
> > > > >
> > > > > Exception in thread "main" java.lang.IllegalStateException:
> > > > > java.io.EOFException
> > > > >
> > > > >     at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > > > >     at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > > > >     at
> > > > >
> > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > > > >
> > > > >     at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > > > >     at
> org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > > > >     at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > > > >
> > > > >     at
> > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > > > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > >     at
> > > > >
> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > > > >
> > > > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > > >     at
> > > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > > >     at
> > > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > > >
> > > > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > > > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > > > Caused by: java.io.EOFException
> > > > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > > >
> > > > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > > > >     at
> > > > >
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > > > >     at
> > > > >
> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > > > >
> > > > >     at
> > > > >
> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > > > >     at
> > > > >
> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > > > >     at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > > > >
> > > > >     at
> > > > >
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > > > >     ... 13 more
> > > > > cloudera@cloudera-demo:~/Downloads/trunk$
> > > > >
> > > > > Thanks,
> > > > >
> > > > >
> > > > > On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <
> > > a_deneche@yahoo.fr>
> > > > > wrote:
> > > > >
> > > > > Hi Prannet,
> > > > >
> > > > >
> > > > >
> > > > > I fixed various bugs since 0.4, could you try using the trunk, and
> > see
> > > if
> > > > > it's happening again ?
> > > > >
> > > > >
> > > > >
> > > > > --- En date de : Mar 29.3.11, praneet mhatre <
> > praneetmhatre@gmail.com>
> > > a
> > > > > écrit :
> > > > >
> > > > >
> > > > >
> > > > > De: praneet mhatre <pr...@gmail.com>
> > > > >
> > > > > Objet: Partial Implementation of Random Forest
> > > > >
> > > > > À: user@mahout.apache.org
> > > > >
> > > > > Date: Mardi 29 mars 2011, 18h30
> > > > >
> > > > >
> > > > >
> > > > > I think my previous mail did not get through.
> > > > >
> > > > >
> > > > >
> > > > > ---------- Forwarded message ----------
> > > > >
> > > > > From: praneet mhatre <pr...@gmail.com>
> > > > >
> > > > > Date: Mon, Mar 28, 2011 at 10:50 PM
> > > > >
> > > > > Subject: Partial Implementation of Random Forest
> > > > >
> > > > > To: user@mahout.apache.org
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Hello all,
> > > > >
> > > > >
> > > > >
> > > > > I very recently started working on Mahout. To get the feel of
> things,
> > I
> > > > was
> > > > >
> > > > > trying to run  the sample implementation of Random Forest posted on
> > the
> > > > > Wiki
> > > > >
> > > > > (
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
> > > > > ).
> > > > >
> > > > > However, even when I issue the exact same commands, I get an
> > > > >
> > > > > EOFException
> > > > >
> > > > > error as follows:
> > > > >
> > > > > Exception in thread "main" java.io.EOFException
> > > > >
> > > > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > > >
> > > > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > > > >
> > > > >     at
> > > > >
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > > > >
> > > > >     at
> > > > >
> > > > >
> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > > > >
> > > > >     at
> > > > >
> > > > >
> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > > > >
> > > > >     at
> > > > >
> > > > >
> > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > > > >
> > > > >     at
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)
> > > > >
> > > > >     at
> > > > >
> > > > >
> > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)
> > > > >
> > > > >     at
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > > > >
> > > > >     at
> org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)
> > > > >
> > > > >     at
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > > > >
> > > > >     at
> > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > > > >
> > > > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > >
> > > > >     at
> > > > >
> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)
> > > > >
> > > > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > > >
> > > > >     at
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > > >
> > > > >     at
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > > >
> > > > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > > > >
> > > > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > > >
> > > > > cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$
> > > > >
> > > > >
> > > > >
> > > > > Can you please tell me what the problem is?
> > > > >
> > > > >
> > > > >
> > > > > Thank you,
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Praneet Mhatre
> > > > >
> > > > > Graduate Student
> > > > >
> > > > > Donald Bren School of ICS
> > > > >
> > > > > University of California, Irvine
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Praneet Mhatre
> > > > > Graduate Student
> > > > > Donald Bren School of ICS
> > > > > University of California, Irvine
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Praneet Mhatre
> > > > Graduate Student
> > > > Donald Bren School of ICS
> > > > University of California, Irvine
> > > >
> > >
> >
>
>
>
> --
> Praneet Mhatre
> Graduate Student
> Donald Bren School of ICS
> University of California, Irvine
>

Re: Re : Partial Implementation of Random Forest

Posted by praneet mhatre <pr...@gmail.com>.
Sean,

I still see the _SUCCESS file in my output path. And I'm sure that I used
the latest trunk. I'm in class right now, so I'll repeat the whole process
carefully again in some time. Just wanted to let you know.

On Mon, Apr 18, 2011 at 3:04 PM, Ted Dunning <te...@gmail.com> wrote:

> Deneche,
>
> My local map-reduce guy says that he doesn't see this in the CDH sources.
>  Can you say which version of CDH you were using?
>
>
> On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <sr...@gmail.com> wrote:
>
> > I can easily add a bit to this method call that will cause it to skip
> files
> > and directories like .crc, _logs, etc. Seems like the right thing to do
> > here
> > as it's evidently causing a problem otherwise.
> >
> > On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <a_deneche@yahoo.fr
> > >wrote:
> >
> > > Ok I was able to finally reproduce this bug, it appears when using
> > > Cloudera's distribution of Hadoop. Apparently this distribution
> contains
> > > some patches from Hadoop 0.21 that create a _SUCCEED file in the output
> > > path, the current code doesn't assume such file thus it can't parse it.
> > > I tried the standard Hadoop O.20 distribution and it's working just
> fine.
> > > So for now I think it's safe to just use the standard distribution.
> > >
> > > --- En date de : Lun 11.4.11, praneet mhatre <pr...@gmail.com>
> a
> > > écrit :
> > >
> > > De: praneet mhatre <pr...@gmail.com>
> > > Objet: Re: Re : Partial Implementation of Random Forest
> > > À: user@mahout.apache.org
> > > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > > Date: Lundi 11 avril 2011, 23h18
> > >
> > > Me too. Used the latest code. Still the exact same error as before.
> > >
> > > Thanks,
> > >
> > > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <
> a_deneche@yahoo.fr
> > > >wrote:
> > >
> > > > hmm, I will give it a look and see what's causing this
> > > >
> > > > --- En date de : Lun 11.4.11, ext-ranjit.chellappannair@nokia.com <
> > > > ext-ranjit.chellappannair@nokia.com> a écrit :
> > > >
> > > > De: ext-ranjit.chellappannair@nokia.com <
> > > > ext-ranjit.chellappannair@nokia.com>
> > > > Objet: RE: Re : Partial Implementation of Random Forest
> > > > À: user@mahout.apache.org
> > > > Date: Lundi 11 avril 2011, 15h58
> > > >
> > > > Hi Deneche,
> > > >
> > > > I used the mahout latest code from the trunk and while running the
> > > > BuildForest on KDD dataset I am getting an EOF exception. Please find
> > the
> > > > exception I am getting below:-
> > > >
> > > > Exception in thread "main" java.lang.IllegalStateException:
> > > > java.io.EOFException
> > > >         at
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > > >         at
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > > >         at
> > > >
> org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > > >         at
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > > >         at
> > org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > > >         at
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > > >         at
> > > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > > >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > >         at
> > > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > >         at
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > >         at
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > >         at java.lang.reflect.Method.invoke(Method.java:597)
> > > >         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > > Caused by: java.io.EOFException
> > > >         at
> java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > >         at
> java.io.DataInputStream.readFully(DataInputStream.java:152)
> > > >         at
> > > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > > >         at
> > > >
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > > >         at
> > > >
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > > >         at
> > > >
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > > >         at
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > > >         at
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > > >         ... 13 more
> > > >
> > > > Any help in resolving the above error will be greately appreciated.
> > > >
> > > > Thanks and Regards,
> > > > Ranjit.C
> > > >
> > > > -----Original Message-----
> > > > From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr]
> > > > Sent: Wednesday, April 06, 2011 9:44 AM
> > > > To: user@mahout.apache.org
> > > > Subject: Re: Re : Partial Implementation of Random Forest
> > > >
> > > > There was a new bug in the code and I fixed it. Please try again
> after
> > > > updating the code. I am also using Cloudera's Hadoop and it's running
> > > just
> > > > fine
> > > >
> > > > --- En date de : Jeu 31.3.11, praneet mhatre <
> praneetmhatre@gmail.com>
> > a
> > > > écrit :
> > > >
> > > > De: praneet mhatre <pr...@gmail.com>
> > > > Objet: Re: Re : Partial Implementation of Random Forest
> > > > À: user@mahout.apache.org
> > > > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > > > Date: Jeudi 31 mars 2011, 23h34
> > > >
> > > > Hi Deneche,
> > > >
> > > > I used the trunk. I still encounter the same error. By the way, I am
> > > > running mahout on top of Cloudera's Linux image. I was just wondering
> > if
> > > > that has anything to do with the error.
> > > >
> > > > Exception in thread "main" java.lang.IllegalStateException:
> > > > java.io.EOFException
> > > >
> > > >     at
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > > >     at
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > > >     at
> > > >
> org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > > >
> > > >     at
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > > >     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > > >     at
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > > >
> > > >     at
> > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > >     at
> > > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > > >
> > > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >     at
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > >     at
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > >
> > > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > > Caused by: java.io.EOFException
> > > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > >
> > > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > > >     at
> > > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > > >     at
> > > >
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > > >
> > > >     at
> > > >
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > > >     at
> > > >
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > > >     at
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > > >
> > > >     at
> > > >
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > > >     ... 13 more
> > > > cloudera@cloudera-demo:~/Downloads/trunk$
> > > >
> > > > Thanks,
> > > >
> > > >
> > > > On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <
> > a_deneche@yahoo.fr>
> > > > wrote:
> > > >
> > > > Hi Prannet,
> > > >
> > > >
> > > >
> > > > I fixed various bugs since 0.4, could you try using the trunk, and
> see
> > if
> > > > it's happening again ?
> > > >
> > > >
> > > >
> > > > --- En date de : Mar 29.3.11, praneet mhatre <
> praneetmhatre@gmail.com>
> > a
> > > > écrit :
> > > >
> > > >
> > > >
> > > > De: praneet mhatre <pr...@gmail.com>
> > > >
> > > > Objet: Partial Implementation of Random Forest
> > > >
> > > > À: user@mahout.apache.org
> > > >
> > > > Date: Mardi 29 mars 2011, 18h30
> > > >
> > > >
> > > >
> > > > I think my previous mail did not get through.
> > > >
> > > >
> > > >
> > > > ---------- Forwarded message ----------
> > > >
> > > > From: praneet mhatre <pr...@gmail.com>
> > > >
> > > > Date: Mon, Mar 28, 2011 at 10:50 PM
> > > >
> > > > Subject: Partial Implementation of Random Forest
> > > >
> > > > To: user@mahout.apache.org
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Hello all,
> > > >
> > > >
> > > >
> > > > I very recently started working on Mahout. To get the feel of things,
> I
> > > was
> > > >
> > > > trying to run  the sample implementation of Random Forest posted on
> the
> > > > Wiki
> > > >
> > > > (
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
> > > > ).
> > > >
> > > > However, even when I issue the exact same commands, I get an
> > > >
> > > > EOFException
> > > >
> > > > error as follows:
> > > >
> > > > Exception in thread "main" java.io.EOFException
> > > >
> > > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > >
> > > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > > >
> > > >     at
> > > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > > >
> > > >     at
> > > >
> > > >
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > > >
> > > >     at
> > > >
> > > >
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > > >
> > > >     at
> > > >
> > > >
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > > >
> > > >     at
> > > >
> > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)
> > > >
> > > >     at
> > > >
> > > >
> org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)
> > > >
> > > >     at
> > > >
> > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > > >
> > > >     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)
> > > >
> > > >     at
> > > >
> > > >
> > > >
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > > >
> > > >     at
> > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > > >
> > > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > >
> > > >     at
> > > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)
> > > >
> > > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >
> > > >     at
> > > >
> > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > >
> > > >     at
> > > >
> > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > >
> > > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > > >
> > > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > >
> > > > cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$
> > > >
> > > >
> > > >
> > > > Can you please tell me what the problem is?
> > > >
> > > >
> > > >
> > > > Thank you,
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Praneet Mhatre
> > > >
> > > > Graduate Student
> > > >
> > > > Donald Bren School of ICS
> > > >
> > > > University of California, Irvine
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Praneet Mhatre
> > > > Graduate Student
> > > > Donald Bren School of ICS
> > > > University of California, Irvine
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Praneet Mhatre
> > > Graduate Student
> > > Donald Bren School of ICS
> > > University of California, Irvine
> > >
> >
>



-- 
Praneet Mhatre
Graduate Student
Donald Bren School of ICS
University of California, Irvine

Re: Re : Partial Implementation of Random Forest

Posted by deneche abdelhakim <a_...@yahoo.fr>.
it's hadoop-0.20.2-cdh3u0 for now I'm running it in a standalone mode.

--- En date de : Mar 19.4.11, Ted Dunning <te...@gmail.com> a écrit :

De: Ted Dunning <te...@gmail.com>
Objet: Re: Re : Partial Implementation of Random Forest
À: user@mahout.apache.org
Cc: "Sean Owen" <sr...@gmail.com>
Date: Mardi 19 avril 2011, 0h04

Deneche,

My local map-reduce guy says that he doesn't see this in the CDH sources.
 Can you say which version of CDH you were using?


On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <sr...@gmail.com> wrote:

> I can easily add a bit to this method call that will cause it to skip files
> and directories like .crc, _logs, etc. Seems like the right thing to do
> here
> as it's evidently causing a problem otherwise.
>
> On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <a_deneche@yahoo.fr
> >wrote:
>
> > Ok I was able to finally reproduce this bug, it appears when using
> > Cloudera's distribution of Hadoop. Apparently this distribution contains
> > some patches from Hadoop 0.21 that create a _SUCCEED file in the output
> > path, the current code doesn't assume such file thus it can't parse it.
> > I tried the standard Hadoop O.20 distribution and it's working just fine.
> > So for now I think it's safe to just use the standard distribution.
> >
> > --- En date de : Lun 11.4.11, praneet mhatre <pr...@gmail.com> a
> > écrit :
> >
> > De: praneet mhatre <pr...@gmail.com>
> > Objet: Re: Re : Partial Implementation of Random Forest
> > À: user@mahout.apache.org
> > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > Date: Lundi 11 avril 2011, 23h18
> >
> > Me too. Used the latest code. Still the exact same error as before.
> >
> > Thanks,
> >
> > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <a_deneche@yahoo.fr
> > >wrote:
> >
> > > hmm, I will give it a look and see what's causing this
> > >
> > > --- En date de : Lun 11.4.11, ext-ranjit.chellappannair@nokia.com <
> > > ext-ranjit.chellappannair@nokia.com> a écrit :
> > >
> > > De: ext-ranjit.chellappannair@nokia.com <
> > > ext-ranjit.chellappannair@nokia.com>
> > > Objet: RE: Re : Partial Implementation of Random Forest
> > > À: user@mahout.apache.org
> > > Date: Lundi 11 avril 2011, 15h58
> > >
> > > Hi Deneche,
> > >
> > > I used the mahout latest code from the trunk and while running the
> > > BuildForest on KDD dataset I am getting an EOF exception. Please find
> the
> > > exception I am getting below:-
> > >
> > > Exception in thread "main" java.lang.IllegalStateException:
> > > java.io.EOFException
> > >         at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > >         at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > >         at
> > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > >         at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > >         at
> org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > >         at
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > >         at
> > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >         at
> > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >         at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >         at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >         at java.lang.reflect.Method.invoke(Method.java:597)
> > >         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > Caused by: java.io.EOFException
> > >         at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >         at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > >         at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > >         at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > >         ... 13 more
> > >
> > > Any help in resolving the above error will be greately appreciated.
> > >
> > > Thanks and Regards,
> > > Ranjit.C
> > >
> > > -----Original Message-----
> > > From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr]
> > > Sent: Wednesday, April 06, 2011 9:44 AM
> > > To: user@mahout.apache.org
> > > Subject: Re: Re : Partial Implementation of Random Forest
> > >
> > > There was a new bug in the code and I fixed it. Please try again after
> > > updating the code. I am also using Cloudera's Hadoop and it's running
> > just
> > > fine
> > >
> > > --- En date de : Jeu 31.3.11, praneet mhatre <pr...@gmail.com>
> a
> > > écrit :
> > >
> > > De: praneet mhatre <pr...@gmail.com>
> > > Objet: Re: Re : Partial Implementation of Random Forest
> > > À: user@mahout.apache.org
> > > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > > Date: Jeudi 31 mars 2011, 23h34
> > >
> > > Hi Deneche,
> > >
> > > I used the trunk. I still encounter the same error. By the way, I am
> > > running mahout on top of Cloudera's Linux image. I was just wondering
> if
> > > that has anything to do with the error.
> > >
> > > Exception in thread "main" java.lang.IllegalStateException:
> > > java.io.EOFException
> > >
> > >     at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > >     at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > >     at
> > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > >
> > >     at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > >     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > >     at
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > >
> > >     at
> > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >     at
> > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > >
> > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >     at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >     at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > Caused by: java.io.EOFException
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > >
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > >     at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > >
> > >     at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > >     ... 13 more
> > > cloudera@cloudera-demo:~/Downloads/trunk$
> > >
> > > Thanks,
> > >
> > >
> > > On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <
> a_deneche@yahoo.fr>
> > > wrote:
> > >
> > > Hi Prannet,
> > >
> > >
> > >
> > > I fixed various bugs since 0.4, could you try using the trunk, and see
> if
> > > it's happening again ?
> > >
> > >
> > >
> > > --- En date de : Mar 29.3.11, praneet mhatre <pr...@gmail.com>
> a
> > > écrit :
> > >
> > >
> > >
> > > De: praneet mhatre <pr...@gmail.com>
> > >
> > > Objet: Partial Implementation of Random Forest
> > >
> > > À: user@mahout.apache.org
> > >
> > > Date: Mardi 29 mars 2011, 18h30
> > >
> > >
> > >
> > > I think my previous mail did not get through.
> > >
> > >
> > >
> > > ---------- Forwarded message ----------
> > >
> > > From: praneet mhatre <pr...@gmail.com>
> > >
> > > Date: Mon, Mar 28, 2011 at 10:50 PM
> > >
> > > Subject: Partial Implementation of Random Forest
> > >
> > > To: user@mahout.apache.org
> > >
> > >
> > >
> > >
> > >
> > > Hello all,
> > >
> > >
> > >
> > > I very recently started working on Mahout. To get the feel of things, I
> > was
> > >
> > > trying to run  the sample implementation of Random Forest posted on the
> > > Wiki
> > >
> > > (
> > >
> >
> https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
> > > ).
> > >
> > > However, even when I issue the exact same commands, I get an
> > >
> > > EOFException
> > >
> > > error as follows:
> > >
> > > Exception in thread "main" java.io.EOFException
> > >
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > >
> > >     at
> > >
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > >
> > >     at
> > >
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > >
> > >     at
> > >
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > >
> > >     at
> > >
> > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)
> > >
> > >     at
> > >
> > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)
> > >
> > >     at
> > >
> > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > >
> > >     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)
> > >
> > >     at
> > >
> > >
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > >
> > >     at
> > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > >
> > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >
> > >     at
> > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)
> > >
> > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >
> > >     at
> > >
> > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >
> > >     at
> > >
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >
> > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > >
> > > cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$
> > >
> > >
> > >
> > > Can you please tell me what the problem is?
> > >
> > >
> > >
> > > Thank you,
> > >
> > >
> > >
> > > --
> > >
> > > Praneet Mhatre
> > >
> > > Graduate Student
> > >
> > > Donald Bren School of ICS
> > >
> > > University of California, Irvine
> > >
> > >
> > >
> > >
> > > --
> > > Praneet Mhatre
> > > Graduate Student
> > > Donald Bren School of ICS
> > > University of California, Irvine
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Praneet Mhatre
> > Graduate Student
> > Donald Bren School of ICS
> > University of California, Irvine
> >
>

Re: Re : Partial Implementation of Random Forest

Posted by Ted Dunning <te...@gmail.com>.
Deneche,

My local map-reduce guy says that he doesn't see this in the CDH sources.
 Can you say which version of CDH you were using?


On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <sr...@gmail.com> wrote:

> I can easily add a bit to this method call that will cause it to skip files
> and directories like .crc, _logs, etc. Seems like the right thing to do
> here
> as it's evidently causing a problem otherwise.
>
> On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <a_deneche@yahoo.fr
> >wrote:
>
> > Ok I was able to finally reproduce this bug, it appears when using
> > Cloudera's distribution of Hadoop. Apparently this distribution contains
> > some patches from Hadoop 0.21 that create a _SUCCEED file in the output
> > path, the current code doesn't assume such file thus it can't parse it.
> > I tried the standard Hadoop O.20 distribution and it's working just fine.
> > So for now I think it's safe to just use the standard distribution.
> >
> > --- En date de : Lun 11.4.11, praneet mhatre <pr...@gmail.com> a
> > écrit :
> >
> > De: praneet mhatre <pr...@gmail.com>
> > Objet: Re: Re : Partial Implementation of Random Forest
> > À: user@mahout.apache.org
> > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > Date: Lundi 11 avril 2011, 23h18
> >
> > Me too. Used the latest code. Still the exact same error as before.
> >
> > Thanks,
> >
> > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <a_deneche@yahoo.fr
> > >wrote:
> >
> > > hmm, I will give it a look and see what's causing this
> > >
> > > --- En date de : Lun 11.4.11, ext-ranjit.chellappannair@nokia.com <
> > > ext-ranjit.chellappannair@nokia.com> a écrit :
> > >
> > > De: ext-ranjit.chellappannair@nokia.com <
> > > ext-ranjit.chellappannair@nokia.com>
> > > Objet: RE: Re : Partial Implementation of Random Forest
> > > À: user@mahout.apache.org
> > > Date: Lundi 11 avril 2011, 15h58
> > >
> > > Hi Deneche,
> > >
> > > I used the mahout latest code from the trunk and while running the
> > > BuildForest on KDD dataset I am getting an EOF exception. Please find
> the
> > > exception I am getting below:-
> > >
> > > Exception in thread "main" java.lang.IllegalStateException:
> > > java.io.EOFException
> > >         at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > >         at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > >         at
> > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > >         at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > >         at
> org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > >         at
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > >         at
> > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >         at
> > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >         at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >         at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >         at java.lang.reflect.Method.invoke(Method.java:597)
> > >         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > Caused by: java.io.EOFException
> > >         at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >         at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > >         at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > >         at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > >         ... 13 more
> > >
> > > Any help in resolving the above error will be greately appreciated.
> > >
> > > Thanks and Regards,
> > > Ranjit.C
> > >
> > > -----Original Message-----
> > > From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr]
> > > Sent: Wednesday, April 06, 2011 9:44 AM
> > > To: user@mahout.apache.org
> > > Subject: Re: Re : Partial Implementation of Random Forest
> > >
> > > There was a new bug in the code and I fixed it. Please try again after
> > > updating the code. I am also using Cloudera's Hadoop and it's running
> > just
> > > fine
> > >
> > > --- En date de : Jeu 31.3.11, praneet mhatre <pr...@gmail.com>
> a
> > > écrit :
> > >
> > > De: praneet mhatre <pr...@gmail.com>
> > > Objet: Re: Re : Partial Implementation of Random Forest
> > > À: user@mahout.apache.org
> > > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > > Date: Jeudi 31 mars 2011, 23h34
> > >
> > > Hi Deneche,
> > >
> > > I used the trunk. I still encounter the same error. By the way, I am
> > > running mahout on top of Cloudera's Linux image. I was just wondering
> if
> > > that has anything to do with the error.
> > >
> > > Exception in thread "main" java.lang.IllegalStateException:
> > > java.io.EOFException
> > >
> > >     at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > >     at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > >     at
> > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > >
> > >     at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > >     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > >     at
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > >
> > >     at
> > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >     at
> > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > >
> > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >     at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >     at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > Caused by: java.io.EOFException
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > >
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > >     at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > >
> > >     at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > >     ... 13 more
> > > cloudera@cloudera-demo:~/Downloads/trunk$
> > >
> > > Thanks,
> > >
> > >
> > > On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <
> a_deneche@yahoo.fr>
> > > wrote:
> > >
> > > Hi Prannet,
> > >
> > >
> > >
> > > I fixed various bugs since 0.4, could you try using the trunk, and see
> if
> > > it's happening again ?
> > >
> > >
> > >
> > > --- En date de : Mar 29.3.11, praneet mhatre <pr...@gmail.com>
> a
> > > écrit :
> > >
> > >
> > >
> > > De: praneet mhatre <pr...@gmail.com>
> > >
> > > Objet: Partial Implementation of Random Forest
> > >
> > > À: user@mahout.apache.org
> > >
> > > Date: Mardi 29 mars 2011, 18h30
> > >
> > >
> > >
> > > I think my previous mail did not get through.
> > >
> > >
> > >
> > > ---------- Forwarded message ----------
> > >
> > > From: praneet mhatre <pr...@gmail.com>
> > >
> > > Date: Mon, Mar 28, 2011 at 10:50 PM
> > >
> > > Subject: Partial Implementation of Random Forest
> > >
> > > To: user@mahout.apache.org
> > >
> > >
> > >
> > >
> > >
> > > Hello all,
> > >
> > >
> > >
> > > I very recently started working on Mahout. To get the feel of things, I
> > was
> > >
> > > trying to run  the sample implementation of Random Forest posted on the
> > > Wiki
> > >
> > > (
> > >
> >
> https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
> > > ).
> > >
> > > However, even when I issue the exact same commands, I get an
> > >
> > > EOFException
> > >
> > > error as follows:
> > >
> > > Exception in thread "main" java.io.EOFException
> > >
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > >
> > >     at
> > >
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > >
> > >     at
> > >
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > >
> > >     at
> > >
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > >
> > >     at
> > >
> > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)
> > >
> > >     at
> > >
> > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)
> > >
> > >     at
> > >
> > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > >
> > >     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)
> > >
> > >     at
> > >
> > >
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > >
> > >     at
> > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > >
> > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >
> > >     at
> > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)
> > >
> > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >
> > >     at
> > >
> > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >
> > >     at
> > >
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >
> > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > >
> > > cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$
> > >
> > >
> > >
> > > Can you please tell me what the problem is?
> > >
> > >
> > >
> > > Thank you,
> > >
> > >
> > >
> > > --
> > >
> > > Praneet Mhatre
> > >
> > > Graduate Student
> > >
> > > Donald Bren School of ICS
> > >
> > > University of California, Irvine
> > >
> > >
> > >
> > >
> > > --
> > > Praneet Mhatre
> > > Graduate Student
> > > Donald Bren School of ICS
> > > University of California, Irvine
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Praneet Mhatre
> > Graduate Student
> > Donald Bren School of ICS
> > University of California, Irvine
> >
>

Re: Re : Partial Implementation of Random Forest

Posted by Sean Owen <sr...@gmail.com>.
I committed my change just now -- try it out.

On Mon, Apr 18, 2011 at 10:46 PM, praneet mhatre <pr...@gmail.com>wrote:

> That would be really helpful. It'll save me the trouble of reverting to an
> older version of Hadoop! Please update us here after it is done.
>
> Thank you,
>
>

Re: Re : Partial Implementation of Random Forest

Posted by praneet mhatre <pr...@gmail.com>.
That would be really helpful. It'll save me the trouble of reverting to an
older version of Hadoop! Please update us here after it is done.

Thank you,

On Mon, Apr 18, 2011 at 2:36 PM, Sean Owen <sr...@gmail.com> wrote:

> I can easily add a bit to this method call that will cause it to skip files
> and directories like .crc, _logs, etc. Seems like the right thing to do
> here
> as it's evidently causing a problem otherwise.
>
> On Mon, Apr 18, 2011 at 7:14 PM, deneche abdelhakim <a_deneche@yahoo.fr
> >wrote:
>
> > Ok I was able to finally reproduce this bug, it appears when using
> > Cloudera's distribution of Hadoop. Apparently this distribution contains
> > some patches from Hadoop 0.21 that create a _SUCCEED file in the output
> > path, the current code doesn't assume such file thus it can't parse it.
> > I tried the standard Hadoop O.20 distribution and it's working just fine.
> > So for now I think it's safe to just use the standard distribution.
> >
> > --- En date de : Lun 11.4.11, praneet mhatre <pr...@gmail.com> a
> > écrit :
> >
> > De: praneet mhatre <pr...@gmail.com>
> > Objet: Re: Re : Partial Implementation of Random Forest
> > À: user@mahout.apache.org
> > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > Date: Lundi 11 avril 2011, 23h18
> >
> > Me too. Used the latest code. Still the exact same error as before.
> >
> > Thanks,
> >
> > On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <a_deneche@yahoo.fr
> > >wrote:
> >
> > > hmm, I will give it a look and see what's causing this
> > >
> > > --- En date de : Lun 11.4.11, ext-ranjit.chellappannair@nokia.com <
> > > ext-ranjit.chellappannair@nokia.com> a écrit :
> > >
> > > De: ext-ranjit.chellappannair@nokia.com <
> > > ext-ranjit.chellappannair@nokia.com>
> > > Objet: RE: Re : Partial Implementation of Random Forest
> > > À: user@mahout.apache.org
> > > Date: Lundi 11 avril 2011, 15h58
> > >
> > > Hi Deneche,
> > >
> > > I used the mahout latest code from the trunk and while running the
> > > BuildForest on KDD dataset I am getting an EOF exception. Please find
> the
> > > exception I am getting below:-
> > >
> > > Exception in thread "main" java.lang.IllegalStateException:
> > > java.io.EOFException
> > >         at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > >         at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > >         at
> > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > >         at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > >         at
> org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > >         at
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > >         at
> > > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >         at
> > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >         at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >         at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >         at java.lang.reflect.Method.invoke(Method.java:597)
> > >         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > Caused by: java.io.EOFException
> > >         at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >         at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > >         at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > >         at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > >         at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > >         ... 13 more
> > >
> > > Any help in resolving the above error will be greately appreciated.
> > >
> > > Thanks and Regards,
> > > Ranjit.C
> > >
> > > -----Original Message-----
> > > From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr]
> > > Sent: Wednesday, April 06, 2011 9:44 AM
> > > To: user@mahout.apache.org
> > > Subject: Re: Re : Partial Implementation of Random Forest
> > >
> > > There was a new bug in the code and I fixed it. Please try again after
> > > updating the code. I am also using Cloudera's Hadoop and it's running
> > just
> > > fine
> > >
> > > --- En date de : Jeu 31.3.11, praneet mhatre <pr...@gmail.com>
> a
> > > écrit :
> > >
> > > De: praneet mhatre <pr...@gmail.com>
> > > Objet: Re: Re : Partial Implementation of Random Forest
> > > À: user@mahout.apache.org
> > > Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> > > Date: Jeudi 31 mars 2011, 23h34
> > >
> > > Hi Deneche,
> > >
> > > I used the trunk. I still encounter the same error. By the way, I am
> > > running mahout on top of Cloudera's Linux image. I was just wondering
> if
> > > that has anything to do with the error.
> > >
> > > Exception in thread "main" java.lang.IllegalStateException:
> > > java.io.EOFException
> > >
> > >     at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
> > >     at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
> > >     at
> > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
> > >
> > >     at
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > >     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
> > >     at
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > >
> > >     at
> > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >     at
> > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
> > >
> > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >     at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >     at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > > Caused by: java.io.EOFException
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > >
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > >     at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
> > >
> > >     at
> > >
> >
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
> > >     ... 13 more
> > > cloudera@cloudera-demo:~/Downloads/trunk$
> > >
> > > Thanks,
> > >
> > >
> > > On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <
> a_deneche@yahoo.fr>
> > > wrote:
> > >
> > > Hi Prannet,
> > >
> > >
> > >
> > > I fixed various bugs since 0.4, could you try using the trunk, and see
> if
> > > it's happening again ?
> > >
> > >
> > >
> > > --- En date de : Mar 29.3.11, praneet mhatre <pr...@gmail.com>
> a
> > > écrit :
> > >
> > >
> > >
> > > De: praneet mhatre <pr...@gmail.com>
> > >
> > > Objet: Partial Implementation of Random Forest
> > >
> > > À: user@mahout.apache.org
> > >
> > > Date: Mardi 29 mars 2011, 18h30
> > >
> > >
> > >
> > > I think my previous mail did not get through.
> > >
> > >
> > >
> > > ---------- Forwarded message ----------
> > >
> > > From: praneet mhatre <pr...@gmail.com>
> > >
> > > Date: Mon, Mar 28, 2011 at 10:50 PM
> > >
> > > Subject: Partial Implementation of Random Forest
> > >
> > > To: user@mahout.apache.org
> > >
> > >
> > >
> > >
> > >
> > > Hello all,
> > >
> > >
> > >
> > > I very recently started working on Mahout. To get the feel of things, I
> > was
> > >
> > > trying to run  the sample implementation of Random Forest posted on the
> > > Wiki
> > >
> > > (
> > >
> >
> https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
> > > ).
> > >
> > > However, even when I issue the exact same commands, I get an
> > >
> > > EOFException
> > >
> > > error as follows:
> > >
> > > Exception in thread "main" java.io.EOFException
> > >
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
> > >
> > >     at
> > >
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
> > >
> > >     at
> > >
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
> > >
> > >     at
> > >
> > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
> > >
> > >     at
> > >
> > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)
> > >
> > >     at
> > >
> > > org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)
> > >
> > >     at
> > >
> > >
> > >
> >
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
> > >
> > >     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)
> > >
> > >     at
> > >
> > >
> > >
> >
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
> > >
> > >     at
> > org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
> > >
> > >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >
> > >     at
> > > org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)
> > >
> > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >
> > >     at
> > >
> > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >
> > >     at
> > >
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >
> > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> > >
> > > cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$
> > >
> > >
> > >
> > > Can you please tell me what the problem is?
> > >
> > >
> > >
> > > Thank you,
> > >
> > >
> > >
> > > --
> > >
> > > Praneet Mhatre
> > >
> > > Graduate Student
> > >
> > > Donald Bren School of ICS
> > >
> > > University of California, Irvine
> > >
> > >
> > >
> > >
> > > --
> > > Praneet Mhatre
> > > Graduate Student
> > > Donald Bren School of ICS
> > > University of California, Irvine
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Praneet Mhatre
> > Graduate Student
> > Donald Bren School of ICS
> > University of California, Irvine
> >
>



-- 
Praneet Mhatre
Graduate Student
Donald Bren School of ICS
University of California, Irvine

Re: Re : Partial Implementation of Random Forest

Posted by deneche abdelhakim <a_...@yahoo.fr>.
Ok I was able to finally reproduce this bug, it appears when using Cloudera's distribution of Hadoop. Apparently this distribution contains some patches from Hadoop 0.21 that create a _SUCCEED file in the output path, the current code doesn't assume such file thus it can't parse it.
I tried the standard Hadoop O.20 distribution and it's working just fine. So for now I think it's safe to just use the standard distribution.

--- En date de : Lun 11.4.11, praneet mhatre <pr...@gmail.com> a écrit :

De: praneet mhatre <pr...@gmail.com>
Objet: Re: Re : Partial Implementation of Random Forest
À: user@mahout.apache.org
Cc: "deneche abdelhakim" <a_...@yahoo.fr>
Date: Lundi 11 avril 2011, 23h18

Me too. Used the latest code. Still the exact same error as before.

Thanks,

On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <a_...@yahoo.fr>wrote:

> hmm, I will give it a look and see what's causing this
>
> --- En date de : Lun 11.4.11, ext-ranjit.chellappannair@nokia.com <
> ext-ranjit.chellappannair@nokia.com> a écrit :
>
> De: ext-ranjit.chellappannair@nokia.com <
> ext-ranjit.chellappannair@nokia.com>
> Objet: RE: Re : Partial Implementation of Random Forest
> À: user@mahout.apache.org
> Date: Lundi 11 avril 2011, 15h58
>
> Hi Deneche,
>
> I used the mahout latest code from the trunk and while running the
> BuildForest on KDD dataset I am getting an EOF exception. Please find the
> exception I am getting below:-
>
> Exception in thread "main" java.lang.IllegalStateException:
> java.io.EOFException
>         at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
>         at
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
>         at
> org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
>         at
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
>         at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
>         at
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
>         at
> org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at
> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at java.io.DataInputStream.readFully(DataInputStream.java:152)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>         at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
>         at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
>         ... 13 more
>
> Any help in resolving the above error will be greately appreciated.
>
> Thanks and Regards,
> Ranjit.C
>
> -----Original Message-----
> From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr]
> Sent: Wednesday, April 06, 2011 9:44 AM
> To: user@mahout.apache.org
> Subject: Re: Re : Partial Implementation of Random Forest
>
> There was a new bug in the code and I fixed it. Please try again after
> updating the code. I am also using Cloudera's Hadoop and it's running just
> fine
>
> --- En date de : Jeu 31.3.11, praneet mhatre <pr...@gmail.com> a
> écrit :
>
> De: praneet mhatre <pr...@gmail.com>
> Objet: Re: Re : Partial Implementation of Random Forest
> À: user@mahout.apache.org
> Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> Date: Jeudi 31 mars 2011, 23h34
>
> Hi Deneche,
>
> I used the trunk. I still encounter the same error. By the way, I am
> running mahout on top of Cloudera's Linux image. I was just wondering if
> that has anything to do with the error.
>
> Exception in thread "main" java.lang.IllegalStateException:
> java.io.EOFException
>
>     at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
>     at
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
>     at
> org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
>
>     at
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
>     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
>     at
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
>
>     at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at
> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Caused by: java.io.EOFException
>     at java.io.DataInputStream.readFully(DataInputStream.java:180)
>
>     at java.io.DataInputStream.readFully(DataInputStream.java:152)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>
>     at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>     at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
>
>     at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
>     ... 13 more
> cloudera@cloudera-demo:~/Downloads/trunk$
>
> Thanks,
>
>
> On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <a_...@yahoo.fr>
> wrote:
>
> Hi Prannet,
>
>
>
> I fixed various bugs since 0.4, could you try using the trunk, and see if
> it's happening again ?
>
>
>
> --- En date de : Mar 29.3.11, praneet mhatre <pr...@gmail.com> a
> écrit :
>
>
>
> De: praneet mhatre <pr...@gmail.com>
>
> Objet: Partial Implementation of Random Forest
>
> À: user@mahout.apache.org
>
> Date: Mardi 29 mars 2011, 18h30
>
>
>
> I think my previous mail did not get through.
>
>
>
> ---------- Forwarded message ----------
>
> From: praneet mhatre <pr...@gmail.com>
>
> Date: Mon, Mar 28, 2011 at 10:50 PM
>
> Subject: Partial Implementation of Random Forest
>
> To: user@mahout.apache.org
>
>
>
>
>
> Hello all,
>
>
>
> I very recently started working on Mahout. To get the feel of things, I was
>
> trying to run  the sample implementation of Random Forest posted on the
> Wiki
>
> (
> https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
> ).
>
> However, even when I issue the exact same commands, I get an
>
> EOFException
>
> error as follows:
>
> Exception in thread "main" java.io.EOFException
>
>     at java.io.DataInputStream.readFully(DataInputStream.java:180)
>
>     at java.io.DataInputStream.readFully(DataInputStream.java:152)
>
>     at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>
>     at
>
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>
>     at
>
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>
>     at
>
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>
>     at
>
>
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)
>
>     at
>
> org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)
>
>     at
>
>
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
>
>     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)
>
>     at
>
>
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
>
>     at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
>
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
>     at
> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
>
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
>     at
>
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>
> cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$
>
>
>
> Can you please tell me what the problem is?
>
>
>
> Thank you,
>
>
>
> --
>
> Praneet Mhatre
>
> Graduate Student
>
> Donald Bren School of ICS
>
> University of California, Irvine
>
>
>
>
> --
> Praneet Mhatre
> Graduate Student
> Donald Bren School of ICS
> University of California, Irvine
>
>
>
>


-- 
Praneet Mhatre
Graduate Student
Donald Bren School of ICS
University of California, Irvine

Re: Re : Partial Implementation of Random Forest

Posted by praneet mhatre <pr...@gmail.com>.
Me too. Used the latest code. Still the exact same error as before.

Thanks,

On Mon, Apr 11, 2011 at 2:06 PM, deneche abdelhakim <a_...@yahoo.fr>wrote:

> hmm, I will give it a look and see what's causing this
>
> --- En date de : Lun 11.4.11, ext-ranjit.chellappannair@nokia.com <
> ext-ranjit.chellappannair@nokia.com> a écrit :
>
> De: ext-ranjit.chellappannair@nokia.com <
> ext-ranjit.chellappannair@nokia.com>
> Objet: RE: Re : Partial Implementation of Random Forest
> À: user@mahout.apache.org
> Date: Lundi 11 avril 2011, 15h58
>
> Hi Deneche,
>
> I used the mahout latest code from the trunk and while running the
> BuildForest on KDD dataset I am getting an EOF exception. Please find the
> exception I am getting below:-
>
> Exception in thread "main" java.lang.IllegalStateException:
> java.io.EOFException
>         at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
>         at
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
>         at
> org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
>         at
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
>         at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
>         at
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
>         at
> org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at
> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at java.io.DataInputStream.readFully(DataInputStream.java:152)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>         at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
>         at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
>         ... 13 more
>
> Any help in resolving the above error will be greately appreciated.
>
> Thanks and Regards,
> Ranjit.C
>
> -----Original Message-----
> From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr]
> Sent: Wednesday, April 06, 2011 9:44 AM
> To: user@mahout.apache.org
> Subject: Re: Re : Partial Implementation of Random Forest
>
> There was a new bug in the code and I fixed it. Please try again after
> updating the code. I am also using Cloudera's Hadoop and it's running just
> fine
>
> --- En date de : Jeu 31.3.11, praneet mhatre <pr...@gmail.com> a
> écrit :
>
> De: praneet mhatre <pr...@gmail.com>
> Objet: Re: Re : Partial Implementation of Random Forest
> À: user@mahout.apache.org
> Cc: "deneche abdelhakim" <a_...@yahoo.fr>
> Date: Jeudi 31 mars 2011, 23h34
>
> Hi Deneche,
>
> I used the trunk. I still encounter the same error. By the way, I am
> running mahout on top of Cloudera's Linux image. I was just wondering if
> that has anything to do with the error.
>
> Exception in thread "main" java.lang.IllegalStateException:
> java.io.EOFException
>
>     at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
>     at
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
>     at
> org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
>
>     at
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
>     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
>     at
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
>
>     at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at
> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Caused by: java.io.EOFException
>     at java.io.DataInputStream.readFully(DataInputStream.java:180)
>
>     at java.io.DataInputStream.readFully(DataInputStream.java:152)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>
>     at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>     at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
>
>     at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
>     ... 13 more
> cloudera@cloudera-demo:~/Downloads/trunk$
>
> Thanks,
>
>
> On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <a_...@yahoo.fr>
> wrote:
>
> Hi Prannet,
>
>
>
> I fixed various bugs since 0.4, could you try using the trunk, and see if
> it's happening again ?
>
>
>
> --- En date de : Mar 29.3.11, praneet mhatre <pr...@gmail.com> a
> écrit :
>
>
>
> De: praneet mhatre <pr...@gmail.com>
>
> Objet: Partial Implementation of Random Forest
>
> À: user@mahout.apache.org
>
> Date: Mardi 29 mars 2011, 18h30
>
>
>
> I think my previous mail did not get through.
>
>
>
> ---------- Forwarded message ----------
>
> From: praneet mhatre <pr...@gmail.com>
>
> Date: Mon, Mar 28, 2011 at 10:50 PM
>
> Subject: Partial Implementation of Random Forest
>
> To: user@mahout.apache.org
>
>
>
>
>
> Hello all,
>
>
>
> I very recently started working on Mahout. To get the feel of things, I was
>
> trying to run  the sample implementation of Random Forest posted on the
> Wiki
>
> (
> https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
> ).
>
> However, even when I issue the exact same commands, I get an
>
> EOFException
>
> error as follows:
>
> Exception in thread "main" java.io.EOFException
>
>     at java.io.DataInputStream.readFully(DataInputStream.java:180)
>
>     at java.io.DataInputStream.readFully(DataInputStream.java:152)
>
>     at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>
>     at
>
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>
>     at
>
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>
>     at
>
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>
>     at
>
>
> org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)
>
>     at
>
> org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)
>
>     at
>
>
> org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
>
>     at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)
>
>     at
>
>
> org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
>
>     at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
>
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
>     at
> org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
>
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
>     at
>
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>
> cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$
>
>
>
> Can you please tell me what the problem is?
>
>
>
> Thank you,
>
>
>
> --
>
> Praneet Mhatre
>
> Graduate Student
>
> Donald Bren School of ICS
>
> University of California, Irvine
>
>
>
>
> --
> Praneet Mhatre
> Graduate Student
> Donald Bren School of ICS
> University of California, Irvine
>
>
>
>


-- 
Praneet Mhatre
Graduate Student
Donald Bren School of ICS
University of California, Irvine

RE: Re : Partial Implementation of Random Forest

Posted by deneche abdelhakim <a_...@yahoo.fr>.
hmm, I will give it a look and see what's causing this

--- En date de : Lun 11.4.11, ext-ranjit.chellappannair@nokia.com <ex...@nokia.com> a écrit :

De: ext-ranjit.chellappannair@nokia.com <ex...@nokia.com>
Objet: RE: Re : Partial Implementation of Random Forest
À: user@mahout.apache.org
Date: Lundi 11 avril 2011, 15h58

Hi Deneche,

I used the mahout latest code from the trunk and while running the BuildForest on KDD dataset I am getting an EOF exception. Please find the exception I am getting below:-

Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
        at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
        at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
        at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
        at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
        at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
        at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
        ... 13 more

Any help in resolving the above error will be greately appreciated.

Thanks and Regards,
Ranjit.C

-----Original Message-----
From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr] 
Sent: Wednesday, April 06, 2011 9:44 AM
To: user@mahout.apache.org
Subject: Re: Re : Partial Implementation of Random Forest

There was a new bug in the code and I fixed it. Please try again after updating the code. I am also using Cloudera's Hadoop and it's running just fine

--- En date de : Jeu 31.3.11, praneet mhatre <pr...@gmail.com> a écrit :

De: praneet mhatre <pr...@gmail.com>
Objet: Re: Re : Partial Implementation of Random Forest
À: user@mahout.apache.org
Cc: "deneche abdelhakim" <a_...@yahoo.fr>
Date: Jeudi 31 mars 2011, 23h34

Hi Deneche,

I used the trunk. I still encounter the same error. By the way, I am running mahout on top of Cloudera's Linux image. I was just wondering if that has anything to do with the error.

Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException

    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
    at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
    at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)

    at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
    at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
    at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)

    at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.io.EOFException
    at java.io.DataInputStream.readFully(DataInputStream.java:180)

    at java.io.DataInputStream.readFully(DataInputStream.java:152)
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)

    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)

    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
    ... 13 more
cloudera@cloudera-demo:~/Downloads/trunk$ 

Thanks,


On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <a_...@yahoo.fr> wrote:

Hi Prannet,



I fixed various bugs since 0.4, could you try using the trunk, and see if it's happening again ?



--- En date de : Mar 29.3.11, praneet mhatre <pr...@gmail.com> a écrit :



De: praneet mhatre <pr...@gmail.com>

Objet: Partial Implementation of Random Forest

À: user@mahout.apache.org

Date: Mardi 29 mars 2011, 18h30



I think my previous mail did not get through.



---------- Forwarded message ----------

From: praneet mhatre <pr...@gmail.com>

Date: Mon, Mar 28, 2011 at 10:50 PM

Subject: Partial Implementation of Random Forest

To: user@mahout.apache.org





Hello all,



I very recently started working on Mahout. To get the feel of things, I was

trying to run  the sample implementation of Random Forest posted on the Wiki

( https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation).

However, even when I issue the exact same commands, I get an

EOFException

error as follows:

Exception in thread "main" java.io.EOFException

    at java.io.DataInputStream.readFully(DataInputStream.java:180)

    at java.io.DataInputStream.readFully(DataInputStream.java:152)

    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)

    at

org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)

    at

org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)

    at

org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)

    at

org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)

    at

org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)

    at

org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)

    at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)

    at

org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)

    at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)

    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

    at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

    at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

    at java.lang.reflect.Method.invoke(Method.java:597)

    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$



Can you please tell me what the problem is?



Thank you,



--

Praneet Mhatre

Graduate Student

Donald Bren School of ICS

University of California, Irvine




-- 
Praneet Mhatre
Graduate Student
Donald Bren School of ICS
University of California, Irvine




RE: Re : Partial Implementation of Random Forest

Posted by ex...@nokia.com.
Hi Deneche,

I used the mahout latest code from the trunk and while running the BuildForest on KDD dataset I am getting an EOF exception. Please find the exception I am getting below:-

Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
        at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
        at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)
        at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
        at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
        at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)
        at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
        ... 13 more

Any help in resolving the above error will be greately appreciated.

Thanks and Regards,
Ranjit.C

-----Original Message-----
From: ext deneche abdelhakim [mailto:a_deneche@yahoo.fr] 
Sent: Wednesday, April 06, 2011 9:44 AM
To: user@mahout.apache.org
Subject: Re: Re : Partial Implementation of Random Forest

There was a new bug in the code and I fixed it. Please try again after updating the code. I am also using Cloudera's Hadoop and it's running just fine

--- En date de : Jeu 31.3.11, praneet mhatre <pr...@gmail.com> a écrit :

De: praneet mhatre <pr...@gmail.com>
Objet: Re: Re : Partial Implementation of Random Forest
À: user@mahout.apache.org
Cc: "deneche abdelhakim" <a_...@yahoo.fr>
Date: Jeudi 31 mars 2011, 23h34

Hi Deneche,

I used the trunk. I still encounter the same error. By the way, I am running mahout on top of Cloudera's Linux image. I was just wondering if that has anything to do with the error.

Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException

    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
    at org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:142)
    at org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:120)

    at org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)
    at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:324)
    at org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)

    at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:239)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.io.EOFException
    at java.io.DataInputStream.readFully(DataInputStream.java:180)

    at java.io.DataInputStream.readFully(DataInputStream.java:152)
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)

    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:59)

    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
    ... 13 more
cloudera@cloudera-demo:~/Downloads/trunk$ 

Thanks,


On Thu, Mar 31, 2011 at 5:15 AM, deneche abdelhakim <a_...@yahoo.fr> wrote:

Hi Prannet,



I fixed various bugs since 0.4, could you try using the trunk, and see if it's happening again ?



--- En date de : Mar 29.3.11, praneet mhatre <pr...@gmail.com> a écrit :



De: praneet mhatre <pr...@gmail.com>

Objet: Partial Implementation of Random Forest

À: user@mahout.apache.org

Date: Mardi 29 mars 2011, 18h30



I think my previous mail did not get through.



---------- Forwarded message ----------

From: praneet mhatre <pr...@gmail.com>

Date: Mon, Mar 28, 2011 at 10:50 PM

Subject: Partial Implementation of Random Forest

To: user@mahout.apache.org





Hello all,



I very recently started working on Mahout. To get the feel of things, I was

trying to run  the sample implementation of Random Forest posted on the Wiki

( https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation).

However, even when I issue the exact same commands, I get an

EOFException

error as follows:

Exception in thread "main" java.io.EOFException

    at java.io.DataInputStream.readFully(DataInputStream.java:180)

    at java.io.DataInputStream.readFully(DataInputStream.java:152)

    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)

    at

org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)

    at

org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)

    at

org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)

    at

org.apache.mahout.df.mapreduce.partial.Step0Job.parseOutput(Step0Job.java:145)

    at

org.apache.mahout.df.mapreduce.partial.Step0Job.run(Step0Job.java:119)

    at

org.apache.mahout.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:115)

    at org.apache.mahout.df.mapreduce.Builder.build(Builder.java:338)

    at

org.apache.mahout.df.mapreduce.BuildForest.buildForest(BuildForest.java:195)

    at org.apache.mahout.df.mapreduce.BuildForest.run(BuildForest.java:159)

    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

    at org.apache.mahout.df.mapreduce.BuildForest.main(BuildForest.java:236)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

    at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

    at java.lang.reflect.Method.invoke(Method.java:597)

    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

cloudera@cloudera-demo:~/Downloads/mahout-distribution-0.4$



Can you please tell me what the problem is?



Thank you,



--

Praneet Mhatre

Graduate Student

Donald Bren School of ICS

University of California, Irvine




-- 
Praneet Mhatre
Graduate Student
Donald Bren School of ICS
University of California, Irvine