You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by WangRamon <ra...@hotmail.com> on 2011/10/19 10:16:02 UTC

Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job




Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.io.IOException: Intermediate merge failed
 at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
 at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
Caused by: java.lang.RuntimeException: java.io.EOFException
 at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
 at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
 at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
 at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
 at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
 at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
 at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
 at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
 at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
 ... 1 more
Caused by: java.io.EOFException
 at java.io.DataInputStream.readByte(DataInputStream.java:250)
 at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
 at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
 at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
 at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
 ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon  		 	   		  

RE: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job

Posted by WangRamon <ra...@hotmail.com>.
Thanks Ted, I agree 2 nodes is an uncommon cluster settings, but it's only for test and as a benchmark for the future plan. Yes, I'm going to upgrade to Mahout 0.6, and with luck i have resolve the problem by increase the map/reduce tasks settings, 2 tasks (the default) for each is a too small setting for my previous test, it's a dual quad core CPU anyway.
 > Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
> From: ted.dunning@gmail.com
> Date: Wed, 19 Oct 2011 07:58:19 -0600
> To: user@mahout.apache.org
> 
> The latest version has many bug fixes. You may be seeing such a bug. 
> 
> Sent from my iPhone
> 
> On Oct 19, 2011, at 2:27, WangRamon <ra...@hotmail.com> wrote:
> 
> > 
> > Yes, I'm still using version 0.5, the plan is to verify it can work on 0.5 and get some benchmark first then moving forward to 0.6, Sebastian, do you think it's a problem related to Mahout? (Not Hadoop?) And do you think 0.6 will bring us a huge performance increase? Thanks. CheersRamon
> >> Date: Wed, 19 Oct 2011 10:20:24 +0200
> >> From: ssc@apache.org
> >> To: user@mahout.apache.org
> >> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
> >> 
> >> It seems like you're still not using Mahout 0.6. Please use the latest
> >> version and apply appropriate down sampling to your input data. You
> >> should also try to get access to a cluster with more than 2 machines.
> >> 
> >> --sebastian
> >> 
> >> On 19.10.2011 10:16, WangRamon wrote:
> >>> 
> >>> 
> >>> 
> >>> 
> >>> Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
> >>> 11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
> >>> 11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
> >>> 11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
> >>> 11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
> >>> 11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
> >>> 11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
> >>> 11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
> >>> 11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
> >>> java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
> >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
> >>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >>> Caused by: java.io.IOException: Intermediate merge failed
> >>> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
> >>> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
> >>> Caused by: java.lang.RuntimeException: java.io.EOFException
> >>> at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
> >>> at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
> >>> at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
> >>> at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
> >>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
> >>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> >>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
> >>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
> >>> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
> >>> ... 1 more
> >>> Caused by: java.io.EOFException
> >>> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> >>> at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
> >>> at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
> >>> at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
> >>> at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
> >>> ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
> >>> 11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon                           
> >> 
> >                         
 		 	   		  

Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job

Posted by Ted Dunning <te...@gmail.com>.
The latest version has many bug fixes. You may be seeing such a bug. 

Sent from my iPhone

On Oct 19, 2011, at 2:27, WangRamon <ra...@hotmail.com> wrote:

> 
> Yes, I'm still using version 0.5, the plan is to verify it can work on 0.5 and get some benchmark first then moving forward to 0.6, Sebastian, do you think it's a problem related to Mahout? (Not Hadoop?) And do you think 0.6 will bring us a huge performance increase? Thanks. CheersRamon
>> Date: Wed, 19 Oct 2011 10:20:24 +0200
>> From: ssc@apache.org
>> To: user@mahout.apache.org
>> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
>> 
>> It seems like you're still not using Mahout 0.6. Please use the latest
>> version and apply appropriate down sampling to your input data. You
>> should also try to get access to a cluster with more than 2 machines.
>> 
>> --sebastian
>> 
>> On 19.10.2011 10:16, WangRamon wrote:
>>> 
>>> 
>>> 
>>> 
>>> Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
>>> 11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
>>> 11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
>>> 11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
>>> 11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
>>> 11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
>>> 11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
>>> 11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
>>> 11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
>>> java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
>>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>> Caused by: java.io.IOException: Intermediate merge failed
>>> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
>>> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
>>> Caused by: java.lang.RuntimeException: java.io.EOFException
>>> at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
>>> at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>>> at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
>>> at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
>>> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
>>> ... 1 more
>>> Caused by: java.io.EOFException
>>> at java.io.DataInputStream.readByte(DataInputStream.java:250)
>>> at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
>>> at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
>>> at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
>>> at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
>>> ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
>>> 11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon                           
>> 
>                         

RE: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job

Posted by WangRamon <ra...@hotmail.com>.
Thank you, do it right now ;)
 > Date: Wed, 19 Oct 2011 10:29:31 +0200
> From: ssc@apache.org
> To: user@mahout.apache.org
> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
> 
> As I'm the author of RowSimilarityJob you should save yourself some time
> and believe me that the best thing is to move to 0.6 immediately.
> 
> --sebastian
> 
> On 19.10.2011 10:27, WangRamon wrote:
> > 
> > Yes, I'm still using version 0.5, the plan is to verify it can work on 0.5 and get some benchmark first then moving forward to 0.6, Sebastian, do you think it's a problem related to Mahout? (Not Hadoop?) And do you think 0.6 will bring us a huge performance increase? Thanks. CheersRamon
> >  > Date: Wed, 19 Oct 2011 10:20:24 +0200
> >> From: ssc@apache.org
> >> To: user@mahout.apache.org
> >> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
> >>
> >> It seems like you're still not using Mahout 0.6. Please use the latest
> >> version and apply appropriate down sampling to your input data. You
> >> should also try to get access to a cluster with more than 2 machines.
> >>
> >> --sebastian
> >>
> >> On 19.10.2011 10:16, WangRamon wrote:
> >>>
> >>>
> >>>
> >>>
> >>> Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
> >>> 11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
> >>> 11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
> >>> 11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
> >>> 11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
> >>> 11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
> >>> 11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
> >>> 11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
> >>> 11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
> >>> java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
> >>>  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
> >>>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >>> Caused by: java.io.IOException: Intermediate merge failed
> >>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
> >>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
> >>> Caused by: java.lang.RuntimeException: java.io.EOFException
> >>>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
> >>>  at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
> >>>  at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
> >>>  at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
> >>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
> >>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> >>>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
> >>>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
> >>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
> >>>  ... 1 more
> >>> Caused by: java.io.EOFException
> >>>  at java.io.DataInputStream.readByte(DataInputStream.java:250)
> >>>  at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
> >>>  at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
> >>>  at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
> >>>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
> >>>  ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
> >>> 11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon  		 	   		  
> >>
> >  		 	   		  
> 
 		 	   		  

Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job

Posted by Sebastian Schelter <ss...@apache.org>.
Yes.
On 19.10.2011 10:53, WangRamon wrote:
> 
> Hi Sebastian Can mahout 0.6 work with hadoop 0.20.2 ? ThanksRamon > Date: Wed, 19 Oct 2011 10:29:31 +0200
>> From: ssc@apache.org
>> To: user@mahout.apache.org
>> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
>>
>> As I'm the author of RowSimilarityJob you should save yourself some time
>> and believe me that the best thing is to move to 0.6 immediately.
>>
>> --sebastian
>>
>> On 19.10.2011 10:27, WangRamon wrote:
>>>
>>> Yes, I'm still using version 0.5, the plan is to verify it can work on 0.5 and get some benchmark first then moving forward to 0.6, Sebastian, do you think it's a problem related to Mahout? (Not Hadoop?) And do you think 0.6 will bring us a huge performance increase? Thanks. CheersRamon
>>>  > Date: Wed, 19 Oct 2011 10:20:24 +0200
>>>> From: ssc@apache.org
>>>> To: user@mahout.apache.org
>>>> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
>>>>
>>>> It seems like you're still not using Mahout 0.6. Please use the latest
>>>> version and apply appropriate down sampling to your input data. You
>>>> should also try to get access to a cluster with more than 2 machines.
>>>>
>>>> --sebastian
>>>>
>>>> On 19.10.2011 10:16, WangRamon wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
>>>>> 11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
>>>>> 11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
>>>>> 11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
>>>>> 11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
>>>>> 11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
>>>>> 11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
>>>>> 11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
>>>>> 11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
>>>>> java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
>>>>>  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
>>>>>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>>> Caused by: java.io.IOException: Intermediate merge failed
>>>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
>>>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
>>>>> Caused by: java.lang.RuntimeException: java.io.EOFException
>>>>>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
>>>>>  at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>>>>>  at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
>>>>>  at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
>>>>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
>>>>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
>>>>>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
>>>>>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
>>>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
>>>>>  ... 1 more
>>>>> Caused by: java.io.EOFException
>>>>>  at java.io.DataInputStream.readByte(DataInputStream.java:250)
>>>>>  at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
>>>>>  at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
>>>>>  at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
>>>>>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
>>>>>  ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
>>>>> 11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon  		 	   		  
>>>>
>>>  		 	   		  
>>
>  		 	   		  


RE: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job

Posted by WangRamon <ra...@hotmail.com>.
Hi Sebastian Can mahout 0.6 work with hadoop 0.20.2 ? ThanksRamon > Date: Wed, 19 Oct 2011 10:29:31 +0200
> From: ssc@apache.org
> To: user@mahout.apache.org
> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
> 
> As I'm the author of RowSimilarityJob you should save yourself some time
> and believe me that the best thing is to move to 0.6 immediately.
> 
> --sebastian
> 
> On 19.10.2011 10:27, WangRamon wrote:
> > 
> > Yes, I'm still using version 0.5, the plan is to verify it can work on 0.5 and get some benchmark first then moving forward to 0.6, Sebastian, do you think it's a problem related to Mahout? (Not Hadoop?) And do you think 0.6 will bring us a huge performance increase? Thanks. CheersRamon
> >  > Date: Wed, 19 Oct 2011 10:20:24 +0200
> >> From: ssc@apache.org
> >> To: user@mahout.apache.org
> >> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
> >>
> >> It seems like you're still not using Mahout 0.6. Please use the latest
> >> version and apply appropriate down sampling to your input data. You
> >> should also try to get access to a cluster with more than 2 machines.
> >>
> >> --sebastian
> >>
> >> On 19.10.2011 10:16, WangRamon wrote:
> >>>
> >>>
> >>>
> >>>
> >>> Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
> >>> 11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
> >>> 11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
> >>> 11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
> >>> 11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
> >>> 11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
> >>> 11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
> >>> 11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
> >>> 11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
> >>> java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
> >>>  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
> >>>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >>> Caused by: java.io.IOException: Intermediate merge failed
> >>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
> >>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
> >>> Caused by: java.lang.RuntimeException: java.io.EOFException
> >>>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
> >>>  at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
> >>>  at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
> >>>  at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
> >>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
> >>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> >>>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
> >>>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
> >>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
> >>>  ... 1 more
> >>> Caused by: java.io.EOFException
> >>>  at java.io.DataInputStream.readByte(DataInputStream.java:250)
> >>>  at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
> >>>  at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
> >>>  at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
> >>>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
> >>>  ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
> >>> 11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon  		 	   		  
> >>
> >  		 	   		  
> 
 		 	   		  

Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job

Posted by Sebastian Schelter <ss...@apache.org>.
As I'm the author of RowSimilarityJob you should save yourself some time
and believe me that the best thing is to move to 0.6 immediately.

--sebastian

On 19.10.2011 10:27, WangRamon wrote:
> 
> Yes, I'm still using version 0.5, the plan is to verify it can work on 0.5 and get some benchmark first then moving forward to 0.6, Sebastian, do you think it's a problem related to Mahout? (Not Hadoop?) And do you think 0.6 will bring us a huge performance increase? Thanks. CheersRamon
>  > Date: Wed, 19 Oct 2011 10:20:24 +0200
>> From: ssc@apache.org
>> To: user@mahout.apache.org
>> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
>>
>> It seems like you're still not using Mahout 0.6. Please use the latest
>> version and apply appropriate down sampling to your input data. You
>> should also try to get access to a cluster with more than 2 machines.
>>
>> --sebastian
>>
>> On 19.10.2011 10:16, WangRamon wrote:
>>>
>>>
>>>
>>>
>>> Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
>>> 11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
>>> 11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
>>> 11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
>>> 11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
>>> 11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
>>> 11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
>>> 11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
>>> 11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
>>> java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
>>>  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
>>>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>> Caused by: java.io.IOException: Intermediate merge failed
>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
>>> Caused by: java.lang.RuntimeException: java.io.EOFException
>>>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
>>>  at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>>>  at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
>>>  at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
>>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
>>>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
>>>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
>>>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
>>>  ... 1 more
>>> Caused by: java.io.EOFException
>>>  at java.io.DataInputStream.readByte(DataInputStream.java:250)
>>>  at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
>>>  at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
>>>  at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
>>>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
>>>  ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
>>> 11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon  		 	   		  
>>
>  		 	   		  


RE: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job

Posted by WangRamon <ra...@hotmail.com>.
Yes, I'm still using version 0.5, the plan is to verify it can work on 0.5 and get some benchmark first then moving forward to 0.6, Sebastian, do you think it's a problem related to Mahout? (Not Hadoop?) And do you think 0.6 will bring us a huge performance increase? Thanks. CheersRamon
 > Date: Wed, 19 Oct 2011 10:20:24 +0200
> From: ssc@apache.org
> To: user@mahout.apache.org
> Subject: Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job
> 
> It seems like you're still not using Mahout 0.6. Please use the latest
> version and apply appropriate down sampling to your input data. You
> should also try to get access to a cluster with more than 2 machines.
> 
> --sebastian
> 
> On 19.10.2011 10:16, WangRamon wrote:
> > 
> > 
> > 
> > 
> > Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
> > 11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
> > 11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
> > 11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
> > 11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
> > 11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
> > 11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
> > 11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
> > 11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
> > java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
> >  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
> >  at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > Caused by: java.io.IOException: Intermediate merge failed
> >  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
> >  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
> > Caused by: java.lang.RuntimeException: java.io.EOFException
> >  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
> >  at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
> >  at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
> >  at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
> >  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
> >  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> >  at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
> >  at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
> >  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
> >  ... 1 more
> > Caused by: java.io.EOFException
> >  at java.io.DataInputStream.readByte(DataInputStream.java:250)
> >  at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
> >  at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
> >  at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
> >  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
> >  ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
> > 11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon  		 	   		  
> 
 		 	   		  

Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job

Posted by Sebastian Schelter <ss...@apache.org>.
It seems like you're still not using Mahout 0.6. Please use the latest
version and apply appropriate down sampling to your input data. You
should also try to get access to a cluster with more than 2 machines.

--sebastian

On 19.10.2011 10:16, WangRamon wrote:
> 
> 
> 
> 
> Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
> 11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
> 11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
> 11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
> 11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
> 11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
> 11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
> 11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
> 11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
> java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
>  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.io.IOException: Intermediate merge failed
>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
> Caused by: java.lang.RuntimeException: java.io.EOFException
>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
>  at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>  at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
>  at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
>  at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
>  at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
>  ... 1 more
> Caused by: java.io.EOFException
>  at java.io.DataInputStream.readByte(DataInputStream.java:250)
>  at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
>  at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
>  at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
>  at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
>  ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
> 11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon  		 	   		  


Re: Exception during running RowSimilarityJob-Mapper-EntriesToVectorsReducer job

Posted by Ted Dunning <te...@gmail.com>.
First of all 2 nodes is a very small hadoop cluster. It is not uncommon to have odd problems with such a small cluster and you certainly are very unlikely to see any significant speedup. 

Are you running out of disk space?

Sent from my iPhone

On Oct 19, 2011, at 2:16, WangRamon <ra...@hotmail.com> wrote:

> 
> 
> 
> 
> Hi Guys I'm continuing running the test case with a 1GB data file which contains 600000 users and 2000000 items, all the jobs are running in a 2 two nodes cluster, each node has 32GB RAM and 8 core CPU, the RecommenderJob running until it reach RowSimilarityJob-Mapper-EntriesToVectorsReducer Job, see below for the error log: 11/10/18 23:09:34 INFO mapred.JobClient:  map 11% reduce 1%
> 11/10/18 23:12:46 INFO mapred.JobClient:  map 11% reduce 2%
> 11/10/18 23:13:55 INFO mapred.JobClient:  map 12% reduce 2%
> 11/10/18 23:18:22 INFO mapred.JobClient:  map 13% reduce 2%
> 11/10/18 23:22:50 INFO mapred.JobClient:  map 14% reduce 2%
> 11/10/18 23:27:08 INFO mapred.JobClient:  map 15% reduce 2%
> 11/10/18 23:28:15 INFO mapred.JobClient:  map 15% reduce 3%
> 11/10/18 23:31:42 INFO mapred.JobClient:  map 16% reduce 3%
> 11/10/18 23:33:36 INFO mapred.JobClient: Task Id : attempt_201110181002_0007_r_000000_0, Status : FAILED
> java.io.IOException: Task: attempt_201110181002_0007_r_000000_0 - The reduce copier failed
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.io.IOException: Intermediate merge failed
> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
> Caused by: java.lang.RuntimeException: java.io.EOFException
> at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
> at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
> at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
> at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
> ... 1 more
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
> at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
> at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
> at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
> ... 9 more11/10/18 23:33:37 INFO mapred.JobClient:  map 16% reduce 0%
> 11/10/18 23:35:57 INFO mapred.JobClient:  map 17% reduce 0% I googled a lot and find i should increase "mapred.reduce.tasks" property in Hadoop, so I set it to 8 in my environment and restart this Job only, so far so good, the job is still running by now, but it's still a little slow, so here comes my questions: 1) does it be so slow for this job RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ?2) what does this property "mapred.reduce.tasks" do? And why it can effect RowSimilarityJob-Mapper-EntriesToVectorsReducer Job ? (Maybe i should ask this 2nd question in hadoop user list... but i think people here are both pro at hadoop :) )3) what can i do to increase the speed for this job? Any ideas? Thanks in advance! Ramon