You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Li Li <fa...@gmail.com> on 2014/04/03 02:04:09 UTC

how to solve reducer memory problem?

I have a map reduce program that do some matrix operations. in the
reducer, it will average many large matrix(each matrix takes up
400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
then the total memory usage is 20GB. so the reduce task got exception:

FATAL org.apache.hadoop.mapred.Child: Error running child :
java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

one method I can come up with is use Combiner to save sums of some
matrixs and their count
but it still can solve the problem because the combiner is not fully
controled by me.

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong

Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong
Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong

Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong

Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong
Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong
Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong

Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

Re: how to solve reducer memory problem?

Posted by Gordon Wang <gw...@gopivotal.com>.
I've no idea what your program is doing.
But you'd better estimate the memory consumption of your reducer. And then,
pick a proper Xmx size for it.
Looks like your reducer needs much memory to cache the TrainingWeights.


On Thu, Apr 3, 2014 at 5:53 PM, Li Li <fa...@gmail.com> wrote:

>  you can think of each TrainingWeights as a very large double[] whose
> length is about 10,000,000
> TrainingWeights result=null;
>  int total=0;
> for(TrainingWeights weights:values){
> if(result==null){
>  result=weights;
> }else{
> addWeights(result, weights);
>  }
> total++;
> }
> if(total>1){
>  divideWeights(result, total);
> }
> context.write(NullWritable.get(), result);
>
>
> On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:
>
>> What is the work in reducer ?
>> Do you have any memory intensive work in reducer(eg. cache a lot of data
>> in memory) ? I guess the OOM error comes from your code in reducer.
>>
>>
>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>
>>> *mapred.child.java.opts=-Xmx2g*
>>>
>>>
>>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>>
>>>> 2g
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>>>
>>>>> This doesn't seem like related with the data size.
>>>>>
>>>>> How much memory do you use for the reducer?
>>>>>
>>>>> Regards,
>>>>> *Stanley Shi,*
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>>>
>>>>>> I have a map reduce program that do some matrix operations. in the
>>>>>> reducer, it will average many large matrix(each matrix takes up
>>>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>>>
>>>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>
>>>>>> one method I can come up with is use Combiner to save sums of some
>>>>>> matrixs and their count
>>>>>> but it still can solve the problem because the combiner is not fully
>>>>>> controled by me.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Regards
>> Gordon Wang
>>
>
>


-- 
Regards
Gordon Wang

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong
Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

Re: how to solve reducer memory problem?

Posted by Gordon Wang <gw...@gopivotal.com>.
I've no idea what your program is doing.
But you'd better estimate the memory consumption of your reducer. And then,
pick a proper Xmx size for it.
Looks like your reducer needs much memory to cache the TrainingWeights.


On Thu, Apr 3, 2014 at 5:53 PM, Li Li <fa...@gmail.com> wrote:

>  you can think of each TrainingWeights as a very large double[] whose
> length is about 10,000,000
> TrainingWeights result=null;
>  int total=0;
> for(TrainingWeights weights:values){
> if(result==null){
>  result=weights;
> }else{
> addWeights(result, weights);
>  }
> total++;
> }
> if(total>1){
>  divideWeights(result, total);
> }
> context.write(NullWritable.get(), result);
>
>
> On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:
>
>> What is the work in reducer ?
>> Do you have any memory intensive work in reducer(eg. cache a lot of data
>> in memory) ? I guess the OOM error comes from your code in reducer.
>>
>>
>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>
>>> *mapred.child.java.opts=-Xmx2g*
>>>
>>>
>>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>>
>>>> 2g
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>>>
>>>>> This doesn't seem like related with the data size.
>>>>>
>>>>> How much memory do you use for the reducer?
>>>>>
>>>>> Regards,
>>>>> *Stanley Shi,*
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>>>
>>>>>> I have a map reduce program that do some matrix operations. in the
>>>>>> reducer, it will average many large matrix(each matrix takes up
>>>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>>>
>>>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>
>>>>>> one method I can come up with is use Combiner to save sums of some
>>>>>> matrixs and their count
>>>>>> but it still can solve the problem because the combiner is not fully
>>>>>> controled by me.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Regards
>> Gordon Wang
>>
>
>


-- 
Regards
Gordon Wang

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong
Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong
Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

Re: how to solve reducer memory problem?

Posted by Gordon Wang <gw...@gopivotal.com>.
I've no idea what your program is doing.
But you'd better estimate the memory consumption of your reducer. And then,
pick a proper Xmx size for it.
Looks like your reducer needs much memory to cache the TrainingWeights.


On Thu, Apr 3, 2014 at 5:53 PM, Li Li <fa...@gmail.com> wrote:

>  you can think of each TrainingWeights as a very large double[] whose
> length is about 10,000,000
> TrainingWeights result=null;
>  int total=0;
> for(TrainingWeights weights:values){
> if(result==null){
>  result=weights;
> }else{
> addWeights(result, weights);
>  }
> total++;
> }
> if(total>1){
>  divideWeights(result, total);
> }
> context.write(NullWritable.get(), result);
>
>
> On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:
>
>> What is the work in reducer ?
>> Do you have any memory intensive work in reducer(eg. cache a lot of data
>> in memory) ? I guess the OOM error comes from your code in reducer.
>>
>>
>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>
>>> *mapred.child.java.opts=-Xmx2g*
>>>
>>>
>>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>>
>>>> 2g
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>>>
>>>>> This doesn't seem like related with the data size.
>>>>>
>>>>> How much memory do you use for the reducer?
>>>>>
>>>>> Regards,
>>>>> *Stanley Shi,*
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>>>
>>>>>> I have a map reduce program that do some matrix operations. in the
>>>>>> reducer, it will average many large matrix(each matrix takes up
>>>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>>>
>>>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>
>>>>>> one method I can come up with is use Combiner to save sums of some
>>>>>> matrixs and their count
>>>>>> but it still can solve the problem because the combiner is not fully
>>>>>> controled by me.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Regards
>> Gordon Wang
>>
>
>


-- 
Regards
Gordon Wang

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong
Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

Re: how to solve reducer memory problem?

Posted by Gordon Wang <gw...@gopivotal.com>.
I've no idea what your program is doing.
But you'd better estimate the memory consumption of your reducer. And then,
pick a proper Xmx size for it.
Looks like your reducer needs much memory to cache the TrainingWeights.


On Thu, Apr 3, 2014 at 5:53 PM, Li Li <fa...@gmail.com> wrote:

>  you can think of each TrainingWeights as a very large double[] whose
> length is about 10,000,000
> TrainingWeights result=null;
>  int total=0;
> for(TrainingWeights weights:values){
> if(result==null){
>  result=weights;
> }else{
> addWeights(result, weights);
>  }
> total++;
> }
> if(total>1){
>  divideWeights(result, total);
> }
> context.write(NullWritable.get(), result);
>
>
> On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:
>
>> What is the work in reducer ?
>> Do you have any memory intensive work in reducer(eg. cache a lot of data
>> in memory) ? I guess the OOM error comes from your code in reducer.
>>
>>
>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>
>>> *mapred.child.java.opts=-Xmx2g*
>>>
>>>
>>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>>
>>>> 2g
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>>>
>>>>> This doesn't seem like related with the data size.
>>>>>
>>>>> How much memory do you use for the reducer?
>>>>>
>>>>> Regards,
>>>>> *Stanley Shi,*
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>>>
>>>>>> I have a map reduce program that do some matrix operations. in the
>>>>>> reducer, it will average many large matrix(each matrix takes up
>>>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>>>
>>>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>
>>>>>> one method I can come up with is use Combiner to save sums of some
>>>>>> matrixs and their count
>>>>>> but it still can solve the problem because the combiner is not fully
>>>>>> controled by me.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Regards
>> Gordon Wang
>>
>
>


-- 
Regards
Gordon Wang

RE: how to solve reducer memory problem?

Posted by java8964 <ja...@hotmail.com>.
There are several issues could come together, since you know your data, we can only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set "mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the later one will override the "mapred.child.java.opts". So double check the setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into "TrainingWeights result". So the question is for each "Reducer group", or "Key", how many data it could be?If a key could contain big values, then all these values will be saved in the memory of "result" instance. That will require big memory. If so, either you have to have that much memory, or redesign your key, make it more lower level, so requires less memory.
Yong
Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancyerii@gmail.com
To: user@hadoop.apache.org

                       you can think of each TrainingWeights as a very large double[] whose length is about 10,000,000		       TrainingWeights result=null;
			int total=0;			for(TrainingWeights weights:values){				if(result==null){
					result=weights;				}else{					addWeights(result, weights);
				}				total++;			}			if(total>1){
				divideWeights(result, total);			}			context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang


 		 	   		  

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
 you can think of each TrainingWeights as a very large double[] whose
length is about 10,000,000
TrainingWeights result=null;
int total=0;
for(TrainingWeights weights:values){
if(result==null){
result=weights;
}else{
addWeights(result, weights);
}
total++;
}
if(total>1){
divideWeights(result, total);
}
context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

> What is the work in reducer ?
> Do you have any memory intensive work in reducer(eg. cache a lot of data
> in memory) ? I guess the OOM error comes from your code in reducer.
>
>
> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>
>> *mapred.child.java.opts=-Xmx2g*
>>
>>
>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>
>>> 2g
>>>
>>>
>>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>>
>>>> This doesn't seem like related with the data size.
>>>>
>>>> How much memory do you use for the reducer?
>>>>
>>>> Regards,
>>>> *Stanley Shi,*
>>>>
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>>
>>>>> I have a map reduce program that do some matrix operations. in the
>>>>> reducer, it will average many large matrix(each matrix takes up
>>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>>
>>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>> at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>
>>>>> one method I can come up with is use Combiner to save sums of some
>>>>> matrixs and their count
>>>>> but it still can solve the problem because the combiner is not fully
>>>>> controled by me.
>>>>>
>>>>
>>>>
>>>
>>
>
>
> --
> Regards
> Gordon Wang
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
 you can think of each TrainingWeights as a very large double[] whose
length is about 10,000,000
TrainingWeights result=null;
int total=0;
for(TrainingWeights weights:values){
if(result==null){
result=weights;
}else{
addWeights(result, weights);
}
total++;
}
if(total>1){
divideWeights(result, total);
}
context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

> What is the work in reducer ?
> Do you have any memory intensive work in reducer(eg. cache a lot of data
> in memory) ? I guess the OOM error comes from your code in reducer.
>
>
> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>
>> *mapred.child.java.opts=-Xmx2g*
>>
>>
>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>
>>> 2g
>>>
>>>
>>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>>
>>>> This doesn't seem like related with the data size.
>>>>
>>>> How much memory do you use for the reducer?
>>>>
>>>> Regards,
>>>> *Stanley Shi,*
>>>>
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>>
>>>>> I have a map reduce program that do some matrix operations. in the
>>>>> reducer, it will average many large matrix(each matrix takes up
>>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>>
>>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>> at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>
>>>>> one method I can come up with is use Combiner to save sums of some
>>>>> matrixs and their count
>>>>> but it still can solve the problem because the combiner is not fully
>>>>> controled by me.
>>>>>
>>>>
>>>>
>>>
>>
>
>
> --
> Regards
> Gordon Wang
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
 you can think of each TrainingWeights as a very large double[] whose
length is about 10,000,000
TrainingWeights result=null;
int total=0;
for(TrainingWeights weights:values){
if(result==null){
result=weights;
}else{
addWeights(result, weights);
}
total++;
}
if(total>1){
divideWeights(result, total);
}
context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

> What is the work in reducer ?
> Do you have any memory intensive work in reducer(eg. cache a lot of data
> in memory) ? I guess the OOM error comes from your code in reducer.
>
>
> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>
>> *mapred.child.java.opts=-Xmx2g*
>>
>>
>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>
>>> 2g
>>>
>>>
>>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>>
>>>> This doesn't seem like related with the data size.
>>>>
>>>> How much memory do you use for the reducer?
>>>>
>>>> Regards,
>>>> *Stanley Shi,*
>>>>
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>>
>>>>> I have a map reduce program that do some matrix operations. in the
>>>>> reducer, it will average many large matrix(each matrix takes up
>>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>>
>>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>> at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>
>>>>> one method I can come up with is use Combiner to save sums of some
>>>>> matrixs and their count
>>>>> but it still can solve the problem because the combiner is not fully
>>>>> controled by me.
>>>>>
>>>>
>>>>
>>>
>>
>
>
> --
> Regards
> Gordon Wang
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
 you can think of each TrainingWeights as a very large double[] whose
length is about 10,000,000
TrainingWeights result=null;
int total=0;
for(TrainingWeights weights:values){
if(result==null){
result=weights;
}else{
addWeights(result, weights);
}
total++;
}
if(total>1){
divideWeights(result, total);
}
context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

> What is the work in reducer ?
> Do you have any memory intensive work in reducer(eg. cache a lot of data
> in memory) ? I guess the OOM error comes from your code in reducer.
>
>
> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>
>> *mapred.child.java.opts=-Xmx2g*
>>
>>
>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>>
>>> 2g
>>>
>>>
>>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>>
>>>> This doesn't seem like related with the data size.
>>>>
>>>> How much memory do you use for the reducer?
>>>>
>>>> Regards,
>>>> *Stanley Shi,*
>>>>
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>>
>>>>> I have a map reduce program that do some matrix operations. in the
>>>>> reducer, it will average many large matrix(each matrix takes up
>>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>>
>>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>> at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>
>>>>> one method I can come up with is use Combiner to save sums of some
>>>>> matrixs and their count
>>>>> but it still can solve the problem because the combiner is not fully
>>>>> controled by me.
>>>>>
>>>>
>>>>
>>>
>>
>
>
> --
> Regards
> Gordon Wang
>

Re: how to solve reducer memory problem?

Posted by Gordon Wang <gw...@gopivotal.com>.
What is the work in reducer ?
Do you have any memory intensive work in reducer(eg. cache a lot of data in
memory) ? I guess the OOM error comes from your code in reducer.


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:

> *mapred.child.java.opts=-Xmx2g*
>
>
> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>
>> 2g
>>
>>
>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>
>>> This doesn't seem like related with the data size.
>>>
>>> How much memory do you use for the reducer?
>>>
>>> Regards,
>>> *Stanley Shi,*
>>>
>>>
>>>
>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>
>>>> I have a map reduce program that do some matrix operations. in the
>>>> reducer, it will average many large matrix(each matrix takes up
>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>
>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>> java.lang.OutOfMemoryError: Java heap space
>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>> at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>
>>>> one method I can come up with is use Combiner to save sums of some
>>>> matrixs and their count
>>>> but it still can solve the problem because the combiner is not fully
>>>> controled by me.
>>>>
>>>
>>>
>>
>


-- 
Regards
Gordon Wang

Re: how to solve reducer memory problem?

Posted by Gordon Wang <gw...@gopivotal.com>.
What is the work in reducer ?
Do you have any memory intensive work in reducer(eg. cache a lot of data in
memory) ? I guess the OOM error comes from your code in reducer.


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:

> *mapred.child.java.opts=-Xmx2g*
>
>
> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>
>> 2g
>>
>>
>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>
>>> This doesn't seem like related with the data size.
>>>
>>> How much memory do you use for the reducer?
>>>
>>> Regards,
>>> *Stanley Shi,*
>>>
>>>
>>>
>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>
>>>> I have a map reduce program that do some matrix operations. in the
>>>> reducer, it will average many large matrix(each matrix takes up
>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>
>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>> java.lang.OutOfMemoryError: Java heap space
>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>> at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>
>>>> one method I can come up with is use Combiner to save sums of some
>>>> matrixs and their count
>>>> but it still can solve the problem because the combiner is not fully
>>>> controled by me.
>>>>
>>>
>>>
>>
>


-- 
Regards
Gordon Wang

Re: how to solve reducer memory problem?

Posted by Gordon Wang <gw...@gopivotal.com>.
What is the work in reducer ?
Do you have any memory intensive work in reducer(eg. cache a lot of data in
memory) ? I guess the OOM error comes from your code in reducer.


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:

> *mapred.child.java.opts=-Xmx2g*
>
>
> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>
>> 2g
>>
>>
>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>
>>> This doesn't seem like related with the data size.
>>>
>>> How much memory do you use for the reducer?
>>>
>>> Regards,
>>> *Stanley Shi,*
>>>
>>>
>>>
>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>
>>>> I have a map reduce program that do some matrix operations. in the
>>>> reducer, it will average many large matrix(each matrix takes up
>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>
>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>> java.lang.OutOfMemoryError: Java heap space
>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>> at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>
>>>> one method I can come up with is use Combiner to save sums of some
>>>> matrixs and their count
>>>> but it still can solve the problem because the combiner is not fully
>>>> controled by me.
>>>>
>>>
>>>
>>
>


-- 
Regards
Gordon Wang

Re: how to solve reducer memory problem?

Posted by Gordon Wang <gw...@gopivotal.com>.
What is the work in reducer ?
Do you have any memory intensive work in reducer(eg. cache a lot of data in
memory) ? I guess the OOM error comes from your code in reducer.


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:

> *mapred.child.java.opts=-Xmx2g*
>
>
> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:
>
>> 2g
>>
>>
>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>>
>>> This doesn't seem like related with the data size.
>>>
>>> How much memory do you use for the reducer?
>>>
>>> Regards,
>>> *Stanley Shi,*
>>>
>>>
>>>
>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>>
>>>> I have a map reduce program that do some matrix operations. in the
>>>> reducer, it will average many large matrix(each matrix takes up
>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>
>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>> java.lang.OutOfMemoryError: Java heap space
>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>> at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>
>>>> one method I can come up with is use Combiner to save sums of some
>>>> matrixs and their count
>>>> but it still can solve the problem because the combiner is not fully
>>>> controled by me.
>>>>
>>>
>>>
>>
>


-- 
Regards
Gordon Wang

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
*mapred.child.java.opts=-Xmx2g*


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:

> 2g
>
>
> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>
>> This doesn't seem like related with the data size.
>>
>> How much memory do you use for the reducer?
>>
>> Regards,
>> *Stanley Shi,*
>>
>>
>>
>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>
>>> I have a map reduce program that do some matrix operations. in the
>>> reducer, it will average many large matrix(each matrix takes up
>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>
>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>> java.lang.OutOfMemoryError: Java heap space
>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>> at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>> at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>
>>> one method I can come up with is use Combiner to save sums of some
>>> matrixs and their count
>>> but it still can solve the problem because the combiner is not fully
>>> controled by me.
>>>
>>
>>
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
*mapred.child.java.opts=-Xmx2g*


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:

> 2g
>
>
> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>
>> This doesn't seem like related with the data size.
>>
>> How much memory do you use for the reducer?
>>
>> Regards,
>> *Stanley Shi,*
>>
>>
>>
>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>
>>> I have a map reduce program that do some matrix operations. in the
>>> reducer, it will average many large matrix(each matrix takes up
>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>
>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>> java.lang.OutOfMemoryError: Java heap space
>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>> at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>> at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>
>>> one method I can come up with is use Combiner to save sums of some
>>> matrixs and their count
>>> but it still can solve the problem because the combiner is not fully
>>> controled by me.
>>>
>>
>>
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
*mapred.child.java.opts=-Xmx2g*


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:

> 2g
>
>
> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>
>> This doesn't seem like related with the data size.
>>
>> How much memory do you use for the reducer?
>>
>> Regards,
>> *Stanley Shi,*
>>
>>
>>
>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>
>>> I have a map reduce program that do some matrix operations. in the
>>> reducer, it will average many large matrix(each matrix takes up
>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>
>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>> java.lang.OutOfMemoryError: Java heap space
>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>> at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>> at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>
>>> one method I can come up with is use Combiner to save sums of some
>>> matrixs and their count
>>> but it still can solve the problem because the combiner is not fully
>>> controled by me.
>>>
>>
>>
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
*mapred.child.java.opts=-Xmx2g*


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fa...@gmail.com> wrote:

> 2g
>
>
> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:
>
>> This doesn't seem like related with the data size.
>>
>> How much memory do you use for the reducer?
>>
>> Regards,
>> *Stanley Shi,*
>>
>>
>>
>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>>
>>> I have a map reduce program that do some matrix operations. in the
>>> reducer, it will average many large matrix(each matrix takes up
>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>
>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>> java.lang.OutOfMemoryError: Java heap space
>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>> at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>> at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>
>>> one method I can come up with is use Combiner to save sums of some
>>> matrixs and their count
>>> but it still can solve the problem because the combiner is not fully
>>> controled by me.
>>>
>>
>>
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
2g


On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:

> This doesn't seem like related with the data size.
>
> How much memory do you use for the reducer?
>
> Regards,
> *Stanley Shi,*
>
>
>
> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>
>> I have a map reduce program that do some matrix operations. in the
>> reducer, it will average many large matrix(each matrix takes up
>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>> then the total memory usage is 20GB. so the reduce task got exception:
>>
>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>> java.lang.OutOfMemoryError: Java heap space
>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>> at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>> at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>
>> one method I can come up with is use Combiner to save sums of some
>> matrixs and their count
>> but it still can solve the problem because the combiner is not fully
>> controled by me.
>>
>
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
2g


On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:

> This doesn't seem like related with the data size.
>
> How much memory do you use for the reducer?
>
> Regards,
> *Stanley Shi,*
>
>
>
> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>
>> I have a map reduce program that do some matrix operations. in the
>> reducer, it will average many large matrix(each matrix takes up
>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>> then the total memory usage is 20GB. so the reduce task got exception:
>>
>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>> java.lang.OutOfMemoryError: Java heap space
>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>> at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>> at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>
>> one method I can come up with is use Combiner to save sums of some
>> matrixs and their count
>> but it still can solve the problem because the combiner is not fully
>> controled by me.
>>
>
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
2g


On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:

> This doesn't seem like related with the data size.
>
> How much memory do you use for the reducer?
>
> Regards,
> *Stanley Shi,*
>
>
>
> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>
>> I have a map reduce program that do some matrix operations. in the
>> reducer, it will average many large matrix(each matrix takes up
>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>> then the total memory usage is 20GB. so the reduce task got exception:
>>
>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>> java.lang.OutOfMemoryError: Java heap space
>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>> at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>> at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>
>> one method I can come up with is use Combiner to save sums of some
>> matrixs and their count
>> but it still can solve the problem because the combiner is not fully
>> controled by me.
>>
>
>

Re: how to solve reducer memory problem?

Posted by Li Li <fa...@gmail.com>.
2g


On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <ss...@gopivotal.com> wrote:

> This doesn't seem like related with the data size.
>
> How much memory do you use for the reducer?
>
> Regards,
> *Stanley Shi,*
>
>
>
> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:
>
>> I have a map reduce program that do some matrix operations. in the
>> reducer, it will average many large matrix(each matrix takes up
>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>> then the total memory usage is 20GB. so the reduce task got exception:
>>
>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>> java.lang.OutOfMemoryError: Java heap space
>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>> at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>> at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>
>> one method I can come up with is use Combiner to save sums of some
>> matrixs and their count
>> but it still can solve the problem because the combiner is not fully
>> controled by me.
>>
>
>

Re: how to solve reducer memory problem?

Posted by Stanley Shi <ss...@gopivotal.com>.
This doesn't seem like related with the data size.

How much memory do you use for the reducer?

Regards,
*Stanley Shi,*



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:

> I have a map reduce program that do some matrix operations. in the
> reducer, it will average many large matrix(each matrix takes up
> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
> then the total memory usage is 20GB. so the reduce task got exception:
>
> FATAL org.apache.hadoop.mapred.Child: Error running child :
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> one method I can come up with is use Combiner to save sums of some
> matrixs and their count
> but it still can solve the problem because the combiner is not fully
> controled by me.
>

Re: how to solve reducer memory problem?

Posted by Fengyun RAO <ra...@gmail.com>.
It doesn't need 20 GB memory.

Reducer doesn't load all data into memory at once, instead is would use the
disk, since it does "merge sort".


2014-04-03 8:04 GMT+08:00 Li Li <fa...@gmail.com>:

> I have a map reduce program that do some matrix operations. in the
> reducer, it will average many large matrix(each matrix takes up
> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
> then the total memory usage is 20GB. so the reduce task got exception:
>
> FATAL org.apache.hadoop.mapred.Child: Error running child :
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> one method I can come up with is use Combiner to save sums of some
> matrixs and their count
> but it still can solve the problem because the combiner is not fully
> controled by me.
>

Re: how to solve reducer memory problem?

Posted by Fengyun RAO <ra...@gmail.com>.
It doesn't need 20 GB memory.

Reducer doesn't load all data into memory at once, instead is would use the
disk, since it does "merge sort".


2014-04-03 8:04 GMT+08:00 Li Li <fa...@gmail.com>:

> I have a map reduce program that do some matrix operations. in the
> reducer, it will average many large matrix(each matrix takes up
> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
> then the total memory usage is 20GB. so the reduce task got exception:
>
> FATAL org.apache.hadoop.mapred.Child: Error running child :
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> one method I can come up with is use Combiner to save sums of some
> matrixs and their count
> but it still can solve the problem because the combiner is not fully
> controled by me.
>

Re: how to solve reducer memory problem?

Posted by Stanley Shi <ss...@gopivotal.com>.
This doesn't seem like related with the data size.

How much memory do you use for the reducer?

Regards,
*Stanley Shi,*



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:

> I have a map reduce program that do some matrix operations. in the
> reducer, it will average many large matrix(each matrix takes up
> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
> then the total memory usage is 20GB. so the reduce task got exception:
>
> FATAL org.apache.hadoop.mapred.Child: Error running child :
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> one method I can come up with is use Combiner to save sums of some
> matrixs and their count
> but it still can solve the problem because the combiner is not fully
> controled by me.
>

Re: how to solve reducer memory problem?

Posted by Stanley Shi <ss...@gopivotal.com>.
This doesn't seem like related with the data size.

How much memory do you use for the reducer?

Regards,
*Stanley Shi,*



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:

> I have a map reduce program that do some matrix operations. in the
> reducer, it will average many large matrix(each matrix takes up
> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
> then the total memory usage is 20GB. so the reduce task got exception:
>
> FATAL org.apache.hadoop.mapred.Child: Error running child :
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> one method I can come up with is use Combiner to save sums of some
> matrixs and their count
> but it still can solve the problem because the combiner is not fully
> controled by me.
>

Re: how to solve reducer memory problem?

Posted by Stanley Shi <ss...@gopivotal.com>.
This doesn't seem like related with the data size.

How much memory do you use for the reducer?

Regards,
*Stanley Shi,*



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fa...@gmail.com> wrote:

> I have a map reduce program that do some matrix operations. in the
> reducer, it will average many large matrix(each matrix takes up
> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
> then the total memory usage is 20GB. so the reduce task got exception:
>
> FATAL org.apache.hadoop.mapred.Child: Error running child :
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> one method I can come up with is use Combiner to save sums of some
> matrixs and their count
> but it still can solve the problem because the combiner is not fully
> controled by me.
>

Re: how to solve reducer memory problem?

Posted by Fengyun RAO <ra...@gmail.com>.
It doesn't need 20 GB memory.

Reducer doesn't load all data into memory at once, instead is would use the
disk, since it does "merge sort".


2014-04-03 8:04 GMT+08:00 Li Li <fa...@gmail.com>:

> I have a map reduce program that do some matrix operations. in the
> reducer, it will average many large matrix(each matrix takes up
> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
> then the total memory usage is 20GB. so the reduce task got exception:
>
> FATAL org.apache.hadoop.mapred.Child: Error running child :
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> one method I can come up with is use Combiner to save sums of some
> matrixs and their count
> but it still can solve the problem because the combiner is not fully
> controled by me.
>

Re: how to solve reducer memory problem?

Posted by Fengyun RAO <ra...@gmail.com>.
It doesn't need 20 GB memory.

Reducer doesn't load all data into memory at once, instead is would use the
disk, since it does "merge sort".


2014-04-03 8:04 GMT+08:00 Li Li <fa...@gmail.com>:

> I have a map reduce program that do some matrix operations. in the
> reducer, it will average many large matrix(each matrix takes up
> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
> then the total memory usage is 20GB. so the reduce task got exception:
>
> FATAL org.apache.hadoop.mapred.Child: Error running child :
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> one method I can come up with is use Combiner to save sums of some
> matrixs and their count
> but it still can solve the problem because the combiner is not fully
> controled by me.
>