You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Marcus Herou <ma...@tailsweep.com> on 2008/12/15 09:06:14 UTC

DataNode/TaskTracker memory constraints.

Hi.

All Hadoop components are started with -Xmx1000M as per default. I am
planning to throw in some data/task nodes here and there in my arch. However
most machines have only 4G physical RAM so allocating 2G + overhead ~2.5G to
hadoop is a little risky since they could very well become inaccessible if
it needs to compete with other processes for RAM. I have experienced this
many times with java processes going haywire where I run other services in
parallell.
Anyway I would like to understand the reasoning about having 1G allocated
per process. I figure that the DataNode could survive with a little less as
well the TaskTracker if the jobs running in it do not consume so much
memory. Of course each process would like to have even more memory than 1G
but if I need to cut down I would like to know which to cut and what I loose
by doing so.

Any thoughts? Trial and error is of course an option but I would like to
hear the basic thoughts about how memory should be utilized to gain max out
of the boxes.

Kindly

//Marcus





-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/