You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Viswanathan J <ja...@gmail.com> on 2013/10/11 19:08:55 UTC

Hadoop Jobtracker heap size calculation and OOME

Hi,

I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
running in all nodes.

*Apache Hadoop :* 1.2.1

It shows the heap size currently as follows:

*Cluster Summary (Heap Size is 5.7/8.89 GB)*
*
*
In the above summary what is the *8.89* GB defines? Is the *8.89* defines
maximum heap size for Jobtracker, if yes how it has been calculated.

Hope *5.7* is currently running jobs heap-size, how it is calculated.

Have set the jobtracker default memory size in hadoop-env.sh

*HADOOP_HEAPSIZE="1024"*
*
*
Have set the mapred.child.java.opts value in mapred-site.xml as,

<property>
  <name>mapred.child.java.opts</name>
  <value>-Xmx2048m</value>
</property>

Even after setting the above property, getting Jobtracker OOME issue. How
the jobtracker memory gradually increasing. After restart the JT, within a
week getting OOME.

How to resolve this, it is in production and critical? Please help. Thanks
in advance.

-- 
Regards,
Viswa.J

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi Harsh,

Appreciate the response.

Thanks Reyane.

Thanks,
Viswa.J
On Oct 12, 2013 5:04 AM, "Reyane Oukpedjo" <ou...@gmail.com> wrote:

> Hi there,
> I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
> set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with
> previous versions. But you can try this if you have memory and see. In my
> case the issue was gone after I set as above.
>
> Thanks
>
>
> Reyane OUKPEDJO
>
>
> On 11 October 2013 13:08, Viswanathan J <ja...@gmail.com>wrote:
>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>>  <property>
>>   <name>mapred.child.java.opts</name>
>>   <value>-Xmx2048m</value>
>> </property>
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi Harsh,

Appreciate the response.

Thanks Reyane.

Thanks,
Viswa.J
On Oct 12, 2013 5:04 AM, "Reyane Oukpedjo" <ou...@gmail.com> wrote:

> Hi there,
> I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
> set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with
> previous versions. But you can try this if you have memory and see. In my
> case the issue was gone after I set as above.
>
> Thanks
>
>
> Reyane OUKPEDJO
>
>
> On 11 October 2013 13:08, Viswanathan J <ja...@gmail.com>wrote:
>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>>  <property>
>>   <name>mapred.child.java.opts</name>
>>   <value>-Xmx2048m</value>
>> </property>
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi Harsh,

Appreciate the response.

Thanks Reyane.

Thanks,
Viswa.J
On Oct 12, 2013 5:04 AM, "Reyane Oukpedjo" <ou...@gmail.com> wrote:

> Hi there,
> I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
> set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with
> previous versions. But you can try this if you have memory and see. In my
> case the issue was gone after I set as above.
>
> Thanks
>
>
> Reyane OUKPEDJO
>
>
> On 11 October 2013 13:08, Viswanathan J <ja...@gmail.com>wrote:
>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>>  <property>
>>   <name>mapred.child.java.opts</name>
>>   <value>-Xmx2048m</value>
>> </property>
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi Harsh,

Appreciate the response.

Thanks Reyane.

Thanks,
Viswa.J
On Oct 12, 2013 5:04 AM, "Reyane Oukpedjo" <ou...@gmail.com> wrote:

> Hi there,
> I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
> set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with
> previous versions. But you can try this if you have memory and see. In my
> case the issue was gone after I set as above.
>
> Thanks
>
>
> Reyane OUKPEDJO
>
>
> On 11 October 2013 13:08, Viswanathan J <ja...@gmail.com>wrote:
>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>>  <property>
>>   <name>mapred.child.java.opts</name>
>>   <value>-Xmx2048m</value>
>> </property>
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Reyane Oukpedjo <ou...@gmail.com>.
Hi there,
I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with previous
versions. But you can try this if you have memory and see. In my case the
issue was gone after I set as above.

Thanks


Reyane OUKPEDJO


On 11 October 2013 13:08, Viswanathan J <ja...@gmail.com> wrote:

> Hi,
>
> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
> running in all nodes.
>
> *Apache Hadoop :* 1.2.1
>
> It shows the heap size currently as follows:
>
> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
> *
> *
> In the above summary what is the *8.89* GB defines? Is the *8.89* defines
> maximum heap size for Jobtracker, if yes how it has been calculated.
>
> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>
> Have set the jobtracker default memory size in hadoop-env.sh
>
> *HADOOP_HEAPSIZE="1024"*
> *
> *
> Have set the mapred.child.java.opts value in mapred-site.xml as,
>
> <property>
>   <name>mapred.child.java.opts</name>
>   <value>-Xmx2048m</value>
> </property>
>
> Even after setting the above property, getting Jobtracker OOME issue. How
> the jobtracker memory gradually increasing. After restart the JT, within a
> week getting OOME.
>
> How to resolve this, it is in production and critical? Please help. Thanks
> in advance.
>
> --
> Regards,
> Viswa.J
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Reyane Oukpedjo <ou...@gmail.com>.
Hi there,
I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with previous
versions. But you can try this if you have memory and see. In my case the
issue was gone after I set as above.

Thanks


Reyane OUKPEDJO


On 11 October 2013 13:08, Viswanathan J <ja...@gmail.com> wrote:

> Hi,
>
> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
> running in all nodes.
>
> *Apache Hadoop :* 1.2.1
>
> It shows the heap size currently as follows:
>
> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
> *
> *
> In the above summary what is the *8.89* GB defines? Is the *8.89* defines
> maximum heap size for Jobtracker, if yes how it has been calculated.
>
> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>
> Have set the jobtracker default memory size in hadoop-env.sh
>
> *HADOOP_HEAPSIZE="1024"*
> *
> *
> Have set the mapred.child.java.opts value in mapred-site.xml as,
>
> <property>
>   <name>mapred.child.java.opts</name>
>   <value>-Xmx2048m</value>
> </property>
>
> Even after setting the above property, getting Jobtracker OOME issue. How
> the jobtracker memory gradually increasing. After restart the JT, within a
> week getting OOME.
>
> How to resolve this, it is in production and critical? Please help. Thanks
> in advance.
>
> --
> Regards,
> Viswa.J
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Reyane Oukpedjo <ou...@gmail.com>.
Hi there,
I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with previous
versions. But you can try this if you have memory and see. In my case the
issue was gone after I set as above.

Thanks


Reyane OUKPEDJO


On 11 October 2013 13:08, Viswanathan J <ja...@gmail.com> wrote:

> Hi,
>
> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
> running in all nodes.
>
> *Apache Hadoop :* 1.2.1
>
> It shows the heap size currently as follows:
>
> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
> *
> *
> In the above summary what is the *8.89* GB defines? Is the *8.89* defines
> maximum heap size for Jobtracker, if yes how it has been calculated.
>
> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>
> Have set the jobtracker default memory size in hadoop-env.sh
>
> *HADOOP_HEAPSIZE="1024"*
> *
> *
> Have set the mapred.child.java.opts value in mapred-site.xml as,
>
> <property>
>   <name>mapred.child.java.opts</name>
>   <value>-Xmx2048m</value>
> </property>
>
> Even after setting the above property, getting Jobtracker OOME issue. How
> the jobtracker memory gradually increasing. After restart the JT, within a
> week getting OOME.
>
> How to resolve this, it is in production and critical? Please help. Thanks
> in advance.
>
> --
> Regards,
> Viswa.J
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Reyane Oukpedjo <ou...@gmail.com>.
Hi there,
I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with previous
versions. But you can try this if you have memory and see. In my case the
issue was gone after I set as above.

Thanks


Reyane OUKPEDJO


On 11 October 2013 13:08, Viswanathan J <ja...@gmail.com> wrote:

> Hi,
>
> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
> running in all nodes.
>
> *Apache Hadoop :* 1.2.1
>
> It shows the heap size currently as follows:
>
> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
> *
> *
> In the above summary what is the *8.89* GB defines? Is the *8.89* defines
> maximum heap size for Jobtracker, if yes how it has been calculated.
>
> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>
> Have set the jobtracker default memory size in hadoop-env.sh
>
> *HADOOP_HEAPSIZE="1024"*
> *
> *
> Have set the mapred.child.java.opts value in mapred-site.xml as,
>
> <property>
>   <name>mapred.child.java.opts</name>
>   <value>-Xmx2048m</value>
> </property>
>
> Even after setting the above property, getting Jobtracker OOME issue. How
> the jobtracker memory gradually increasing. After restart the JT, within a
> week getting OOME.
>
> How to resolve this, it is in production and critical? Please help. Thanks
> in advance.
>
> --
> Regards,
> Viswa.J
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi,

Not yet updated in production environment. Will keep you posted once it is
done.

In which Apache hadoop release this issue will be fixed? Or this issue
already fixed in hadoop-1.2.1 version as in the given below link,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true

Please confirm.

Thanks,
On Oct 15, 2013 3:43 AM, "Antwnis" <an...@gmail.com> wrote:

> Viswana,
>
> please confirm :) whether the issue was fixed - for future readers of this
> thread
>
> with this configuration, after restarting the JobTracker you should see on
> the jobtracker page that the memory usage remains low over time
>
> Antonios
>
>
> On Mon, Oct 14, 2013 at 10:56 AM, Antwnis <an...@gmail.com> wrote:
>
>> After changing mapred-site.xml , you will have to restart the JobTracker
>> to have the changes applied to it
>>
>>
>> On Mon, Oct 14, 2013 at 10:37 AM, Viswanathan J <
>> jayamviswanathan@gmail.com> wrote:
>>
>>> Thanks a lot and lot Antonio.
>>>
>>> I'm using the Apache hadoop, hope this issue will be resolved in
>>> upcoming apache hadoop releases.
>>>
>>> Do I need the restart whole cluster after changing the mapred site conf
>>> as you mentioned?
>>>
>>> What is the following bug id,
>>>
>>>
>>> https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&amp;serverRenderedViewIssue=true<https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true>
>>>
>>> Is this issue was different from OOME, but they mentioned that issue is
>>> fixed.
>>>
>>> Thanks,
>>> Viswa.J
>>>  On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>> wrote:
>>>
>>>> In *mapred-site.xml* you need the following snipset:
>>>>
>>>> <property>
>>>> <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
>>>> <value>100</value>
>>>> </property>
>>>> <property>
>>>> <name>keep.failed.task.files</name>
>>>> <value>true</value>
>>>> </property>
>>>> <property>
>>>> <name>keep.task.files.pattern</name>
>>>> <value>shouldnevereverevermatch</value>
>>>> </property>
>>>>
>>>>
>>>> This will fix the memory leak issue ( the official fix i think is
>>>> available in Cloudera's 4.6 distribution )
>>>> It will cause another issue - that is not removing the .staging files
>>>> from the /user/*/.staging/ location
>>>>
>>>>
>>>> To overcome this use a daily Jenkins job ( or cron ) and
>>>>
>>>> #!/bin/bash
>>>> LAST_DATE=$(date -ud '-7days' +%s)
>>>> hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
>>>> ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
>>>> <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs
>>>> dfs -rm -r -skipTrash
>>>>
>>>>
>>>> ^ The above will remove all directories that were created more than 7
>>>> days ago .. and will keep your HDFS clean
>>>>
>>>>
>>>>
>>>> On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>>>>>
>>>>> Hi guys,
>>>>>
>>>>> Appreciate your response.
>>>>>
>>>>> Thanks,
>>>>> Viswa.J
>>>>> On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Guys,
>>>>>>
>>>>>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1
>>>>>> version as per the hadoop release notes as below.
>>>>>>
>>>>>> Please check this URL,
>>>>>>
>>>>>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351<https://issues.apache.org/jira/browse/MAPREDUCE-5351>
>>>>>>
>>>>>> How come the issue still persist? I'm I asking a valid thing.
>>>>>>
>>>>>> Do I need to configure anything our I missing anything.
>>>>>>
>>>>>> Please help. Appreciate your response.
>>>>>>
>>>>>> Thanks,
>>>>>> Viswa.J
>>>>>> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Antonio, hope the memory leak issue will be resolved. Its
>>>>>>> really nightmare every week.
>>>>>>>
>>>>>>> In which release this issue will be resolved?
>>>>>>>
>>>>>>> How to solve this issue, please help because we are facing in
>>>>>>> production environment.
>>>>>>>
>>>>>>> Please share the configuration and cron to do that cleanup process.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Viswa
>>>>>>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> "After restart the JT, within a week getting OOME."
>>>>>>>>
>>>>>>>> Viswa, we were having the same issue in our cluster as well -
>>>>>>>> roughly every 5-7 days getting OOME.
>>>>>>>> The heap size of the Job Tracker was constantly increasing due to a
>>>>>>>> memory leak that will hopefully be fixed in newest releases.
>>>>>>>>
>>>>>>>> There is a configuration change in the JobTracker that will disable
>>>>>>>> a functionality regarding cleaning up staging files i.e.
>>>>>>>> /user/build/.staging/* - but that means that you will have to
>>>>>>>> handle the staging files through a cron / jenkins task
>>>>>>>>
>>>>>>>> I'll get you the configuration on Monday..
>>>>>>>>
>>>>>>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm running a 14 nodes of Hadoop cluster with
>>>>>>>>> datanodes,tasktrackers running in all nodes.
>>>>>>>>>
>>>>>>>>> *Apache Hadoop :* 1.2.1
>>>>>>>>>
>>>>>>>>> It shows the heap size currently as follows:
>>>>>>>>>
>>>>>>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>>>>>>> been calculated.
>>>>>>>>>
>>>>>>>>> Hope *5.7* is currently running jobs heap-size, how it
>>>>>>>>> is calculated.
>>>>>>>>>
>>>>>>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>>>>>>
>>>>>>>>> *HADOOP_HEAPSIZE="1024"*
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>>>>>>
>>>>>>>>>  <property>
>>>>>>>>>   <name>mapred.child.java.opts</****name>
>>>>>>>>>   <value>-Xmx2048m</value>
>>>>>>>>>  </property>
>>>>>>>>>
>>>>>>>>> Even after setting the above property, getting Jobtracker OOME
>>>>>>>>> issue. How the jobtracker memory gradually increasing. After restart the
>>>>>>>>> JT, within a week getting OOME.
>>>>>>>>>
>>>>>>>>> How to resolve this, it is in production and critical? Please
>>>>>>>>> help. Thanks in advance.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Regards,
>>>>>>>>> Viswa.J
>>>>>>>>>
>>>>>>>>  --
>>>>>>>>
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "CDH Users" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to cdh-user+u...@cloudera.**org.
>>>>>>>> For more options, visit https://groups.google.com/a/**
>>>>>>>> cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
>>>>>>>> .
>>>>>>>>
>>>>>>>   --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "CDH Users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to cdh-user+unsubscribe@cloudera.org.
>>>>
>>>> For more options, visit
>>>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>>>
>>>
>>
>>
>> --
>>
>
>
>
> --
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi,

Not yet updated in production environment. Will keep you posted once it is
done.

In which Apache hadoop release this issue will be fixed? Or this issue
already fixed in hadoop-1.2.1 version as in the given below link,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true

Please confirm.

Thanks,
On Oct 15, 2013 3:43 AM, "Antwnis" <an...@gmail.com> wrote:

> Viswana,
>
> please confirm :) whether the issue was fixed - for future readers of this
> thread
>
> with this configuration, after restarting the JobTracker you should see on
> the jobtracker page that the memory usage remains low over time
>
> Antonios
>
>
> On Mon, Oct 14, 2013 at 10:56 AM, Antwnis <an...@gmail.com> wrote:
>
>> After changing mapred-site.xml , you will have to restart the JobTracker
>> to have the changes applied to it
>>
>>
>> On Mon, Oct 14, 2013 at 10:37 AM, Viswanathan J <
>> jayamviswanathan@gmail.com> wrote:
>>
>>> Thanks a lot and lot Antonio.
>>>
>>> I'm using the Apache hadoop, hope this issue will be resolved in
>>> upcoming apache hadoop releases.
>>>
>>> Do I need the restart whole cluster after changing the mapred site conf
>>> as you mentioned?
>>>
>>> What is the following bug id,
>>>
>>>
>>> https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&amp;serverRenderedViewIssue=true<https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true>
>>>
>>> Is this issue was different from OOME, but they mentioned that issue is
>>> fixed.
>>>
>>> Thanks,
>>> Viswa.J
>>>  On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>> wrote:
>>>
>>>> In *mapred-site.xml* you need the following snipset:
>>>>
>>>> <property>
>>>> <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
>>>> <value>100</value>
>>>> </property>
>>>> <property>
>>>> <name>keep.failed.task.files</name>
>>>> <value>true</value>
>>>> </property>
>>>> <property>
>>>> <name>keep.task.files.pattern</name>
>>>> <value>shouldnevereverevermatch</value>
>>>> </property>
>>>>
>>>>
>>>> This will fix the memory leak issue ( the official fix i think is
>>>> available in Cloudera's 4.6 distribution )
>>>> It will cause another issue - that is not removing the .staging files
>>>> from the /user/*/.staging/ location
>>>>
>>>>
>>>> To overcome this use a daily Jenkins job ( or cron ) and
>>>>
>>>> #!/bin/bash
>>>> LAST_DATE=$(date -ud '-7days' +%s)
>>>> hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
>>>> ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
>>>> <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs
>>>> dfs -rm -r -skipTrash
>>>>
>>>>
>>>> ^ The above will remove all directories that were created more than 7
>>>> days ago .. and will keep your HDFS clean
>>>>
>>>>
>>>>
>>>> On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>>>>>
>>>>> Hi guys,
>>>>>
>>>>> Appreciate your response.
>>>>>
>>>>> Thanks,
>>>>> Viswa.J
>>>>> On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Guys,
>>>>>>
>>>>>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1
>>>>>> version as per the hadoop release notes as below.
>>>>>>
>>>>>> Please check this URL,
>>>>>>
>>>>>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351<https://issues.apache.org/jira/browse/MAPREDUCE-5351>
>>>>>>
>>>>>> How come the issue still persist? I'm I asking a valid thing.
>>>>>>
>>>>>> Do I need to configure anything our I missing anything.
>>>>>>
>>>>>> Please help. Appreciate your response.
>>>>>>
>>>>>> Thanks,
>>>>>> Viswa.J
>>>>>> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Antonio, hope the memory leak issue will be resolved. Its
>>>>>>> really nightmare every week.
>>>>>>>
>>>>>>> In which release this issue will be resolved?
>>>>>>>
>>>>>>> How to solve this issue, please help because we are facing in
>>>>>>> production environment.
>>>>>>>
>>>>>>> Please share the configuration and cron to do that cleanup process.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Viswa
>>>>>>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> "After restart the JT, within a week getting OOME."
>>>>>>>>
>>>>>>>> Viswa, we were having the same issue in our cluster as well -
>>>>>>>> roughly every 5-7 days getting OOME.
>>>>>>>> The heap size of the Job Tracker was constantly increasing due to a
>>>>>>>> memory leak that will hopefully be fixed in newest releases.
>>>>>>>>
>>>>>>>> There is a configuration change in the JobTracker that will disable
>>>>>>>> a functionality regarding cleaning up staging files i.e.
>>>>>>>> /user/build/.staging/* - but that means that you will have to
>>>>>>>> handle the staging files through a cron / jenkins task
>>>>>>>>
>>>>>>>> I'll get you the configuration on Monday..
>>>>>>>>
>>>>>>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm running a 14 nodes of Hadoop cluster with
>>>>>>>>> datanodes,tasktrackers running in all nodes.
>>>>>>>>>
>>>>>>>>> *Apache Hadoop :* 1.2.1
>>>>>>>>>
>>>>>>>>> It shows the heap size currently as follows:
>>>>>>>>>
>>>>>>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>>>>>>> been calculated.
>>>>>>>>>
>>>>>>>>> Hope *5.7* is currently running jobs heap-size, how it
>>>>>>>>> is calculated.
>>>>>>>>>
>>>>>>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>>>>>>
>>>>>>>>> *HADOOP_HEAPSIZE="1024"*
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>>>>>>
>>>>>>>>>  <property>
>>>>>>>>>   <name>mapred.child.java.opts</****name>
>>>>>>>>>   <value>-Xmx2048m</value>
>>>>>>>>>  </property>
>>>>>>>>>
>>>>>>>>> Even after setting the above property, getting Jobtracker OOME
>>>>>>>>> issue. How the jobtracker memory gradually increasing. After restart the
>>>>>>>>> JT, within a week getting OOME.
>>>>>>>>>
>>>>>>>>> How to resolve this, it is in production and critical? Please
>>>>>>>>> help. Thanks in advance.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Regards,
>>>>>>>>> Viswa.J
>>>>>>>>>
>>>>>>>>  --
>>>>>>>>
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "CDH Users" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to cdh-user+u...@cloudera.**org.
>>>>>>>> For more options, visit https://groups.google.com/a/**
>>>>>>>> cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
>>>>>>>> .
>>>>>>>>
>>>>>>>   --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "CDH Users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to cdh-user+unsubscribe@cloudera.org.
>>>>
>>>> For more options, visit
>>>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>>>
>>>
>>
>>
>> --
>>
>
>
>
> --
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi,

Not yet updated in production environment. Will keep you posted once it is
done.

In which Apache hadoop release this issue will be fixed? Or this issue
already fixed in hadoop-1.2.1 version as in the given below link,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true

Please confirm.

Thanks,
On Oct 15, 2013 3:43 AM, "Antwnis" <an...@gmail.com> wrote:

> Viswana,
>
> please confirm :) whether the issue was fixed - for future readers of this
> thread
>
> with this configuration, after restarting the JobTracker you should see on
> the jobtracker page that the memory usage remains low over time
>
> Antonios
>
>
> On Mon, Oct 14, 2013 at 10:56 AM, Antwnis <an...@gmail.com> wrote:
>
>> After changing mapred-site.xml , you will have to restart the JobTracker
>> to have the changes applied to it
>>
>>
>> On Mon, Oct 14, 2013 at 10:37 AM, Viswanathan J <
>> jayamviswanathan@gmail.com> wrote:
>>
>>> Thanks a lot and lot Antonio.
>>>
>>> I'm using the Apache hadoop, hope this issue will be resolved in
>>> upcoming apache hadoop releases.
>>>
>>> Do I need the restart whole cluster after changing the mapred site conf
>>> as you mentioned?
>>>
>>> What is the following bug id,
>>>
>>>
>>> https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&amp;serverRenderedViewIssue=true<https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true>
>>>
>>> Is this issue was different from OOME, but they mentioned that issue is
>>> fixed.
>>>
>>> Thanks,
>>> Viswa.J
>>>  On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>> wrote:
>>>
>>>> In *mapred-site.xml* you need the following snipset:
>>>>
>>>> <property>
>>>> <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
>>>> <value>100</value>
>>>> </property>
>>>> <property>
>>>> <name>keep.failed.task.files</name>
>>>> <value>true</value>
>>>> </property>
>>>> <property>
>>>> <name>keep.task.files.pattern</name>
>>>> <value>shouldnevereverevermatch</value>
>>>> </property>
>>>>
>>>>
>>>> This will fix the memory leak issue ( the official fix i think is
>>>> available in Cloudera's 4.6 distribution )
>>>> It will cause another issue - that is not removing the .staging files
>>>> from the /user/*/.staging/ location
>>>>
>>>>
>>>> To overcome this use a daily Jenkins job ( or cron ) and
>>>>
>>>> #!/bin/bash
>>>> LAST_DATE=$(date -ud '-7days' +%s)
>>>> hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
>>>> ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
>>>> <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs
>>>> dfs -rm -r -skipTrash
>>>>
>>>>
>>>> ^ The above will remove all directories that were created more than 7
>>>> days ago .. and will keep your HDFS clean
>>>>
>>>>
>>>>
>>>> On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>>>>>
>>>>> Hi guys,
>>>>>
>>>>> Appreciate your response.
>>>>>
>>>>> Thanks,
>>>>> Viswa.J
>>>>> On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Guys,
>>>>>>
>>>>>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1
>>>>>> version as per the hadoop release notes as below.
>>>>>>
>>>>>> Please check this URL,
>>>>>>
>>>>>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351<https://issues.apache.org/jira/browse/MAPREDUCE-5351>
>>>>>>
>>>>>> How come the issue still persist? I'm I asking a valid thing.
>>>>>>
>>>>>> Do I need to configure anything our I missing anything.
>>>>>>
>>>>>> Please help. Appreciate your response.
>>>>>>
>>>>>> Thanks,
>>>>>> Viswa.J
>>>>>> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Antonio, hope the memory leak issue will be resolved. Its
>>>>>>> really nightmare every week.
>>>>>>>
>>>>>>> In which release this issue will be resolved?
>>>>>>>
>>>>>>> How to solve this issue, please help because we are facing in
>>>>>>> production environment.
>>>>>>>
>>>>>>> Please share the configuration and cron to do that cleanup process.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Viswa
>>>>>>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> "After restart the JT, within a week getting OOME."
>>>>>>>>
>>>>>>>> Viswa, we were having the same issue in our cluster as well -
>>>>>>>> roughly every 5-7 days getting OOME.
>>>>>>>> The heap size of the Job Tracker was constantly increasing due to a
>>>>>>>> memory leak that will hopefully be fixed in newest releases.
>>>>>>>>
>>>>>>>> There is a configuration change in the JobTracker that will disable
>>>>>>>> a functionality regarding cleaning up staging files i.e.
>>>>>>>> /user/build/.staging/* - but that means that you will have to
>>>>>>>> handle the staging files through a cron / jenkins task
>>>>>>>>
>>>>>>>> I'll get you the configuration on Monday..
>>>>>>>>
>>>>>>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm running a 14 nodes of Hadoop cluster with
>>>>>>>>> datanodes,tasktrackers running in all nodes.
>>>>>>>>>
>>>>>>>>> *Apache Hadoop :* 1.2.1
>>>>>>>>>
>>>>>>>>> It shows the heap size currently as follows:
>>>>>>>>>
>>>>>>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>>>>>>> been calculated.
>>>>>>>>>
>>>>>>>>> Hope *5.7* is currently running jobs heap-size, how it
>>>>>>>>> is calculated.
>>>>>>>>>
>>>>>>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>>>>>>
>>>>>>>>> *HADOOP_HEAPSIZE="1024"*
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>>>>>>
>>>>>>>>>  <property>
>>>>>>>>>   <name>mapred.child.java.opts</****name>
>>>>>>>>>   <value>-Xmx2048m</value>
>>>>>>>>>  </property>
>>>>>>>>>
>>>>>>>>> Even after setting the above property, getting Jobtracker OOME
>>>>>>>>> issue. How the jobtracker memory gradually increasing. After restart the
>>>>>>>>> JT, within a week getting OOME.
>>>>>>>>>
>>>>>>>>> How to resolve this, it is in production and critical? Please
>>>>>>>>> help. Thanks in advance.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Regards,
>>>>>>>>> Viswa.J
>>>>>>>>>
>>>>>>>>  --
>>>>>>>>
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "CDH Users" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to cdh-user+u...@cloudera.**org.
>>>>>>>> For more options, visit https://groups.google.com/a/**
>>>>>>>> cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
>>>>>>>> .
>>>>>>>>
>>>>>>>   --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "CDH Users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to cdh-user+unsubscribe@cloudera.org.
>>>>
>>>> For more options, visit
>>>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>>>
>>>
>>
>>
>> --
>>
>
>
>
> --
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi,

Not yet updated in production environment. Will keep you posted once it is
done.

In which Apache hadoop release this issue will be fixed? Or this issue
already fixed in hadoop-1.2.1 version as in the given below link,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true

Please confirm.

Thanks,
On Oct 15, 2013 3:43 AM, "Antwnis" <an...@gmail.com> wrote:

> Viswana,
>
> please confirm :) whether the issue was fixed - for future readers of this
> thread
>
> with this configuration, after restarting the JobTracker you should see on
> the jobtracker page that the memory usage remains low over time
>
> Antonios
>
>
> On Mon, Oct 14, 2013 at 10:56 AM, Antwnis <an...@gmail.com> wrote:
>
>> After changing mapred-site.xml , you will have to restart the JobTracker
>> to have the changes applied to it
>>
>>
>> On Mon, Oct 14, 2013 at 10:37 AM, Viswanathan J <
>> jayamviswanathan@gmail.com> wrote:
>>
>>> Thanks a lot and lot Antonio.
>>>
>>> I'm using the Apache hadoop, hope this issue will be resolved in
>>> upcoming apache hadoop releases.
>>>
>>> Do I need the restart whole cluster after changing the mapred site conf
>>> as you mentioned?
>>>
>>> What is the following bug id,
>>>
>>>
>>> https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&amp;serverRenderedViewIssue=true<https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true>
>>>
>>> Is this issue was different from OOME, but they mentioned that issue is
>>> fixed.
>>>
>>> Thanks,
>>> Viswa.J
>>>  On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>> wrote:
>>>
>>>> In *mapred-site.xml* you need the following snipset:
>>>>
>>>> <property>
>>>> <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
>>>> <value>100</value>
>>>> </property>
>>>> <property>
>>>> <name>keep.failed.task.files</name>
>>>> <value>true</value>
>>>> </property>
>>>> <property>
>>>> <name>keep.task.files.pattern</name>
>>>> <value>shouldnevereverevermatch</value>
>>>> </property>
>>>>
>>>>
>>>> This will fix the memory leak issue ( the official fix i think is
>>>> available in Cloudera's 4.6 distribution )
>>>> It will cause another issue - that is not removing the .staging files
>>>> from the /user/*/.staging/ location
>>>>
>>>>
>>>> To overcome this use a daily Jenkins job ( or cron ) and
>>>>
>>>> #!/bin/bash
>>>> LAST_DATE=$(date -ud '-7days' +%s)
>>>> hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
>>>> ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
>>>> <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs
>>>> dfs -rm -r -skipTrash
>>>>
>>>>
>>>> ^ The above will remove all directories that were created more than 7
>>>> days ago .. and will keep your HDFS clean
>>>>
>>>>
>>>>
>>>> On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>>>>>
>>>>> Hi guys,
>>>>>
>>>>> Appreciate your response.
>>>>>
>>>>> Thanks,
>>>>> Viswa.J
>>>>> On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Guys,
>>>>>>
>>>>>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1
>>>>>> version as per the hadoop release notes as below.
>>>>>>
>>>>>> Please check this URL,
>>>>>>
>>>>>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351<https://issues.apache.org/jira/browse/MAPREDUCE-5351>
>>>>>>
>>>>>> How come the issue still persist? I'm I asking a valid thing.
>>>>>>
>>>>>> Do I need to configure anything our I missing anything.
>>>>>>
>>>>>> Please help. Appreciate your response.
>>>>>>
>>>>>> Thanks,
>>>>>> Viswa.J
>>>>>> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Antonio, hope the memory leak issue will be resolved. Its
>>>>>>> really nightmare every week.
>>>>>>>
>>>>>>> In which release this issue will be resolved?
>>>>>>>
>>>>>>> How to solve this issue, please help because we are facing in
>>>>>>> production environment.
>>>>>>>
>>>>>>> Please share the configuration and cron to do that cleanup process.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Viswa
>>>>>>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> "After restart the JT, within a week getting OOME."
>>>>>>>>
>>>>>>>> Viswa, we were having the same issue in our cluster as well -
>>>>>>>> roughly every 5-7 days getting OOME.
>>>>>>>> The heap size of the Job Tracker was constantly increasing due to a
>>>>>>>> memory leak that will hopefully be fixed in newest releases.
>>>>>>>>
>>>>>>>> There is a configuration change in the JobTracker that will disable
>>>>>>>> a functionality regarding cleaning up staging files i.e.
>>>>>>>> /user/build/.staging/* - but that means that you will have to
>>>>>>>> handle the staging files through a cron / jenkins task
>>>>>>>>
>>>>>>>> I'll get you the configuration on Monday..
>>>>>>>>
>>>>>>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm running a 14 nodes of Hadoop cluster with
>>>>>>>>> datanodes,tasktrackers running in all nodes.
>>>>>>>>>
>>>>>>>>> *Apache Hadoop :* 1.2.1
>>>>>>>>>
>>>>>>>>> It shows the heap size currently as follows:
>>>>>>>>>
>>>>>>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>>>>>>> been calculated.
>>>>>>>>>
>>>>>>>>> Hope *5.7* is currently running jobs heap-size, how it
>>>>>>>>> is calculated.
>>>>>>>>>
>>>>>>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>>>>>>
>>>>>>>>> *HADOOP_HEAPSIZE="1024"*
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>>>>>>
>>>>>>>>>  <property>
>>>>>>>>>   <name>mapred.child.java.opts</****name>
>>>>>>>>>   <value>-Xmx2048m</value>
>>>>>>>>>  </property>
>>>>>>>>>
>>>>>>>>> Even after setting the above property, getting Jobtracker OOME
>>>>>>>>> issue. How the jobtracker memory gradually increasing. After restart the
>>>>>>>>> JT, within a week getting OOME.
>>>>>>>>>
>>>>>>>>> How to resolve this, it is in production and critical? Please
>>>>>>>>> help. Thanks in advance.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Regards,
>>>>>>>>> Viswa.J
>>>>>>>>>
>>>>>>>>  --
>>>>>>>>
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "CDH Users" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to cdh-user+u...@cloudera.**org.
>>>>>>>> For more options, visit https://groups.google.com/a/**
>>>>>>>> cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
>>>>>>>> .
>>>>>>>>
>>>>>>>   --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "CDH Users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to cdh-user+unsubscribe@cloudera.org.
>>>>
>>>> For more options, visit
>>>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>>>
>>>
>>
>>
>> --
>>
>
>
>
> --
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Thanks a lot and lot Antonio.

I'm using the Apache hadoop, hope this issue will be resolved in upcoming
apache hadoop releases.

Do I need the restart whole cluster after changing the mapred site conf as
you mentioned?

What is the following bug id,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&amp;serverRenderedViewIssue=true

Is this issue was different from OOME, but they mentioned that issue is
fixed.

Thanks,
Viswa.J
 On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
wrote:

> In *mapred-site.xml* you need the following snipset:
>
> <property>
> <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
> <value>100</value>
> </property>
> <property>
> <name>keep.failed.task.files</name>
> <value>true</value>
> </property>
> <property>
> <name>keep.task.files.pattern</name>
> <value>shouldnevereverevermatch</value>
> </property>
>
>
> This will fix the memory leak issue ( the official fix i think is
> available in Cloudera's 4.6 distribution )
> It will cause another issue - that is not removing the .staging files from
> the /user/*/.staging/ location
>
>
> To overcome this use a daily Jenkins job ( or cron ) and
>
> #!/bin/bash
> LAST_DATE=$(date -ud '-7days' +%s)
> hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
> ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
> <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs dfs
> -rm -r -skipTrash
>
>
> ^ The above will remove all directories that were created more than 7 days
> ago .. and will keep your HDFS clean
>
>
>
> On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>>
>> Hi guys,
>>
>> Appreciate your response.
>>
>> Thanks,
>> Viswa.J
>> On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com> wrote:
>>
>>> Hi Guys,
>>>
>>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version
>>> as per the hadoop release notes as below.
>>>
>>> Please check this URL,
>>>
>>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351<https://issues.apache.org/jira/browse/MAPREDUCE-5351>
>>>
>>> How come the issue still persist? I'm I asking a valid thing.
>>>
>>> Do I need to configure anything our I missing anything.
>>>
>>> Please help. Appreciate your response.
>>>
>>> Thanks,
>>> Viswa.J
>>> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com> wrote:
>>>
>>>> Thanks Antonio, hope the memory leak issue will be resolved. Its really
>>>> nightmare every week.
>>>>
>>>> In which release this issue will be resolved?
>>>>
>>>> How to solve this issue, please help because we are facing in
>>>> production environment.
>>>>
>>>> Please share the configuration and cron to do that cleanup process.
>>>>
>>>> Thanks,
>>>> Viswa
>>>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>>> wrote:
>>>>
>>>>> "After restart the JT, within a week getting OOME."
>>>>>
>>>>> Viswa, we were having the same issue in our cluster as well - roughly
>>>>> every 5-7 days getting OOME.
>>>>> The heap size of the Job Tracker was constantly increasing due to a
>>>>> memory leak that will hopefully be fixed in newest releases.
>>>>>
>>>>> There is a configuration change in the JobTracker that will disable a
>>>>> functionality regarding cleaning up staging files i.e.
>>>>> /user/build/.staging/* - but that means that you will have to handle
>>>>> the staging files through a cron / jenkins task
>>>>>
>>>>> I'll get you the configuration on Monday..
>>>>>
>>>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>>>>> running in all nodes.
>>>>>>
>>>>>> *Apache Hadoop :* 1.2.1
>>>>>>
>>>>>> It shows the heap size currently as follows:
>>>>>>
>>>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>>>> *
>>>>>> *
>>>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>>>> been calculated.
>>>>>>
>>>>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>>>>
>>>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>>>
>>>>>> *HADOOP_HEAPSIZE="1024"*
>>>>>> *
>>>>>> *
>>>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>>>
>>>>>>  <property>
>>>>>>   <name>mapred.child.java.opts</****name>
>>>>>>   <value>-Xmx2048m</value>
>>>>>>  </property>
>>>>>>
>>>>>> Even after setting the above property, getting Jobtracker OOME issue.
>>>>>> How the jobtracker memory gradually increasing. After restart the JT,
>>>>>> within a week getting OOME.
>>>>>>
>>>>>> How to resolve this, it is in production and critical? Please help.
>>>>>> Thanks in advance.
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Viswa.J
>>>>>>
>>>>>  --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "CDH Users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to cdh-user+u...@cloudera.**org.
>>>>> For more options, visit https://groups.google.com/a/**
>>>>> cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
>>>>> .
>>>>>
>>>>   --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscribe@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Thanks a lot and lot Antonio.

I'm using the Apache hadoop, hope this issue will be resolved in upcoming
apache hadoop releases.

Do I need the restart whole cluster after changing the mapred site conf as
you mentioned?

What is the following bug id,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&amp;serverRenderedViewIssue=true

Is this issue was different from OOME, but they mentioned that issue is
fixed.

Thanks,
Viswa.J
 On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
wrote:

> In *mapred-site.xml* you need the following snipset:
>
> <property>
> <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
> <value>100</value>
> </property>
> <property>
> <name>keep.failed.task.files</name>
> <value>true</value>
> </property>
> <property>
> <name>keep.task.files.pattern</name>
> <value>shouldnevereverevermatch</value>
> </property>
>
>
> This will fix the memory leak issue ( the official fix i think is
> available in Cloudera's 4.6 distribution )
> It will cause another issue - that is not removing the .staging files from
> the /user/*/.staging/ location
>
>
> To overcome this use a daily Jenkins job ( or cron ) and
>
> #!/bin/bash
> LAST_DATE=$(date -ud '-7days' +%s)
> hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
> ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
> <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs dfs
> -rm -r -skipTrash
>
>
> ^ The above will remove all directories that were created more than 7 days
> ago .. and will keep your HDFS clean
>
>
>
> On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>>
>> Hi guys,
>>
>> Appreciate your response.
>>
>> Thanks,
>> Viswa.J
>> On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com> wrote:
>>
>>> Hi Guys,
>>>
>>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version
>>> as per the hadoop release notes as below.
>>>
>>> Please check this URL,
>>>
>>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351<https://issues.apache.org/jira/browse/MAPREDUCE-5351>
>>>
>>> How come the issue still persist? I'm I asking a valid thing.
>>>
>>> Do I need to configure anything our I missing anything.
>>>
>>> Please help. Appreciate your response.
>>>
>>> Thanks,
>>> Viswa.J
>>> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com> wrote:
>>>
>>>> Thanks Antonio, hope the memory leak issue will be resolved. Its really
>>>> nightmare every week.
>>>>
>>>> In which release this issue will be resolved?
>>>>
>>>> How to solve this issue, please help because we are facing in
>>>> production environment.
>>>>
>>>> Please share the configuration and cron to do that cleanup process.
>>>>
>>>> Thanks,
>>>> Viswa
>>>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>>> wrote:
>>>>
>>>>> "After restart the JT, within a week getting OOME."
>>>>>
>>>>> Viswa, we were having the same issue in our cluster as well - roughly
>>>>> every 5-7 days getting OOME.
>>>>> The heap size of the Job Tracker was constantly increasing due to a
>>>>> memory leak that will hopefully be fixed in newest releases.
>>>>>
>>>>> There is a configuration change in the JobTracker that will disable a
>>>>> functionality regarding cleaning up staging files i.e.
>>>>> /user/build/.staging/* - but that means that you will have to handle
>>>>> the staging files through a cron / jenkins task
>>>>>
>>>>> I'll get you the configuration on Monday..
>>>>>
>>>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>>>>> running in all nodes.
>>>>>>
>>>>>> *Apache Hadoop :* 1.2.1
>>>>>>
>>>>>> It shows the heap size currently as follows:
>>>>>>
>>>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>>>> *
>>>>>> *
>>>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>>>> been calculated.
>>>>>>
>>>>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>>>>
>>>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>>>
>>>>>> *HADOOP_HEAPSIZE="1024"*
>>>>>> *
>>>>>> *
>>>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>>>
>>>>>>  <property>
>>>>>>   <name>mapred.child.java.opts</****name>
>>>>>>   <value>-Xmx2048m</value>
>>>>>>  </property>
>>>>>>
>>>>>> Even after setting the above property, getting Jobtracker OOME issue.
>>>>>> How the jobtracker memory gradually increasing. After restart the JT,
>>>>>> within a week getting OOME.
>>>>>>
>>>>>> How to resolve this, it is in production and critical? Please help.
>>>>>> Thanks in advance.
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Viswa.J
>>>>>>
>>>>>  --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "CDH Users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to cdh-user+u...@cloudera.**org.
>>>>> For more options, visit https://groups.google.com/a/**
>>>>> cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
>>>>> .
>>>>>
>>>>   --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscribe@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Thanks a lot and lot Antonio.

I'm using the Apache hadoop, hope this issue will be resolved in upcoming
apache hadoop releases.

Do I need the restart whole cluster after changing the mapred site conf as
you mentioned?

What is the following bug id,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&amp;serverRenderedViewIssue=true

Is this issue was different from OOME, but they mentioned that issue is
fixed.

Thanks,
Viswa.J
 On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
wrote:

> In *mapred-site.xml* you need the following snipset:
>
> <property>
> <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
> <value>100</value>
> </property>
> <property>
> <name>keep.failed.task.files</name>
> <value>true</value>
> </property>
> <property>
> <name>keep.task.files.pattern</name>
> <value>shouldnevereverevermatch</value>
> </property>
>
>
> This will fix the memory leak issue ( the official fix i think is
> available in Cloudera's 4.6 distribution )
> It will cause another issue - that is not removing the .staging files from
> the /user/*/.staging/ location
>
>
> To overcome this use a daily Jenkins job ( or cron ) and
>
> #!/bin/bash
> LAST_DATE=$(date -ud '-7days' +%s)
> hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
> ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
> <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs dfs
> -rm -r -skipTrash
>
>
> ^ The above will remove all directories that were created more than 7 days
> ago .. and will keep your HDFS clean
>
>
>
> On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>>
>> Hi guys,
>>
>> Appreciate your response.
>>
>> Thanks,
>> Viswa.J
>> On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com> wrote:
>>
>>> Hi Guys,
>>>
>>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version
>>> as per the hadoop release notes as below.
>>>
>>> Please check this URL,
>>>
>>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351<https://issues.apache.org/jira/browse/MAPREDUCE-5351>
>>>
>>> How come the issue still persist? I'm I asking a valid thing.
>>>
>>> Do I need to configure anything our I missing anything.
>>>
>>> Please help. Appreciate your response.
>>>
>>> Thanks,
>>> Viswa.J
>>> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com> wrote:
>>>
>>>> Thanks Antonio, hope the memory leak issue will be resolved. Its really
>>>> nightmare every week.
>>>>
>>>> In which release this issue will be resolved?
>>>>
>>>> How to solve this issue, please help because we are facing in
>>>> production environment.
>>>>
>>>> Please share the configuration and cron to do that cleanup process.
>>>>
>>>> Thanks,
>>>> Viswa
>>>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>>> wrote:
>>>>
>>>>> "After restart the JT, within a week getting OOME."
>>>>>
>>>>> Viswa, we were having the same issue in our cluster as well - roughly
>>>>> every 5-7 days getting OOME.
>>>>> The heap size of the Job Tracker was constantly increasing due to a
>>>>> memory leak that will hopefully be fixed in newest releases.
>>>>>
>>>>> There is a configuration change in the JobTracker that will disable a
>>>>> functionality regarding cleaning up staging files i.e.
>>>>> /user/build/.staging/* - but that means that you will have to handle
>>>>> the staging files through a cron / jenkins task
>>>>>
>>>>> I'll get you the configuration on Monday..
>>>>>
>>>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>>>>> running in all nodes.
>>>>>>
>>>>>> *Apache Hadoop :* 1.2.1
>>>>>>
>>>>>> It shows the heap size currently as follows:
>>>>>>
>>>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>>>> *
>>>>>> *
>>>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>>>> been calculated.
>>>>>>
>>>>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>>>>
>>>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>>>
>>>>>> *HADOOP_HEAPSIZE="1024"*
>>>>>> *
>>>>>> *
>>>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>>>
>>>>>>  <property>
>>>>>>   <name>mapred.child.java.opts</****name>
>>>>>>   <value>-Xmx2048m</value>
>>>>>>  </property>
>>>>>>
>>>>>> Even after setting the above property, getting Jobtracker OOME issue.
>>>>>> How the jobtracker memory gradually increasing. After restart the JT,
>>>>>> within a week getting OOME.
>>>>>>
>>>>>> How to resolve this, it is in production and critical? Please help.
>>>>>> Thanks in advance.
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Viswa.J
>>>>>>
>>>>>  --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "CDH Users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to cdh-user+u...@cloudera.**org.
>>>>> For more options, visit https://groups.google.com/a/**
>>>>> cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
>>>>> .
>>>>>
>>>>   --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscribe@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Thanks a lot and lot Antonio.

I'm using the Apache hadoop, hope this issue will be resolved in upcoming
apache hadoop releases.

Do I need the restart whole cluster after changing the mapred site conf as
you mentioned?

What is the following bug id,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&amp;serverRenderedViewIssue=true

Is this issue was different from OOME, but they mentioned that issue is
fixed.

Thanks,
Viswa.J
 On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
wrote:

> In *mapred-site.xml* you need the following snipset:
>
> <property>
> <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
> <value>100</value>
> </property>
> <property>
> <name>keep.failed.task.files</name>
> <value>true</value>
> </property>
> <property>
> <name>keep.task.files.pattern</name>
> <value>shouldnevereverevermatch</value>
> </property>
>
>
> This will fix the memory leak issue ( the official fix i think is
> available in Cloudera's 4.6 distribution )
> It will cause another issue - that is not removing the .staging files from
> the /user/*/.staging/ location
>
>
> To overcome this use a daily Jenkins job ( or cron ) and
>
> #!/bin/bash
> LAST_DATE=$(date -ud '-7days' +%s)
> hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
> ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
> <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs dfs
> -rm -r -skipTrash
>
>
> ^ The above will remove all directories that were created more than 7 days
> ago .. and will keep your HDFS clean
>
>
>
> On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>>
>> Hi guys,
>>
>> Appreciate your response.
>>
>> Thanks,
>> Viswa.J
>> On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com> wrote:
>>
>>> Hi Guys,
>>>
>>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version
>>> as per the hadoop release notes as below.
>>>
>>> Please check this URL,
>>>
>>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351<https://issues.apache.org/jira/browse/MAPREDUCE-5351>
>>>
>>> How come the issue still persist? I'm I asking a valid thing.
>>>
>>> Do I need to configure anything our I missing anything.
>>>
>>> Please help. Appreciate your response.
>>>
>>> Thanks,
>>> Viswa.J
>>> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com> wrote:
>>>
>>>> Thanks Antonio, hope the memory leak issue will be resolved. Its really
>>>> nightmare every week.
>>>>
>>>> In which release this issue will be resolved?
>>>>
>>>> How to solve this issue, please help because we are facing in
>>>> production environment.
>>>>
>>>> Please share the configuration and cron to do that cleanup process.
>>>>
>>>> Thanks,
>>>> Viswa
>>>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>>>> wrote:
>>>>
>>>>> "After restart the JT, within a week getting OOME."
>>>>>
>>>>> Viswa, we were having the same issue in our cluster as well - roughly
>>>>> every 5-7 days getting OOME.
>>>>> The heap size of the Job Tracker was constantly increasing due to a
>>>>> memory leak that will hopefully be fixed in newest releases.
>>>>>
>>>>> There is a configuration change in the JobTracker that will disable a
>>>>> functionality regarding cleaning up staging files i.e.
>>>>> /user/build/.staging/* - but that means that you will have to handle
>>>>> the staging files through a cron / jenkins task
>>>>>
>>>>> I'll get you the configuration on Monday..
>>>>>
>>>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>>>>> running in all nodes.
>>>>>>
>>>>>> *Apache Hadoop :* 1.2.1
>>>>>>
>>>>>> It shows the heap size currently as follows:
>>>>>>
>>>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>>>> *
>>>>>> *
>>>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>>>> been calculated.
>>>>>>
>>>>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>>>>
>>>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>>>
>>>>>> *HADOOP_HEAPSIZE="1024"*
>>>>>> *
>>>>>> *
>>>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>>>
>>>>>>  <property>
>>>>>>   <name>mapred.child.java.opts</****name>
>>>>>>   <value>-Xmx2048m</value>
>>>>>>  </property>
>>>>>>
>>>>>> Even after setting the above property, getting Jobtracker OOME issue.
>>>>>> How the jobtracker memory gradually increasing. After restart the JT,
>>>>>> within a week getting OOME.
>>>>>>
>>>>>> How to resolve this, it is in production and critical? Please help.
>>>>>> Thanks in advance.
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Viswa.J
>>>>>>
>>>>>  --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "CDH Users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to cdh-user+u...@cloudera.**org.
>>>>> For more options, visit https://groups.google.com/a/**
>>>>> cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
>>>>> .
>>>>>
>>>>   --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscribe@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi guys,

Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com>
wrote:

> Hi Guys,
>
> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
> per the hadoop release notes as below.
>
> Please check this URL,
>
> https://issues.apache.org/jira/browse/MAPREDUCE-5351
>
> How come the issue still persist? I'm I asking a valid thing.
>
> Do I need to configure anything our I missing anything.
>
> Please help. Appreciate your response.
>
> Thanks,
> Viswa.J
> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com>
> wrote:
>
>> Thanks Antonio, hope the memory leak issue will be resolved. Its really
>> nightmare every week.
>>
>> In which release this issue will be resolved?
>>
>> How to solve this issue, please help because we are facing in production
>> environment.
>>
>> Please share the configuration and cron to do that cleanup process.
>>
>> Thanks,
>> Viswa
>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>> wrote:
>>
>>> "After restart the JT, within a week getting OOME."
>>>
>>> Viswa, we were having the same issue in our cluster as well - roughly
>>> every 5-7 days getting OOME.
>>> The heap size of the Job Tracker was constantly increasing due to a
>>> memory leak that will hopefully be fixed in newest releases.
>>>
>>> There is a configuration change in the JobTracker that will disable a
>>> functionality regarding cleaning up staging files i.e.
>>> /user/build/.staging/* - but that means that you will have to handle the
>>> staging files through a cron / jenkins task
>>>
>>> I'll get you the configuration on Monday..
>>>
>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>>> running in all nodes.
>>>>
>>>> *Apache Hadoop :* 1.2.1
>>>>
>>>> It shows the heap size currently as follows:
>>>>
>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>> *
>>>> *
>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>> been calculated.
>>>>
>>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>>
>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>
>>>> *HADOOP_HEAPSIZE="1024"*
>>>> *
>>>> *
>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>
>>>>  <property>
>>>>   <name>mapred.child.java.opts</**name>
>>>>   <value>-Xmx2048m</value>
>>>>  </property>
>>>>
>>>> Even after setting the above property, getting Jobtracker OOME issue.
>>>> How the jobtracker memory gradually increasing. After restart the JT,
>>>> within a week getting OOME.
>>>>
>>>> How to resolve this, it is in production and critical? Please help.
>>>> Thanks in advance.
>>>>
>>>> --
>>>> Regards,
>>>> Viswa.J
>>>>
>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "CDH Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to cdh-user+unsubscribe@cloudera.org.
>>> For more options, visit
>>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>>
>>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi guys,

Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com>
wrote:

> Hi Guys,
>
> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
> per the hadoop release notes as below.
>
> Please check this URL,
>
> https://issues.apache.org/jira/browse/MAPREDUCE-5351
>
> How come the issue still persist? I'm I asking a valid thing.
>
> Do I need to configure anything our I missing anything.
>
> Please help. Appreciate your response.
>
> Thanks,
> Viswa.J
> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com>
> wrote:
>
>> Thanks Antonio, hope the memory leak issue will be resolved. Its really
>> nightmare every week.
>>
>> In which release this issue will be resolved?
>>
>> How to solve this issue, please help because we are facing in production
>> environment.
>>
>> Please share the configuration and cron to do that cleanup process.
>>
>> Thanks,
>> Viswa
>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>> wrote:
>>
>>> "After restart the JT, within a week getting OOME."
>>>
>>> Viswa, we were having the same issue in our cluster as well - roughly
>>> every 5-7 days getting OOME.
>>> The heap size of the Job Tracker was constantly increasing due to a
>>> memory leak that will hopefully be fixed in newest releases.
>>>
>>> There is a configuration change in the JobTracker that will disable a
>>> functionality regarding cleaning up staging files i.e.
>>> /user/build/.staging/* - but that means that you will have to handle the
>>> staging files through a cron / jenkins task
>>>
>>> I'll get you the configuration on Monday..
>>>
>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>>> running in all nodes.
>>>>
>>>> *Apache Hadoop :* 1.2.1
>>>>
>>>> It shows the heap size currently as follows:
>>>>
>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>> *
>>>> *
>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>> been calculated.
>>>>
>>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>>
>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>
>>>> *HADOOP_HEAPSIZE="1024"*
>>>> *
>>>> *
>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>
>>>>  <property>
>>>>   <name>mapred.child.java.opts</**name>
>>>>   <value>-Xmx2048m</value>
>>>>  </property>
>>>>
>>>> Even after setting the above property, getting Jobtracker OOME issue.
>>>> How the jobtracker memory gradually increasing. After restart the JT,
>>>> within a week getting OOME.
>>>>
>>>> How to resolve this, it is in production and critical? Please help.
>>>> Thanks in advance.
>>>>
>>>> --
>>>> Regards,
>>>> Viswa.J
>>>>
>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "CDH Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to cdh-user+unsubscribe@cloudera.org.
>>> For more options, visit
>>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>>
>>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi guys,

Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com>
wrote:

> Hi Guys,
>
> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
> per the hadoop release notes as below.
>
> Please check this URL,
>
> https://issues.apache.org/jira/browse/MAPREDUCE-5351
>
> How come the issue still persist? I'm I asking a valid thing.
>
> Do I need to configure anything our I missing anything.
>
> Please help. Appreciate your response.
>
> Thanks,
> Viswa.J
> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com>
> wrote:
>
>> Thanks Antonio, hope the memory leak issue will be resolved. Its really
>> nightmare every week.
>>
>> In which release this issue will be resolved?
>>
>> How to solve this issue, please help because we are facing in production
>> environment.
>>
>> Please share the configuration and cron to do that cleanup process.
>>
>> Thanks,
>> Viswa
>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>> wrote:
>>
>>> "After restart the JT, within a week getting OOME."
>>>
>>> Viswa, we were having the same issue in our cluster as well - roughly
>>> every 5-7 days getting OOME.
>>> The heap size of the Job Tracker was constantly increasing due to a
>>> memory leak that will hopefully be fixed in newest releases.
>>>
>>> There is a configuration change in the JobTracker that will disable a
>>> functionality regarding cleaning up staging files i.e.
>>> /user/build/.staging/* - but that means that you will have to handle the
>>> staging files through a cron / jenkins task
>>>
>>> I'll get you the configuration on Monday..
>>>
>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>>> running in all nodes.
>>>>
>>>> *Apache Hadoop :* 1.2.1
>>>>
>>>> It shows the heap size currently as follows:
>>>>
>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>> *
>>>> *
>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>> been calculated.
>>>>
>>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>>
>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>
>>>> *HADOOP_HEAPSIZE="1024"*
>>>> *
>>>> *
>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>
>>>>  <property>
>>>>   <name>mapred.child.java.opts</**name>
>>>>   <value>-Xmx2048m</value>
>>>>  </property>
>>>>
>>>> Even after setting the above property, getting Jobtracker OOME issue.
>>>> How the jobtracker memory gradually increasing. After restart the JT,
>>>> within a week getting OOME.
>>>>
>>>> How to resolve this, it is in production and critical? Please help.
>>>> Thanks in advance.
>>>>
>>>> --
>>>> Regards,
>>>> Viswa.J
>>>>
>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "CDH Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to cdh-user+unsubscribe@cloudera.org.
>>> For more options, visit
>>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>>
>>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi guys,

Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 11:29 PM, "Viswanathan J" <ja...@gmail.com>
wrote:

> Hi Guys,
>
> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
> per the hadoop release notes as below.
>
> Please check this URL,
>
> https://issues.apache.org/jira/browse/MAPREDUCE-5351
>
> How come the issue still persist? I'm I asking a valid thing.
>
> Do I need to configure anything our I missing anything.
>
> Please help. Appreciate your response.
>
> Thanks,
> Viswa.J
> On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com>
> wrote:
>
>> Thanks Antonio, hope the memory leak issue will be resolved. Its really
>> nightmare every week.
>>
>> In which release this issue will be resolved?
>>
>> How to solve this issue, please help because we are facing in production
>> environment.
>>
>> Please share the configuration and cron to do that cleanup process.
>>
>> Thanks,
>> Viswa
>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
>> wrote:
>>
>>> "After restart the JT, within a week getting OOME."
>>>
>>> Viswa, we were having the same issue in our cluster as well - roughly
>>> every 5-7 days getting OOME.
>>> The heap size of the Job Tracker was constantly increasing due to a
>>> memory leak that will hopefully be fixed in newest releases.
>>>
>>> There is a configuration change in the JobTracker that will disable a
>>> functionality regarding cleaning up staging files i.e.
>>> /user/build/.staging/* - but that means that you will have to handle the
>>> staging files through a cron / jenkins task
>>>
>>> I'll get you the configuration on Monday..
>>>
>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>>> running in all nodes.
>>>>
>>>> *Apache Hadoop :* 1.2.1
>>>>
>>>> It shows the heap size currently as follows:
>>>>
>>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>>> *
>>>> *
>>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>>> been calculated.
>>>>
>>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>>
>>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>>
>>>> *HADOOP_HEAPSIZE="1024"*
>>>> *
>>>> *
>>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>>
>>>>  <property>
>>>>   <name>mapred.child.java.opts</**name>
>>>>   <value>-Xmx2048m</value>
>>>>  </property>
>>>>
>>>> Even after setting the above property, getting Jobtracker OOME issue.
>>>> How the jobtracker memory gradually increasing. After restart the JT,
>>>> within a week getting OOME.
>>>>
>>>> How to resolve this, it is in production and critical? Please help.
>>>> Thanks in advance.
>>>>
>>>> --
>>>> Regards,
>>>> Viswa.J
>>>>
>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "CDH Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to cdh-user+unsubscribe@cloudera.org.
>>> For more options, visit
>>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>>
>>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi Guys,

But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
per the hadoop release notes as below.

Please check this URL,

https://issues.apache.org/jira/browse/MAPREDUCE-5351

How come the issue still persist? I'm I asking a valid thing.

Do I need to configure anything our I missing anything.

Please help. Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com> wrote:

> Thanks Antonio, hope the memory leak issue will be resolved. Its really
> nightmare every week.
>
> In which release this issue will be resolved?
>
> How to solve this issue, please help because we are facing in production
> environment.
>
> Please share the configuration and cron to do that cleanup process.
>
> Thanks,
> Viswa
> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
> wrote:
>
>> "After restart the JT, within a week getting OOME."
>>
>> Viswa, we were having the same issue in our cluster as well - roughly
>> every 5-7 days getting OOME.
>> The heap size of the Job Tracker was constantly increasing due to a
>> memory leak that will hopefully be fixed in newest releases.
>>
>> There is a configuration change in the JobTracker that will disable a
>> functionality regarding cleaning up staging files i.e.
>> /user/build/.staging/* - but that means that you will have to handle the
>> staging files through a cron / jenkins task
>>
>> I'll get you the configuration on Monday..
>>
>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>
>>> Hi,
>>>
>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>> running in all nodes.
>>>
>>> *Apache Hadoop :* 1.2.1
>>>
>>> It shows the heap size currently as follows:
>>>
>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>> *
>>> *
>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>> been calculated.
>>>
>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>
>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>
>>> *HADOOP_HEAPSIZE="1024"*
>>> *
>>> *
>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>
>>>  <property>
>>>   <name>mapred.child.java.opts</**name>
>>>   <value>-Xmx2048m</value>
>>>  </property>
>>>
>>> Even after setting the above property, getting Jobtracker OOME issue.
>>> How the jobtracker memory gradually increasing. After restart the JT,
>>> within a week getting OOME.
>>>
>>> How to resolve this, it is in production and critical? Please help.
>>> Thanks in advance.
>>>
>>> --
>>> Regards,
>>> Viswa.J
>>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "CDH Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to cdh-user+unsubscribe@cloudera.org.
>> For more options, visit
>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi Guys,

But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
per the hadoop release notes as below.

Please check this URL,

https://issues.apache.org/jira/browse/MAPREDUCE-5351

How come the issue still persist? I'm I asking a valid thing.

Do I need to configure anything our I missing anything.

Please help. Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com> wrote:

> Thanks Antonio, hope the memory leak issue will be resolved. Its really
> nightmare every week.
>
> In which release this issue will be resolved?
>
> How to solve this issue, please help because we are facing in production
> environment.
>
> Please share the configuration and cron to do that cleanup process.
>
> Thanks,
> Viswa
> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
> wrote:
>
>> "After restart the JT, within a week getting OOME."
>>
>> Viswa, we were having the same issue in our cluster as well - roughly
>> every 5-7 days getting OOME.
>> The heap size of the Job Tracker was constantly increasing due to a
>> memory leak that will hopefully be fixed in newest releases.
>>
>> There is a configuration change in the JobTracker that will disable a
>> functionality regarding cleaning up staging files i.e.
>> /user/build/.staging/* - but that means that you will have to handle the
>> staging files through a cron / jenkins task
>>
>> I'll get you the configuration on Monday..
>>
>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>
>>> Hi,
>>>
>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>> running in all nodes.
>>>
>>> *Apache Hadoop :* 1.2.1
>>>
>>> It shows the heap size currently as follows:
>>>
>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>> *
>>> *
>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>> been calculated.
>>>
>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>
>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>
>>> *HADOOP_HEAPSIZE="1024"*
>>> *
>>> *
>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>
>>>  <property>
>>>   <name>mapred.child.java.opts</**name>
>>>   <value>-Xmx2048m</value>
>>>  </property>
>>>
>>> Even after setting the above property, getting Jobtracker OOME issue.
>>> How the jobtracker memory gradually increasing. After restart the JT,
>>> within a week getting OOME.
>>>
>>> How to resolve this, it is in production and critical? Please help.
>>> Thanks in advance.
>>>
>>> --
>>> Regards,
>>> Viswa.J
>>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "CDH Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to cdh-user+unsubscribe@cloudera.org.
>> For more options, visit
>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi Guys,

But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
per the hadoop release notes as below.

Please check this URL,

https://issues.apache.org/jira/browse/MAPREDUCE-5351

How come the issue still persist? I'm I asking a valid thing.

Do I need to configure anything our I missing anything.

Please help. Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com> wrote:

> Thanks Antonio, hope the memory leak issue will be resolved. Its really
> nightmare every week.
>
> In which release this issue will be resolved?
>
> How to solve this issue, please help because we are facing in production
> environment.
>
> Please share the configuration and cron to do that cleanup process.
>
> Thanks,
> Viswa
> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
> wrote:
>
>> "After restart the JT, within a week getting OOME."
>>
>> Viswa, we were having the same issue in our cluster as well - roughly
>> every 5-7 days getting OOME.
>> The heap size of the Job Tracker was constantly increasing due to a
>> memory leak that will hopefully be fixed in newest releases.
>>
>> There is a configuration change in the JobTracker that will disable a
>> functionality regarding cleaning up staging files i.e.
>> /user/build/.staging/* - but that means that you will have to handle the
>> staging files through a cron / jenkins task
>>
>> I'll get you the configuration on Monday..
>>
>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>
>>> Hi,
>>>
>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>> running in all nodes.
>>>
>>> *Apache Hadoop :* 1.2.1
>>>
>>> It shows the heap size currently as follows:
>>>
>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>> *
>>> *
>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>> been calculated.
>>>
>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>
>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>
>>> *HADOOP_HEAPSIZE="1024"*
>>> *
>>> *
>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>
>>>  <property>
>>>   <name>mapred.child.java.opts</**name>
>>>   <value>-Xmx2048m</value>
>>>  </property>
>>>
>>> Even after setting the above property, getting Jobtracker OOME issue.
>>> How the jobtracker memory gradually increasing. After restart the JT,
>>> within a week getting OOME.
>>>
>>> How to resolve this, it is in production and critical? Please help.
>>> Thanks in advance.
>>>
>>> --
>>> Regards,
>>> Viswa.J
>>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "CDH Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to cdh-user+unsubscribe@cloudera.org.
>> For more options, visit
>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Hi Guys,

But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
per the hadoop release notes as below.

Please check this URL,

https://issues.apache.org/jira/browse/MAPREDUCE-5351

How come the issue still persist? I'm I asking a valid thing.

Do I need to configure anything our I missing anything.

Please help. Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 7:57 PM, "Viswanathan J" <ja...@gmail.com> wrote:

> Thanks Antonio, hope the memory leak issue will be resolved. Its really
> nightmare every week.
>
> In which release this issue will be resolved?
>
> How to solve this issue, please help because we are facing in production
> environment.
>
> Please share the configuration and cron to do that cleanup process.
>
> Thanks,
> Viswa
> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com>
> wrote:
>
>> "After restart the JT, within a week getting OOME."
>>
>> Viswa, we were having the same issue in our cluster as well - roughly
>> every 5-7 days getting OOME.
>> The heap size of the Job Tracker was constantly increasing due to a
>> memory leak that will hopefully be fixed in newest releases.
>>
>> There is a configuration change in the JobTracker that will disable a
>> functionality regarding cleaning up staging files i.e.
>> /user/build/.staging/* - but that means that you will have to handle the
>> staging files through a cron / jenkins task
>>
>> I'll get you the configuration on Monday..
>>
>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>
>>> Hi,
>>>
>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>> running in all nodes.
>>>
>>> *Apache Hadoop :* 1.2.1
>>>
>>> It shows the heap size currently as follows:
>>>
>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>> *
>>> *
>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>>> been calculated.
>>>
>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>
>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>
>>> *HADOOP_HEAPSIZE="1024"*
>>> *
>>> *
>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>
>>>  <property>
>>>   <name>mapred.child.java.opts</**name>
>>>   <value>-Xmx2048m</value>
>>>  </property>
>>>
>>> Even after setting the above property, getting Jobtracker OOME issue.
>>> How the jobtracker memory gradually increasing. After restart the JT,
>>> within a week getting OOME.
>>>
>>> How to resolve this, it is in production and critical? Please help.
>>> Thanks in advance.
>>>
>>> --
>>> Regards,
>>> Viswa.J
>>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "CDH Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to cdh-user+unsubscribe@cloudera.org.
>> For more options, visit
>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Thanks Antonio, hope the memory leak issue will be resolved. Its really
nightmare every week.

In which release this issue will be resolved?

How to solve this issue, please help because we are facing in production
environment.

Please share the configuration and cron to do that cleanup process.

Thanks,
Viswa
On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com> wrote:

> "After restart the JT, within a week getting OOME."
>
> Viswa, we were having the same issue in our cluster as well - roughly
> every 5-7 days getting OOME.
> The heap size of the Job Tracker was constantly increasing due to a memory
> leak that will hopefully be fixed in newest releases.
>
> There is a configuration change in the JobTracker that will disable a
> functionality regarding cleaning up staging files i.e.
> /user/build/.staging/* - but that means that you will have to handle the
> staging files through a cron / jenkins task
>
> I'll get you the configuration on Monday..
>
> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>> <property>
>>   <name>mapred.child.java.opts</**name>
>>   <value>-Xmx2048m</value>
>> </property>
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscribe@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Thanks Antonio, hope the memory leak issue will be resolved. Its really
nightmare every week.

In which release this issue will be resolved?

How to solve this issue, please help because we are facing in production
environment.

Please share the configuration and cron to do that cleanup process.

Thanks,
Viswa
On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com> wrote:

> "After restart the JT, within a week getting OOME."
>
> Viswa, we were having the same issue in our cluster as well - roughly
> every 5-7 days getting OOME.
> The heap size of the Job Tracker was constantly increasing due to a memory
> leak that will hopefully be fixed in newest releases.
>
> There is a configuration change in the JobTracker that will disable a
> functionality regarding cleaning up staging files i.e.
> /user/build/.staging/* - but that means that you will have to handle the
> staging files through a cron / jenkins task
>
> I'll get you the configuration on Monday..
>
> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>> <property>
>>   <name>mapred.child.java.opts</**name>
>>   <value>-Xmx2048m</value>
>> </property>
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscribe@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Thanks Antonio, hope the memory leak issue will be resolved. Its really
nightmare every week.

In which release this issue will be resolved?

How to solve this issue, please help because we are facing in production
environment.

Please share the configuration and cron to do that cleanup process.

Thanks,
Viswa
On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com> wrote:

> "After restart the JT, within a week getting OOME."
>
> Viswa, we were having the same issue in our cluster as well - roughly
> every 5-7 days getting OOME.
> The heap size of the Job Tracker was constantly increasing due to a memory
> leak that will hopefully be fixed in newest releases.
>
> There is a configuration change in the JobTracker that will disable a
> functionality regarding cleaning up staging files i.e.
> /user/build/.staging/* - but that means that you will have to handle the
> staging files through a cron / jenkins task
>
> I'll get you the configuration on Monday..
>
> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>> <property>
>>   <name>mapred.child.java.opts</**name>
>>   <value>-Xmx2048m</value>
>> </property>
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscribe@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>

Re: Hadoop Jobtracker heap size calculation and OOME

Posted by Viswanathan J <ja...@gmail.com>.
Thanks Antonio, hope the memory leak issue will be resolved. Its really
nightmare every week.

In which release this issue will be resolved?

How to solve this issue, please help because we are facing in production
environment.

Please share the configuration and cron to do that cleanup process.

Thanks,
Viswa
On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" <an...@gmail.com> wrote:

> "After restart the JT, within a week getting OOME."
>
> Viswa, we were having the same issue in our cluster as well - roughly
> every 5-7 days getting OOME.
> The heap size of the Job Tracker was constantly increasing due to a memory
> leak that will hopefully be fixed in newest releases.
>
> There is a configuration change in the JobTracker that will disable a
> functionality regarding cleaning up staging files i.e.
> /user/build/.staging/* - but that means that you will have to handle the
> staging files through a cron / jenkins task
>
> I'll get you the configuration on Monday..
>
> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>> <property>
>>   <name>mapred.child.java.opts</**name>
>>   <value>-Xmx2048m</value>
>> </property>
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscribe@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>