You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Marcin Mejran <ma...@hooklogic.com> on 2013/04/16 19:47:10 UTC

Jobtracker memory issues due to FileSystem$Cache

We've recently run into jobtracker memory issues on our new hadoop cluster. A heap dump shows that there are thousands of copies of DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job run on the cluster and their jobconf objects support this view. I believe these are created when the .staging directories get cleaned up but I may be wrong on that.

>From what I can tell in the dump, the username (probably not ugi, hard to tell), scheme and authority parts of the Cache$Key are the same across multiple objects in FileSystem$Cache. I can only assume that the usergroupinformation piece differs somehow every time it's created.

We're using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so on are not enabled.

Is there any known reason for this type of behavior?

Thanks,
-Marcin

Re: Jobtracker memory issues due to FileSystem$Cache

Posted by "agile.java@gmail.com" <ag...@gmail.com>.
The reason you described is true and I verified it  at our enviroment,thank
you very much.
I try to set keep.failed.task.files=true,but all jobs failed due to
MAPREDUCE-5047 <https://issues.apache.org/jira/browse/MAPREDUCE-5047> ,because
our hadoop cluster turn on the  kerberos. :(
The only thing we can do now is to restart the jobtracker at timing before
the  CDH 4.3 release.
Do you know any other solution?
Thanks.





On Sat, Apr 27, 2013 at 11:07 AM, agile.java@gmail.com <agile.java@gmail.com
> wrote:

> We meet the same problem, I haven't found the reason,I'm debugging it.
>
>
> On Wed, Apr 17, 2013 at 11:14 PM, Marcin Mejran <
> marcin.mejran@hooklogic.com> wrote:
>
>>  In case anyone is wondering, I tracked this down to a race condition in
>> JobInProgress or failure to clean up FileSystems in CleanupQueue (depending
>> on how you look at it). ****
>>
>> ** **
>>
>> FileSystem.closeAllForUGI is what keeps the cache from memory leaking
>> however it’s not called in one thread. However JobInProgress calls
>> closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread.
>> If closeAllForUGI is called by JobInProgress before CleanupQueue calls
>> FileSystem.get with that ugi then there’s a leak. Since CleanupQueue
>> doesn’t call closeAllForUGI the filesystem is left cached perpetually.***
>> *
>>
>> ** **
>>
>> Setting, for example, keep.failed.task.files=true or
>> keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting
>> called which seems to solve my issues. You get junk left in .staging but
>> that can be dealt with.****
>>
>> ** **
>>
>> -Marcin****
>>
>> ** **
>>
>> *From:* Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
>> *Sent:* Tuesday, April 16, 2013 1:47 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Jobtracker memory issues due to FileSystem$Cache****
>>
>> ** **
>>
>> We’ve recently run into jobtracker memory issues on our new hadoop
>> cluster. A heap dump shows that there are thousands of copies of
>> DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job
>> run on the cluster and their jobconf objects support this view. I believe
>> these are created when the .staging directories get cleaned up but I may be
>> wrong on that.****
>>
>> ** **
>>
>> From what I can tell in the dump, the username (probably not ugi, hard to
>> tell), scheme and authority parts of the Cache$Key are the same across
>> multiple objects in FileSystem$Cache. I can only assume that the
>> usergroupinformation piece differs somehow every time it’s created.****
>>
>> ** **
>>
>> We’re using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and
>> so on are not enabled. ****
>>
>> ** **
>>
>> Is there any known reason for this type of behavior?****
>>
>> ** **
>>
>> Thanks,****
>>
>> -Marcin****
>>
>
>
>
> --
> d0ngd0ng
>



-- 
d0ngd0ng

Re: Jobtracker memory issues due to FileSystem$Cache

Posted by "agile.java@gmail.com" <ag...@gmail.com>.
The reason you described is true and I verified it  at our enviroment,thank
you very much.
I try to set keep.failed.task.files=true,but all jobs failed due to
MAPREDUCE-5047 <https://issues.apache.org/jira/browse/MAPREDUCE-5047> ,because
our hadoop cluster turn on the  kerberos. :(
The only thing we can do now is to restart the jobtracker at timing before
the  CDH 4.3 release.
Do you know any other solution?
Thanks.





On Sat, Apr 27, 2013 at 11:07 AM, agile.java@gmail.com <agile.java@gmail.com
> wrote:

> We meet the same problem, I haven't found the reason,I'm debugging it.
>
>
> On Wed, Apr 17, 2013 at 11:14 PM, Marcin Mejran <
> marcin.mejran@hooklogic.com> wrote:
>
>>  In case anyone is wondering, I tracked this down to a race condition in
>> JobInProgress or failure to clean up FileSystems in CleanupQueue (depending
>> on how you look at it). ****
>>
>> ** **
>>
>> FileSystem.closeAllForUGI is what keeps the cache from memory leaking
>> however it’s not called in one thread. However JobInProgress calls
>> closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread.
>> If closeAllForUGI is called by JobInProgress before CleanupQueue calls
>> FileSystem.get with that ugi then there’s a leak. Since CleanupQueue
>> doesn’t call closeAllForUGI the filesystem is left cached perpetually.***
>> *
>>
>> ** **
>>
>> Setting, for example, keep.failed.task.files=true or
>> keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting
>> called which seems to solve my issues. You get junk left in .staging but
>> that can be dealt with.****
>>
>> ** **
>>
>> -Marcin****
>>
>> ** **
>>
>> *From:* Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
>> *Sent:* Tuesday, April 16, 2013 1:47 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Jobtracker memory issues due to FileSystem$Cache****
>>
>> ** **
>>
>> We’ve recently run into jobtracker memory issues on our new hadoop
>> cluster. A heap dump shows that there are thousands of copies of
>> DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job
>> run on the cluster and their jobconf objects support this view. I believe
>> these are created when the .staging directories get cleaned up but I may be
>> wrong on that.****
>>
>> ** **
>>
>> From what I can tell in the dump, the username (probably not ugi, hard to
>> tell), scheme and authority parts of the Cache$Key are the same across
>> multiple objects in FileSystem$Cache. I can only assume that the
>> usergroupinformation piece differs somehow every time it’s created.****
>>
>> ** **
>>
>> We’re using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and
>> so on are not enabled. ****
>>
>> ** **
>>
>> Is there any known reason for this type of behavior?****
>>
>> ** **
>>
>> Thanks,****
>>
>> -Marcin****
>>
>
>
>
> --
> d0ngd0ng
>



-- 
d0ngd0ng

Re: Jobtracker memory issues due to FileSystem$Cache

Posted by "agile.java@gmail.com" <ag...@gmail.com>.
The reason you described is true and I verified it  at our enviroment,thank
you very much.
I try to set keep.failed.task.files=true,but all jobs failed due to
MAPREDUCE-5047 <https://issues.apache.org/jira/browse/MAPREDUCE-5047> ,because
our hadoop cluster turn on the  kerberos. :(
The only thing we can do now is to restart the jobtracker at timing before
the  CDH 4.3 release.
Do you know any other solution?
Thanks.





On Sat, Apr 27, 2013 at 11:07 AM, agile.java@gmail.com <agile.java@gmail.com
> wrote:

> We meet the same problem, I haven't found the reason,I'm debugging it.
>
>
> On Wed, Apr 17, 2013 at 11:14 PM, Marcin Mejran <
> marcin.mejran@hooklogic.com> wrote:
>
>>  In case anyone is wondering, I tracked this down to a race condition in
>> JobInProgress or failure to clean up FileSystems in CleanupQueue (depending
>> on how you look at it). ****
>>
>> ** **
>>
>> FileSystem.closeAllForUGI is what keeps the cache from memory leaking
>> however it’s not called in one thread. However JobInProgress calls
>> closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread.
>> If closeAllForUGI is called by JobInProgress before CleanupQueue calls
>> FileSystem.get with that ugi then there’s a leak. Since CleanupQueue
>> doesn’t call closeAllForUGI the filesystem is left cached perpetually.***
>> *
>>
>> ** **
>>
>> Setting, for example, keep.failed.task.files=true or
>> keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting
>> called which seems to solve my issues. You get junk left in .staging but
>> that can be dealt with.****
>>
>> ** **
>>
>> -Marcin****
>>
>> ** **
>>
>> *From:* Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
>> *Sent:* Tuesday, April 16, 2013 1:47 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Jobtracker memory issues due to FileSystem$Cache****
>>
>> ** **
>>
>> We’ve recently run into jobtracker memory issues on our new hadoop
>> cluster. A heap dump shows that there are thousands of copies of
>> DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job
>> run on the cluster and their jobconf objects support this view. I believe
>> these are created when the .staging directories get cleaned up but I may be
>> wrong on that.****
>>
>> ** **
>>
>> From what I can tell in the dump, the username (probably not ugi, hard to
>> tell), scheme and authority parts of the Cache$Key are the same across
>> multiple objects in FileSystem$Cache. I can only assume that the
>> usergroupinformation piece differs somehow every time it’s created.****
>>
>> ** **
>>
>> We’re using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and
>> so on are not enabled. ****
>>
>> ** **
>>
>> Is there any known reason for this type of behavior?****
>>
>> ** **
>>
>> Thanks,****
>>
>> -Marcin****
>>
>
>
>
> --
> d0ngd0ng
>



-- 
d0ngd0ng

Re: Jobtracker memory issues due to FileSystem$Cache

Posted by "agile.java@gmail.com" <ag...@gmail.com>.
The reason you described is true and I verified it  at our enviroment,thank
you very much.
I try to set keep.failed.task.files=true,but all jobs failed due to
MAPREDUCE-5047 <https://issues.apache.org/jira/browse/MAPREDUCE-5047> ,because
our hadoop cluster turn on the  kerberos. :(
The only thing we can do now is to restart the jobtracker at timing before
the  CDH 4.3 release.
Do you know any other solution?
Thanks.





On Sat, Apr 27, 2013 at 11:07 AM, agile.java@gmail.com <agile.java@gmail.com
> wrote:

> We meet the same problem, I haven't found the reason,I'm debugging it.
>
>
> On Wed, Apr 17, 2013 at 11:14 PM, Marcin Mejran <
> marcin.mejran@hooklogic.com> wrote:
>
>>  In case anyone is wondering, I tracked this down to a race condition in
>> JobInProgress or failure to clean up FileSystems in CleanupQueue (depending
>> on how you look at it). ****
>>
>> ** **
>>
>> FileSystem.closeAllForUGI is what keeps the cache from memory leaking
>> however it’s not called in one thread. However JobInProgress calls
>> closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread.
>> If closeAllForUGI is called by JobInProgress before CleanupQueue calls
>> FileSystem.get with that ugi then there’s a leak. Since CleanupQueue
>> doesn’t call closeAllForUGI the filesystem is left cached perpetually.***
>> *
>>
>> ** **
>>
>> Setting, for example, keep.failed.task.files=true or
>> keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting
>> called which seems to solve my issues. You get junk left in .staging but
>> that can be dealt with.****
>>
>> ** **
>>
>> -Marcin****
>>
>> ** **
>>
>> *From:* Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
>> *Sent:* Tuesday, April 16, 2013 1:47 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Jobtracker memory issues due to FileSystem$Cache****
>>
>> ** **
>>
>> We’ve recently run into jobtracker memory issues on our new hadoop
>> cluster. A heap dump shows that there are thousands of copies of
>> DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job
>> run on the cluster and their jobconf objects support this view. I believe
>> these are created when the .staging directories get cleaned up but I may be
>> wrong on that.****
>>
>> ** **
>>
>> From what I can tell in the dump, the username (probably not ugi, hard to
>> tell), scheme and authority parts of the Cache$Key are the same across
>> multiple objects in FileSystem$Cache. I can only assume that the
>> usergroupinformation piece differs somehow every time it’s created.****
>>
>> ** **
>>
>> We’re using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and
>> so on are not enabled. ****
>>
>> ** **
>>
>> Is there any known reason for this type of behavior?****
>>
>> ** **
>>
>> Thanks,****
>>
>> -Marcin****
>>
>
>
>
> --
> d0ngd0ng
>



-- 
d0ngd0ng

Re: Jobtracker memory issues due to FileSystem$Cache

Posted by "agile.java@gmail.com" <ag...@gmail.com>.
We meet the same problem, I haven't found the reason,I'm debugging it.


On Wed, Apr 17, 2013 at 11:14 PM, Marcin Mejran <marcin.mejran@hooklogic.com
> wrote:

>  In case anyone is wondering, I tracked this down to a race condition in
> JobInProgress or failure to clean up FileSystems in CleanupQueue (depending
> on how you look at it). ****
>
> ** **
>
> FileSystem.closeAllForUGI is what keeps the cache from memory leaking
> however it’s not called in one thread. However JobInProgress calls
> closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread.
> If closeAllForUGI is called by JobInProgress before CleanupQueue calls
> FileSystem.get with that ugi then there’s a leak. Since CleanupQueue
> doesn’t call closeAllForUGI the filesystem is left cached perpetually.****
>
> ** **
>
> Setting, for example, keep.failed.task.files=true or
> keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting
> called which seems to solve my issues. You get junk left in .staging but
> that can be dealt with.****
>
> ** **
>
> -Marcin****
>
> ** **
>
> *From:* Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
> *Sent:* Tuesday, April 16, 2013 1:47 PM
> *To:* user@hadoop.apache.org
> *Subject:* Jobtracker memory issues due to FileSystem$Cache****
>
> ** **
>
> We’ve recently run into jobtracker memory issues on our new hadoop
> cluster. A heap dump shows that there are thousands of copies of
> DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job
> run on the cluster and their jobconf objects support this view. I believe
> these are created when the .staging directories get cleaned up but I may be
> wrong on that.****
>
> ** **
>
> From what I can tell in the dump, the username (probably not ugi, hard to
> tell), scheme and authority parts of the Cache$Key are the same across
> multiple objects in FileSystem$Cache. I can only assume that the
> usergroupinformation piece differs somehow every time it’s created.****
>
> ** **
>
> We’re using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so
> on are not enabled. ****
>
> ** **
>
> Is there any known reason for this type of behavior?****
>
> ** **
>
> Thanks,****
>
> -Marcin****
>



-- 
d0ngd0ng

Re: Jobtracker memory issues due to FileSystem$Cache

Posted by "agile.java@gmail.com" <ag...@gmail.com>.
We meet the same problem, I haven't found the reason,I'm debugging it.


On Wed, Apr 17, 2013 at 11:14 PM, Marcin Mejran <marcin.mejran@hooklogic.com
> wrote:

>  In case anyone is wondering, I tracked this down to a race condition in
> JobInProgress or failure to clean up FileSystems in CleanupQueue (depending
> on how you look at it). ****
>
> ** **
>
> FileSystem.closeAllForUGI is what keeps the cache from memory leaking
> however it’s not called in one thread. However JobInProgress calls
> closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread.
> If closeAllForUGI is called by JobInProgress before CleanupQueue calls
> FileSystem.get with that ugi then there’s a leak. Since CleanupQueue
> doesn’t call closeAllForUGI the filesystem is left cached perpetually.****
>
> ** **
>
> Setting, for example, keep.failed.task.files=true or
> keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting
> called which seems to solve my issues. You get junk left in .staging but
> that can be dealt with.****
>
> ** **
>
> -Marcin****
>
> ** **
>
> *From:* Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
> *Sent:* Tuesday, April 16, 2013 1:47 PM
> *To:* user@hadoop.apache.org
> *Subject:* Jobtracker memory issues due to FileSystem$Cache****
>
> ** **
>
> We’ve recently run into jobtracker memory issues on our new hadoop
> cluster. A heap dump shows that there are thousands of copies of
> DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job
> run on the cluster and their jobconf objects support this view. I believe
> these are created when the .staging directories get cleaned up but I may be
> wrong on that.****
>
> ** **
>
> From what I can tell in the dump, the username (probably not ugi, hard to
> tell), scheme and authority parts of the Cache$Key are the same across
> multiple objects in FileSystem$Cache. I can only assume that the
> usergroupinformation piece differs somehow every time it’s created.****
>
> ** **
>
> We’re using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so
> on are not enabled. ****
>
> ** **
>
> Is there any known reason for this type of behavior?****
>
> ** **
>
> Thanks,****
>
> -Marcin****
>



-- 
d0ngd0ng

Re: Jobtracker memory issues due to FileSystem$Cache

Posted by "agile.java@gmail.com" <ag...@gmail.com>.
We meet the same problem, I haven't found the reason,I'm debugging it.


On Wed, Apr 17, 2013 at 11:14 PM, Marcin Mejran <marcin.mejran@hooklogic.com
> wrote:

>  In case anyone is wondering, I tracked this down to a race condition in
> JobInProgress or failure to clean up FileSystems in CleanupQueue (depending
> on how you look at it). ****
>
> ** **
>
> FileSystem.closeAllForUGI is what keeps the cache from memory leaking
> however it’s not called in one thread. However JobInProgress calls
> closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread.
> If closeAllForUGI is called by JobInProgress before CleanupQueue calls
> FileSystem.get with that ugi then there’s a leak. Since CleanupQueue
> doesn’t call closeAllForUGI the filesystem is left cached perpetually.****
>
> ** **
>
> Setting, for example, keep.failed.task.files=true or
> keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting
> called which seems to solve my issues. You get junk left in .staging but
> that can be dealt with.****
>
> ** **
>
> -Marcin****
>
> ** **
>
> *From:* Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
> *Sent:* Tuesday, April 16, 2013 1:47 PM
> *To:* user@hadoop.apache.org
> *Subject:* Jobtracker memory issues due to FileSystem$Cache****
>
> ** **
>
> We’ve recently run into jobtracker memory issues on our new hadoop
> cluster. A heap dump shows that there are thousands of copies of
> DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job
> run on the cluster and their jobconf objects support this view. I believe
> these are created when the .staging directories get cleaned up but I may be
> wrong on that.****
>
> ** **
>
> From what I can tell in the dump, the username (probably not ugi, hard to
> tell), scheme and authority parts of the Cache$Key are the same across
> multiple objects in FileSystem$Cache. I can only assume that the
> usergroupinformation piece differs somehow every time it’s created.****
>
> ** **
>
> We’re using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so
> on are not enabled. ****
>
> ** **
>
> Is there any known reason for this type of behavior?****
>
> ** **
>
> Thanks,****
>
> -Marcin****
>



-- 
d0ngd0ng

Re: Jobtracker memory issues due to FileSystem$Cache

Posted by "agile.java@gmail.com" <ag...@gmail.com>.
We meet the same problem, I haven't found the reason,I'm debugging it.


On Wed, Apr 17, 2013 at 11:14 PM, Marcin Mejran <marcin.mejran@hooklogic.com
> wrote:

>  In case anyone is wondering, I tracked this down to a race condition in
> JobInProgress or failure to clean up FileSystems in CleanupQueue (depending
> on how you look at it). ****
>
> ** **
>
> FileSystem.closeAllForUGI is what keeps the cache from memory leaking
> however it’s not called in one thread. However JobInProgress calls
> closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread.
> If closeAllForUGI is called by JobInProgress before CleanupQueue calls
> FileSystem.get with that ugi then there’s a leak. Since CleanupQueue
> doesn’t call closeAllForUGI the filesystem is left cached perpetually.****
>
> ** **
>
> Setting, for example, keep.failed.task.files=true or
> keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting
> called which seems to solve my issues. You get junk left in .staging but
> that can be dealt with.****
>
> ** **
>
> -Marcin****
>
> ** **
>
> *From:* Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
> *Sent:* Tuesday, April 16, 2013 1:47 PM
> *To:* user@hadoop.apache.org
> *Subject:* Jobtracker memory issues due to FileSystem$Cache****
>
> ** **
>
> We’ve recently run into jobtracker memory issues on our new hadoop
> cluster. A heap dump shows that there are thousands of copies of
> DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job
> run on the cluster and their jobconf objects support this view. I believe
> these are created when the .staging directories get cleaned up but I may be
> wrong on that.****
>
> ** **
>
> From what I can tell in the dump, the username (probably not ugi, hard to
> tell), scheme and authority parts of the Cache$Key are the same across
> multiple objects in FileSystem$Cache. I can only assume that the
> usergroupinformation piece differs somehow every time it’s created.****
>
> ** **
>
> We’re using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so
> on are not enabled. ****
>
> ** **
>
> Is there any known reason for this type of behavior?****
>
> ** **
>
> Thanks,****
>
> -Marcin****
>



-- 
d0ngd0ng

RE: Jobtracker memory issues due to FileSystem$Cache

Posted by Marcin Mejran <ma...@hooklogic.com>.
In case anyone is wondering, I tracked this down to a race condition in JobInProgress or failure to clean up FileSystems in CleanupQueue (depending on how you look at it).

FileSystem.closeAllForUGI is what keeps the cache from memory leaking however it's not called in one thread. However JobInProgress calls closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread. If closeAllForUGI is called by JobInProgress before CleanupQueue calls FileSystem.get with that ugi then there's a leak. Since CleanupQueue doesn't call closeAllForUGI the filesystem is left cached perpetually.

Setting, for example, keep.failed.task.files=true or keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting called which seems to solve my issues. You get junk left in .staging but that can be dealt with.

-Marcin

From: Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
Sent: Tuesday, April 16, 2013 1:47 PM
To: user@hadoop.apache.org
Subject: Jobtracker memory issues due to FileSystem$Cache

We've recently run into jobtracker memory issues on our new hadoop cluster. A heap dump shows that there are thousands of copies of DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job run on the cluster and their jobconf objects support this view. I believe these are created when the .staging directories get cleaned up but I may be wrong on that.

>From what I can tell in the dump, the username (probably not ugi, hard to tell), scheme and authority parts of the Cache$Key are the same across multiple objects in FileSystem$Cache. I can only assume that the usergroupinformation piece differs somehow every time it's created.

We're using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so on are not enabled.

Is there any known reason for this type of behavior?

Thanks,
-Marcin

RE: Jobtracker memory issues due to FileSystem$Cache

Posted by Marcin Mejran <ma...@hooklogic.com>.
In case anyone is wondering, I tracked this down to a race condition in JobInProgress or failure to clean up FileSystems in CleanupQueue (depending on how you look at it).

FileSystem.closeAllForUGI is what keeps the cache from memory leaking however it's not called in one thread. However JobInProgress calls closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread. If closeAllForUGI is called by JobInProgress before CleanupQueue calls FileSystem.get with that ugi then there's a leak. Since CleanupQueue doesn't call closeAllForUGI the filesystem is left cached perpetually.

Setting, for example, keep.failed.task.files=true or keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting called which seems to solve my issues. You get junk left in .staging but that can be dealt with.

-Marcin

From: Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
Sent: Tuesday, April 16, 2013 1:47 PM
To: user@hadoop.apache.org
Subject: Jobtracker memory issues due to FileSystem$Cache

We've recently run into jobtracker memory issues on our new hadoop cluster. A heap dump shows that there are thousands of copies of DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job run on the cluster and their jobconf objects support this view. I believe these are created when the .staging directories get cleaned up but I may be wrong on that.

>From what I can tell in the dump, the username (probably not ugi, hard to tell), scheme and authority parts of the Cache$Key are the same across multiple objects in FileSystem$Cache. I can only assume that the usergroupinformation piece differs somehow every time it's created.

We're using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so on are not enabled.

Is there any known reason for this type of behavior?

Thanks,
-Marcin

RE: Jobtracker memory issues due to FileSystem$Cache

Posted by Marcin Mejran <ma...@hooklogic.com>.
In case anyone is wondering, I tracked this down to a race condition in JobInProgress or failure to clean up FileSystems in CleanupQueue (depending on how you look at it).

FileSystem.closeAllForUGI is what keeps the cache from memory leaking however it's not called in one thread. However JobInProgress calls closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread. If closeAllForUGI is called by JobInProgress before CleanupQueue calls FileSystem.get with that ugi then there's a leak. Since CleanupQueue doesn't call closeAllForUGI the filesystem is left cached perpetually.

Setting, for example, keep.failed.task.files=true or keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting called which seems to solve my issues. You get junk left in .staging but that can be dealt with.

-Marcin

From: Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
Sent: Tuesday, April 16, 2013 1:47 PM
To: user@hadoop.apache.org
Subject: Jobtracker memory issues due to FileSystem$Cache

We've recently run into jobtracker memory issues on our new hadoop cluster. A heap dump shows that there are thousands of copies of DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job run on the cluster and their jobconf objects support this view. I believe these are created when the .staging directories get cleaned up but I may be wrong on that.

>From what I can tell in the dump, the username (probably not ugi, hard to tell), scheme and authority parts of the Cache$Key are the same across multiple objects in FileSystem$Cache. I can only assume that the usergroupinformation piece differs somehow every time it's created.

We're using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so on are not enabled.

Is there any known reason for this type of behavior?

Thanks,
-Marcin

RE: Jobtracker memory issues due to FileSystem$Cache

Posted by Marcin Mejran <ma...@hooklogic.com>.
In case anyone is wondering, I tracked this down to a race condition in JobInProgress or failure to clean up FileSystems in CleanupQueue (depending on how you look at it).

FileSystem.closeAllForUGI is what keeps the cache from memory leaking however it's not called in one thread. However JobInProgress calls closeAllForUGI  on a UGI that was also passed to the CleanupQueue thread. If closeAllForUGI is called by JobInProgress before CleanupQueue calls FileSystem.get with that ugi then there's a leak. Since CleanupQueue doesn't call closeAllForUGI the filesystem is left cached perpetually.

Setting, for example, keep.failed.task.files=true or keep.task.files.pattern=<dummy text> prevents CleanupQueue from getting called which seems to solve my issues. You get junk left in .staging but that can be dealt with.

-Marcin

From: Marcin Mejran [mailto:marcin.mejran@hooklogic.com]
Sent: Tuesday, April 16, 2013 1:47 PM
To: user@hadoop.apache.org
Subject: Jobtracker memory issues due to FileSystem$Cache

We've recently run into jobtracker memory issues on our new hadoop cluster. A heap dump shows that there are thousands of copies of DistributedFileSystem kept in FileSystem$Cache, a bit over one for each job run on the cluster and their jobconf objects support this view. I believe these are created when the .staging directories get cleaned up but I may be wrong on that.

>From what I can tell in the dump, the username (probably not ugi, hard to tell), scheme and authority parts of the Cache$Key are the same across multiple objects in FileSystem$Cache. I can only assume that the usergroupinformation piece differs somehow every time it's created.

We're using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so on are not enabled.

Is there any known reason for this type of behavior?

Thanks,
-Marcin