You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Andrew <gu...@nigma.ru> on 2009/02/04 11:26:57 UTC

Task tracker archive contains too many files

I've noticed that task tracker moves all unpacked jars into 
${hadoop.tmp.dir}/mapred/local/taskTracker.

We are using a lot of external libraries, that are deployed via "-libjars" 
option. The total number of files after unpacking is about 20 thousands.

After running a number of jobs, tasks start to be killed with timeout reason 
("Task attempt_200901281518_0011_m_000173_2 failed to report status for 601 
seconds. Killing!"). All killed tasks are in "initializing" state. I've 
watched the tasktracker logs and found such messages:


Thread 20926 (Thread-10368):
  State: BLOCKED
  Blocked count: 3611
  Waited count: 24
  Blocked on java.lang.ref.Reference$Lock@e48ed6
  Blocked by 20882 (Thread-10341)
  Stack:
    java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232)
    java.lang.StringCoding.encode(StringCoding.java:272)
    java.lang.String.getBytes(String.java:947)
    java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
    java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
    java.io.File.isDirectory(File.java:754)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:427)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
    org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)


This is exactly as in HADOOP-4780. 
As I understand, patch brings the code, which stores map of directories along 
with their DU's, thus reducing the number of calls to DU. This must help but 
the process of deleting 20000 files taks too long. I've manually deleted 
archive after 10 jobs had run and it took over 30 minutes on XFS. Three times 
more, that default timeout for tasks!

Is there is the way to prohibit unpacking of jars? Or at least not to hold the 
archive? Or any other better way to solve this problem?

Hadoop version: 0.19.0.


-- 
Andrew Gudkov
PGP key id: CB9F07D8 (cryptonomicon.mit.edu)
Jabber: gudok@jabber.ru

Re: Task tracker archive contains too many files

Posted by Andrew <gu...@nigma.ru>.

On Wednesday 04 February 2009 14:25:44 Amareshwari Sriramadasu wrote:
> Now, there is no way to stop DistributedCache from stopping unpacking of
> jars. I think it should have an option (thru configuration) whether to
> unpack or not.
> Can you raise a jira for the same?

OK!
https://issues.apache.org/jira/browse/HADOOP-5175

-- 
Andrew Gudkov
PGP key id: CB9F07D8 (cryptonomicon.mit.edu)
Jabber: gudok@jabber.ru

Re: Task tracker archive contains too many files

Posted by Amareshwari Sriramadasu <am...@yahoo-inc.com>.

Andrew wrote:
> I've noticed that task tracker moves all unpacked jars into 
> ${hadoop.tmp.dir}/mapred/local/taskTracker.
>
> We are using a lot of external libraries, that are deployed via "-libjars" 
> option. The total number of files after unpacking is about 20 thousands.
>
> After running a number of jobs, tasks start to be killed with timeout reason 
> ("Task attempt_200901281518_0011_m_000173_2 failed to report status for 601 
> seconds. Killing!"). All killed tasks are in "initializing" state. I've 
> watched the tasktracker logs and found such messages:
>
>
> Thread 20926 (Thread-10368):
>   State: BLOCKED
>   Blocked count: 3611
>   Waited count: 24
>   Blocked on java.lang.ref.Reference$Lock@e48ed6
>   Blocked by 20882 (Thread-10341)
>   Stack:
>     java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232)
>     java.lang.StringCoding.encode(StringCoding.java:272)
>     java.lang.String.getBytes(String.java:947)
>     java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
>     java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
>     java.io.File.isDirectory(File.java:754)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:427)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>
>
> This is exactly as in HADOOP-4780. 
> As I understand, patch brings the code, which stores map of directories along 
> with their DU's, thus reducing the number of calls to DU. This must help but 
> the process of deleting 20000 files taks too long. I've manually deleted 
> archive after 10 jobs had run and it took over 30 minutes on XFS. Three times 
> more, that default timeout for tasks!
>
> Is there is the way to prohibit unpacking of jars? Or at least not to hold the 
> archive? Or any other better way to solve this problem?
>
> Hadoop version: 0.19.0.
>
>
>   
Now, there is no way to stop DistributedCache from stopping unpacking of 
jars. I think it should have an option (thru configuration) whether to 
unpack or not.
Can you raise a jira for the same?

Thanks
Amareshwari