You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2014/06/28 17:25:24 UTC
[jira] [Commented] (MAPREDUCE-5949) Tasktracker's java threads hunging

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046885#comment-14046885 ] 

Steve Loughran commented on MAPREDUCE-5949:
-------------------------------------------

This is clearly the regular Jetty issue from MAPREDUCE-2386; still lurking.

Dmitry: the fix for this in Hadoop 2 was to take Jetty out of the process, otherwise it's going to happen. Some work went into trying to detect the problem in Hadoop 1.x -try running Hadoop 1.3 to see if that helps. Otherwise, there's nothing anyone can do except say "upgrade to Hadoop 2". Sorry

> Tasktracker's java threads hunging
> ----------------------------------
>
>                 Key: MAPREDUCE-5949
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5949
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 1.2.1
>         Environment: FreeBSD-10/stable
> openjdk version "1.7.0_60"
> OpenJDK Runtime Environment (build 1.7.0_60-b19)
> OpenJDK 64-Bit Server VM (build 24.60-b09, mixed mode)
>            Reporter: Dmitry Sivachenko
>         Attachments: task1.txt
>
>
> I set up hadoop-1.2.1 (from ports) on FreeBSD-10/stable with openjdk version 1.7.0_60.
> On the first glance it is doing well except one annoying thing:  after executing some tasks, tasktracker process starts to eat CPU when idle.
> Sometimes it is 10-20% (numbers from top(1) output), sometimes it is 100-150%.
> In tasktrackers's log I see numerious records like this:
> 2014-06-09 13:08:29,858 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - injecting delay59 times
> 2014-06-09 13:08:29,859 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - recreating selector 59 times, canceled keys 944 times
> 2014-06-09 13:09:29,862 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - injecting delay58 times
> 2014-06-09 13:09:29,862 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - recreating selector 58 times, canceled keys 928 times
> 2014-06-09 13:10:29,901 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - injecting delay58 times
> 2014-06-09 13:10:29,901 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - recreating selector 58 times, canceled keys 928 times
> <...>
> The more jobs I run, more java threads start to consume CPU after all tasks finished.  After several job execution, top(1) output looks like this (splitted by thread, the same PID):
> PID USERNAME     PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
> 79045 hadoop        47    0  1948M   867M uwait   2  20:49  37.50% java{java}
> 79045 hadoop        31    0  1948M   867M uwait  31   1:45  19.29% java{java}
> 79045 hadoop        33    0  1948M   867M uwait  21   2:51  19.19% java{java}
> 79045 hadoop        30    0  1948M   867M uwait  17   2:51  18.65% java{java}
> 79045 hadoop        30    0  1948M   867M uwait  11   1:52  18.36% java{java}
> 79045 hadoop        30    0  1948M   867M uwait  22   1:45  18.36% java{java}
> 79045 hadoop        31    0  1948M   867M uwait  29   2:50  18.26% java{java}
> 79045 hadoop        31    0  1948M   867M uwait   6   1:57  18.16% java{java}
> 79045 hadoop        31    0  1948M   867M uwait  13   4:55  17.97% java{java}
> 79045 hadoop        31    0  1948M   867M uwait  26   3:39  17.77% java{java}
> 79045 hadoop        33    0  1948M   867M uwait   8   1:21  17.48% java{java}
> 79045 hadoop        30    0  1948M   867M uwait   1   3:32  16.70% java{java}
> 79045 hadoop        32    0  1948M   867M uwait  24   3:12  16.70% java{java}
> 79045 hadoop        26    0  1948M   867M uwait   4   1:27  10.35% java{java}
> 72417 root          20    0 19828K  3252K CPU21  21   0:00   0.29% top
> 836 root          20    0 36104K  1952K select 14   6:51   0.00% snmpd
> 79045 hadoop        20    0  1948M   867M uwait  20   6:51   0.00% java{java}
> 79045 hadoop        20    0  1948M   867M uwait  27   3:45   0.00% java{java}
> 79045 hadoop        20    0  1948M   867M uwait  30   2:37   0.00% java{java}
> 79045 hadoop        20    0  1948M   867M uwait  15   0:54   0.00% java{java}
> 79045 hadoop        20    0  1948M   867M uwait   2   0:48   0.00% java{java}
> 79045 hadoop        20    0  1948M   867M uwait  14   0:48   0.00% java{java}
> 79045 hadoop        20    0  1948M   867M uwait   2   0:48   0.00% java{java}
> <....>
> This is on absolutely idle cluster, no single task is running.
> I am attaching truss(1) output for that java process:



--
This message was sent by Atlassian JIRA
(v6.2#6252)