You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Sean McNamara <Se...@Webtrends.com> on 2012/12/03 02:41:03 UTC

TaskTracker slow to start/join

I have a TaskTracker on a particular node that is very slow to join the jobtracker.  When I start it up with ./hadoop-daemon.sh start tasktracker I see the daemon fire up and running in top.  The TaskTracker daemon will sit there using 50% cpu according to top.   If it helps any this cluster is on hadoop 1.0.3.  Does anyone know what the TT could be up to?

Here is the log output:

2012-12-03 01:28:17,310 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-12-03 01:28:17,320 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-12-03 01:28:17,321 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-12-03 01:28:17,321 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
2012-12-03 01:28:17,515 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context WepAppsContext
2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as hadoop
2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt/hdfs/data4/mapred,…

It takes 10+ minutes to get past this last line, and then it finally continues on and registers in ok with the JT.


Thanks

Re: TaskTracker slow to start/join

Posted by Sean McNamara <Se...@Webtrends.com>.
Ahh that makes perfect sense.  Thank you!

On 12/2/12 9:25 PM, "Harsh J" <ha...@cloudera.com> wrote:

>Hi,
>
>This is cause of the TT's behavior of deleting the mapred.local.dir
>contents every time you restart it. In your version, 1.0.3, that
>process is synchronous and hence it appears like the TT hangs when
>there's a lot of data to purge out from those dirs.
>
>On Mon, Dec 3, 2012 at 7:11 AM, Sean McNamara
><Se...@webtrends.com> wrote:
>> I have a TaskTracker on a particular node that is very slow to join the
>> jobtracker.  When I start it up with ./hadoop-daemon.sh start
>>tasktracker I
>> see the daemon fire up and running in top.  The TaskTracker daemon will
>>sit
>> there using 50% cpu according to top.   If it helps any this cluster is
>>on
>> hadoop 1.0.3.  Does anyone know what the TT could be up to?
>>
>> Here is the log output:
>>
>> 2012-12-03 01:28:17,310 INFO
>>org.apache.hadoop.metrics2.impl.MetricsConfig:
>> loaded properties from hadoop-metrics2.properties
>> 2012-12-03 01:28:17,320 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-12-03 01:28:17,321 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>period
>> at 10 second(s).
>> 2012-12-03 01:28:17,321 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>> system started
>> 2012-12-03 01:28:17,515 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>ugi
>> registered.
>> 2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> 2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added
>>global
>> filtersafety 
>>(class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context WepAppsContext
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context static
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context logs
>> 2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>> Initializing logs' truncater with mapRetainSize=-1 and
>>reduceRetainSize=-1
>> 2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker:
>>Starting
>> tasktracker with owner as hadoop
>> 2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good
>> mapred local directories are:
>> 
>>/mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt
>>/hdfs/data4/mapred,Š
>>
>> It takes 10+ minutes to get past this last line, and then it finally
>> continues on and registers in ok with the JT.
>>
>>
>> Thanks
>
>
>
>-- 
>Harsh J


Re: TaskTracker slow to start/join

Posted by Sean McNamara <Se...@Webtrends.com>.
Ahh that makes perfect sense.  Thank you!

On 12/2/12 9:25 PM, "Harsh J" <ha...@cloudera.com> wrote:

>Hi,
>
>This is cause of the TT's behavior of deleting the mapred.local.dir
>contents every time you restart it. In your version, 1.0.3, that
>process is synchronous and hence it appears like the TT hangs when
>there's a lot of data to purge out from those dirs.
>
>On Mon, Dec 3, 2012 at 7:11 AM, Sean McNamara
><Se...@webtrends.com> wrote:
>> I have a TaskTracker on a particular node that is very slow to join the
>> jobtracker.  When I start it up with ./hadoop-daemon.sh start
>>tasktracker I
>> see the daemon fire up and running in top.  The TaskTracker daemon will
>>sit
>> there using 50% cpu according to top.   If it helps any this cluster is
>>on
>> hadoop 1.0.3.  Does anyone know what the TT could be up to?
>>
>> Here is the log output:
>>
>> 2012-12-03 01:28:17,310 INFO
>>org.apache.hadoop.metrics2.impl.MetricsConfig:
>> loaded properties from hadoop-metrics2.properties
>> 2012-12-03 01:28:17,320 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-12-03 01:28:17,321 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>period
>> at 10 second(s).
>> 2012-12-03 01:28:17,321 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>> system started
>> 2012-12-03 01:28:17,515 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>ugi
>> registered.
>> 2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> 2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added
>>global
>> filtersafety 
>>(class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context WepAppsContext
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context static
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context logs
>> 2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>> Initializing logs' truncater with mapRetainSize=-1 and
>>reduceRetainSize=-1
>> 2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker:
>>Starting
>> tasktracker with owner as hadoop
>> 2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good
>> mapred local directories are:
>> 
>>/mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt
>>/hdfs/data4/mapred,Š
>>
>> It takes 10+ minutes to get past this last line, and then it finally
>> continues on and registers in ok with the JT.
>>
>>
>> Thanks
>
>
>
>-- 
>Harsh J


Re: TaskTracker slow to start/join

Posted by Sean McNamara <Se...@Webtrends.com>.
Ahh that makes perfect sense.  Thank you!

On 12/2/12 9:25 PM, "Harsh J" <ha...@cloudera.com> wrote:

>Hi,
>
>This is cause of the TT's behavior of deleting the mapred.local.dir
>contents every time you restart it. In your version, 1.0.3, that
>process is synchronous and hence it appears like the TT hangs when
>there's a lot of data to purge out from those dirs.
>
>On Mon, Dec 3, 2012 at 7:11 AM, Sean McNamara
><Se...@webtrends.com> wrote:
>> I have a TaskTracker on a particular node that is very slow to join the
>> jobtracker.  When I start it up with ./hadoop-daemon.sh start
>>tasktracker I
>> see the daemon fire up and running in top.  The TaskTracker daemon will
>>sit
>> there using 50% cpu according to top.   If it helps any this cluster is
>>on
>> hadoop 1.0.3.  Does anyone know what the TT could be up to?
>>
>> Here is the log output:
>>
>> 2012-12-03 01:28:17,310 INFO
>>org.apache.hadoop.metrics2.impl.MetricsConfig:
>> loaded properties from hadoop-metrics2.properties
>> 2012-12-03 01:28:17,320 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-12-03 01:28:17,321 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>period
>> at 10 second(s).
>> 2012-12-03 01:28:17,321 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>> system started
>> 2012-12-03 01:28:17,515 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>ugi
>> registered.
>> 2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> 2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added
>>global
>> filtersafety 
>>(class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context WepAppsContext
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context static
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context logs
>> 2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>> Initializing logs' truncater with mapRetainSize=-1 and
>>reduceRetainSize=-1
>> 2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker:
>>Starting
>> tasktracker with owner as hadoop
>> 2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good
>> mapred local directories are:
>> 
>>/mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt
>>/hdfs/data4/mapred,Š
>>
>> It takes 10+ minutes to get past this last line, and then it finally
>> continues on and registers in ok with the JT.
>>
>>
>> Thanks
>
>
>
>-- 
>Harsh J


Re: TaskTracker slow to start/join

Posted by Sean McNamara <Se...@Webtrends.com>.
Ahh that makes perfect sense.  Thank you!

On 12/2/12 9:25 PM, "Harsh J" <ha...@cloudera.com> wrote:

>Hi,
>
>This is cause of the TT's behavior of deleting the mapred.local.dir
>contents every time you restart it. In your version, 1.0.3, that
>process is synchronous and hence it appears like the TT hangs when
>there's a lot of data to purge out from those dirs.
>
>On Mon, Dec 3, 2012 at 7:11 AM, Sean McNamara
><Se...@webtrends.com> wrote:
>> I have a TaskTracker on a particular node that is very slow to join the
>> jobtracker.  When I start it up with ./hadoop-daemon.sh start
>>tasktracker I
>> see the daemon fire up and running in top.  The TaskTracker daemon will
>>sit
>> there using 50% cpu according to top.   If it helps any this cluster is
>>on
>> hadoop 1.0.3.  Does anyone know what the TT could be up to?
>>
>> Here is the log output:
>>
>> 2012-12-03 01:28:17,310 INFO
>>org.apache.hadoop.metrics2.impl.MetricsConfig:
>> loaded properties from hadoop-metrics2.properties
>> 2012-12-03 01:28:17,320 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-12-03 01:28:17,321 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>period
>> at 10 second(s).
>> 2012-12-03 01:28:17,321 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>> system started
>> 2012-12-03 01:28:17,515 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>ugi
>> registered.
>> 2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> 2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added
>>global
>> filtersafety 
>>(class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context WepAppsContext
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context static
>> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added
>>filter
>> static_user_filter
>> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
>>to
>> context logs
>> 2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>> Initializing logs' truncater with mapRetainSize=-1 and
>>reduceRetainSize=-1
>> 2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker:
>>Starting
>> tasktracker with owner as hadoop
>> 2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good
>> mapred local directories are:
>> 
>>/mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt
>>/hdfs/data4/mapred,Š
>>
>> It takes 10+ minutes to get past this last line, and then it finally
>> continues on and registers in ok with the JT.
>>
>>
>> Thanks
>
>
>
>-- 
>Harsh J


Re: TaskTracker slow to start/join

Posted by Harsh J <ha...@cloudera.com>.
Hi,

This is cause of the TT's behavior of deleting the mapred.local.dir
contents every time you restart it. In your version, 1.0.3, that
process is synchronous and hence it appears like the TT hangs when
there's a lot of data to purge out from those dirs.

On Mon, Dec 3, 2012 at 7:11 AM, Sean McNamara
<Se...@webtrends.com> wrote:
> I have a TaskTracker on a particular node that is very slow to join the
> jobtracker.  When I start it up with ./hadoop-daemon.sh start tasktracker I
> see the daemon fire up and running in top.  The TaskTracker daemon will sit
> there using 50% cpu according to top.   If it helps any this cluster is on
> hadoop 1.0.3.  Does anyone know what the TT could be up to?
>
> Here is the log output:
>
> 2012-12-03 01:28:17,310 INFO org.apache.hadoop.metrics2.impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-12-03 01:28:17,320 INFO
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
> MetricsSystem,sub=Stats registered.
> 2012-12-03 01:28:17,321 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period
> at 10 second(s).
> 2012-12-03 01:28:17,321 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
> system started
> 2012-12-03 01:28:17,515 INFO
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
> registered.
> 2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added global
> filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context WepAppsContext
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context static
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context logs
> 2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker: Starting
> tasktracker with owner as hadoop
> 2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good
> mapred local directories are:
> /mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt/hdfs/data4/mapred,…
>
> It takes 10+ minutes to get past this last line, and then it finally
> continues on and registers in ok with the JT.
>
>
> Thanks



-- 
Harsh J

Re: TaskTracker slow to start/join

Posted by Harsh J <ha...@cloudera.com>.
Hi,

This is cause of the TT's behavior of deleting the mapred.local.dir
contents every time you restart it. In your version, 1.0.3, that
process is synchronous and hence it appears like the TT hangs when
there's a lot of data to purge out from those dirs.

On Mon, Dec 3, 2012 at 7:11 AM, Sean McNamara
<Se...@webtrends.com> wrote:
> I have a TaskTracker on a particular node that is very slow to join the
> jobtracker.  When I start it up with ./hadoop-daemon.sh start tasktracker I
> see the daemon fire up and running in top.  The TaskTracker daemon will sit
> there using 50% cpu according to top.   If it helps any this cluster is on
> hadoop 1.0.3.  Does anyone know what the TT could be up to?
>
> Here is the log output:
>
> 2012-12-03 01:28:17,310 INFO org.apache.hadoop.metrics2.impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-12-03 01:28:17,320 INFO
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
> MetricsSystem,sub=Stats registered.
> 2012-12-03 01:28:17,321 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period
> at 10 second(s).
> 2012-12-03 01:28:17,321 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
> system started
> 2012-12-03 01:28:17,515 INFO
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
> registered.
> 2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added global
> filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context WepAppsContext
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context static
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context logs
> 2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker: Starting
> tasktracker with owner as hadoop
> 2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good
> mapred local directories are:
> /mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt/hdfs/data4/mapred,…
>
> It takes 10+ minutes to get past this last line, and then it finally
> continues on and registers in ok with the JT.
>
>
> Thanks



-- 
Harsh J

Re: TaskTracker slow to start/join

Posted by Harsh J <ha...@cloudera.com>.
Hi,

This is cause of the TT's behavior of deleting the mapred.local.dir
contents every time you restart it. In your version, 1.0.3, that
process is synchronous and hence it appears like the TT hangs when
there's a lot of data to purge out from those dirs.

On Mon, Dec 3, 2012 at 7:11 AM, Sean McNamara
<Se...@webtrends.com> wrote:
> I have a TaskTracker on a particular node that is very slow to join the
> jobtracker.  When I start it up with ./hadoop-daemon.sh start tasktracker I
> see the daemon fire up and running in top.  The TaskTracker daemon will sit
> there using 50% cpu according to top.   If it helps any this cluster is on
> hadoop 1.0.3.  Does anyone know what the TT could be up to?
>
> Here is the log output:
>
> 2012-12-03 01:28:17,310 INFO org.apache.hadoop.metrics2.impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-12-03 01:28:17,320 INFO
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
> MetricsSystem,sub=Stats registered.
> 2012-12-03 01:28:17,321 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period
> at 10 second(s).
> 2012-12-03 01:28:17,321 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
> system started
> 2012-12-03 01:28:17,515 INFO
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
> registered.
> 2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added global
> filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context WepAppsContext
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context static
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context logs
> 2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker: Starting
> tasktracker with owner as hadoop
> 2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good
> mapred local directories are:
> /mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt/hdfs/data4/mapred,…
>
> It takes 10+ minutes to get past this last line, and then it finally
> continues on and registers in ok with the JT.
>
>
> Thanks



-- 
Harsh J

Re: TaskTracker slow to start/join

Posted by Harsh J <ha...@cloudera.com>.
Hi,

This is cause of the TT's behavior of deleting the mapred.local.dir
contents every time you restart it. In your version, 1.0.3, that
process is synchronous and hence it appears like the TT hangs when
there's a lot of data to purge out from those dirs.

On Mon, Dec 3, 2012 at 7:11 AM, Sean McNamara
<Se...@webtrends.com> wrote:
> I have a TaskTracker on a particular node that is very slow to join the
> jobtracker.  When I start it up with ./hadoop-daemon.sh start tasktracker I
> see the daemon fire up and running in top.  The TaskTracker daemon will sit
> there using 50% cpu according to top.   If it helps any this cluster is on
> hadoop 1.0.3.  Does anyone know what the TT could be up to?
>
> Here is the log output:
>
> 2012-12-03 01:28:17,310 INFO org.apache.hadoop.metrics2.impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-12-03 01:28:17,320 INFO
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
> MetricsSystem,sub=Stats registered.
> 2012-12-03 01:28:17,321 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period
> at 10 second(s).
> 2012-12-03 01:28:17,321 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
> system started
> 2012-12-03 01:28:17,515 INFO
> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
> registered.
> 2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added global
> filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context WepAppsContext
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context static
> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added filter
> static_user_filter
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
> context logs
> 2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker: Starting
> tasktracker with owner as hadoop
> 2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good
> mapred local directories are:
> /mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt/hdfs/data4/mapred,…
>
> It takes 10+ minutes to get past this last line, and then it finally
> continues on and registers in ok with the JT.
>
>
> Thanks



-- 
Harsh J