You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by sam liu <sa...@gmail.com> on 2014/05/07 05:40:59 UTC

Questions about Hadoop logs and mapred.local.dir

Hi Experts,

1. The size of mapred.local.dir is big(30 GB), how many methods could clean
it correctly?
2. For logs of NameNode/DataNode/JobTracker/TaskTracker, are they all
rolling type log? What's their max size? I can not find the specific
settings for them in log4j.properties.
3. I find the size of dfs.name.dir and dfs.data.dir is very big now, are
there any files under them could be removed actually? Or all files under
the two folders could not be removed at all?

Thanks!

Re: Questions about Hadoop logs and mapred.local.dir

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Sam,

1. I am sorry I didn't quite get "how many methods could clean it correctly?
".

Since this directory contains only the temporary files it should get
cleaned up after your jobs are over. If you still have unnecessary data
present there you can delete it. Make sure no jobs are running while you
clean this directory.

2. All the daemons use log4j and DailyRollingFileAppender, which does not
have retention settings. You can change the behavior by changing the
Appender of your choice in *log4j.properties* files under
*HADOOP_HOME/conf*directory. The associated property is
*log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender*.

3. You must never touch the content of these 2 directories. This the actual
HDFS *data+metadata*, which you don't want to loose.

You can't find more on log files
here<http://blog.cloudera.com/blog/2010/11/hadoop-log-location-and-retention/>
.

HTH

*Warm regards,*
*Mohammad Tariq*
*cloudfront.blogspot.com <http://cloudfront.blogspot.com>*


On Wed, May 7, 2014 at 9:10 AM, sam liu <sa...@gmail.com> wrote:

> Hi Experts,
>
> 1. The size of mapred.local.dir is big(30 GB), how many methods could
> clean it correctly?
> 2. For logs of NameNode/DataNode/JobTracker/TaskTracker, are they all
> rolling type log? What's their max size? I can not find the specific
> settings for them in log4j.properties.
> 3. I find the size of dfs.name.dir and dfs.data.dir is very big now, are
> there any files under them could be removed actually? Or all files under
> the two folders could not be removed at all?
>
> Thanks!
>

Re: Questions about Hadoop logs and mapred.local.dir

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Sam,

1. I am sorry I didn't quite get "how many methods could clean it correctly?
".

Since this directory contains only the temporary files it should get
cleaned up after your jobs are over. If you still have unnecessary data
present there you can delete it. Make sure no jobs are running while you
clean this directory.

2. All the daemons use log4j and DailyRollingFileAppender, which does not
have retention settings. You can change the behavior by changing the
Appender of your choice in *log4j.properties* files under
*HADOOP_HOME/conf*directory. The associated property is
*log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender*.

3. You must never touch the content of these 2 directories. This the actual
HDFS *data+metadata*, which you don't want to loose.

You can't find more on log files
here<http://blog.cloudera.com/blog/2010/11/hadoop-log-location-and-retention/>
.

HTH

*Warm regards,*
*Mohammad Tariq*
*cloudfront.blogspot.com <http://cloudfront.blogspot.com>*


On Wed, May 7, 2014 at 9:10 AM, sam liu <sa...@gmail.com> wrote:

> Hi Experts,
>
> 1. The size of mapred.local.dir is big(30 GB), how many methods could
> clean it correctly?
> 2. For logs of NameNode/DataNode/JobTracker/TaskTracker, are they all
> rolling type log? What's their max size? I can not find the specific
> settings for them in log4j.properties.
> 3. I find the size of dfs.name.dir and dfs.data.dir is very big now, are
> there any files under them could be removed actually? Or all files under
> the two folders could not be removed at all?
>
> Thanks!
>

Re: Questions about Hadoop logs and mapred.local.dir

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Sam,

1. I am sorry I didn't quite get "how many methods could clean it correctly?
".

Since this directory contains only the temporary files it should get
cleaned up after your jobs are over. If you still have unnecessary data
present there you can delete it. Make sure no jobs are running while you
clean this directory.

2. All the daemons use log4j and DailyRollingFileAppender, which does not
have retention settings. You can change the behavior by changing the
Appender of your choice in *log4j.properties* files under
*HADOOP_HOME/conf*directory. The associated property is
*log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender*.

3. You must never touch the content of these 2 directories. This the actual
HDFS *data+metadata*, which you don't want to loose.

You can't find more on log files
here<http://blog.cloudera.com/blog/2010/11/hadoop-log-location-and-retention/>
.

HTH

*Warm regards,*
*Mohammad Tariq*
*cloudfront.blogspot.com <http://cloudfront.blogspot.com>*


On Wed, May 7, 2014 at 9:10 AM, sam liu <sa...@gmail.com> wrote:

> Hi Experts,
>
> 1. The size of mapred.local.dir is big(30 GB), how many methods could
> clean it correctly?
> 2. For logs of NameNode/DataNode/JobTracker/TaskTracker, are they all
> rolling type log? What's their max size? I can not find the specific
> settings for them in log4j.properties.
> 3. I find the size of dfs.name.dir and dfs.data.dir is very big now, are
> there any files under them could be removed actually? Or all files under
> the two folders could not be removed at all?
>
> Thanks!
>

Re: Questions about Hadoop logs and mapred.local.dir

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Sam,

1. I am sorry I didn't quite get "how many methods could clean it correctly?
".

Since this directory contains only the temporary files it should get
cleaned up after your jobs are over. If you still have unnecessary data
present there you can delete it. Make sure no jobs are running while you
clean this directory.

2. All the daemons use log4j and DailyRollingFileAppender, which does not
have retention settings. You can change the behavior by changing the
Appender of your choice in *log4j.properties* files under
*HADOOP_HOME/conf*directory. The associated property is
*log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender*.

3. You must never touch the content of these 2 directories. This the actual
HDFS *data+metadata*, which you don't want to loose.

You can't find more on log files
here<http://blog.cloudera.com/blog/2010/11/hadoop-log-location-and-retention/>
.

HTH

*Warm regards,*
*Mohammad Tariq*
*cloudfront.blogspot.com <http://cloudfront.blogspot.com>*


On Wed, May 7, 2014 at 9:10 AM, sam liu <sa...@gmail.com> wrote:

> Hi Experts,
>
> 1. The size of mapred.local.dir is big(30 GB), how many methods could
> clean it correctly?
> 2. For logs of NameNode/DataNode/JobTracker/TaskTracker, are they all
> rolling type log? What's their max size? I can not find the specific
> settings for them in log4j.properties.
> 3. I find the size of dfs.name.dir and dfs.data.dir is very big now, are
> there any files under them could be removed actually? Or all files under
> the two folders could not be removed at all?
>
> Thanks!
>