You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Tang <sh...@gmail.com> on 2014/10/29 06:11:21 UTC

MapReduce job temp input files

hi,

We are running mapreduce jobs on hadoop clusters. The job inputs come from logs which are not in HDFS. so we need first copy them into HDFS. After job finished, delete them. 
Recently the cluster become very unstable. the HDFS disk are prone to full. in fact total valid files are only several Gega bytes. many invalid blocks are on the disk. After reboot the cluster,

they are deleted automatically. It seems that restart datanode only willn't work, the namenode willn't send delete block command to datanode. 

for this case, any ideas?

Regards
Tang

Re: MapReduce job temp input files

Posted by Tang <sh...@gmail.com>.
Hi,
I also see this on the webUI:
Number of Blocks Pending Deletion: 14444

how to delete the invalidate blocks immediately without restart cluster.

Thanks
Tang

On 2014/10/29 13:11:28, Tang <sh...@gmail.com> wrote:
hi,

We are running mapreduce jobs on hadoop clusters. The job inputs come from logs which are not in HDFS. so we need first copy them into HDFS. After job finished, delete them. 
Recently the cluster become very unstable. the HDFS disk are prone to full. in fact total valid files are only several Gega bytes. many invalid blocks are on the disk. After reboot the cluster,

they are deleted automatically. It seems that restart datanode only willn't work, the namenode willn't send delete block command to datanode. 

for this case, any ideas?

Regards
Tang

Re: MapReduce job temp input files

Posted by Tang <sh...@gmail.com>.
Hi,
I also see this on the webUI:
Number of Blocks Pending Deletion: 14444

how to delete the invalidate blocks immediately without restart cluster.

Thanks
Tang

On 2014/10/29 13:11:28, Tang <sh...@gmail.com> wrote:
hi,

We are running mapreduce jobs on hadoop clusters. The job inputs come from logs which are not in HDFS. so we need first copy them into HDFS. After job finished, delete them. 
Recently the cluster become very unstable. the HDFS disk are prone to full. in fact total valid files are only several Gega bytes. many invalid blocks are on the disk. After reboot the cluster,

they are deleted automatically. It seems that restart datanode only willn't work, the namenode willn't send delete block command to datanode. 

for this case, any ideas?

Regards
Tang

Re: MapReduce job temp input files

Posted by Tang <sh...@gmail.com>.
Hi,
I also see this on the webUI:
Number of Blocks Pending Deletion: 14444

how to delete the invalidate blocks immediately without restart cluster.

Thanks
Tang

On 2014/10/29 13:11:28, Tang <sh...@gmail.com> wrote:
hi,

We are running mapreduce jobs on hadoop clusters. The job inputs come from logs which are not in HDFS. so we need first copy them into HDFS. After job finished, delete them. 
Recently the cluster become very unstable. the HDFS disk are prone to full. in fact total valid files are only several Gega bytes. many invalid blocks are on the disk. After reboot the cluster,

they are deleted automatically. It seems that restart datanode only willn't work, the namenode willn't send delete block command to datanode. 

for this case, any ideas?

Regards
Tang

Re: MapReduce job temp input files

Posted by Tang <sh...@gmail.com>.
Hi,
I also see this on the webUI:
Number of Blocks Pending Deletion: 14444

how to delete the invalidate blocks immediately without restart cluster.

Thanks
Tang

On 2014/10/29 13:11:28, Tang <sh...@gmail.com> wrote:
hi,

We are running mapreduce jobs on hadoop clusters. The job inputs come from logs which are not in HDFS. so we need first copy them into HDFS. After job finished, delete them. 
Recently the cluster become very unstable. the HDFS disk are prone to full. in fact total valid files are only several Gega bytes. many invalid blocks are on the disk. After reboot the cluster,

they are deleted automatically. It seems that restart datanode only willn't work, the namenode willn't send delete block command to datanode. 

for this case, any ideas?

Regards
Tang