You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Raymond Jennings III <ra...@yahoo.com> on 2010/04/23 19:48:44 UTC

Decomishining a node

I've got a dead machine on my cluster.  I want to safely update HDFS so that nothing references this machine then I want to rebuild it and put it back in service in the cluster.

Does anyone have any pointers how to do this (the first part - updating HDFS so that it's no longer referenced.)  Thank you.


      

Re: Decomishining a node

Posted by rtejac <rt...@gmail.com>.
May be your  "issplittable to false" is being overwritten. 

On Dec 3, 2013, at 10:06 AM, Adam Kawa <ka...@gmail.com> wrote:

> 
> I have override the InputFormat and set the isspitable to return false.
> My entire file has to go one mapper only as I set the issplitable false?
> 
> Yes.
> 
> 1) Could you double-check that you have only 1 input file in the input directory.
> 2) Did you configure your job to use your custom InputFormat (instead of the default one)?
> 


Re: Decomishining a node

Posted by rtejac <rt...@gmail.com>.
May be your  "issplittable to false" is being overwritten. 

On Dec 3, 2013, at 10:06 AM, Adam Kawa <ka...@gmail.com> wrote:

> 
> I have override the InputFormat and set the isspitable to return false.
> My entire file has to go one mapper only as I set the issplitable false?
> 
> Yes.
> 
> 1) Could you double-check that you have only 1 input file in the input directory.
> 2) Did you configure your job to use your custom InputFormat (instead of the default one)?
> 


Re: Decomishining a node

Posted by rtejac <rt...@gmail.com>.
May be your  "issplittable to false" is being overwritten. 

On Dec 3, 2013, at 10:06 AM, Adam Kawa <ka...@gmail.com> wrote:

> 
> I have override the InputFormat and set the isspitable to return false.
> My entire file has to go one mapper only as I set the issplitable false?
> 
> Yes.
> 
> 1) Could you double-check that you have only 1 input file in the input directory.
> 2) Did you configure your job to use your custom InputFormat (instead of the default one)?
> 


Re: Decomishining a node

Posted by rtejac <rt...@gmail.com>.
May be your  "issplittable to false" is being overwritten. 

On Dec 3, 2013, at 10:06 AM, Adam Kawa <ka...@gmail.com> wrote:

> 
> I have override the InputFormat and set the isspitable to return false.
> My entire file has to go one mapper only as I set the issplitable false?
> 
> Yes.
> 
> 1) Could you double-check that you have only 1 input file in the input directory.
> 2) Did you configure your job to use your custom InputFormat (instead of the default one)?
> 


Re: Decomishining a node

Posted by Adam Kawa <ka...@gmail.com>.
> I have override the InputFormat and set the isspitable to return false.
> My entire file has to go one mapper only as I set the issplitable false?
>

Yes.

1) Could you double-check that you have only 1 input file in the input
directory.
2) Did you configure your job to use your custom InputFormat (instead of
the default one)?

Re: Decomishining a node

Posted by Adam Kawa <ka...@gmail.com>.
> I have override the InputFormat and set the isspitable to return false.
> My entire file has to go one mapper only as I set the issplitable false?
>

Yes.

1) Could you double-check that you have only 1 input file in the input
directory.
2) Did you configure your job to use your custom InputFormat (instead of
the default one)?

Re: Decomishining a node

Posted by Adam Kawa <ka...@gmail.com>.
> I have override the InputFormat and set the isspitable to return false.
> My entire file has to go one mapper only as I set the issplitable false?
>

Yes.

1) Could you double-check that you have only 1 input file in the input
directory.
2) Did you configure your job to use your custom InputFormat (instead of
the default one)?

Re: Decomishining a node

Posted by Adam Kawa <ka...@gmail.com>.
> I have override the InputFormat and set the isspitable to return false.
> My entire file has to go one mapper only as I set the issplitable false?
>

Yes.

1) Could you double-check that you have only 1 input file in the input
directory.
2) Did you configure your job to use your custom InputFormat (instead of
the default one)?

Re: Decomishining a node

Posted by senthilselvan <sn...@gmail.com>.
Hi,

I have override the InputFormat and set the isspitable to return false. What
I am seeing is in the log, I could see three map jobs. How is it possible?
My entire file has to go one mapper only as I set the issplitable false?

Thanks


Re: Decomishining a node

Posted by senthilselvan <sn...@gmail.com>.
Hi,

I have override the InputFormat and set the isspitable to return false. What
I am seeing is in the log, I could see three map jobs. How is it possible?
My entire file has to go one mapper only as I set the issplitable false?

Thanks


Re: Decomishining a node

Posted by senthilselvan <sn...@gmail.com>.
Hi,

I have override the InputFormat and set the isspitable to return false. What
I am seeing is in the log, I could see three map jobs. How is it possible?
My entire file has to go one mapper only as I set the issplitable false?

Thanks


Re: Decomishining a node

Posted by senthilselvan <sn...@gmail.com>.
Hi,

I have override the InputFormat and set the isspitable to return false. What
I am seeing is in the log, I could see three map jobs. How is it possible?
My entire file has to go one mapper only as I set the issplitable false?

Thanks


Re: Decomishining a node

Posted by Allen Wittenauer <aw...@linkedin.com>.
On Apr 23, 2010, at 2:50 PM, Alex Kozlov wrote:

> The best way to resolve an argument is to look at the code:

I didn't realize we were having an argument.

But I will say this:

I've never had a node removed from both dfs.hosts and dfs.hosts.exclude actually disappear from the dead list in the web ui, at least under 0.20.2, without bouncing the nn.



Re: Decomishining a node

Posted by Alex Kozlov <al...@cloudera.com>.
The best way to resolve an argument is to look at the code:

 */**
   * Rereads the config to get hosts and exclude list file names.
   * Rereads the files to update the hosts and exclude lists.  It
   * checks if any of the hosts have changed states:
   * 1. Added to hosts  --> no further work needed here.
   * 2. Removed from hosts --> mark AdminState as decommissioned.
   * 3. Added to exclude --> start decommission.
   * 4. Removed from exclude --> stop decommission.
   */
  public void refreshNodes(Configuration conf) throws IOException {
    checkSuperuserPrivilege();
    // Reread the config to get dfs.hosts and dfs.hosts.exclude filenames.
    // Update the file names and refresh internal includes and excludes list
    if (conf == null)
      conf = new Configuration();
    hostsReader.updateFileNames(conf.get("dfs.hosts",""),
                                conf.get("dfs.hosts.exclude", ""));
    hostsReader.refresh();
    synchronized (this) {
      for (Iterator<DatanodeDescriptor> it =
datanodeMap.values().iterator();
           it.hasNext();) {
        DatanodeDescriptor node = it.next();
        // Check if not include.
        if (!inHostsList(node, null)) {
          node.setDecommissioned();  // case 2.
        } else {
          if (inExcludedHostsList(node, null)) {
            if (!node.isDecommissionInProgress() &&
                !node.isDecommissioned()) {
              startDecommission(node);   // case 3.
            }
          } else {
            if (node.isDecommissionInProgress() ||
                node.isDecommissioned()) {
              stopDecommission(node);   // case 4.
            }
          }
        }
      }
    }
  }*

The machine is already dead, so there is no point in decomissioning.  HDFS
will still replicate the blocks as it is risky to function under a reduced
replication factor.

There may still be an argument whether it makes sense to physically move the
blocks...

Alex K

On Fri, Apr 23, 2010 at 2:20 PM, Allen Wittenauer
<aw...@linkedin.com>wrote:

>
> On Apr 23, 2010, at 1:56 PM, Alex Kozlov wrote:
>
> > I think Raymond says that the machine is already dead...
>
> Right.  But he wants to re-add it later.  So dfs.exclude is still a better
> way to go.  dfs.hosts, iirc, doesn't get re-read so it would require a nn
> bounce to clear.
>
>

Re: Decomishining a node

Posted by Allen Wittenauer <aw...@linkedin.com>.
On Apr 23, 2010, at 1:56 PM, Alex Kozlov wrote:

> I think Raymond says that the machine is already dead...

Right.  But he wants to re-add it later.  So dfs.exclude is still a better way to go.  dfs.hosts, iirc, doesn't get re-read so it would require a nn bounce to clear.


Re: Decomishining a node

Posted by Alex Kozlov <al...@cloudera.com>.
I think Raymond says that the machine is already dead...

At this point, you can just remove it from <em>dfs.hosts</em> list and let
HDFS to restore the lost blocks...

But before that, if you have the disks intact, you can stop HDFS and
manually copy the blocks together with their CRC from the dead machine's
<em>dfs.data.dir</em> to existing machines.  The copied blocks will be found
and recognized by the HDFS on restart.

Alex K

On Fri, Apr 23, 2010 at 11:31 AM, Allen Wittenauer <awittenauer@linkedin.com
> wrote:

>
> On Apr 23, 2010, at 10:48 AM, Raymond Jennings III wrote:
>
> > I've got a dead machine on my cluster.  I want to safely update HDFS so
> that nothing references this machine then I want to rebuild it and put it
> back in service in the cluster.
> >
> > Does anyone have any pointers how to do this (the first part - updating
> HDFS so that it's no longer referenced.)
>
> 1. Add node to dfs.exclude
> 2. hadoop dfsadmin -refreshNodes
>
> That will start the decommissioning process.
>
> When you want to add it back in, remove it from dfs.excluce and re-run
> refreshnodes.

Re: Decomishining a node

Posted by Allen Wittenauer <aw...@linkedin.com>.
On Apr 23, 2010, at 10:48 AM, Raymond Jennings III wrote:

> I've got a dead machine on my cluster.  I want to safely update HDFS so that nothing references this machine then I want to rebuild it and put it back in service in the cluster.
> 
> Does anyone have any pointers how to do this (the first part - updating HDFS so that it's no longer referenced.) 

1. Add node to dfs.exclude
2. hadoop dfsadmin -refreshNodes

That will start the decommissioning process.

When you want to add it back in, remove it from dfs.excluce and re-run refreshnodes.