You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by lei liu <li...@gmail.com> on 2012/10/25 10:27:50 UTC

HDFS HA IO Fencing

I want to use HDFS HA function, I find the  IO Fencing function is  complex
in hadoop2.0. I think we can use  file lock to implement the  IO Fencing
function, I think that is simple.

Thanks,

LiuLei

Re: HDFS HA IO Fencing

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.

If you use NSFv4 you should be able to use locks and when a machine dies /
fails to renew the lease, the other machine can take over.

On Friday, October 26, 2012, Todd Lipcon wrote:

> NFS Locks typically last forever if you disconnect abruptly. So they are
> not sufficient -- your standby wouldn't be able to take over without manual
> intervention to remove the lock.
>
> If you want to build an unreliable system that might corrupt your data,
> you could set up 'shell(/bin/true)' as a second fencing method. But, it's
> really a bad idea. There are failure scenarios which could cause split
> brain if you do this, and you'd very likely lose data.
>
> -Todd
>
> On Fri, Oct 26, 2012 at 1:59 AM, lei liu <liulei412@gmail.com<javascript:_e({}, 'cvml', 'liulei412@gmail.com');>
> > wrote:
>
>> We are using NFS for Shared storage,  Can we use linux nfslcok service to
>> implement IO Fencing ?
>>
>>
>> 2012/10/26 Steve Loughran <stevel@hortonworks.com <javascript:_e({},
>> 'cvml', 'stevel@hortonworks.com');>>
>>
>>>
>>>
>>> On 25 October 2012 14:08, Todd Lipcon <todd@cloudera.com<javascript:_e({}, 'cvml', 'todd@cloudera.com');>
>>> > wrote:
>>>
>>>> Hi Liu,
>>>>
>>>> Locks are not sufficient, because there is no way to enforce a lock in
>>>> a distributed system without unbounded blocking. What you might be
>>>> referring to is a lease, but leases are still problematic unless you can
>>>> put bounds on the speed with which clocks progress on different machines,
>>>> _and_ have strict guarantees on the way each node's scheduler works. With
>>>> Linux and Java, the latter is tough.
>>>>
>>>>
>>> on any OS running in any virtual environment, including EC2, time is
>>> entirely unpredictable, just to make things worse.
>>>
>>>
>>> On a single machine you can use file locking as the OS will know that
>>> the process is dead and closes the file; other programs can attempt to open
>>> the same file with exclusive locking -and, by getting the right failures,
>>> know that something else has the file, hence the other process is live.
>>> Shared NFS storage you need to mount with softlock set precisely to stop
>>> file locks lasting until some lease has expired, because the on-host
>>> liveness probes detect failure faster and want to react to it.
>>>
>>>
>>> -Steve
>>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


-- 
Thanks
-balaji

--
http://balajin.net/blog/
http://flic.kr/balajijegan

Re: HDFS HA IO Fencing

Posted by lei liu <li...@gmail.com>.

I use NFS V4 to test the java  FileLock.

The 192.168.1.233 machine is NFS Server,  the nfs configuration  are
/home/hdfs.ha/share  192.168.1.221(rw,sync,no_root_squash)
/home/hdfs.ha/share  192.168.1.222(rw,sync,no_root_squash)
in /etc/exports file.

I run below commands to start nfs server:
service nfs start
service nfslock start

The 192.168.1.221 and 192.168.1.222 machines are NFS Client, the nfs
configuration is
192.168.1.223:/home/hdfs.ha/share /home/hdfs.ha/share  nfs
rsize=8192,wsize=8192,timeo=14,intr in /etc/fstab file.

I run below commands to start nfs client:
service nfs start
service nfslock start

I write one programm to receive file lock:
public class FileLockTest {
 FileLock lock;

   public void lock(String path,boolean isShare) throws IOException {
       this.lock = tryLock(path,isShare);
       if (lock == null) {
         String msg = "Cannot lock storage " + path
             + ". The directory is already locked.";
        System.out.println(msg);
         throw new IOException(msg);
       }
     }

    private FileLock tryLock(String path,boolean isShare) throws
IOException {
        boolean deletionHookAdded = false;
        File lockF = new File(path);
        if (!lockF.exists()) {
          lockF.deleteOnExit();
          deletionHookAdded = true;
        }
        RandomAccessFile file = new RandomAccessFile(lockF, "rws");
        FileLock res = null;
        try {
          res = file.getChannel().tryLock(0,Long.MAX_VALUE,isShare);
        } catch (OverlappingFileLockException oe) {
          file.close();
          return null;
        } catch (IOException e) {
          e.printStackTrace();
          file.close();
          throw e;
        }
        if (res != null && !deletionHookAdded) {
          // If the file existed prior to our startup, we didn't
          // call deleteOnExit above. But since we successfully locked
          // the dir, we can take care of cleaning it up.
          lockF.deleteOnExit();
        }
        return res;
      }

    public static void main(String[] s) {
     FileLockTest fileLockTest =new FileLockTest();
     try {
      fileLockTest.lock(s[0], Boolean.valueOf(s[1]));

      Thread.sleep(1000*60*60*1);
     } catch (Exception e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
     }
    }
}

I do two test cases.

1. The network is OK
I run " java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command in 192.168.1.221 to hold file
lock, and then I run same command to hold same file lock in 192.168.1.222,
throw below exception:
Cannot lock storage /home/hdfs.ha/share/test.lock. The directory is already
locked.
java.io.IOException: Cannot lock storage /home/hdfs.ha/share/test.lock. The
directory is already locked.
        at lock.FileLockTest.lock(FileLockTest.java:18)
        at lock.FileLockTest.main(FileLockTest.java:53)

2. machine which hold file lock is diconnected
I run " java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command on 192.168.1.221,  then
192.168.1.221  machine is disconnected from network . After three minutes ,
I run the "  java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command on 192.168.1.222, that can
hold the file lock.
 I use "mount | grep nfs" command to examine the mount nfs directory on
192.168.1.221, the share directory /home/hdfs.ha/share/ is disappear on
192.168.1.221 machine.  So I think when the machine is disconnected for
a long time, other machine can receive the same file lock.

Re: HDFS HA IO Fencing

Posted by lei liu <li...@gmail.com>.

I use NFS V4 to test the java  FileLock.

The 192.168.1.233 machine is NFS Server,  the nfs configuration  are
/home/hdfs.ha/share  192.168.1.221(rw,sync,no_root_squash)
/home/hdfs.ha/share  192.168.1.222(rw,sync,no_root_squash)
in /etc/exports file.

I run below commands to start nfs server:
service nfs start
service nfslock start

The 192.168.1.221 and 192.168.1.222 machines are NFS Client, the nfs
configuration is
192.168.1.223:/home/hdfs.ha/share /home/hdfs.ha/share  nfs
rsize=8192,wsize=8192,timeo=14,intr in /etc/fstab file.

I run below commands to start nfs client:
service nfs start
service nfslock start

I write one programm to receive file lock:
public class FileLockTest {
 FileLock lock;

   public void lock(String path,boolean isShare) throws IOException {
       this.lock = tryLock(path,isShare);
       if (lock == null) {
         String msg = "Cannot lock storage " + path
             + ". The directory is already locked.";
        System.out.println(msg);
         throw new IOException(msg);
       }
     }

    private FileLock tryLock(String path,boolean isShare) throws
IOException {
        boolean deletionHookAdded = false;
        File lockF = new File(path);
        if (!lockF.exists()) {
          lockF.deleteOnExit();
          deletionHookAdded = true;
        }
        RandomAccessFile file = new RandomAccessFile(lockF, "rws");
        FileLock res = null;
        try {
          res = file.getChannel().tryLock(0,Long.MAX_VALUE,isShare);
        } catch (OverlappingFileLockException oe) {
          file.close();
          return null;
        } catch (IOException e) {
          e.printStackTrace();
          file.close();
          throw e;
        }
        if (res != null && !deletionHookAdded) {
          // If the file existed prior to our startup, we didn't
          // call deleteOnExit above. But since we successfully locked
          // the dir, we can take care of cleaning it up.
          lockF.deleteOnExit();
        }
        return res;
      }

    public static void main(String[] s) {
     FileLockTest fileLockTest =new FileLockTest();
     try {
      fileLockTest.lock(s[0], Boolean.valueOf(s[1]));

      Thread.sleep(1000*60*60*1);
     } catch (Exception e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
     }
    }
}

I do two test cases.

1. The network is OK
I run " java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command in 192.168.1.221 to hold file
lock, and then I run same command to hold same file lock in 192.168.1.222,
throw below exception:
Cannot lock storage /home/hdfs.ha/share/test.lock. The directory is already
locked.
java.io.IOException: Cannot lock storage /home/hdfs.ha/share/test.lock. The
directory is already locked.
        at lock.FileLockTest.lock(FileLockTest.java:18)
        at lock.FileLockTest.main(FileLockTest.java:53)

2. machine which hold file lock is diconnected
I run " java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command on 192.168.1.221,  then
192.168.1.221  machine is disconnected from network . After three minutes ,
I run the "  java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command on 192.168.1.222, that can
hold the file lock.
 I use "mount | grep nfs" command to examine the mount nfs directory on
192.168.1.221, the share directory /home/hdfs.ha/share/ is disappear on
192.168.1.221 machine.  So I think when the machine is disconnected for
a long time, other machine can receive the same file lock.

Re: HDFS HA IO Fencing

Posted by lei liu <li...@gmail.com>.

I use NFS V4 to test the java  FileLock.

The 192.168.1.233 machine is NFS Server,  the nfs configuration  are
/home/hdfs.ha/share  192.168.1.221(rw,sync,no_root_squash)
/home/hdfs.ha/share  192.168.1.222(rw,sync,no_root_squash)
in /etc/exports file.

I run below commands to start nfs server:
service nfs start
service nfslock start

The 192.168.1.221 and 192.168.1.222 machines are NFS Client, the nfs
configuration is
192.168.1.223:/home/hdfs.ha/share /home/hdfs.ha/share  nfs
rsize=8192,wsize=8192,timeo=14,intr in /etc/fstab file.

I run below commands to start nfs client:
service nfs start
service nfslock start

I write one programm to receive file lock:
public class FileLockTest {
 FileLock lock;

   public void lock(String path,boolean isShare) throws IOException {
       this.lock = tryLock(path,isShare);
       if (lock == null) {
         String msg = "Cannot lock storage " + path
             + ". The directory is already locked.";
        System.out.println(msg);
         throw new IOException(msg);
       }
     }

    private FileLock tryLock(String path,boolean isShare) throws
IOException {
        boolean deletionHookAdded = false;
        File lockF = new File(path);
        if (!lockF.exists()) {
          lockF.deleteOnExit();
          deletionHookAdded = true;
        }
        RandomAccessFile file = new RandomAccessFile(lockF, "rws");
        FileLock res = null;
        try {
          res = file.getChannel().tryLock(0,Long.MAX_VALUE,isShare);
        } catch (OverlappingFileLockException oe) {
          file.close();
          return null;
        } catch (IOException e) {
          e.printStackTrace();
          file.close();
          throw e;
        }
        if (res != null && !deletionHookAdded) {
          // If the file existed prior to our startup, we didn't
          // call deleteOnExit above. But since we successfully locked
          // the dir, we can take care of cleaning it up.
          lockF.deleteOnExit();
        }
        return res;
      }

    public static void main(String[] s) {
     FileLockTest fileLockTest =new FileLockTest();
     try {
      fileLockTest.lock(s[0], Boolean.valueOf(s[1]));

      Thread.sleep(1000*60*60*1);
     } catch (Exception e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
     }
    }
}

I do two test cases.

1. The network is OK
I run " java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command in 192.168.1.221 to hold file
lock, and then I run same command to hold same file lock in 192.168.1.222,
throw below exception:
Cannot lock storage /home/hdfs.ha/share/test.lock. The directory is already
locked.
java.io.IOException: Cannot lock storage /home/hdfs.ha/share/test.lock. The
directory is already locked.
        at lock.FileLockTest.lock(FileLockTest.java:18)
        at lock.FileLockTest.main(FileLockTest.java:53)

2. machine which hold file lock is diconnected
I run " java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command on 192.168.1.221,  then
192.168.1.221  machine is disconnected from network . After three minutes ,
I run the "  java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command on 192.168.1.222, that can
hold the file lock.
 I use "mount | grep nfs" command to examine the mount nfs directory on
192.168.1.221, the share directory /home/hdfs.ha/share/ is disappear on
192.168.1.221 machine.  So I think when the machine is disconnected for
a long time, other machine can receive the same file lock.

Re: HDFS HA IO Fencing

Posted by lei liu <li...@gmail.com>.

I use NFS V4 to test the java  FileLock.

The 192.168.1.233 machine is NFS Server,  the nfs configuration  are
/home/hdfs.ha/share  192.168.1.221(rw,sync,no_root_squash)
/home/hdfs.ha/share  192.168.1.222(rw,sync,no_root_squash)
in /etc/exports file.

I run below commands to start nfs server:
service nfs start
service nfslock start

The 192.168.1.221 and 192.168.1.222 machines are NFS Client, the nfs
configuration is
192.168.1.223:/home/hdfs.ha/share /home/hdfs.ha/share  nfs
rsize=8192,wsize=8192,timeo=14,intr in /etc/fstab file.

I run below commands to start nfs client:
service nfs start
service nfslock start

I write one programm to receive file lock:
public class FileLockTest {
 FileLock lock;

   public void lock(String path,boolean isShare) throws IOException {
       this.lock = tryLock(path,isShare);
       if (lock == null) {
         String msg = "Cannot lock storage " + path
             + ". The directory is already locked.";
        System.out.println(msg);
         throw new IOException(msg);
       }
     }

    private FileLock tryLock(String path,boolean isShare) throws
IOException {
        boolean deletionHookAdded = false;
        File lockF = new File(path);
        if (!lockF.exists()) {
          lockF.deleteOnExit();
          deletionHookAdded = true;
        }
        RandomAccessFile file = new RandomAccessFile(lockF, "rws");
        FileLock res = null;
        try {
          res = file.getChannel().tryLock(0,Long.MAX_VALUE,isShare);
        } catch (OverlappingFileLockException oe) {
          file.close();
          return null;
        } catch (IOException e) {
          e.printStackTrace();
          file.close();
          throw e;
        }
        if (res != null && !deletionHookAdded) {
          // If the file existed prior to our startup, we didn't
          // call deleteOnExit above. But since we successfully locked
          // the dir, we can take care of cleaning it up.
          lockF.deleteOnExit();
        }
        return res;
      }

    public static void main(String[] s) {
     FileLockTest fileLockTest =new FileLockTest();
     try {
      fileLockTest.lock(s[0], Boolean.valueOf(s[1]));

      Thread.sleep(1000*60*60*1);
     } catch (Exception e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
     }
    }
}

I do two test cases.

1. The network is OK
I run " java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command in 192.168.1.221 to hold file
lock, and then I run same command to hold same file lock in 192.168.1.222,
throw below exception:
Cannot lock storage /home/hdfs.ha/share/test.lock. The directory is already
locked.
java.io.IOException: Cannot lock storage /home/hdfs.ha/share/test.lock. The
directory is already locked.
        at lock.FileLockTest.lock(FileLockTest.java:18)
        at lock.FileLockTest.main(FileLockTest.java:53)

2. machine which hold file lock is diconnected
I run " java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command on 192.168.1.221,  then
192.168.1.221  machine is disconnected from network . After three minutes ,
I run the "  java -cp ./filelock.jar lock.FileLockTest
/home/hdfs.ha/share/test.lock false" command on 192.168.1.222, that can
hold the file lock.
 I use "mount | grep nfs" command to examine the mount nfs directory on
192.168.1.221, the share directory /home/hdfs.ha/share/ is disappear on
192.168.1.221 machine.  So I think when the machine is disconnected for
a long time, other machine can receive the same file lock.

Re: HDFS HA IO Fencing

Posted by Steve Loughran <st...@hortonworks.com>.

On 26 October 2012 15:37, Todd Lipcon <to...@cloudera.com> wrote:

> NFS Locks typically last forever if you disconnect abruptly. So they are
> not sufficient -- your standby wouldn't be able to take over without manual
> intervention to remove the lock.



+1. This is why you are told to mount your shared edit log NFS servers with
the "soft" option -otherwise you can't swap over to the other machines.
see: http://www.faqs.org/docs/linux_network/x-087-2-nfs.mountd.html


>
> If you want to build an unreliable system that might corrupt your data,
> you could set up 'shell(/bin/true)' as a second fencing method. But, it's
> really a bad idea. There are failure scenarios which could cause split
> brain if you do this, and you'd very likely lose data
>

You could do it on a cluster where you were prepared to lose/corrupt all
the data. But in that world: why bother?

Re: HDFS HA IO Fencing

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.

If you use NSFv4 you should be able to use locks and when a machine dies /
fails to renew the lease, the other machine can take over.

On Friday, October 26, 2012, Todd Lipcon wrote:

> NFS Locks typically last forever if you disconnect abruptly. So they are
> not sufficient -- your standby wouldn't be able to take over without manual
> intervention to remove the lock.
>
> If you want to build an unreliable system that might corrupt your data,
> you could set up 'shell(/bin/true)' as a second fencing method. But, it's
> really a bad idea. There are failure scenarios which could cause split
> brain if you do this, and you'd very likely lose data.
>
> -Todd
>
> On Fri, Oct 26, 2012 at 1:59 AM, lei liu <liulei412@gmail.com<javascript:_e({}, 'cvml', 'liulei412@gmail.com');>
> > wrote:
>
>> We are using NFS for Shared storage,  Can we use linux nfslcok service to
>> implement IO Fencing ?
>>
>>
>> 2012/10/26 Steve Loughran <stevel@hortonworks.com <javascript:_e({},
>> 'cvml', 'stevel@hortonworks.com');>>
>>
>>>
>>>
>>> On 25 October 2012 14:08, Todd Lipcon <todd@cloudera.com<javascript:_e({}, 'cvml', 'todd@cloudera.com');>
>>> > wrote:
>>>
>>>> Hi Liu,
>>>>
>>>> Locks are not sufficient, because there is no way to enforce a lock in
>>>> a distributed system without unbounded blocking. What you might be
>>>> referring to is a lease, but leases are still problematic unless you can
>>>> put bounds on the speed with which clocks progress on different machines,
>>>> _and_ have strict guarantees on the way each node's scheduler works. With
>>>> Linux and Java, the latter is tough.
>>>>
>>>>
>>> on any OS running in any virtual environment, including EC2, time is
>>> entirely unpredictable, just to make things worse.
>>>
>>>
>>> On a single machine you can use file locking as the OS will know that
>>> the process is dead and closes the file; other programs can attempt to open
>>> the same file with exclusive locking -and, by getting the right failures,
>>> know that something else has the file, hence the other process is live.
>>> Shared NFS storage you need to mount with softlock set precisely to stop
>>> file locks lasting until some lease has expired, because the on-host
>>> liveness probes detect failure faster and want to react to it.
>>>
>>>
>>> -Steve
>>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


-- 
Thanks
-balaji

--
http://balajin.net/blog/
http://flic.kr/balajijegan

Re: HDFS HA IO Fencing

Posted by Steve Loughran <st...@hortonworks.com>.

On 26 October 2012 15:37, Todd Lipcon <to...@cloudera.com> wrote:

> NFS Locks typically last forever if you disconnect abruptly. So they are
> not sufficient -- your standby wouldn't be able to take over without manual
> intervention to remove the lock.



+1. This is why you are told to mount your shared edit log NFS servers with
the "soft" option -otherwise you can't swap over to the other machines.
see: http://www.faqs.org/docs/linux_network/x-087-2-nfs.mountd.html


>
> If you want to build an unreliable system that might corrupt your data,
> you could set up 'shell(/bin/true)' as a second fencing method. But, it's
> really a bad idea. There are failure scenarios which could cause split
> brain if you do this, and you'd very likely lose data
>

You could do it on a cluster where you were prepared to lose/corrupt all
the data. But in that world: why bother?

Re: HDFS HA IO Fencing

Posted by Steve Loughran <st...@hortonworks.com>.

On 26 October 2012 15:37, Todd Lipcon <to...@cloudera.com> wrote:

> NFS Locks typically last forever if you disconnect abruptly. So they are
> not sufficient -- your standby wouldn't be able to take over without manual
> intervention to remove the lock.



+1. This is why you are told to mount your shared edit log NFS servers with
the "soft" option -otherwise you can't swap over to the other machines.
see: http://www.faqs.org/docs/linux_network/x-087-2-nfs.mountd.html


>
> If you want to build an unreliable system that might corrupt your data,
> you could set up 'shell(/bin/true)' as a second fencing method. But, it's
> really a bad idea. There are failure scenarios which could cause split
> brain if you do this, and you'd very likely lose data
>

You could do it on a cluster where you were prepared to lose/corrupt all
the data. But in that world: why bother?

Re: HDFS HA IO Fencing

Posted by Steve Loughran <st...@hortonworks.com>.

On 26 October 2012 15:37, Todd Lipcon <to...@cloudera.com> wrote:

> NFS Locks typically last forever if you disconnect abruptly. So they are
> not sufficient -- your standby wouldn't be able to take over without manual
> intervention to remove the lock.



+1. This is why you are told to mount your shared edit log NFS servers with
the "soft" option -otherwise you can't swap over to the other machines.
see: http://www.faqs.org/docs/linux_network/x-087-2-nfs.mountd.html


>
> If you want to build an unreliable system that might corrupt your data,
> you could set up 'shell(/bin/true)' as a second fencing method. But, it's
> really a bad idea. There are failure scenarios which could cause split
> brain if you do this, and you'd very likely lose data
>

You could do it on a cluster where you were prepared to lose/corrupt all
the data. But in that world: why bother?

Re: HDFS HA IO Fencing

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.

If you use NSFv4 you should be able to use locks and when a machine dies /
fails to renew the lease, the other machine can take over.

On Friday, October 26, 2012, Todd Lipcon wrote:

> NFS Locks typically last forever if you disconnect abruptly. So they are
> not sufficient -- your standby wouldn't be able to take over without manual
> intervention to remove the lock.
>
> If you want to build an unreliable system that might corrupt your data,
> you could set up 'shell(/bin/true)' as a second fencing method. But, it's
> really a bad idea. There are failure scenarios which could cause split
> brain if you do this, and you'd very likely lose data.
>
> -Todd
>
> On Fri, Oct 26, 2012 at 1:59 AM, lei liu <liulei412@gmail.com<javascript:_e({}, 'cvml', 'liulei412@gmail.com');>
> > wrote:
>
>> We are using NFS for Shared storage,  Can we use linux nfslcok service to
>> implement IO Fencing ?
>>
>>
>> 2012/10/26 Steve Loughran <stevel@hortonworks.com <javascript:_e({},
>> 'cvml', 'stevel@hortonworks.com');>>
>>
>>>
>>>
>>> On 25 October 2012 14:08, Todd Lipcon <todd@cloudera.com<javascript:_e({}, 'cvml', 'todd@cloudera.com');>
>>> > wrote:
>>>
>>>> Hi Liu,
>>>>
>>>> Locks are not sufficient, because there is no way to enforce a lock in
>>>> a distributed system without unbounded blocking. What you might be
>>>> referring to is a lease, but leases are still problematic unless you can
>>>> put bounds on the speed with which clocks progress on different machines,
>>>> _and_ have strict guarantees on the way each node's scheduler works. With
>>>> Linux and Java, the latter is tough.
>>>>
>>>>
>>> on any OS running in any virtual environment, including EC2, time is
>>> entirely unpredictable, just to make things worse.
>>>
>>>
>>> On a single machine you can use file locking as the OS will know that
>>> the process is dead and closes the file; other programs can attempt to open
>>> the same file with exclusive locking -and, by getting the right failures,
>>> know that something else has the file, hence the other process is live.
>>> Shared NFS storage you need to mount with softlock set precisely to stop
>>> file locks lasting until some lease has expired, because the on-host
>>> liveness probes detect failure faster and want to react to it.
>>>
>>>
>>> -Steve
>>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


-- 
Thanks
-balaji

--
http://balajin.net/blog/
http://flic.kr/balajijegan

Re: HDFS HA IO Fencing

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.

If you use NSFv4 you should be able to use locks and when a machine dies /
fails to renew the lease, the other machine can take over.

On Friday, October 26, 2012, Todd Lipcon wrote:

> NFS Locks typically last forever if you disconnect abruptly. So they are
> not sufficient -- your standby wouldn't be able to take over without manual
> intervention to remove the lock.
>
> If you want to build an unreliable system that might corrupt your data,
> you could set up 'shell(/bin/true)' as a second fencing method. But, it's
> really a bad idea. There are failure scenarios which could cause split
> brain if you do this, and you'd very likely lose data.
>
> -Todd
>
> On Fri, Oct 26, 2012 at 1:59 AM, lei liu <liulei412@gmail.com<javascript:_e({}, 'cvml', 'liulei412@gmail.com');>
> > wrote:
>
>> We are using NFS for Shared storage,  Can we use linux nfslcok service to
>> implement IO Fencing ?
>>
>>
>> 2012/10/26 Steve Loughran <stevel@hortonworks.com <javascript:_e({},
>> 'cvml', 'stevel@hortonworks.com');>>
>>
>>>
>>>
>>> On 25 October 2012 14:08, Todd Lipcon <todd@cloudera.com<javascript:_e({}, 'cvml', 'todd@cloudera.com');>
>>> > wrote:
>>>
>>>> Hi Liu,
>>>>
>>>> Locks are not sufficient, because there is no way to enforce a lock in
>>>> a distributed system without unbounded blocking. What you might be
>>>> referring to is a lease, but leases are still problematic unless you can
>>>> put bounds on the speed with which clocks progress on different machines,
>>>> _and_ have strict guarantees on the way each node's scheduler works. With
>>>> Linux and Java, the latter is tough.
>>>>
>>>>
>>> on any OS running in any virtual environment, including EC2, time is
>>> entirely unpredictable, just to make things worse.
>>>
>>>
>>> On a single machine you can use file locking as the OS will know that
>>> the process is dead and closes the file; other programs can attempt to open
>>> the same file with exclusive locking -and, by getting the right failures,
>>> know that something else has the file, hence the other process is live.
>>> Shared NFS storage you need to mount with softlock set precisely to stop
>>> file locks lasting until some lease has expired, because the on-host
>>> liveness probes detect failure faster and want to react to it.
>>>
>>>
>>> -Steve
>>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


-- 
Thanks
-balaji

--
http://balajin.net/blog/
http://flic.kr/balajijegan

Re: HDFS HA IO Fencing

Posted by Todd Lipcon <to...@cloudera.com>.

NFS Locks typically last forever if you disconnect abruptly. So they are
not sufficient -- your standby wouldn't be able to take over without manual
intervention to remove the lock.

If you want to build an unreliable system that might corrupt your data, you
could set up 'shell(/bin/true)' as a second fencing method. But, it's
really a bad idea. There are failure scenarios which could cause split
brain if you do this, and you'd very likely lose data.

-Todd

On Fri, Oct 26, 2012 at 1:59 AM, lei liu <li...@gmail.com> wrote:

> We are using NFS for Shared storage,  Can we use linux nfslcok service to
> implement IO Fencing ?
>
>
> 2012/10/26 Steve Loughran <st...@hortonworks.com>
>
>>
>>
>> On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:
>>
>>> Hi Liu,
>>>
>>> Locks are not sufficient, because there is no way to enforce a lock in a
>>> distributed system without unbounded blocking. What you might be referring
>>> to is a lease, but leases are still problematic unless you can put bounds
>>> on the speed with which clocks progress on different machines, _and_ have
>>> strict guarantees on the way each node's scheduler works. With Linux and
>>> Java, the latter is tough.
>>>
>>>
>> on any OS running in any virtual environment, including EC2, time is
>> entirely unpredictable, just to make things worse.
>>
>>
>> On a single machine you can use file locking as the OS will know that the
>> process is dead and closes the file; other programs can attempt to open the
>> same file with exclusive locking -and, by getting the right failures, know
>> that something else has the file, hence the other process is live. Shared
>> NFS storage you need to mount with softlock set precisely to stop file
>> locks lasting until some lease has expired, because the on-host liveness
>> probes detect failure faster and want to react to it.
>>
>>
>> -Steve
>>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS HA IO Fencing

Posted by Todd Lipcon <to...@cloudera.com>.

NFS Locks typically last forever if you disconnect abruptly. So they are
not sufficient -- your standby wouldn't be able to take over without manual
intervention to remove the lock.

If you want to build an unreliable system that might corrupt your data, you
could set up 'shell(/bin/true)' as a second fencing method. But, it's
really a bad idea. There are failure scenarios which could cause split
brain if you do this, and you'd very likely lose data.

-Todd

On Fri, Oct 26, 2012 at 1:59 AM, lei liu <li...@gmail.com> wrote:

> We are using NFS for Shared storage,  Can we use linux nfslcok service to
> implement IO Fencing ?
>
>
> 2012/10/26 Steve Loughran <st...@hortonworks.com>
>
>>
>>
>> On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:
>>
>>> Hi Liu,
>>>
>>> Locks are not sufficient, because there is no way to enforce a lock in a
>>> distributed system without unbounded blocking. What you might be referring
>>> to is a lease, but leases are still problematic unless you can put bounds
>>> on the speed with which clocks progress on different machines, _and_ have
>>> strict guarantees on the way each node's scheduler works. With Linux and
>>> Java, the latter is tough.
>>>
>>>
>> on any OS running in any virtual environment, including EC2, time is
>> entirely unpredictable, just to make things worse.
>>
>>
>> On a single machine you can use file locking as the OS will know that the
>> process is dead and closes the file; other programs can attempt to open the
>> same file with exclusive locking -and, by getting the right failures, know
>> that something else has the file, hence the other process is live. Shared
>> NFS storage you need to mount with softlock set precisely to stop file
>> locks lasting until some lease has expired, because the on-host liveness
>> probes detect failure faster and want to react to it.
>>
>>
>> -Steve
>>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS HA IO Fencing

Posted by Todd Lipcon <to...@cloudera.com>.

NFS Locks typically last forever if you disconnect abruptly. So they are
not sufficient -- your standby wouldn't be able to take over without manual
intervention to remove the lock.

If you want to build an unreliable system that might corrupt your data, you
could set up 'shell(/bin/true)' as a second fencing method. But, it's
really a bad idea. There are failure scenarios which could cause split
brain if you do this, and you'd very likely lose data.

-Todd

On Fri, Oct 26, 2012 at 1:59 AM, lei liu <li...@gmail.com> wrote:

> We are using NFS for Shared storage,  Can we use linux nfslcok service to
> implement IO Fencing ?
>
>
> 2012/10/26 Steve Loughran <st...@hortonworks.com>
>
>>
>>
>> On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:
>>
>>> Hi Liu,
>>>
>>> Locks are not sufficient, because there is no way to enforce a lock in a
>>> distributed system without unbounded blocking. What you might be referring
>>> to is a lease, but leases are still problematic unless you can put bounds
>>> on the speed with which clocks progress on different machines, _and_ have
>>> strict guarantees on the way each node's scheduler works. With Linux and
>>> Java, the latter is tough.
>>>
>>>
>> on any OS running in any virtual environment, including EC2, time is
>> entirely unpredictable, just to make things worse.
>>
>>
>> On a single machine you can use file locking as the OS will know that the
>> process is dead and closes the file; other programs can attempt to open the
>> same file with exclusive locking -and, by getting the right failures, know
>> that something else has the file, hence the other process is live. Shared
>> NFS storage you need to mount with softlock set precisely to stop file
>> locks lasting until some lease has expired, because the on-host liveness
>> probes detect failure faster and want to react to it.
>>
>>
>> -Steve
>>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS HA IO Fencing

Posted by Todd Lipcon <to...@cloudera.com>.

NFS Locks typically last forever if you disconnect abruptly. So they are
not sufficient -- your standby wouldn't be able to take over without manual
intervention to remove the lock.

If you want to build an unreliable system that might corrupt your data, you
could set up 'shell(/bin/true)' as a second fencing method. But, it's
really a bad idea. There are failure scenarios which could cause split
brain if you do this, and you'd very likely lose data.

-Todd

On Fri, Oct 26, 2012 at 1:59 AM, lei liu <li...@gmail.com> wrote:

> We are using NFS for Shared storage,  Can we use linux nfslcok service to
> implement IO Fencing ?
>
>
> 2012/10/26 Steve Loughran <st...@hortonworks.com>
>
>>
>>
>> On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:
>>
>>> Hi Liu,
>>>
>>> Locks are not sufficient, because there is no way to enforce a lock in a
>>> distributed system without unbounded blocking. What you might be referring
>>> to is a lease, but leases are still problematic unless you can put bounds
>>> on the speed with which clocks progress on different machines, _and_ have
>>> strict guarantees on the way each node's scheduler works. With Linux and
>>> Java, the latter is tough.
>>>
>>>
>> on any OS running in any virtual environment, including EC2, time is
>> entirely unpredictable, just to make things worse.
>>
>>
>> On a single machine you can use file locking as the OS will know that the
>> process is dead and closes the file; other programs can attempt to open the
>> same file with exclusive locking -and, by getting the right failures, know
>> that something else has the file, hence the other process is live. Shared
>> NFS storage you need to mount with softlock set precisely to stop file
>> locks lasting until some lease has expired, because the on-host liveness
>> probes detect failure faster and want to react to it.
>>
>>
>> -Steve
>>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS HA IO Fencing

Posted by lei liu <li...@gmail.com>.

We are using NFS for Shared storage,  Can we use linux nfslcok service to
implement IO Fencing ?

2012/10/26 Steve Loughran <st...@hortonworks.com>

>
>
> On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:
>
>> Hi Liu,
>>
>> Locks are not sufficient, because there is no way to enforce a lock in a
>> distributed system without unbounded blocking. What you might be referring
>> to is a lease, but leases are still problematic unless you can put bounds
>> on the speed with which clocks progress on different machines, _and_ have
>> strict guarantees on the way each node's scheduler works. With Linux and
>> Java, the latter is tough.
>>
>>
> on any OS running in any virtual environment, including EC2, time is
> entirely unpredictable, just to make things worse.
>
>
> On a single machine you can use file locking as the OS will know that the
> process is dead and closes the file; other programs can attempt to open the
> same file with exclusive locking -and, by getting the right failures, know
> that something else has the file, hence the other process is live. Shared
> NFS storage you need to mount with softlock set precisely to stop file
> locks lasting until some lease has expired, because the on-host liveness
> probes detect failure faster and want to react to it.
>
>
> -Steve
>

Re: HDFS HA IO Fencing

Posted by lei liu <li...@gmail.com>.

We are using NFS for Shared storage,  Can we use linux nfslcok service to
implement IO Fencing ?

2012/10/26 Steve Loughran <st...@hortonworks.com>

>
>
> On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:
>
>> Hi Liu,
>>
>> Locks are not sufficient, because there is no way to enforce a lock in a
>> distributed system without unbounded blocking. What you might be referring
>> to is a lease, but leases are still problematic unless you can put bounds
>> on the speed with which clocks progress on different machines, _and_ have
>> strict guarantees on the way each node's scheduler works. With Linux and
>> Java, the latter is tough.
>>
>>
> on any OS running in any virtual environment, including EC2, time is
> entirely unpredictable, just to make things worse.
>
>
> On a single machine you can use file locking as the OS will know that the
> process is dead and closes the file; other programs can attempt to open the
> same file with exclusive locking -and, by getting the right failures, know
> that something else has the file, hence the other process is live. Shared
> NFS storage you need to mount with softlock set precisely to stop file
> locks lasting until some lease has expired, because the on-host liveness
> probes detect failure faster and want to react to it.
>
>
> -Steve
>

Re: HDFS HA IO Fencing

Posted by lei liu <li...@gmail.com>.

We are using NFS for Shared storage,  Can we use linux nfslcok service to
implement IO Fencing ?

2012/10/26 Steve Loughran <st...@hortonworks.com>

>
>
> On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:
>
>> Hi Liu,
>>
>> Locks are not sufficient, because there is no way to enforce a lock in a
>> distributed system without unbounded blocking. What you might be referring
>> to is a lease, but leases are still problematic unless you can put bounds
>> on the speed with which clocks progress on different machines, _and_ have
>> strict guarantees on the way each node's scheduler works. With Linux and
>> Java, the latter is tough.
>>
>>
> on any OS running in any virtual environment, including EC2, time is
> entirely unpredictable, just to make things worse.
>
>
> On a single machine you can use file locking as the OS will know that the
> process is dead and closes the file; other programs can attempt to open the
> same file with exclusive locking -and, by getting the right failures, know
> that something else has the file, hence the other process is live. Shared
> NFS storage you need to mount with softlock set precisely to stop file
> locks lasting until some lease has expired, because the on-host liveness
> probes detect failure faster and want to react to it.
>
>
> -Steve
>

Re: HDFS HA IO Fencing

Posted by lei liu <li...@gmail.com>.

We are using NFS for Shared storage,  Can we use linux nfslcok service to
implement IO Fencing ?

2012/10/26 Steve Loughran <st...@hortonworks.com>

>
>
> On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:
>
>> Hi Liu,
>>
>> Locks are not sufficient, because there is no way to enforce a lock in a
>> distributed system without unbounded blocking. What you might be referring
>> to is a lease, but leases are still problematic unless you can put bounds
>> on the speed with which clocks progress on different machines, _and_ have
>> strict guarantees on the way each node's scheduler works. With Linux and
>> Java, the latter is tough.
>>
>>
> on any OS running in any virtual environment, including EC2, time is
> entirely unpredictable, just to make things worse.
>
>
> On a single machine you can use file locking as the OS will know that the
> process is dead and closes the file; other programs can attempt to open the
> same file with exclusive locking -and, by getting the right failures, know
> that something else has the file, hence the other process is live. Shared
> NFS storage you need to mount with softlock set precisely to stop file
> locks lasting until some lease has expired, because the on-host liveness
> probes detect failure faster and want to react to it.
>
>
> -Steve
>

Re: HDFS HA IO Fencing

Posted by Steve Loughran <st...@hortonworks.com>.

On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:

> Hi Liu,
>
> Locks are not sufficient, because there is no way to enforce a lock in a
> distributed system without unbounded blocking. What you might be referring
> to is a lease, but leases are still problematic unless you can put bounds
> on the speed with which clocks progress on different machines, _and_ have
> strict guarantees on the way each node's scheduler works. With Linux and
> Java, the latter is tough.
>
>
on any OS running in any virtual environment, including EC2, time is
entirely unpredictable, just to make things worse.

On a single machine you can use file locking as the OS will know that the
process is dead and closes the file; other programs can attempt to open the
same file with exclusive locking -and, by getting the right failures, know
that something else has the file, hence the other process is live. Shared
NFS storage you need to mount with softlock set precisely to stop file
locks lasting until some lease has expired, because the on-host liveness
probes detect failure faster and want to react to it.

-Steve

Re: HDFS HA IO Fencing

Posted by Steve Loughran <st...@hortonworks.com>.

On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:

> Hi Liu,
>
> Locks are not sufficient, because there is no way to enforce a lock in a
> distributed system without unbounded blocking. What you might be referring
> to is a lease, but leases are still problematic unless you can put bounds
> on the speed with which clocks progress on different machines, _and_ have
> strict guarantees on the way each node's scheduler works. With Linux and
> Java, the latter is tough.
>
>
on any OS running in any virtual environment, including EC2, time is
entirely unpredictable, just to make things worse.

On a single machine you can use file locking as the OS will know that the
process is dead and closes the file; other programs can attempt to open the
same file with exclusive locking -and, by getting the right failures, know
that something else has the file, hence the other process is live. Shared
NFS storage you need to mount with softlock set precisely to stop file
locks lasting until some lease has expired, because the on-host liveness
probes detect failure faster and want to react to it.

-Steve

Re: HDFS HA IO Fencing

Posted by Steve Loughran <st...@hortonworks.com>.

On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:

> Hi Liu,
>
> Locks are not sufficient, because there is no way to enforce a lock in a
> distributed system without unbounded blocking. What you might be referring
> to is a lease, but leases are still problematic unless you can put bounds
> on the speed with which clocks progress on different machines, _and_ have
> strict guarantees on the way each node's scheduler works. With Linux and
> Java, the latter is tough.
>
>
on any OS running in any virtual environment, including EC2, time is
entirely unpredictable, just to make things worse.

On a single machine you can use file locking as the OS will know that the
process is dead and closes the file; other programs can attempt to open the
same file with exclusive locking -and, by getting the right failures, know
that something else has the file, hence the other process is live. Shared
NFS storage you need to mount with softlock set precisely to stop file
locks lasting until some lease has expired, because the on-host liveness
probes detect failure faster and want to react to it.

-Steve

Re: HDFS HA IO Fencing

Posted by Steve Loughran <st...@hortonworks.com>.

On 25 October 2012 14:08, Todd Lipcon <to...@cloudera.com> wrote:

> Hi Liu,
>
> Locks are not sufficient, because there is no way to enforce a lock in a
> distributed system without unbounded blocking. What you might be referring
> to is a lease, but leases are still problematic unless you can put bounds
> on the speed with which clocks progress on different machines, _and_ have
> strict guarantees on the way each node's scheduler works. With Linux and
> Java, the latter is tough.
>
>
on any OS running in any virtual environment, including EC2, time is
entirely unpredictable, just to make things worse.

On a single machine you can use file locking as the OS will know that the
process is dead and closes the file; other programs can attempt to open the
same file with exclusive locking -and, by getting the right failures, know
that something else has the file, hence the other process is live. Shared
NFS storage you need to mount with softlock set precisely to stop file
locks lasting until some lease has expired, because the on-host liveness
probes detect failure faster and want to react to it.

-Steve

Re: HDFS HA IO Fencing

Posted by Todd Lipcon <to...@cloudera.com>.

Hi Liu,

Locks are not sufficient, because there is no way to enforce a lock in a
distributed system without unbounded blocking. What you might be referring
to is a lease, but leases are still problematic unless you can put bounds
on the speed with which clocks progress on different machines, _and_ have
strict guarantees on the way each node's scheduler works. With Linux and
Java, the latter is tough.

You may want to look into QuorumJournalManager which doesn't require
setting up IO fencing.

-Todd

On Thu, Oct 25, 2012 at 1:27 AM, lei liu <li...@gmail.com> wrote:

> I want to use HDFS HA function, I find the  IO Fencing function is
> complex in hadoop2.0. I think we can use  file lock to implement the  IO
> Fencing function, I think that is simple.
>
> Thanks,
>
> LiuLei
>

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS HA IO Fencing

Posted by Todd Lipcon <to...@cloudera.com>.

Hi Liu,

Locks are not sufficient, because there is no way to enforce a lock in a
distributed system without unbounded blocking. What you might be referring
to is a lease, but leases are still problematic unless you can put bounds
on the speed with which clocks progress on different machines, _and_ have
strict guarantees on the way each node's scheduler works. With Linux and
Java, the latter is tough.

You may want to look into QuorumJournalManager which doesn't require
setting up IO fencing.

-Todd

On Thu, Oct 25, 2012 at 1:27 AM, lei liu <li...@gmail.com> wrote:

> I want to use HDFS HA function, I find the  IO Fencing function is
> complex in hadoop2.0. I think we can use  file lock to implement the  IO
> Fencing function, I think that is simple.
>
> Thanks,
>
> LiuLei
>

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS HA IO Fencing

Posted by Todd Lipcon <to...@cloudera.com>.

Hi Liu,

Locks are not sufficient, because there is no way to enforce a lock in a
distributed system without unbounded blocking. What you might be referring
to is a lease, but leases are still problematic unless you can put bounds
on the speed with which clocks progress on different machines, _and_ have
strict guarantees on the way each node's scheduler works. With Linux and
Java, the latter is tough.

You may want to look into QuorumJournalManager which doesn't require
setting up IO fencing.

-Todd

On Thu, Oct 25, 2012 at 1:27 AM, lei liu <li...@gmail.com> wrote:

> I want to use HDFS HA function, I find the  IO Fencing function is
> complex in hadoop2.0. I think we can use  file lock to implement the  IO
> Fencing function, I think that is simple.
>
> Thanks,
>
> LiuLei
>

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS HA IO Fencing

Posted by Todd Lipcon <to...@cloudera.com>.

Hi Liu,

Locks are not sufficient, because there is no way to enforce a lock in a
distributed system without unbounded blocking. What you might be referring
to is a lease, but leases are still problematic unless you can put bounds
on the speed with which clocks progress on different machines, _and_ have
strict guarantees on the way each node's scheduler works. With Linux and
Java, the latter is tough.

You may want to look into QuorumJournalManager which doesn't require
setting up IO fencing.

-Todd

On Thu, Oct 25, 2012 at 1:27 AM, lei liu <li...@gmail.com> wrote:

> I want to use HDFS HA function, I find the  IO Fencing function is
> complex in hadoop2.0. I think we can use  file lock to implement the  IO
> Fencing function, I think that is simple.
>
> Thanks,
>
> LiuLei
>

-- 
Todd Lipcon
Software Engineer, Cloudera