You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@ambari.apache.org by Jayush Luniya <jl...@hortonworks.com> on 2016/04/22 00:22:04 UTC

Review Request 46544: AMBARI-16028: Namenode marked as INITIAL standby could potentially never start if other namenode is down

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46544/
-----------------------------------------------------------

Review request for Ambari, Alejandro Fernandez, Robert Nettleton, and Sumit Mohanty.


Bugs: AMBARI-16028
    https://issues.apache.org/jira/browse/AMBARI-16028


Repository: ambari


Description
-------

1. During Namenode HA blueprint deployment, we configure the name nodes to start in active/standby mode based on the following properties
     {
        "hadoop-env": {
          "properties" : {
            "dfs_ha_initial_namenode_active" : "jay-msft-1.c.pramod-thangali.internal",
            "dfs_ha_initial_namenode_standby" : "jay-msft-2.c.pramod-thangali.internal”
          }
        }
      }
2. The current logic is to always bootstrap the name node marked as standby.
3. This will lead to the Namenode marked as Standby to never start under the following situation
- Cluster is deployed successfully
- Both name nodes are stopped
- Start the name node marked as standby. Namenode will never start.
- This is because the standby name node will try to bootstrap again.
- However to bootstrap a name node an active name node is required. Based on the HDFS logic the first step done when bootstrapping is to connect to the Active Namenode.
- Also there is no need to bootstrap here as the name node should already be bootstrapped and should come back up as “Active"


Fix:
- The fix is to maintain a bootstrap marker file (similar to the way we keep a name node formatted marker file)
- In the INITIAL_START phase (during cluster deployment) we will always force bootstrap so as to enforce the name node marked as Standby to wait for the Active name node to come up, bootstrap and start in STANDBY node.
- Once we are out of INITIAL_START phase, we will bootstrap only if this name node has not been bootstrapped in the past.
- We will not enforce bootstrapping only in the INITIAL_START phase because there is a possibility during cluster deployment that both name nodes don’t start and hence bootstrapping out of INITIAL_START phase would be required in this case.


Diffs
-----

  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py 8b6c924 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/params_linux.py d8ff3c5 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 1c08d57 

Diff: https://reviews.apache.org/r/46544/diff/


Testing
-------

mvn clean test -DskipSurefireTests


Thanks,

Jayush Luniya


Re: Review Request 46544: AMBARI-16028: Namenode marked as INITIAL standby could potentially never start if other namenode is down

Posted by Robert Nettleton <rn...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46544/#review130117
-----------------------------------------------------------


Ship it!




Ship It!

- Robert Nettleton


On April 21, 2016, 10:24 p.m., Jayush Luniya wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46544/
> -----------------------------------------------------------
> 
> (Updated April 21, 2016, 10:24 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Robert Nettleton, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-16028
>     https://issues.apache.org/jira/browse/AMBARI-16028
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> 1. During Namenode HA blueprint deployment, we configure the name nodes to start in active/standby mode based on the following properties
>      {
>         "hadoop-env": {
>           "properties" : {
>             "dfs_ha_initial_namenode_active" : "host1",
>             "dfs_ha_initial_namenode_standby" : "host2”
>           }
>         }
>       }
> 2. The current logic is to always bootstrap the name node marked as standby.
> 3. This will lead to the Namenode marked as Standby to never start under the following situation
> 
> - Cluster is deployed successfully
> - Both name nodes are stopped
> - Start the name node marked as standby. Namenode will never start.
> - This is because the standby name node will try to bootstrap again.
> - However to bootstrap a name node an active name node is required. Based on the HDFS logic the first step done when bootstrapping is to connect to the Active Namenode.
> - Also there is no need to bootstrap here as the name node should already be bootstrapped and should come back up as “Active"
> 
> 
> Fix:
> - The fix is to maintain a bootstrap marker file (similar to the way we keep a name node formatted marker file)
> - In the INITIAL_START phase (during cluster deployment) we will always force bootstrap so as to enforce the name node marked as Standby to wait for the Active name node to come up, bootstrap and start in STANDBY node.
> - Once we are out of INITIAL_START phase, we will bootstrap only if this name node has not been bootstrapped in the past.
> - We will not enforce bootstrapping only in the INITIAL_START phase because there is a possibility during cluster deployment that both name nodes don’t start and hence bootstrapping out of INITIAL_START phase would be required in this case.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py 8b6c924 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/params_linux.py d8ff3c5 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 1c08d57 
> 
> Diff: https://reviews.apache.org/r/46544/diff/
> 
> 
> Testing
> -------
> 
> Verified scenarios on live cluster.
> 
> mvn clean test -DskipSurefireTests
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 54.784s
> [INFO] Finished at: Thu Apr 21 15:22:52 PDT 2016
> [INFO] Final Memory: 64M/1172M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Jayush Luniya
> 
>


Re: Review Request 46544: AMBARI-16028: Namenode marked as INITIAL standby could potentially never start if other namenode is down

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46544/#review129997
-----------------------------------------------------------


Ship it!




Ship It!

- Alejandro Fernandez


On April 21, 2016, 10:24 p.m., Jayush Luniya wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46544/
> -----------------------------------------------------------
> 
> (Updated April 21, 2016, 10:24 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Robert Nettleton, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-16028
>     https://issues.apache.org/jira/browse/AMBARI-16028
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> 1. During Namenode HA blueprint deployment, we configure the name nodes to start in active/standby mode based on the following properties
>      {
>         "hadoop-env": {
>           "properties" : {
>             "dfs_ha_initial_namenode_active" : "host1",
>             "dfs_ha_initial_namenode_standby" : "host2”
>           }
>         }
>       }
> 2. The current logic is to always bootstrap the name node marked as standby.
> 3. This will lead to the Namenode marked as Standby to never start under the following situation
> 
> - Cluster is deployed successfully
> - Both name nodes are stopped
> - Start the name node marked as standby. Namenode will never start.
> - This is because the standby name node will try to bootstrap again.
> - However to bootstrap a name node an active name node is required. Based on the HDFS logic the first step done when bootstrapping is to connect to the Active Namenode.
> - Also there is no need to bootstrap here as the name node should already be bootstrapped and should come back up as “Active"
> 
> 
> Fix:
> - The fix is to maintain a bootstrap marker file (similar to the way we keep a name node formatted marker file)
> - In the INITIAL_START phase (during cluster deployment) we will always force bootstrap so as to enforce the name node marked as Standby to wait for the Active name node to come up, bootstrap and start in STANDBY node.
> - Once we are out of INITIAL_START phase, we will bootstrap only if this name node has not been bootstrapped in the past.
> - We will not enforce bootstrapping only in the INITIAL_START phase because there is a possibility during cluster deployment that both name nodes don’t start and hence bootstrapping out of INITIAL_START phase would be required in this case.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py 8b6c924 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/params_linux.py d8ff3c5 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 1c08d57 
> 
> Diff: https://reviews.apache.org/r/46544/diff/
> 
> 
> Testing
> -------
> 
> Verified scenarios on live cluster.
> 
> mvn clean test -DskipSurefireTests
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 54.784s
> [INFO] Finished at: Thu Apr 21 15:22:52 PDT 2016
> [INFO] Final Memory: 64M/1172M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Jayush Luniya
> 
>


Re: Review Request 46544: AMBARI-16028: Namenode marked as INITIAL standby could potentially never start if other namenode is down

Posted by Jayush Luniya <jl...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46544/
-----------------------------------------------------------

(Updated April 21, 2016, 10:24 p.m.)


Review request for Ambari, Alejandro Fernandez, Robert Nettleton, and Sumit Mohanty.


Bugs: AMBARI-16028
    https://issues.apache.org/jira/browse/AMBARI-16028


Repository: ambari


Description (updated)
-------

1. During Namenode HA blueprint deployment, we configure the name nodes to start in active/standby mode based on the following properties
     {
        "hadoop-env": {
          "properties" : {
            "dfs_ha_initial_namenode_active" : "host1",
            "dfs_ha_initial_namenode_standby" : "host2”
          }
        }
      }
2. The current logic is to always bootstrap the name node marked as standby.
3. This will lead to the Namenode marked as Standby to never start under the following situation

- Cluster is deployed successfully
- Both name nodes are stopped
- Start the name node marked as standby. Namenode will never start.
- This is because the standby name node will try to bootstrap again.
- However to bootstrap a name node an active name node is required. Based on the HDFS logic the first step done when bootstrapping is to connect to the Active Namenode.
- Also there is no need to bootstrap here as the name node should already be bootstrapped and should come back up as “Active"


Fix:
- The fix is to maintain a bootstrap marker file (similar to the way we keep a name node formatted marker file)
- In the INITIAL_START phase (during cluster deployment) we will always force bootstrap so as to enforce the name node marked as Standby to wait for the Active name node to come up, bootstrap and start in STANDBY node.
- Once we are out of INITIAL_START phase, we will bootstrap only if this name node has not been bootstrapped in the past.
- We will not enforce bootstrapping only in the INITIAL_START phase because there is a possibility during cluster deployment that both name nodes don’t start and hence bootstrapping out of INITIAL_START phase would be required in this case.


Diffs
-----

  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py 8b6c924 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/params_linux.py d8ff3c5 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 1c08d57 

Diff: https://reviews.apache.org/r/46544/diff/


Testing (updated)
-------

Verified scenarios on live cluster.

mvn clean test -DskipSurefireTests
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 54.784s
[INFO] Finished at: Thu Apr 21 15:22:52 PDT 2016
[INFO] Final Memory: 64M/1172M
[INFO] ------------------------------------------------------------------------


Thanks,

Jayush Luniya