You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by Jonathan Hurley <jh...@hortonworks.com> on 2015/10/28 20:31:47 UTC

Re: Review Request 39731: Express Upgrade: ZKFC Cannot Stop Because Newer Configurations Don't Exist

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39731/
-----------------------------------------------------------

(Updated Oct. 28, 2015, 3:31 p.m.)


Review request for Ambari, Alejandro Fernandez, Jayush Luniya, and Nate Cole.


Summary (updated)
-----------------

Express Upgrade: ZKFC Cannot Stop Because Newer Configurations Don't Exist


Bugs: AMBARI-13615
    https://issues.apache.org/jira/browse/AMBARI-13615


Repository: ambari


Description
-------

During an express upgrade, components are stopped ahead of time. Before {{restart}} is invoked, the following task runs updating all hdp pointers:

{code}
    <group xsi:type="cluster" name="RESTORE_CONFIG_DIRS" title="Restore Configuration Directories">
      <direction>DOWNGRADE</direction>
      <execute-stage title="Restore configuration directories and remove HDP 2.3 symlinks">
        <task xsi:type="execute">
          <script>scripts/ru_set_all.py</script>
          <function>unlink_all_configs</function>
        </task>
      </execute-stage>
    </group>
{code}

After this, all components begin to restart. However, restarting involves a {{stop}} and a {{start}} command. The components are already stopped and most of them have logic that says if the PID says it's not running, then don't stop it twice.

However, some components like ZKFC and HBase Master don't have this logic and try stopping it regardless. The problem arises when a JVM is spun up to stop the process:

Initially it was though that moving the {{hdp-select set all}} to after the {{restart}} groups would solve the problem. As it turns out, moving the {{hdp-select set all}} doesn't work since the {{params.py}} it always taking the new version and building conf/lib/bin directories with it. Additionally, some components have upgrade bugs which calling {{hdp-select set all}} corrects.


Diffs
-----

  ambari-common/src/main/python/resource_management/libraries/functions/hdp_select.py 5f05777 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py 97ad424 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py e9037d8 
  ambari-server/src/main/resources/stacks/HDP/2.1/upgrades/nonrolling-upgrade-2.3.xml 25fd6ab 
  ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.2.xml 63f9f8d 
  ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.3.xml 44413d3 
  ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/nonrolling-upgrade-2.3.xml 407b22b 
  ambari-server/src/test/python/stacks/2.1/HIVE/test_hive_metastore.py 2800224 
  ambari-server/src/test/python/stacks/utils/RMFTestCase.py ab4eed4 

Diff: https://reviews.apache.org/r/39731/diff/


Testing
-------

Upgrades from 2.2 to 2.3, and 2.3 to 2.3+ with ZKFC. 

mvn clean test


Thanks,

Jonathan Hurley


Re: Review Request 39731: Express Upgrade: ZKFC Cannot Stop Because Newer Configurations Don't Exist

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39731/#review104378
-----------------------------------------------------------

Ship it!


Ship It!

- Alejandro Fernandez


On Oct. 28, 2015, 10:49 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39731/
> -----------------------------------------------------------
> 
> (Updated Oct. 28, 2015, 10:49 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Jayush Luniya, and Nate Cole.
> 
> 
> Bugs: AMBARI-13615
>     https://issues.apache.org/jira/browse/AMBARI-13615
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During an express upgrade, components are stopped ahead of time. Before {{restart}} is invoked, the following task runs updating all hdp pointers:
> 
> {code}
>     <group xsi:type="cluster" name="RESTORE_CONFIG_DIRS" title="Restore Configuration Directories">
>       <direction>DOWNGRADE</direction>
>       <execute-stage title="Restore configuration directories and remove HDP 2.3 symlinks">
>         <task xsi:type="execute">
>           <script>scripts/ru_set_all.py</script>
>           <function>unlink_all_configs</function>
>         </task>
>       </execute-stage>
>     </group>
> {code}
> 
> After this, all components begin to restart. However, restarting involves a {{stop}} and a {{start}} command. The components are already stopped and most of them have logic that says if the PID says it's not running, then don't stop it twice.
> 
> However, some components like ZKFC and HBase Master don't have this logic and try stopping it regardless. The problem arises when a JVM is spun up to stop the process:
> 
> Initially it was though that moving the {{hdp-select set all}} to after the {{restart}} groups would solve the problem. As it turns out, moving the {{hdp-select set all}} doesn't work since the {{params.py}} it always taking the new version and building conf/lib/bin directories with it. Additionally, some components have upgrade bugs which calling {{hdp-select set all}} corrects.
> 
> 
> Diffs
> -----
> 
>   ambari-common/src/main/python/resource_management/libraries/functions/hdp_select.py 5f05777 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py 97ad424 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py e9037d8 
>   ambari-server/src/main/resources/stacks/HDP/2.1/upgrades/nonrolling-upgrade-2.3.xml 25fd6ab 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.2.xml 63f9f8d 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.3.xml 44413d3 
>   ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/nonrolling-upgrade-2.3.xml 407b22b 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_datanode.py 263eeb2 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_journalnode.py a6cd740 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 21b71f9 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_nfsgateway.py 5852eaf 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_snamenode.py dfbd887 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_zkfc.py 744d3ba 
>   ambari-server/src/test/python/stacks/2.1/HIVE/test_hive_metastore.py 2800224 
>   ambari-server/src/test/python/stacks/utils/RMFTestCase.py ab4eed4 
> 
> Diff: https://reviews.apache.org/r/39731/diff/
> 
> 
> Testing
> -------
> 
> Upgrades from 2.2 to 2.3, and 2.3 to 2.3+ with ZKFC. 
> 
> mvn clean test
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 39731: Express Upgrade: ZKFC Cannot Stop Because Newer Configurations Don't Exist

Posted by Jayush Luniya <jl...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39731/#review104393
-----------------------------------------------------------



ambari-common/src/main/python/resource_management/libraries/functions/hdp_select.py (line 72)
<https://reviews.apache.org/r/39731/#comment162661>

    Minor: Can you add a comment on why for ZKFC we use hadoop-hdfs-namenode.


- Jayush Luniya


On Oct. 28, 2015, 10:49 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39731/
> -----------------------------------------------------------
> 
> (Updated Oct. 28, 2015, 10:49 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Jayush Luniya, and Nate Cole.
> 
> 
> Bugs: AMBARI-13615
>     https://issues.apache.org/jira/browse/AMBARI-13615
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During an express upgrade, components are stopped ahead of time. Before {{restart}} is invoked, the following task runs updating all hdp pointers:
> 
> {code}
>     <group xsi:type="cluster" name="RESTORE_CONFIG_DIRS" title="Restore Configuration Directories">
>       <direction>DOWNGRADE</direction>
>       <execute-stage title="Restore configuration directories and remove HDP 2.3 symlinks">
>         <task xsi:type="execute">
>           <script>scripts/ru_set_all.py</script>
>           <function>unlink_all_configs</function>
>         </task>
>       </execute-stage>
>     </group>
> {code}
> 
> After this, all components begin to restart. However, restarting involves a {{stop}} and a {{start}} command. The components are already stopped and most of them have logic that says if the PID says it's not running, then don't stop it twice.
> 
> However, some components like ZKFC and HBase Master don't have this logic and try stopping it regardless. The problem arises when a JVM is spun up to stop the process:
> 
> Initially it was though that moving the {{hdp-select set all}} to after the {{restart}} groups would solve the problem. As it turns out, moving the {{hdp-select set all}} doesn't work since the {{params.py}} it always taking the new version and building conf/lib/bin directories with it. Additionally, some components have upgrade bugs which calling {{hdp-select set all}} corrects.
> 
> 
> Diffs
> -----
> 
>   ambari-common/src/main/python/resource_management/libraries/functions/hdp_select.py 5f05777 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py 97ad424 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py e9037d8 
>   ambari-server/src/main/resources/stacks/HDP/2.1/upgrades/nonrolling-upgrade-2.3.xml 25fd6ab 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.2.xml 63f9f8d 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.3.xml 44413d3 
>   ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/nonrolling-upgrade-2.3.xml 407b22b 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_datanode.py 263eeb2 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_journalnode.py a6cd740 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 21b71f9 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_nfsgateway.py 5852eaf 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_snamenode.py dfbd887 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_zkfc.py 744d3ba 
>   ambari-server/src/test/python/stacks/2.1/HIVE/test_hive_metastore.py 2800224 
>   ambari-server/src/test/python/stacks/utils/RMFTestCase.py ab4eed4 
> 
> Diff: https://reviews.apache.org/r/39731/diff/
> 
> 
> Testing
> -------
> 
> Upgrades from 2.2 to 2.3, and 2.3 to 2.3+ with ZKFC. 
> 
> mvn clean test
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 39731: Express Upgrade: ZKFC Cannot Stop Because Newer Configurations Don't Exist

Posted by Jayush Luniya <jl...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39731/#review104394
-----------------------------------------------------------

Ship it!


Ship It!

- Jayush Luniya


On Oct. 28, 2015, 10:49 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39731/
> -----------------------------------------------------------
> 
> (Updated Oct. 28, 2015, 10:49 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Jayush Luniya, and Nate Cole.
> 
> 
> Bugs: AMBARI-13615
>     https://issues.apache.org/jira/browse/AMBARI-13615
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During an express upgrade, components are stopped ahead of time. Before {{restart}} is invoked, the following task runs updating all hdp pointers:
> 
> {code}
>     <group xsi:type="cluster" name="RESTORE_CONFIG_DIRS" title="Restore Configuration Directories">
>       <direction>DOWNGRADE</direction>
>       <execute-stage title="Restore configuration directories and remove HDP 2.3 symlinks">
>         <task xsi:type="execute">
>           <script>scripts/ru_set_all.py</script>
>           <function>unlink_all_configs</function>
>         </task>
>       </execute-stage>
>     </group>
> {code}
> 
> After this, all components begin to restart. However, restarting involves a {{stop}} and a {{start}} command. The components are already stopped and most of them have logic that says if the PID says it's not running, then don't stop it twice.
> 
> However, some components like ZKFC and HBase Master don't have this logic and try stopping it regardless. The problem arises when a JVM is spun up to stop the process:
> 
> Initially it was though that moving the {{hdp-select set all}} to after the {{restart}} groups would solve the problem. As it turns out, moving the {{hdp-select set all}} doesn't work since the {{params.py}} it always taking the new version and building conf/lib/bin directories with it. Additionally, some components have upgrade bugs which calling {{hdp-select set all}} corrects.
> 
> 
> Diffs
> -----
> 
>   ambari-common/src/main/python/resource_management/libraries/functions/hdp_select.py 5f05777 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py 97ad424 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py e9037d8 
>   ambari-server/src/main/resources/stacks/HDP/2.1/upgrades/nonrolling-upgrade-2.3.xml 25fd6ab 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.2.xml 63f9f8d 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.3.xml 44413d3 
>   ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/nonrolling-upgrade-2.3.xml 407b22b 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_datanode.py 263eeb2 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_journalnode.py a6cd740 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 21b71f9 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_nfsgateway.py 5852eaf 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_snamenode.py dfbd887 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_zkfc.py 744d3ba 
>   ambari-server/src/test/python/stacks/2.1/HIVE/test_hive_metastore.py 2800224 
>   ambari-server/src/test/python/stacks/utils/RMFTestCase.py ab4eed4 
> 
> Diff: https://reviews.apache.org/r/39731/diff/
> 
> 
> Testing
> -------
> 
> Upgrades from 2.2 to 2.3, and 2.3 to 2.3+ with ZKFC. 
> 
> mvn clean test
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 39731: Express Upgrade: ZKFC Cannot Stop Because Newer Configurations Don't Exist

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39731/
-----------------------------------------------------------

(Updated Oct. 28, 2015, 6:49 p.m.)


Review request for Ambari, Alejandro Fernandez, Jayush Luniya, and Nate Cole.


Bugs: AMBARI-13615
    https://issues.apache.org/jira/browse/AMBARI-13615


Repository: ambari


Description
-------

During an express upgrade, components are stopped ahead of time. Before {{restart}} is invoked, the following task runs updating all hdp pointers:

{code}
    <group xsi:type="cluster" name="RESTORE_CONFIG_DIRS" title="Restore Configuration Directories">
      <direction>DOWNGRADE</direction>
      <execute-stage title="Restore configuration directories and remove HDP 2.3 symlinks">
        <task xsi:type="execute">
          <script>scripts/ru_set_all.py</script>
          <function>unlink_all_configs</function>
        </task>
      </execute-stage>
    </group>
{code}

After this, all components begin to restart. However, restarting involves a {{stop}} and a {{start}} command. The components are already stopped and most of them have logic that says if the PID says it's not running, then don't stop it twice.

However, some components like ZKFC and HBase Master don't have this logic and try stopping it regardless. The problem arises when a JVM is spun up to stop the process:

Initially it was though that moving the {{hdp-select set all}} to after the {{restart}} groups would solve the problem. As it turns out, moving the {{hdp-select set all}} doesn't work since the {{params.py}} it always taking the new version and building conf/lib/bin directories with it. Additionally, some components have upgrade bugs which calling {{hdp-select set all}} corrects.


Diffs (updated)
-----

  ambari-common/src/main/python/resource_management/libraries/functions/hdp_select.py 5f05777 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py 97ad424 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py e9037d8 
  ambari-server/src/main/resources/stacks/HDP/2.1/upgrades/nonrolling-upgrade-2.3.xml 25fd6ab 
  ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.2.xml 63f9f8d 
  ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.3.xml 44413d3 
  ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/nonrolling-upgrade-2.3.xml 407b22b 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_datanode.py 263eeb2 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_journalnode.py a6cd740 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 21b71f9 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_nfsgateway.py 5852eaf 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_snamenode.py dfbd887 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_zkfc.py 744d3ba 
  ambari-server/src/test/python/stacks/2.1/HIVE/test_hive_metastore.py 2800224 
  ambari-server/src/test/python/stacks/utils/RMFTestCase.py ab4eed4 

Diff: https://reviews.apache.org/r/39731/diff/


Testing
-------

Upgrades from 2.2 to 2.3, and 2.3 to 2.3+ with ZKFC. 

mvn clean test


Thanks,

Jonathan Hurley


Re: Review Request 39731: Express Upgrade: ZKFC Cannot Stop Because Newer Configurations Don't Exist

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39731/
-----------------------------------------------------------

(Updated Oct. 28, 2015, 6:49 p.m.)


Review request for Ambari, Alejandro Fernandez, Jayush Luniya, and Nate Cole.


Bugs: AMBARI-13615
    https://issues.apache.org/jira/browse/AMBARI-13615


Repository: ambari


Description
-------

During an express upgrade, components are stopped ahead of time. Before {{restart}} is invoked, the following task runs updating all hdp pointers:

{code}
    <group xsi:type="cluster" name="RESTORE_CONFIG_DIRS" title="Restore Configuration Directories">
      <direction>DOWNGRADE</direction>
      <execute-stage title="Restore configuration directories and remove HDP 2.3 symlinks">
        <task xsi:type="execute">
          <script>scripts/ru_set_all.py</script>
          <function>unlink_all_configs</function>
        </task>
      </execute-stage>
    </group>
{code}

After this, all components begin to restart. However, restarting involves a {{stop}} and a {{start}} command. The components are already stopped and most of them have logic that says if the PID says it's not running, then don't stop it twice.

However, some components like ZKFC and HBase Master don't have this logic and try stopping it regardless. The problem arises when a JVM is spun up to stop the process:

Initially it was though that moving the {{hdp-select set all}} to after the {{restart}} groups would solve the problem. As it turns out, moving the {{hdp-select set all}} doesn't work since the {{params.py}} it always taking the new version and building conf/lib/bin directories with it. Additionally, some components have upgrade bugs which calling {{hdp-select set all}} corrects.


Diffs (updated)
-----

  ambari-common/src/main/python/resource_management/libraries/functions/hdp_select.py 5f05777 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py 97ad424 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py e9037d8 
  ambari-server/src/main/resources/stacks/HDP/2.1/upgrades/nonrolling-upgrade-2.3.xml 25fd6ab 
  ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.2.xml 63f9f8d 
  ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.3.xml 44413d3 
  ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/nonrolling-upgrade-2.3.xml 407b22b 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_datanode.py 263eeb2 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_journalnode.py a6cd740 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 21b71f9 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_nfsgateway.py 5852eaf 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_snamenode.py dfbd887 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_zkfc.py 744d3ba 
  ambari-server/src/test/python/stacks/2.1/HIVE/test_hive_metastore.py 2800224 
  ambari-server/src/test/python/stacks/utils/RMFTestCase.py ab4eed4 

Diff: https://reviews.apache.org/r/39731/diff/


Testing
-------

Upgrades from 2.2 to 2.3, and 2.3 to 2.3+ with ZKFC. 

mvn clean test


Thanks,

Jonathan Hurley


Re: Review Request 39731: Express Upgrade: ZKFC Cannot Stop Because Newer Configurations Don't Exist

Posted by Jonathan Hurley <jh...@hortonworks.com>.

> On Oct. 28, 2015, 4:52 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py, line 72
> > <https://reviews.apache.org/r/39731/diff/1/?file=1111687#file1111687line72>
> >
> >     I thought we started ZKFC using the same component  in hdp-select as namenode.
> >     
> >     Otherwise, it may be risky to change hadoop-client this early on during RU.

I think ZKFC never upgrades in a rolling upgrade. Take a look at the workflow in RU:
    <group name="CORE_MASTER" title="Core Masters">
      <service-check>false</service-check>
      <service name="HDFS">
        <component>JOURNALNODE</component>
        <component>ZKFC</component>
        <component>NAMENODE</component>
      </service>
      
It's after JN, but before NN. Since ZKFC wasn't bound to JN in our mappings, it must have always started on the old version. I can change this, but if I change it to NameNode, that could be a problem still.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39731/#review104345
-----------------------------------------------------------


On Oct. 28, 2015, 3:31 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39731/
> -----------------------------------------------------------
> 
> (Updated Oct. 28, 2015, 3:31 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Jayush Luniya, and Nate Cole.
> 
> 
> Bugs: AMBARI-13615
>     https://issues.apache.org/jira/browse/AMBARI-13615
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During an express upgrade, components are stopped ahead of time. Before {{restart}} is invoked, the following task runs updating all hdp pointers:
> 
> {code}
>     <group xsi:type="cluster" name="RESTORE_CONFIG_DIRS" title="Restore Configuration Directories">
>       <direction>DOWNGRADE</direction>
>       <execute-stage title="Restore configuration directories and remove HDP 2.3 symlinks">
>         <task xsi:type="execute">
>           <script>scripts/ru_set_all.py</script>
>           <function>unlink_all_configs</function>
>         </task>
>       </execute-stage>
>     </group>
> {code}
> 
> After this, all components begin to restart. However, restarting involves a {{stop}} and a {{start}} command. The components are already stopped and most of them have logic that says if the PID says it's not running, then don't stop it twice.
> 
> However, some components like ZKFC and HBase Master don't have this logic and try stopping it regardless. The problem arises when a JVM is spun up to stop the process:
> 
> Initially it was though that moving the {{hdp-select set all}} to after the {{restart}} groups would solve the problem. As it turns out, moving the {{hdp-select set all}} doesn't work since the {{params.py}} it always taking the new version and building conf/lib/bin directories with it. Additionally, some components have upgrade bugs which calling {{hdp-select set all}} corrects.
> 
> 
> Diffs
> -----
> 
>   ambari-common/src/main/python/resource_management/libraries/functions/hdp_select.py 5f05777 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py 97ad424 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py e9037d8 
>   ambari-server/src/main/resources/stacks/HDP/2.1/upgrades/nonrolling-upgrade-2.3.xml 25fd6ab 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.2.xml 63f9f8d 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.3.xml 44413d3 
>   ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/nonrolling-upgrade-2.3.xml 407b22b 
>   ambari-server/src/test/python/stacks/2.1/HIVE/test_hive_metastore.py 2800224 
>   ambari-server/src/test/python/stacks/utils/RMFTestCase.py ab4eed4 
> 
> Diff: https://reviews.apache.org/r/39731/diff/
> 
> 
> Testing
> -------
> 
> Upgrades from 2.2 to 2.3, and 2.3 to 2.3+ with ZKFC. 
> 
> mvn clean test
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 39731: Express Upgrade: ZKFC Cannot Stop Because Newer Configurations Don't Exist

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39731/#review104345
-----------------------------------------------------------



ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py (line 67)
<https://reviews.apache.org/r/39731/#comment162594>

    I thought we started ZKFC using the same component  in hdp-select as namenode.
    
    Otherwise, it may be risky to change hadoop-client this early on during RU.


- Alejandro Fernandez


On Oct. 28, 2015, 7:31 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39731/
> -----------------------------------------------------------
> 
> (Updated Oct. 28, 2015, 7:31 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Jayush Luniya, and Nate Cole.
> 
> 
> Bugs: AMBARI-13615
>     https://issues.apache.org/jira/browse/AMBARI-13615
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During an express upgrade, components are stopped ahead of time. Before {{restart}} is invoked, the following task runs updating all hdp pointers:
> 
> {code}
>     <group xsi:type="cluster" name="RESTORE_CONFIG_DIRS" title="Restore Configuration Directories">
>       <direction>DOWNGRADE</direction>
>       <execute-stage title="Restore configuration directories and remove HDP 2.3 symlinks">
>         <task xsi:type="execute">
>           <script>scripts/ru_set_all.py</script>
>           <function>unlink_all_configs</function>
>         </task>
>       </execute-stage>
>     </group>
> {code}
> 
> After this, all components begin to restart. However, restarting involves a {{stop}} and a {{start}} command. The components are already stopped and most of them have logic that says if the PID says it's not running, then don't stop it twice.
> 
> However, some components like ZKFC and HBase Master don't have this logic and try stopping it regardless. The problem arises when a JVM is spun up to stop the process:
> 
> Initially it was though that moving the {{hdp-select set all}} to after the {{restart}} groups would solve the problem. As it turns out, moving the {{hdp-select set all}} doesn't work since the {{params.py}} it always taking the new version and building conf/lib/bin directories with it. Additionally, some components have upgrade bugs which calling {{hdp-select set all}} corrects.
> 
> 
> Diffs
> -----
> 
>   ambari-common/src/main/python/resource_management/libraries/functions/hdp_select.py 5f05777 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py 97ad424 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/zkfc_slave.py e9037d8 
>   ambari-server/src/main/resources/stacks/HDP/2.1/upgrades/nonrolling-upgrade-2.3.xml 25fd6ab 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.2.xml 63f9f8d 
>   ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.3.xml 44413d3 
>   ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/nonrolling-upgrade-2.3.xml 407b22b 
>   ambari-server/src/test/python/stacks/2.1/HIVE/test_hive_metastore.py 2800224 
>   ambari-server/src/test/python/stacks/utils/RMFTestCase.py ab4eed4 
> 
> Diff: https://reviews.apache.org/r/39731/diff/
> 
> 
> Testing
> -------
> 
> Upgrades from 2.2 to 2.3, and 2.3 to 2.3+ with ZKFC. 
> 
> mvn clean test
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>