You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@ambari.apache.org by Sebastian Toader <st...@hortonworks.com> on 2016/04/11 16:38:40 UTC

Review Request 46032: Restarting ambari-server after successful blueprint deploy of large cluster makes it unresponsive

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46032/
-----------------------------------------------------------

Review request for Ambari, Daniel Gergely, Laszlo Puskas, Robert Levas, Sandor Magyari, Srimanth Gunturi, and Sid Wagle.


Bugs: AMBARI-15803
    https://issues.apache.org/jira/browse/AMBARI-15803


Repository: ambari


Description
-------

After restart Ambari lazily loads persisted cluster state from database in order to figure out if there is anything pending for finalizing the cluster creation 
using Blueprints. The persisted host requests that don't have a host assigned yet (pending host request) the server has to assign hosts as they register with the server.

Due to bug the server was erroneously tracking which hosts to wait for to assign to the pending persisted host requests.
This led to NPEs later in the process of initializing state process database. Each host registration first checks if initialization from 
persisted state completed if not that triggers the initialization. Since the init was continuously failing it was re-triggered on each host 
registration leading to the unresponsiveness of the server.


Diffs
-----

  ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java 82edbcf 
  ambari-server/src/test/java/org/apache/ambari/server/topology/LogicalRequestTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/46032/diff/


Testing
-------

Manual testing using cluster creation templates with host groups with multiple hosts specified through fixed fqdn list and also host predicates.


Unit tests:
Results :

Tests run: 3550, Failures: 0, Errors: 0, Skipped: 36


Thanks,

Sebastian Toader


Re: Review Request 46032: Restarting ambari-server after successful blueprint deploy of large cluster makes it unresponsive

Posted by Robert Levas <rl...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46032/#review128129
-----------------------------------------------------------


Ship it!




Ship It!

- Robert Levas


On April 11, 2016, 10:38 a.m., Sebastian Toader wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46032/
> -----------------------------------------------------------
> 
> (Updated April 11, 2016, 10:38 a.m.)
> 
> 
> Review request for Ambari, Daniel Gergely, Laszlo Puskas, Robert Levas, Sandor Magyari, Srimanth Gunturi, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15803
>     https://issues.apache.org/jira/browse/AMBARI-15803
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> After restart Ambari lazily loads persisted cluster state from database in order to figure out if there is anything pending for finalizing the cluster creation 
> using Blueprints. The persisted host requests that don't have a host assigned yet (pending host request) the server has to assign hosts as they register with the server.
> 
> Due to bug the server was erroneously tracking which hosts to wait for to assign to the pending persisted host requests.
> This led to NPEs later in the process of initializing state process database. Each host registration first checks if initialization from 
> persisted state completed if not that triggers the initialization. Since the init was continuously failing it was re-triggered on each host 
> registration leading to the unresponsiveness of the server.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java 82edbcf 
>   ambari-server/src/test/java/org/apache/ambari/server/topology/LogicalRequestTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/46032/diff/
> 
> 
> Testing
> -------
> 
> Manual testing using cluster creation templates with host groups with multiple hosts specified through fixed fqdn list and also host predicates.
> 
> 
> Unit tests:
> Results :
> 
> Tests run: 3550, Failures: 0, Errors: 0, Skipped: 36
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>


Re: Review Request 46032: Restarting ambari-server after successful blueprint deploy of large cluster makes it unresponsive

Posted by Robert Nettleton <rn...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46032/#review128128
-----------------------------------------------------------


Ship it!




Ship It!

- Robert Nettleton


On April 11, 2016, 2:38 p.m., Sebastian Toader wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46032/
> -----------------------------------------------------------
> 
> (Updated April 11, 2016, 2:38 p.m.)
> 
> 
> Review request for Ambari, Daniel Gergely, Laszlo Puskas, Robert Levas, Sandor Magyari, Srimanth Gunturi, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15803
>     https://issues.apache.org/jira/browse/AMBARI-15803
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> After restart Ambari lazily loads persisted cluster state from database in order to figure out if there is anything pending for finalizing the cluster creation 
> using Blueprints. The persisted host requests that don't have a host assigned yet (pending host request) the server has to assign hosts as they register with the server.
> 
> Due to bug the server was erroneously tracking which hosts to wait for to assign to the pending persisted host requests.
> This led to NPEs later in the process of initializing state process database. Each host registration first checks if initialization from 
> persisted state completed if not that triggers the initialization. Since the init was continuously failing it was re-triggered on each host 
> registration leading to the unresponsiveness of the server.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java 82edbcf 
>   ambari-server/src/test/java/org/apache/ambari/server/topology/LogicalRequestTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/46032/diff/
> 
> 
> Testing
> -------
> 
> Manual testing using cluster creation templates with host groups with multiple hosts specified through fixed fqdn list and also host predicates.
> 
> 
> Unit tests:
> Results :
> 
> Tests run: 3550, Failures: 0, Errors: 0, Skipped: 36
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>


Re: Review Request 46032: Restarting ambari-server after successful blueprint deploy of large cluster makes it unresponsive

Posted by Daniel Gergely <dg...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46032/#review128125
-----------------------------------------------------------


Fix it, then Ship it!





ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java (line 467)
<https://reviews.apache.org/r/46032/#comment191500>

    wrong indentation


- Daniel Gergely


On ápr. 11, 2016, 2:38 du, Sebastian Toader wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46032/
> -----------------------------------------------------------
> 
> (Updated ápr. 11, 2016, 2:38 du)
> 
> 
> Review request for Ambari, Daniel Gergely, Laszlo Puskas, Robert Levas, Sandor Magyari, Srimanth Gunturi, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15803
>     https://issues.apache.org/jira/browse/AMBARI-15803
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> After restart Ambari lazily loads persisted cluster state from database in order to figure out if there is anything pending for finalizing the cluster creation 
> using Blueprints. The persisted host requests that don't have a host assigned yet (pending host request) the server has to assign hosts as they register with the server.
> 
> Due to bug the server was erroneously tracking which hosts to wait for to assign to the pending persisted host requests.
> This led to NPEs later in the process of initializing state process database. Each host registration first checks if initialization from 
> persisted state completed if not that triggers the initialization. Since the init was continuously failing it was re-triggered on each host 
> registration leading to the unresponsiveness of the server.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java 82edbcf 
>   ambari-server/src/test/java/org/apache/ambari/server/topology/LogicalRequestTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/46032/diff/
> 
> 
> Testing
> -------
> 
> Manual testing using cluster creation templates with host groups with multiple hosts specified through fixed fqdn list and also host predicates.
> 
> 
> Unit tests:
> Results :
> 
> Tests run: 3550, Failures: 0, Errors: 0, Skipped: 36
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>


Re: Review Request 46032: Restarting ambari-server after successful blueprint deploy of large cluster makes it unresponsive

Posted by Sebastian Toader <st...@hortonworks.com>.

> On April 11, 2016, 6:06 p.m., Sid Wagle wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java, line 412
> > <https://reviews.apache.org/r/46032/diff/2/?file=1339362#file1339362line412>
> >
> >     Any chance of an NPE here ?

Added checks for NPE there.


- Sebastian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46032/#review128142
-----------------------------------------------------------


On April 11, 2016, 6:43 p.m., Sebastian Toader wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46032/
> -----------------------------------------------------------
> 
> (Updated April 11, 2016, 6:43 p.m.)
> 
> 
> Review request for Ambari, Daniel Gergely, Laszlo Puskas, Robert Levas, Sandor Magyari, Srimanth Gunturi, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15803
>     https://issues.apache.org/jira/browse/AMBARI-15803
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> After restart Ambari lazily loads persisted cluster state from database in order to figure out if there is anything pending for finalizing the cluster creation 
> using Blueprints. The persisted host requests that don't have a host assigned yet (pending host request) the server has to assign hosts as they register with the server.
> 
> Due to bug the server was erroneously tracking which hosts to wait for to assign to the pending persisted host requests.
> This led to NPEs later in the process of initializing state process database. Each host registration first checks if initialization from 
> persisted state completed if not that triggers the initialization. Since the init was continuously failing it was re-triggered on each host 
> registration leading to the unresponsiveness of the server.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java 82edbcf 
>   ambari-server/src/test/java/org/apache/ambari/server/topology/LogicalRequestTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/46032/diff/
> 
> 
> Testing
> -------
> 
> Manual testing using cluster creation templates with host groups with multiple hosts specified through fixed fqdn list and also host predicates.
> 
> 
> Unit tests:
> Results :
> 
> Tests run: 3550, Failures: 0, Errors: 0, Skipped: 36
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>


Re: Review Request 46032: Restarting ambari-server after successful blueprint deploy of large cluster makes it unresponsive

Posted by Sid Wagle <sw...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46032/#review128142
-----------------------------------------------------------




ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java (line 412)
<https://reviews.apache.org/r/46032/#comment191509>

    Any chance of an NPE here ?


- Sid Wagle


On April 11, 2016, 3:15 p.m., Sebastian Toader wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46032/
> -----------------------------------------------------------
> 
> (Updated April 11, 2016, 3:15 p.m.)
> 
> 
> Review request for Ambari, Daniel Gergely, Laszlo Puskas, Robert Levas, Sandor Magyari, Srimanth Gunturi, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15803
>     https://issues.apache.org/jira/browse/AMBARI-15803
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> After restart Ambari lazily loads persisted cluster state from database in order to figure out if there is anything pending for finalizing the cluster creation 
> using Blueprints. The persisted host requests that don't have a host assigned yet (pending host request) the server has to assign hosts as they register with the server.
> 
> Due to bug the server was erroneously tracking which hosts to wait for to assign to the pending persisted host requests.
> This led to NPEs later in the process of initializing state process database. Each host registration first checks if initialization from 
> persisted state completed if not that triggers the initialization. Since the init was continuously failing it was re-triggered on each host 
> registration leading to the unresponsiveness of the server.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java 82edbcf 
>   ambari-server/src/test/java/org/apache/ambari/server/topology/LogicalRequestTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/46032/diff/
> 
> 
> Testing
> -------
> 
> Manual testing using cluster creation templates with host groups with multiple hosts specified through fixed fqdn list and also host predicates.
> 
> 
> Unit tests:
> Results :
> 
> Tests run: 3550, Failures: 0, Errors: 0, Skipped: 36
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>


Re: Review Request 46032: Restarting ambari-server after successful blueprint deploy of large cluster makes it unresponsive

Posted by Sandor Magyari <sm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46032/#review128157
-----------------------------------------------------------


Ship it!




Ship It!

- Sandor Magyari


On April 11, 2016, 4:43 p.m., Sebastian Toader wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46032/
> -----------------------------------------------------------
> 
> (Updated April 11, 2016, 4:43 p.m.)
> 
> 
> Review request for Ambari, Daniel Gergely, Laszlo Puskas, Robert Levas, Sandor Magyari, Srimanth Gunturi, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15803
>     https://issues.apache.org/jira/browse/AMBARI-15803
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> After restart Ambari lazily loads persisted cluster state from database in order to figure out if there is anything pending for finalizing the cluster creation 
> using Blueprints. The persisted host requests that don't have a host assigned yet (pending host request) the server has to assign hosts as they register with the server.
> 
> Due to bug the server was erroneously tracking which hosts to wait for to assign to the pending persisted host requests.
> This led to NPEs later in the process of initializing state process database. Each host registration first checks if initialization from 
> persisted state completed if not that triggers the initialization. Since the init was continuously failing it was re-triggered on each host 
> registration leading to the unresponsiveness of the server.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java 82edbcf 
>   ambari-server/src/test/java/org/apache/ambari/server/topology/LogicalRequestTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/46032/diff/
> 
> 
> Testing
> -------
> 
> Manual testing using cluster creation templates with host groups with multiple hosts specified through fixed fqdn list and also host predicates.
> 
> 
> Unit tests:
> Results :
> 
> Tests run: 3550, Failures: 0, Errors: 0, Skipped: 36
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>


Re: Review Request 46032: Restarting ambari-server after successful blueprint deploy of large cluster makes it unresponsive

Posted by Sebastian Toader <st...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46032/
-----------------------------------------------------------

(Updated April 11, 2016, 6:43 p.m.)


Review request for Ambari, Daniel Gergely, Laszlo Puskas, Robert Levas, Sandor Magyari, Srimanth Gunturi, and Sid Wagle.


Changes
-------

Added checks for NPE


Bugs: AMBARI-15803
    https://issues.apache.org/jira/browse/AMBARI-15803


Repository: ambari


Description
-------

After restart Ambari lazily loads persisted cluster state from database in order to figure out if there is anything pending for finalizing the cluster creation 
using Blueprints. The persisted host requests that don't have a host assigned yet (pending host request) the server has to assign hosts as they register with the server.

Due to bug the server was erroneously tracking which hosts to wait for to assign to the pending persisted host requests.
This led to NPEs later in the process of initializing state process database. Each host registration first checks if initialization from 
persisted state completed if not that triggers the initialization. Since the init was continuously failing it was re-triggered on each host 
registration leading to the unresponsiveness of the server.


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java 82edbcf 
  ambari-server/src/test/java/org/apache/ambari/server/topology/LogicalRequestTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/46032/diff/


Testing
-------

Manual testing using cluster creation templates with host groups with multiple hosts specified through fixed fqdn list and also host predicates.


Unit tests:
Results :

Tests run: 3550, Failures: 0, Errors: 0, Skipped: 36


Thanks,

Sebastian Toader


Re: Review Request 46032: Restarting ambari-server after successful blueprint deploy of large cluster makes it unresponsive

Posted by Sebastian Toader <st...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46032/
-----------------------------------------------------------

(Updated April 11, 2016, 5:15 p.m.)


Review request for Ambari, Daniel Gergely, Laszlo Puskas, Robert Levas, Sandor Magyari, Srimanth Gunturi, and Sid Wagle.


Changes
-------

Fix wrong identation.


Bugs: AMBARI-15803
    https://issues.apache.org/jira/browse/AMBARI-15803


Repository: ambari


Description
-------

After restart Ambari lazily loads persisted cluster state from database in order to figure out if there is anything pending for finalizing the cluster creation 
using Blueprints. The persisted host requests that don't have a host assigned yet (pending host request) the server has to assign hosts as they register with the server.

Due to bug the server was erroneously tracking which hosts to wait for to assign to the pending persisted host requests.
This led to NPEs later in the process of initializing state process database. Each host registration first checks if initialization from 
persisted state completed if not that triggers the initialization. Since the init was continuously failing it was re-triggered on each host 
registration leading to the unresponsiveness of the server.


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/topology/LogicalRequest.java 82edbcf 
  ambari-server/src/test/java/org/apache/ambari/server/topology/LogicalRequestTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/46032/diff/


Testing
-------

Manual testing using cluster creation templates with host groups with multiple hosts specified through fixed fqdn list and also host predicates.


Unit tests:
Results :

Tests run: 3550, Failures: 0, Errors: 0, Skipped: 36


Thanks,

Sebastian Toader