You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by John Speidel <js...@hortonworks.com> on 2015/05/29 21:00:48 UTC

Review Request 34821: Occasional database deadlock detected when provisioning cluster via blueprint api

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34821/
-----------------------------------------------------------

Review request for Ambari, Robert Nettleton, Sumit Mohanty, Sid Wagle, and Tom Beerbower.


Bugs: AMBARI-11542
    https://issues.apache.org/jira/browse/AMBARI-11542


Repository: ambari


Description
-------

When provisioning a cluster via the blueprint api, occasionally a database deadlock is detected. There is retry logic around this code so it doesn't affect the creation of the cluster and a user wouldn't notice this unless they looked at the logs. That being said, this issue involves incorrect transaction demarcation and synchronization and is potentially serious depending on how it is manifested.

The fix involves changing the scope of the database transaction as well as synchronization.
There are currently many issues transaction/synchronization issues in the state layer that need to be addresses, this only deals with this exact use case.

Also, this patch strictly deals with correctness and I didn't make an effort to optimize this path.  If this results is a performance regression, there are several approaches that we could take.


Diffs
-----

  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 792b6fe 

Diff: https://reviews.apache.org/r/34821/diff/


Testing
-------

Provison clusters many times via looking for a reported database deadlock.  Without this patch, I was able to reproduce the deadlock fairly consistently and with the patch no deadlock occurred across many installs.

Unit Tests:
- tx/synchronization change only so no new unit test
- currently running full unit test suite and will update review with result summary when completed


Thanks,

John Speidel


Re: Review Request 34821: Occasional database deadlock detected when provisioning cluster via blueprint api

Posted by Sumit Mohanty <sm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34821/#review85787
-----------------------------------------------------------

Ship it!


Ship It!

- Sumit Mohanty


On May 29, 2015, 7 p.m., John Speidel wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34821/
> -----------------------------------------------------------
> 
> (Updated May 29, 2015, 7 p.m.)
> 
> 
> Review request for Ambari, Robert Nettleton, Sumit Mohanty, Sid Wagle, and Tom Beerbower.
> 
> 
> Bugs: AMBARI-11542
>     https://issues.apache.org/jira/browse/AMBARI-11542
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When provisioning a cluster via the blueprint api, occasionally a database deadlock is detected. There is retry logic around this code so it doesn't affect the creation of the cluster and a user wouldn't notice this unless they looked at the logs. That being said, this issue involves incorrect transaction demarcation and synchronization and is potentially serious depending on how it is manifested.
> 
> The fix involves changing the scope of the database transaction as well as synchronization.
> There are currently many issues transaction/synchronization issues in the state layer that need to be addresses, this only deals with this exact use case.
> 
> Also, this patch strictly deals with correctness and I didn't make an effort to optimize this path.  If this results is a performance regression, there are several approaches that we could take.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 792b6fe 
> 
> Diff: https://reviews.apache.org/r/34821/diff/
> 
> 
> Testing
> -------
> 
> Provison clusters many times via looking for a reported database deadlock.  Without this patch, I was able to reproduce the deadlock fairly consistently and with the patch no deadlock occurred across many installs.
> 
> Unit Tests:
> - tx/synchronization change only so no new unit test
> - currently running full unit test suite and will update review with result summary when completed
> 
> 
> Thanks,
> 
> John Speidel
> 
>


Re: Review Request 34821: Occasional database deadlock detected when provisioning cluster via blueprint api

Posted by Tom Beerbower <tb...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34821/#review85784
-----------------------------------------------------------

Ship it!


Ship It!

- Tom Beerbower


On May 29, 2015, 7 p.m., John Speidel wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34821/
> -----------------------------------------------------------
> 
> (Updated May 29, 2015, 7 p.m.)
> 
> 
> Review request for Ambari, Robert Nettleton, Sumit Mohanty, Sid Wagle, and Tom Beerbower.
> 
> 
> Bugs: AMBARI-11542
>     https://issues.apache.org/jira/browse/AMBARI-11542
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When provisioning a cluster via the blueprint api, occasionally a database deadlock is detected. There is retry logic around this code so it doesn't affect the creation of the cluster and a user wouldn't notice this unless they looked at the logs. That being said, this issue involves incorrect transaction demarcation and synchronization and is potentially serious depending on how it is manifested.
> 
> The fix involves changing the scope of the database transaction as well as synchronization.
> There are currently many issues transaction/synchronization issues in the state layer that need to be addresses, this only deals with this exact use case.
> 
> Also, this patch strictly deals with correctness and I didn't make an effort to optimize this path.  If this results is a performance regression, there are several approaches that we could take.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 792b6fe 
> 
> Diff: https://reviews.apache.org/r/34821/diff/
> 
> 
> Testing
> -------
> 
> Provison clusters many times via looking for a reported database deadlock.  Without this patch, I was able to reproduce the deadlock fairly consistently and with the patch no deadlock occurred across many installs.
> 
> Unit Tests:
> - tx/synchronization change only so no new unit test
> - currently running full unit test suite and will update review with result summary when completed
> 
> 
> Thanks,
> 
> John Speidel
> 
>


Re: Review Request 34821: Occasional database deadlock detected when provisioning cluster via blueprint api

Posted by Robert Nettleton <rn...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34821/#review85780
-----------------------------------------------------------

Ship it!


Ship It!

- Robert Nettleton


On May 29, 2015, 7 p.m., John Speidel wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34821/
> -----------------------------------------------------------
> 
> (Updated May 29, 2015, 7 p.m.)
> 
> 
> Review request for Ambari, Robert Nettleton, Sumit Mohanty, Sid Wagle, and Tom Beerbower.
> 
> 
> Bugs: AMBARI-11542
>     https://issues.apache.org/jira/browse/AMBARI-11542
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When provisioning a cluster via the blueprint api, occasionally a database deadlock is detected. There is retry logic around this code so it doesn't affect the creation of the cluster and a user wouldn't notice this unless they looked at the logs. That being said, this issue involves incorrect transaction demarcation and synchronization and is potentially serious depending on how it is manifested.
> 
> The fix involves changing the scope of the database transaction as well as synchronization.
> There are currently many issues transaction/synchronization issues in the state layer that need to be addresses, this only deals with this exact use case.
> 
> Also, this patch strictly deals with correctness and I didn't make an effort to optimize this path.  If this results is a performance regression, there are several approaches that we could take.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 792b6fe 
> 
> Diff: https://reviews.apache.org/r/34821/diff/
> 
> 
> Testing
> -------
> 
> Provison clusters many times via looking for a reported database deadlock.  Without this patch, I was able to reproduce the deadlock fairly consistently and with the patch no deadlock occurred across many installs.
> 
> Unit Tests:
> - tx/synchronization change only so no new unit test
> - currently running full unit test suite and will update review with result summary when completed
> 
> 
> Thanks,
> 
> John Speidel
> 
>


Re: Review Request 34821: Occasional database deadlock detected when provisioning cluster via blueprint api

Posted by John Speidel <js...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34821/
-----------------------------------------------------------

(Updated May 29, 2015, 8:06 p.m.)


Review request for Ambari, Robert Nettleton, Sumit Mohanty, Sid Wagle, and Tom Beerbower.


Bugs: AMBARI-11542
    https://issues.apache.org/jira/browse/AMBARI-11542


Repository: ambari


Description
-------

When provisioning a cluster via the blueprint api, occasionally a database deadlock is detected. There is retry logic around this code so it doesn't affect the creation of the cluster and a user wouldn't notice this unless they looked at the logs. That being said, this issue involves incorrect transaction demarcation and synchronization and is potentially serious depending on how it is manifested.

The fix involves changing the scope of the database transaction as well as synchronization.
There are currently many issues transaction/synchronization issues in the state layer that need to be addresses, this only deals with this exact use case.

Also, this patch strictly deals with correctness and I didn't make an effort to optimize this path.  If this results is a performance regression, there are several approaches that we could take.


Diffs
-----

  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 792b6fe 

Diff: https://reviews.apache.org/r/34821/diff/


Testing (updated)
-------

Provison clusters many times via looking for a reported database deadlock.  Without this patch, I was able to reproduce the deadlock fairly consistently and with the patch no deadlock occurred across many installs.

Unit Tests:
- tx/synchronization change only so no new unit test
- currently running full unit test suite and will update review with result summary when completed


Results :

Tests run: 3020, Failures: 0, Errors: 0, Skipped: 21
...
Total run:744
Total errors:0
Total failures:0


Thanks,

John Speidel