You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by Dmitro Lisnichenko <dl...@hortonworks.com> on 2015/05/06 20:09:11 UTC

Re: Review Request 33663: Hit re-install when performing an RU - UI seems to have stuck at installing even though the request has completed

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33663/
-----------------------------------------------------------

(Updated May 6, 2015, 6:09 p.m.)


Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, and Nate Cole.


Changes
-------

Attached patch that has been proved to work.


Bugs: AMBARI-10818
    https://issues.apache.org/jira/browse/AMBARI-10818


Repository: ambari


Description (updated)
-------

Problem here is that DistributeRepositoriesActionListener is executed in a separate thread. So we have to use UnitOfWork just like at org.apache.ambari.server.actionmanager.ActionScheduler#doWork , otherwise EntityManager cache is not updated on DB updates. I mean that RepositoryVersion state at DB is INSTALLING, and API shows INSTALLING, but RepositoryVersion state in DistributeRepositoriesActionListener is still INSTALLED, and cluster state transition is not performed.


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1 
  ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java 39192c4 

Diff: https://reviews.apache.org/r/33663/diff/


Testing (updated)
-------

Same tests are failing on trunk

Tests in error: 
  test220Cardinality(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
  test220AutoDeploy(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
  test220Dependencies(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
  testCommonOozieServiceDescriptor(org.apache.ambari.server.stack.KerberosDescriptorTest): /media/plextor/review_ambari/ambari-server/target/classes/common-services/OOZIE/5.0.0.2.3/kerberos.json is not a readable file

Tests run: 2951, Failures: 0, Errors: 4, Skipped: 17

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Ambari Views ...................................... SUCCESS [2.900s]
[INFO] Ambari Metrics Common ............................. SUCCESS [1.570s]
[INFO] Ambari Server ..................................... FAILURE [43:59.418s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 44:04.603s
[INFO] Finished at: Wed May 06 20:25:17 EEST 2015
[INFO] Final Memory: 32M/268M
[INFO] ----------------------------------------------------------------------


Thanks,

Dmitro Lisnichenko


Re: Review Request 33663: Hit re-install when performing an RU - UI seems to have stuck at installing even though the request has completed

Posted by Dmitro Lisnichenko <dl...@hortonworks.com>.

> On May 6, 2015, 8:32 p.m., Jonathan Hurley wrote:
> > Very interesting that the problem is caused because this is executed in separate threads from the normal request workflow which causes the EntityManager to contain stale cache.
> > 
> > Could we actually extrapolate this into an annotation so that we can simply decorate methods that need to be inside units of work? Something like @UnitOfWork and then doc why it's needed ... as Ambari becomes more and more asynchronous this will probably be useful.

Created a separate jira for that since current issue is urgent


- Dmitro


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33663/#review82726
-----------------------------------------------------------


On May 6, 2015, 6:09 p.m., Dmitro Lisnichenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33663/
> -----------------------------------------------------------
> 
> (Updated May 6, 2015, 6:09 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, and Nate Cole.
> 
> 
> Bugs: AMBARI-10818
>     https://issues.apache.org/jira/browse/AMBARI-10818
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Problem here is that DistributeRepositoriesActionListener is executed in a separate thread. So we have to use UnitOfWork just like at org.apache.ambari.server.actionmanager.ActionScheduler#doWork , otherwise EntityManager cache is not updated on DB updates. I mean that RepositoryVersion state at DB is INSTALLING, and API shows INSTALLING, but RepositoryVersion state in DistributeRepositoriesActionListener is still INSTALLED, and cluster state transition is not performed.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java 39192c4 
> 
> Diff: https://reviews.apache.org/r/33663/diff/
> 
> 
> Testing
> -------
> 
> Same tests are failing on trunk
> 
> Tests in error: 
>   test220Cardinality(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
>   test220AutoDeploy(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
>   test220Dependencies(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
>   testCommonOozieServiceDescriptor(org.apache.ambari.server.stack.KerberosDescriptorTest): /media/plextor/review_ambari/ambari-server/target/classes/common-services/OOZIE/5.0.0.2.3/kerberos.json is not a readable file
> 
> Tests run: 2951, Failures: 0, Errors: 4, Skipped: 17
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] Ambari Views ...................................... SUCCESS [2.900s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [1.570s]
> [INFO] Ambari Server ..................................... FAILURE [43:59.418s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD FAILURE
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 44:04.603s
> [INFO] Finished at: Wed May 06 20:25:17 EEST 2015
> [INFO] Final Memory: 32M/268M
> [INFO] ----------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Dmitro Lisnichenko
> 
>


Re: Review Request 33663: Hit re-install when performing an RU - UI seems to have stuck at installing even though the request has completed

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33663/#review82726
-----------------------------------------------------------

Ship it!


Very interesting that the problem is caused because this is executed in separate threads from the normal request workflow which causes the EntityManager to contain stale cache.

Could we actually extrapolate this into an annotation so that we can simply decorate methods that need to be inside units of work? Something like @UnitOfWork and then doc why it's needed ... as Ambari becomes more and more asynchronous this will probably be useful.

- Jonathan Hurley


On May 6, 2015, 2:09 p.m., Dmitro Lisnichenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33663/
> -----------------------------------------------------------
> 
> (Updated May 6, 2015, 2:09 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, and Nate Cole.
> 
> 
> Bugs: AMBARI-10818
>     https://issues.apache.org/jira/browse/AMBARI-10818
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Problem here is that DistributeRepositoriesActionListener is executed in a separate thread. So we have to use UnitOfWork just like at org.apache.ambari.server.actionmanager.ActionScheduler#doWork , otherwise EntityManager cache is not updated on DB updates. I mean that RepositoryVersion state at DB is INSTALLING, and API shows INSTALLING, but RepositoryVersion state in DistributeRepositoriesActionListener is still INSTALLED, and cluster state transition is not performed.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java 39192c4 
> 
> Diff: https://reviews.apache.org/r/33663/diff/
> 
> 
> Testing
> -------
> 
> Same tests are failing on trunk
> 
> Tests in error: 
>   test220Cardinality(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
>   test220AutoDeploy(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
>   test220Dependencies(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
>   testCommonOozieServiceDescriptor(org.apache.ambari.server.stack.KerberosDescriptorTest): /media/plextor/review_ambari/ambari-server/target/classes/common-services/OOZIE/5.0.0.2.3/kerberos.json is not a readable file
> 
> Tests run: 2951, Failures: 0, Errors: 4, Skipped: 17
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] Ambari Views ...................................... SUCCESS [2.900s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [1.570s]
> [INFO] Ambari Server ..................................... FAILURE [43:59.418s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD FAILURE
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 44:04.603s
> [INFO] Finished at: Wed May 06 20:25:17 EEST 2015
> [INFO] Final Memory: 32M/268M
> [INFO] ----------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Dmitro Lisnichenko
> 
>


Re: Review Request 33663: Hit re-install when performing an RU - UI seems to have stuck at installing even though the request has completed

Posted by Dmitro Lisnichenko <dl...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33663/
-----------------------------------------------------------

(Updated May 8, 2015, 12:08 p.m.)


Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, and Nate Cole.


Changes
-------

Patch adjusts annotations as Myroslav proposed.
Live test on cluster shows that it does not solve issue, so posted only for discussion purposes.


Bugs: AMBARI-10818
    https://issues.apache.org/jira/browse/AMBARI-10818


Repository: ambari


Description
-------

Problem here is that DistributeRepositoriesActionListener is executed in a separate thread. So we have to use UnitOfWork just like at org.apache.ambari.server.actionmanager.ActionScheduler#doWork , otherwise EntityManager cache is not updated on DB updates. I mean that RepositoryVersion state at DB is INSTALLING, and API shows INSTALLING, but RepositoryVersion state in DistributeRepositoriesActionListener is still INSTALLED, and cluster state transition is not performed.


Diffs (updated)
-----

  ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/upgrade-2.2.xml 0cf8ff2 
  ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/upgrade-2.3.xml abd95aa 
  ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/upgrade-2.3.xml 65549dd 

Diff: https://reviews.apache.org/r/33663/diff/


Testing
-------

Same tests are failing on trunk

Tests in error: 
  test220Cardinality(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
  test220AutoDeploy(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
  test220Dependencies(org.apache.ambari.server.api.services.KerberosServiceMetaInfoTest): Guice provision errors:(..)
  testCommonOozieServiceDescriptor(org.apache.ambari.server.stack.KerberosDescriptorTest): /media/plextor/review_ambari/ambari-server/target/classes/common-services/OOZIE/5.0.0.2.3/kerberos.json is not a readable file

Tests run: 2951, Failures: 0, Errors: 4, Skipped: 17

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Ambari Views ...................................... SUCCESS [2.900s]
[INFO] Ambari Metrics Common ............................. SUCCESS [1.570s]
[INFO] Ambari Server ..................................... FAILURE [43:59.418s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 44:04.603s
[INFO] Finished at: Wed May 06 20:25:17 EEST 2015
[INFO] Final Memory: 32M/268M
[INFO] ----------------------------------------------------------------------


Thanks,

Dmitro Lisnichenko