You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by "Aled Sage (JIRA)" <ji...@apache.org> on 2016/11/15 09:56:58 UTC

[jira] [Commented] (BROOKLYN-386) NPE on rebind calling CreateUserPolicy.addUser

    [ https://issues.apache.org/jira/browse/BROOKLYN-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666704#comment-15666704 ] 

Aled Sage commented on BROOKLYN-386:
------------------------------------

On rebind, the following happens:
* Entity is instantiated, but not yet managed; it therefore has a {{QueueingSubscriptionManager}}
* Entity's locations are re-added via AbstractEntity.addLocations(); this calls {{sensors().emit(AbstractEntity.LOCATION_ADDED, loc)}}
* These attribute-changed publications are queued inside the {{QueueingSubscriptionManager}}
* The Entity's policies are added; {{CreateUserPolicy.setEntity}} subscribes to {{LOCATION_ADDED}} events (which is registered in the {{QueueingSubscriptionManager}})
* The entity becomes managed: the {{QueueingSubscriptionManager}} is drained.
* in {{EntityManagementSupport.onManagementStarting}}, it replays the {{QueueingSubscriptionManager.queuedSubscriptions}} (so now the policy's subscription is active)
* in {{EntityManagementSupport.onManagementStarted}}, it replays the {{QueueingSubscriptionManager.queuedSensorEvents}} (including the locationAdded event, which is received by the policy)
* The policy executes {{onEvent}}, which triggers (asynchronously) adding the user to the machine.

Three ways I see to fix this are:
1. Avoid publishing location-changed events when just re-adding the locations on rebind.
2. Replay the queued subscribe/publish in a smarter way, so they correctly interleave (i.e. events 'published' before the 'subscribe' would not be received)
3. Change when {{policy.setEntity()}} is called, so that it is only done when the entity is managed.
4. Guard against duplicate events inside {{CreateUserPolicy}}.

For (1), that makes sense to me. We already do something similar when re-setting the attributes on rebind (calling {{entity.setAttributeWithoutPublishing()}}).

For (2), it becomes less of an issue if we've done (3), but still would be nice to have as it feels like a more general problem. However, this is fiddly! The subscribe/publish replays are called from different methods. Changing the order may have subtle consequences on other entity/policy implementations!

For (3), I'll discuss that on the dev@brooklyn mailing list.

For (4), that is the easiest thing to do. But other policies/enrichers could still hit the same problem.


> NPE on rebind calling CreateUserPolicy.addUser
> ----------------------------------------------
>
>                 Key: BROOKLYN-386
>                 URL: https://issues.apache.org/jira/browse/BROOKLYN-386
>             Project: Brooklyn
>          Issue Type: Bug
>            Reporter: Aled Sage
>
> I found this NullPointerException in the log:
> {noformat}
> 2016-09-07 13:50:40,633 INFO  o.a.b.c.m.r.RebindIteration [brooklyn-execmanager-EVQzoN78-0]: Rebind complete (MASTER) in 41.0s: 6 apps, 16 entities, 56 locations, 2 policies, 88 enrichers, 0 feeds, 162 catalog items
> 2016-09-07 13:50:40,633 DEBUG o.a.b.c.m.r.RebindIteration [brooklyn-execmanager-EVQzoN78-0]: RebindManager complete; apps: [fxky5xbx0z, vt864wmzpn, u3ohrxr21o, X0UTBSWZ, sJslLEBo, eb95zYiG]
> 2016-09-07 13:50:40,634 INFO  o.a.b.p.j.os.CreateUserPolicy [brooklyn-execmanager-EVQzoN78-0]: Adding auto-generated user myname @ 1.2.3.4:11071
> 2016-09-07 13:50:40,667 DEBUG o.a.b.c.m.r.RebindManagerImpl [main]: Starting persistence (org.apache.brooklyn.core.mgmt.rebind.RebindManagerImpl@19d095d5[mgmt=EVQzoN78]), mgmt EVQzoN78
> 2016-09-07 13:50:40,668 DEBUG o.a.b.l.j.JcloudsSshMachineLocation [brooklyn-execmanager-EVQzoN78-0]: Problem getting node-metadata for SshMachineLocation[MyVcloudDirector(Test):amp@1.1.1.1/1.1.1.1:11071(id=N1UFSoVb)], node id urn:vcloud:vm:be3270fd-698f-4be3-b8
> 55-d379505ac95a (continuing)
> java.lang.NullPointerException: null
>         at org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.getOptionalNode(JcloudsSshMachineLocation.java:225) [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
>         at org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.getOptionalOperatingSystem(JcloudsSshMachineLocation.java:519) [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
>         at org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.inferMachineDetails(JcloudsSshMachineLocation.java:543) [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
>         at org.apache.brooklyn.location.ssh.SshMachineLocation.getMachineDetails(SshMachineLocation.java:1058) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
>         at org.apache.brooklyn.policy.jclouds.os.CreateUserPolicy.addUser(CreateUserPolicy.java:145) [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
>         at org.apache.brooklyn.policy.jclouds.os.CreateUserPolicy$1.run(CreateUserPolicy.java:114) [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
>         at org.apache.brooklyn.util.concurrent.CallableFromRunnable.call(CallableFromRunnable.java:43) [brooklyn-utils-common-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
>         at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_95]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_95]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_95]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_95]
> {noformat}
> It shouldn't try to create the user again on rebind. And we should check to avoid the NPE as well.
> But this is benign, given that we don't want it to be executing the create-user code again anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)