You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@karaf.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/04/22 20:49:59 UTC

[jira] [Commented] (KARAF-1842) cellar will show a distributed service as available, after the Node exposing it has shutdown unexpectedly

    [ https://issues.apache.org/jira/browse/KARAF-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507648#comment-14507648 ] 

ASF GitHub Bot commented on KARAF-1842:
---------------------------------------

GitHub user FlavioF opened a pull request:

    https://github.com/apache/karaf-cellar/pull/6

    [KARAF-1842] Implemented Service Tracker to remove unavailable distributed service.

    In this PR, it was implemented a Service Tracker to remove unavailable distributed services from the list of available endpoints. It was implemented taking DiscoveryTask as a reference.
    
    ###### Modifications in `org/apache/karaf/cellar/hazelcast/HazelcastClusterManager`, `org/apache/karaf/cellar/hazelcast/HazelcastGroupManager` and `org/apache/karaf/cellar/hazelcast/HazelcastInstanceAware`
    In Hazelcast Cellar bundle updates `getInetSocketAddress().getHostName()` to `getSocketAddress().getHostString()` when populating cellar nodes since `getHostName()` "may trigger a name service reverse lookup if the address was created with a literal IP address". It leads to different node IDs depending if node is running locally  (it will reverse loockup) or not (it will not reverse lockup). Thus, it is impossible to compare between remote endpoint node IDs (`endpointDescription.getNodes()`) and cluster manager node IDs (`clusterManager.listNodes()`).
    
    ###### Modifications in `org/apache/karaf/cellar/hazelcast/HazelcastClusterManager`
    `org/apache/karaf/cellar/hazelcast/QueueConsumer.run` could throw exceptions like OperationTimeoutException. This could lead to QueueConsumer task being killed making the karaf container useless since it couldn't communicate. Therefore, when calling `getQueue().poll(10, TimeUnit.SECONDS)` it now catches all possible Exception to ensure that QueueConsumer task never dies.
    
    All this changes are needed in order to Service Tracker to work.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/FlavioF/karaf-cellar feature_add_service_tracker_to_unregist_remove_service

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/karaf-cellar/pull/6.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #6
    
----
commit de8162fba4542a53050eccf16908981c54131ab6
Author: Flávio Ferreira <ff...@bikeemotion.com>
Date:   2015-04-22T17:46:42Z

    [KARAF-1842] Implemented Service Tracker to remove unavailable distributed service.

----


> cellar will show a distributed service as available, after the Node exposing it has shutdown unexpectedly 
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: KARAF-1842
>                 URL: https://issues.apache.org/jira/browse/KARAF-1842
>             Project: Karaf
>          Issue Type: Bug
>          Components: cellar-dosgi
>    Affects Versions: cellar-2.2.4
>            Reporter: Alexandre Barbosa
>            Assignee: Jean-Baptiste Onofré
>            Priority: Minor
>
> In Cellar if a Node exposing a distributed OSGi service shuts down unexpectedly, the exposed service will still be shown as available. As can be seen using the command "cluster:service-list" in the other Nodes of the Cluster.
> So if there are no other implementations of the service exposed in the Cluster, calls to this "dead" service will be made resulting in NullPointerException to the bundles invoking it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)