You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@karaf.apache.org by Gareth <ga...@gmail.com> on 2011/09/03 00:48:25 UTC
Configuring Cellar For TCP Instead Of Multicast
Hello all,
I finally am trying out cellar on two real linux machines (RedHat EL 5 VMs).
I have the following software features installed on each instance:
[installed ] [1.9.3 ] hazelcast repo-0
In memory data grid
[installed ] [2.2.2 ] cellar repo-0
Karaf clustering
[installed ] [2.2.2 ] cellar-webconsole repo-0
Karaf Cellar Webconsole Plugin
[installed ] [1.0.0 ] jclouds repo-0
JClouds
[installed ] [3.0 ] guice repo-0
Google Guice
[installed ] [3.0.6.RELEASE ] spring
karaf-2.2.3
[installed ] [1.2.1 ] spring-dm
karaf-2.2.3
[installed ] [2.2.3 ] config
karaf-2.2.3
[installed ] [7.4.5.v20110725] jetty
karaf-2.2.3
[installed ] [2.2.3 ] http
karaf-2.2.3
[installed ] [2.2.3 ] war
karaf-2.2.3
[installed ] [2.2.3 ] webconsole-base
karaf-2.2.3
[installed ] [2.2.3 ] webconsole
karaf-2.2.3
[installed ] [2.2.3 ] ssh
karaf-2.2.3
[installed ] [2.2.3 ] management
karaf-2.2.3
[installed ] [5.5.0 ] activemq
activemq-5.5.0
[installed ] [5.5.0 ] activemq-blueprint
activemq-5.5.0
[installed ] [5.5.0 ] activemq-web-console
activemq-5.5.0
I installed the webconsole, then activemq, then cellar on both machines. I
started both the machines up (they are in the same subnet 192.168.204.123
and 192.168.204.124), but they couldn't automatically find each other using
multicast (it must be a router blockage as displaying the network interfaces
via ifconfig shows that multicast is available -> two karaf cellar instances
on the same machine do find each other).
Anyway, so I changed the com.apache.karaf.cellar.instance configuration to
TCP via the karaf web console and added in the IP addresses (tried hostnames
as well) for the two machines:
multicastGroup = 224.2.2.3
tcpIpMembers = 192.168.204.123,192.168.204.124
username = cellar
password = pass
felix.fileinstall.filename =
file:/home/osgi/apache-karaf-2.2.3-felix/etc/org.apache.karaf.cellar.instance.cfg
multicastEnabled = false
multicastPort = 54327
multicastTimeoutSeconds = 2
tcpIpEnabled = true
Cellar still couldn't find its partner. When I looked at the configuration
again a minute later the tcpMembers have been deleted. When I look in the
log I see the following:
2011-09-02 18:36:12,335 | INFO | .1.ServiceThread | ClusterManager
| dardLoggerFactory$StandardLogger 62 | - - | [cellar]
Members [2] {
Member [192.168.204.123:5701] this
Member [192.168.204.124:5701]
}
2011-09-02 18:36:14,345 | DEBUG | hz.1.InThread | Connection
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] Connection lost
/192.168.204.124:54468
2011-09-02 18:36:14,346 | WARN | hz.1.InThread | ReadHandler
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] hz.1.InThread
Closing socket to endpoint Address[192.168.204.124:5701],
Cause:java.io.EOFException
2011-09-02 18:36:14,346 | INFO | .1.ServiceThread | ClusterManager
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] Removing Address
Address[192.168.204.124:5701]
2011-09-02 18:36:14,350 | INFO | .1.ServiceThread | ClusterManager
| dardLoggerFactory$StandardLogger 62 | - - | [cellar]
Members [1] {
Member [192.168.204.123:5701] this
}
2011-09-02 18:36:14,358 | INFO | hz.1.InThread | InSelector
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] 5701 is
accepting socket connection from /192.168.204.124:34112
2011-09-02 18:36:14,359 | INFO | hz.1.InThread | InSelector
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] 5701 is accepted
socket connection from /192.168.204.124:34112
2011-09-02 18:36:14,471 | DEBUG | Timer-0 | MessageListenerServlet
| ageListenerServlet$ClientCleaner 479 | 98 -
org.apache.activemq.activemq-web-console - 5.5.0 | Cleaning up expired web
clients.
2011-09-02 18:36:17,238 | DEBUG | heckpoint Worker | MessageDatabase
| emq.store.kahadb.MessageDatabase 1161 | 81 -
org.apache.activemq.activemq-core - 5.5.0 | Checkpoint started.
2011-09-02 18:36:17,243 | DEBUG | heckpoint Worker | MessageDatabase
| emq.store.kahadb.MessageDatabase 1280 | 81 -
org.apache.activemq.activemq-core - 5.5.0 | Checkpoint done.
2011-09-02 18:36:20,371 | INFO | .1.ServiceThread | ClusterManager
| dardLoggerFactory$StandardLogger 62 | - - | [cellar]
Members [2] {
Member [192.168.204.123:5701] this
Member [192.168.204.124:5701]
}
2011-09-02 18:36:22,264 | DEBUG | heckpoint Worker | MessageDatabase
| emq.store.kahadb.MessageDatabase 1161 | 81 -
org.apache.activemq.activemq-core - 5.5.0 | Checkpoint started.
2011-09-02 18:36:22,268 | DEBUG | heckpoint Worker | MessageDatabase
| emq.store.kahadb.MessageDatabase 1280 | 81 -
org.apache.activemq.activemq-core - 5.5.0 | Checkpoint done.
2011-09-02 18:36:22,377 | DEBUG | hz.1.InThread | Connection
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] Connection lost
/192.168.204.124:34112
2011-09-02 18:36:22,377 | WARN | hz.1.InThread | ReadHandler
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] hz.1.InThread
Closing socket to endpoint Address[192.168.204.124:5701],
Cause:java.io.EOFException
2011-09-02 18:36:22,377 | INFO | .1.ServiceThread | ClusterManager
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] Removing Address
Address[192.168.204.124:5701]
2011-09-02 18:36:22,382 | INFO | .1.ServiceThread | ClusterManager
| dardLoggerFactory$StandardLogger 62 | - - | [cellar]
Members [1] {
Member [192.168.204.123:5701] this
}
2011-09-02 18:36:22,390 | INFO | hz.1.InThread | InSelector
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] 5701 is
accepting socket connection from /192.168.204.124:38994
2011-09-02 18:36:22,391 | INFO | hz.1.InThread | InSelector
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] 5701 is accepted
socket connection from /192.168.204.124:38994
I am sure I have missed something obvious here (though I haven't found it
yet in the docs). Anything obvious I have messed up?
thanks in advance,
Gareth
--
View this message in context: http://karaf.922171.n3.nabble.com/Configuring-Cellar-For-TCP-Instead-Of-Multicast-tp3305497p3305497.html
Sent from the Karaf - User mailing list archive at Nabble.com.
Re: Configuring Cellar For TCP Instead Of Multicast
Posted by Ioannis Canellos <io...@gmail.com>.
This was a bug that was introduced after implementing cloud discovery.
If fixed it both on trunk and branch 2.2.x
Here is the Jira:
https://issues.apache.org/jira/browse/KARAF-856
--
*Ioannis Canellos*
*
http://iocanel.blogspot.com
Apache Karaf <http://karaf.apache.org/> Committer & PMC
Apache ServiceMix <http://servicemix.apache.org/> Committer
Apache Gora <http://incubator.apache.org/gora/> Committer
*
Re: Configuring Cellar For TCP Instead Of Multicast
Posted by Gareth <ga...@gmail.com>.
Hello Ioannis,
I tried again with an even more basic configuration (base karaf +
features:install cellar):
State Version Name
Repository Description
[installed ] [1.9.3 ] hazelcast repo-0
In memory data grid
[installed ] [2.2.2 ] cellar repo-0
Karaf clustering
[installed ] [1.0.0 ] jclouds repo-0
JClouds
[installed ] [3.0 ] guice repo-0
Google Guice
[installed ] [3.0.6.RELEASE ] spring
karaf-2.2.3
[installed ] [1.2.1 ] spring-dm
karaf-2.2.3
[installed ] [2.2.3 ] config
karaf-2.2.3
[installed ] [2.2.3 ] ssh
karaf-2.2.3
[installed ] [2.2.3 ] management
karaf-2.2.3
I still had the same problem. In fact, I found the problem even worse. As
soon as you turn multicast off from the file and restart karaf, the cellar
features bundle hangs on startup:
[ 67] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
:: Core (2.2.2)
[ 68] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
:: Config (2.2.2)
[ 69] [Active ] [GracePeriod ] [ ] [ 60] Apache Karaf :: Cellar
:: Features (2.2.2)
[ 70] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
:: Bundle (2.2.2)
[ 71] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
:: Utils (2.2.2)
[ 72] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
:: Shell (2.2.2)
[ 73] [Active ] [ ] [Started] [ 60] Apache Karaf :: Cellar
:: Hazelcast (2.2.2)
[ 74] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
:: Management (2.2.2)
I have to shutdown karaf with "kill -9" (as shutdown doesn't work). Even if
I now turn multicast back on in the instance config file and restart, the
"Cellar :: Features" bundle continues to hang. I don't see anything too
exciting in the logs which could point me to the issue. I only see this on
shutdown:
2011-09-06 15:29:23,259 | DEBUG | lixDispatchQueue | framework
| ? ? | 0 - org.apache.felix.framework -
3.0.9 | BundleEvent STOPPED2011-09-06 15:29:23,259 | DEBUG | nt Dispatcher:
1 | BlueprintListener | raf.shell.osgi.BlueprintListener 85
| 30 - org.apache.karaf.shell.osgi - 2.2.3 | Blueprint app state changed to
Destroying
for bundle 74
2011-09-06 15:29:23,259 | WARN | hz.UDP.Sender | ManagementCenterService
| dardLoggerFactory$StandardLogger 62 | - - | [cellar] sleep
interrupted
java.lang.InterruptedException: sleep interrupted at
java.lang.Thread.sleep(Native Method)[:1.6.0_27]
at
com.hazelcast.impl.management.ManagementCenterService$UDPSender.run(ManagementCenterService.java:212)[42:hazelcast:1.9.3]2011-09-06
15:29:23,260 | DEBUG | FelixStartLevel | management |
? ? | 74 -
org.apache.karaf.cellar.management - 2.2.2 | ServiceEvent UNREGISTERING
2011-09-06 15:29:23,261 | DEBUG | FelixStartLevel | ReferenceRecipe
| eprint.container.ReferenceRecipe 152 | 9 - org.apache.aries.blueprint -
0.3.1 | Unbinding reference mbeanServer
If you could let me know what else I can try, it would be much appreciated.
thanks in advance,
Gareth
--
View this message in context: http://karaf.922171.n3.nabble.com/Configuring-Cellar-For-TCP-Instead-Of-Multicast-tp3305497p3314614.html
Sent from the Karaf - User mailing list archive at Nabble.com.
Re: Configuring Cellar For TCP Instead Of Multicast
Posted by Gareth <ga...@gmail.com>.
Hello Ioannis,
I did try Jean-Baptiste's suggestion. I shut both instances down, added the
tcpIpMembers to etc/org.apache.karaf.cellar.instance.cfg, started both
instances up and it still didn't work. When I checked the
etc/org.apache.karaf.cellar.instance.cfg configuration files, the
tcpIpMembers field was empty:
tcpIpMembers =
Just to make sure I didn't miss something, I repeated the test with quotes
around the tcpIpMembers:
tcpIpMembers="192.168.204.123,192.168.204.124"
with the same result.
I haven't installed the cellar-cloud feature. I do notice another weird
thing though...which I am not sure is related. I do see lots of duplicate
fileinstall instances build up (see attached screenshot).
I guess another minor point to all this. I will want to setup an apache
activemq instance for each instance. The activemq instance will need
different configuration for each instance. By default, when I create an
activemq instance via activemq:create-broker a broker blueprint
configuration file will be dropped in the deploy directory. Cellar
aurtomatically tries to replicate anything in the deploy directory, doesn't
it? What is the correct way of stopping it from doing this?
Anyway, I guess I will try again with as few features installed as possible
to see if I can get it to work. Any suggestions on what to try next would be
much appreciated.
thanks again,
Gareth
http://karaf.922171.n3.nabble.com/file/n3314332/Screen_shot_2011-09-06_at_1.53.40_PM.png
--
View this message in context: http://karaf.922171.n3.nabble.com/Configuring-Cellar-For-TCP-Instead-Of-Multicast-tp3305497p3314332.html
Sent from the Karaf - User mailing list archive at Nabble.com.
Re: Configuring Cellar For TCP Instead Of Multicast
Posted by Ioannis Canellos <io...@gmail.com>.
>
> did you restart your Karaf instance after changing the Cellar configuration
> in etc file ?
This is not necessary. Cellar should automatically get the configuration
change event and restart the hazelcast instance.
Garteth, have you also installed the cellar-cloud module? Its the only
module that could add/remove tcpMemebers and what you are describing sounds
like a bug in that module.
--
*Ioannis Canellos*
*
http://iocanel.blogspot.com
Apache Karaf <http://karaf.apache.org/> Committer & PMC
Apache ServiceMix <http://servicemix.apache.org/> Committer
Apache Gora <http://incubator.apache.org/gora/> Committer
*
Re: Configuring Cellar For TCP Instead Of Multicast
Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi Gareth,
did you restart your Karaf instance after changing the Cellar
configuration in etc file ?
Regards
JB
On 09/03/2011 12:48 AM, Gareth wrote:
> Hello all,
>
> I finally am trying out cellar on two real linux machines (RedHat EL 5 VMs).
> I have the following software features installed on each instance:
>
> [installed ] [1.9.3 ] hazelcast repo-0
> In memory data grid
> [installed ] [2.2.2 ] cellar repo-0
> Karaf clustering
> [installed ] [2.2.2 ] cellar-webconsole repo-0
> Karaf Cellar Webconsole Plugin
> [installed ] [1.0.0 ] jclouds repo-0
> JClouds
> [installed ] [3.0 ] guice repo-0
> Google Guice
> [installed ] [3.0.6.RELEASE ] spring
> karaf-2.2.3
> [installed ] [1.2.1 ] spring-dm
> karaf-2.2.3
> [installed ] [2.2.3 ] config
> karaf-2.2.3
> [installed ] [7.4.5.v20110725] jetty
> karaf-2.2.3
> [installed ] [2.2.3 ] http
> karaf-2.2.3
> [installed ] [2.2.3 ] war
> karaf-2.2.3
> [installed ] [2.2.3 ] webconsole-base
> karaf-2.2.3
> [installed ] [2.2.3 ] webconsole
> karaf-2.2.3
> [installed ] [2.2.3 ] ssh
> karaf-2.2.3
> [installed ] [2.2.3 ] management
> karaf-2.2.3
> [installed ] [5.5.0 ] activemq
> activemq-5.5.0
> [installed ] [5.5.0 ] activemq-blueprint
> activemq-5.5.0
> [installed ] [5.5.0 ] activemq-web-console
> activemq-5.5.0
>
> I installed the webconsole, then activemq, then cellar on both machines. I
> started both the machines up (they are in the same subnet 192.168.204.123
> and 192.168.204.124), but they couldn't automatically find each other using
> multicast (it must be a router blockage as displaying the network interfaces
> via ifconfig shows that multicast is available -> two karaf cellar instances
> on the same machine do find each other).
>
> Anyway, so I changed the com.apache.karaf.cellar.instance configuration to
> TCP via the karaf web console and added in the IP addresses (tried hostnames
> as well) for the two machines:
>
> multicastGroup = 224.2.2.3
> tcpIpMembers = 192.168.204.123,192.168.204.124
> username = cellar
> password = pass
> felix.fileinstall.filename =
> file:/home/osgi/apache-karaf-2.2.3-felix/etc/org.apache.karaf.cellar.instance.cfg
> multicastEnabled = false
> multicastPort = 54327
> multicastTimeoutSeconds = 2
> tcpIpEnabled = true
>
> Cellar still couldn't find its partner. When I looked at the configuration
> again a minute later the tcpMembers have been deleted. When I look in the
> log I see the following:
>
> 2011-09-02 18:36:12,335 | INFO | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar]
>
> Members [2] {
> Member [192.168.204.123:5701] this
> Member [192.168.204.124:5701]
> }
>
> 2011-09-02 18:36:14,345 | DEBUG | hz.1.InThread | Connection
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] Connection lost
> /192.168.204.124:54468
> 2011-09-02 18:36:14,346 | WARN | hz.1.InThread | ReadHandler
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] hz.1.InThread
> Closing socket to endpoint Address[192.168.204.124:5701],
> Cause:java.io.EOFException
> 2011-09-02 18:36:14,346 | INFO | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] Removing Address
> Address[192.168.204.124:5701]
> 2011-09-02 18:36:14,350 | INFO | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar]
>
> Members [1] {
> Member [192.168.204.123:5701] this
> }
>
> 2011-09-02 18:36:14,358 | INFO | hz.1.InThread | InSelector
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] 5701 is
> accepting socket connection from /192.168.204.124:34112
> 2011-09-02 18:36:14,359 | INFO | hz.1.InThread | InSelector
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] 5701 is accepted
> socket connection from /192.168.204.124:34112
> 2011-09-02 18:36:14,471 | DEBUG | Timer-0 | MessageListenerServlet
> | ageListenerServlet$ClientCleaner 479 | 98 -
> org.apache.activemq.activemq-web-console - 5.5.0 | Cleaning up expired web
> clients.
> 2011-09-02 18:36:17,238 | DEBUG | heckpoint Worker | MessageDatabase
> | emq.store.kahadb.MessageDatabase 1161 | 81 -
> org.apache.activemq.activemq-core - 5.5.0 | Checkpoint started.
> 2011-09-02 18:36:17,243 | DEBUG | heckpoint Worker | MessageDatabase
> | emq.store.kahadb.MessageDatabase 1280 | 81 -
> org.apache.activemq.activemq-core - 5.5.0 | Checkpoint done.
> 2011-09-02 18:36:20,371 | INFO | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar]
>
> Members [2] {
> Member [192.168.204.123:5701] this
> Member [192.168.204.124:5701]
> }
>
> 2011-09-02 18:36:22,264 | DEBUG | heckpoint Worker | MessageDatabase
> | emq.store.kahadb.MessageDatabase 1161 | 81 -
> org.apache.activemq.activemq-core - 5.5.0 | Checkpoint started.
> 2011-09-02 18:36:22,268 | DEBUG | heckpoint Worker | MessageDatabase
> | emq.store.kahadb.MessageDatabase 1280 | 81 -
> org.apache.activemq.activemq-core - 5.5.0 | Checkpoint done.
> 2011-09-02 18:36:22,377 | DEBUG | hz.1.InThread | Connection
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] Connection lost
> /192.168.204.124:34112
> 2011-09-02 18:36:22,377 | WARN | hz.1.InThread | ReadHandler
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] hz.1.InThread
> Closing socket to endpoint Address[192.168.204.124:5701],
> Cause:java.io.EOFException
> 2011-09-02 18:36:22,377 | INFO | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] Removing Address
> Address[192.168.204.124:5701]
> 2011-09-02 18:36:22,382 | INFO | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar]
>
> Members [1] {
> Member [192.168.204.123:5701] this
> }
>
> 2011-09-02 18:36:22,390 | INFO | hz.1.InThread | InSelector
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] 5701 is
> accepting socket connection from /192.168.204.124:38994
> 2011-09-02 18:36:22,391 | INFO | hz.1.InThread | InSelector
> | dardLoggerFactory$StandardLogger 62 | - - | [cellar] 5701 is accepted
> socket connection from /192.168.204.124:38994
>
> I am sure I have missed something obvious here (though I haven't found it
> yet in the docs). Anything obvious I have messed up?
>
> thanks in advance,
> Gareth
>
> --
> View this message in context: http://karaf.922171.n3.nabble.com/Configuring-Cellar-For-TCP-Instead-Of-Multicast-tp3305497p3305497.html
> Sent from the Karaf - User mailing list archive at Nabble.com.
--
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com