You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@karaf.apache.org by Gareth <ga...@gmail.com> on 2011/09/03 00:48:25 UTC

Configuring Cellar For TCP Instead Of Multicast

Hello all,

I finally am trying out cellar on two real linux machines (RedHat EL 5 VMs).
I have the following software features installed on each instance:

[installed  ] [1.9.3          ] hazelcast                            repo-0                
In memory data grid
[installed  ] [2.2.2          ] cellar                               repo-0                
Karaf clustering
[installed  ] [2.2.2          ] cellar-webconsole                    repo-0                
Karaf Cellar Webconsole Plugin
[installed  ] [1.0.0          ] jclouds                              repo-0                
JClouds
[installed  ] [3.0            ] guice                                repo-0                
Google Guice
[installed  ] [3.0.6.RELEASE  ] spring                              
karaf-2.2.3            
[installed  ] [1.2.1          ] spring-dm                           
karaf-2.2.3            
[installed  ] [2.2.3          ] config                              
karaf-2.2.3            
[installed  ] [7.4.5.v20110725] jetty                               
karaf-2.2.3            
[installed  ] [2.2.3          ] http                                
karaf-2.2.3            
[installed  ] [2.2.3          ] war                                 
karaf-2.2.3            
[installed  ] [2.2.3          ] webconsole-base                     
karaf-2.2.3            
[installed  ] [2.2.3          ] webconsole                          
karaf-2.2.3            
[installed  ] [2.2.3          ] ssh                                 
karaf-2.2.3            
[installed  ] [2.2.3          ] management                          
karaf-2.2.3            
[installed  ] [5.5.0          ] activemq                            
activemq-5.5.0         
[installed  ] [5.5.0          ] activemq-blueprint                  
activemq-5.5.0         
[installed  ] [5.5.0          ] activemq-web-console                
activemq-5.5.0

I installed the webconsole, then activemq, then cellar on both machines. I
started both the machines up (they are in the same subnet 192.168.204.123
and 192.168.204.124), but they couldn't automatically find each other using
multicast (it must be a router blockage as displaying the network interfaces
via ifconfig shows that multicast is available -> two karaf cellar instances
on the same machine do find each other).

Anyway, so I changed the com.apache.karaf.cellar.instance configuration to
TCP via the karaf web console and added in the IP addresses (tried hostnames
as well) for the two machines:

multicastGroup =  224.2.2.3
tcpIpMembers =  192.168.204.123,192.168.204.124
username =  cellar
password =  pass
felix.fileinstall.filename = 
file:/home/osgi/apache-karaf-2.2.3-felix/etc/org.apache.karaf.cellar.instance.cfg
multicastEnabled =  false
multicastPort =  54327
multicastTimeoutSeconds =  2
tcpIpEnabled =  true

Cellar still couldn't find its partner. When I looked at the configuration
again a minute later the tcpMembers have been deleted. When I look in the
log I see the following:

2011-09-02 18:36:12,335 | INFO  | .1.ServiceThread | ClusterManager                  
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 

Members [2] {
	Member [192.168.204.123:5701] this
	Member [192.168.204.124:5701]
}

2011-09-02 18:36:14,345 | DEBUG | hz.1.InThread    | Connection                      
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] Connection lost
/192.168.204.124:54468
2011-09-02 18:36:14,346 | WARN  | hz.1.InThread    | ReadHandler                     
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] hz.1.InThread
Closing socket to endpoint Address[192.168.204.124:5701],
Cause:java.io.EOFException
2011-09-02 18:36:14,346 | INFO  | .1.ServiceThread | ClusterManager                  
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] Removing Address
Address[192.168.204.124:5701]
2011-09-02 18:36:14,350 | INFO  | .1.ServiceThread | ClusterManager                  
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 

Members [1] {
	Member [192.168.204.123:5701] this
}

2011-09-02 18:36:14,358 | INFO  | hz.1.InThread    | InSelector                      
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 5701 is
accepting socket connection from /192.168.204.124:34112
2011-09-02 18:36:14,359 | INFO  | hz.1.InThread    | InSelector                      
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 5701 is accepted
socket connection from /192.168.204.124:34112
2011-09-02 18:36:14,471 | DEBUG | Timer-0          | MessageListenerServlet          
| ageListenerServlet$ClientCleaner  479 | 98 -
org.apache.activemq.activemq-web-console - 5.5.0 | Cleaning up expired web
clients.
2011-09-02 18:36:17,238 | DEBUG | heckpoint Worker | MessageDatabase                 
| emq.store.kahadb.MessageDatabase 1161 | 81 -
org.apache.activemq.activemq-core - 5.5.0 | Checkpoint started.
2011-09-02 18:36:17,243 | DEBUG | heckpoint Worker | MessageDatabase                 
| emq.store.kahadb.MessageDatabase 1280 | 81 -
org.apache.activemq.activemq-core - 5.5.0 | Checkpoint done.
2011-09-02 18:36:20,371 | INFO  | .1.ServiceThread | ClusterManager                  
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 

Members [2] {
	Member [192.168.204.123:5701] this
	Member [192.168.204.124:5701]
}

2011-09-02 18:36:22,264 | DEBUG | heckpoint Worker | MessageDatabase                 
| emq.store.kahadb.MessageDatabase 1161 | 81 -
org.apache.activemq.activemq-core - 5.5.0 | Checkpoint started.
2011-09-02 18:36:22,268 | DEBUG | heckpoint Worker | MessageDatabase                 
| emq.store.kahadb.MessageDatabase 1280 | 81 -
org.apache.activemq.activemq-core - 5.5.0 | Checkpoint done.
2011-09-02 18:36:22,377 | DEBUG | hz.1.InThread    | Connection                      
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] Connection lost
/192.168.204.124:34112
2011-09-02 18:36:22,377 | WARN  | hz.1.InThread    | ReadHandler                     
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] hz.1.InThread
Closing socket to endpoint Address[192.168.204.124:5701],
Cause:java.io.EOFException
2011-09-02 18:36:22,377 | INFO  | .1.ServiceThread | ClusterManager                  
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] Removing Address
Address[192.168.204.124:5701]
2011-09-02 18:36:22,382 | INFO  | .1.ServiceThread | ClusterManager                  
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 

Members [1] {
	Member [192.168.204.123:5701] this
}

2011-09-02 18:36:22,390 | INFO  | hz.1.InThread    | InSelector                      
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 5701 is
accepting socket connection from /192.168.204.124:38994
2011-09-02 18:36:22,391 | INFO  | hz.1.InThread    | InSelector                      
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 5701 is accepted
socket connection from /192.168.204.124:38994

I am sure I have missed something obvious here (though I haven't found it
yet in the docs). Anything obvious I have messed up?

thanks in advance,
Gareth

--
View this message in context: http://karaf.922171.n3.nabble.com/Configuring-Cellar-For-TCP-Instead-Of-Multicast-tp3305497p3305497.html
Sent from the Karaf - User mailing list archive at Nabble.com.

Re: Configuring Cellar For TCP Instead Of Multicast

Posted by Ioannis Canellos <io...@gmail.com>.
This was a bug that was introduced after implementing cloud discovery.
If fixed it both on trunk and branch 2.2.x

Here is the Jira:
https://issues.apache.org/jira/browse/KARAF-856


-- 
*Ioannis Canellos*
*
 http://iocanel.blogspot.com

Apache Karaf <http://karaf.apache.org/> Committer & PMC
Apache ServiceMix <http://servicemix.apache.org/>  Committer
Apache Gora <http://incubator.apache.org/gora/> Committer
*

Re: Configuring Cellar For TCP Instead Of Multicast

Posted by Gareth <ga...@gmail.com>.
Hello Ioannis,

I tried again with an even more basic configuration (base karaf +
features:install cellar):

State         Version           Name                                
Repository             Description
[installed  ] [1.9.3          ] hazelcast                            repo-0                
In memory data grid
[installed  ] [2.2.2          ] cellar                               repo-0                
Karaf clustering
[installed  ] [1.0.0          ] jclouds                              repo-0                
JClouds
[installed  ] [3.0            ] guice                                repo-0                
Google Guice
[installed  ] [3.0.6.RELEASE  ] spring                              
karaf-2.2.3            
[installed  ] [1.2.1          ] spring-dm                           
karaf-2.2.3            
[installed  ] [2.2.3          ] config                              
karaf-2.2.3            
[installed  ] [2.2.3          ] ssh                                 
karaf-2.2.3            
[installed  ] [2.2.3          ] management                          
karaf-2.2.3

I still had the same problem. In fact, I found the problem even worse. As
soon as you turn multicast off from the file and restart karaf, the cellar
features bundle hangs on startup:

[  67] [Active     ] [Created     ] [       ] [   60] Apache Karaf :: Cellar
:: Core (2.2.2)
[  68] [Active     ] [Created     ] [       ] [   60] Apache Karaf :: Cellar
:: Config (2.2.2)
[  69] [Active     ] [GracePeriod ] [       ] [   60] Apache Karaf :: Cellar
:: Features (2.2.2)
[  70] [Active     ] [Created     ] [       ] [   60] Apache Karaf :: Cellar
:: Bundle (2.2.2)
[  71] [Active     ] [Created     ] [       ] [   60] Apache Karaf :: Cellar
:: Utils (2.2.2)
[  72] [Active     ] [Created     ] [       ] [   60] Apache Karaf :: Cellar
:: Shell (2.2.2)
[  73] [Active     ] [            ] [Started] [   60] Apache Karaf :: Cellar
:: Hazelcast (2.2.2)
[  74] [Active     ] [Created     ] [       ] [   60] Apache Karaf :: Cellar
:: Management (2.2.2)

I have to shutdown karaf with "kill -9" (as shutdown doesn't work). Even if
I now turn multicast back on in the instance config file and restart, the
"Cellar :: Features" bundle continues to hang. I don't see anything too
exciting in the logs which could point me to the issue. I only see this on
shutdown:

2011-09-06 15:29:23,259 | DEBUG | lixDispatchQueue | framework                       
| ?                                   ? | 0 - org.apache.felix.framework -
3.0.9 | BundleEvent STOPPED2011-09-06 15:29:23,259 | DEBUG | nt Dispatcher:
1 | BlueprintListener                | raf.shell.osgi.BlueprintListener   85
| 30 - org.apache.karaf.shell.osgi - 2.2.3 | Blueprint app state changed to
Destroying
 for bundle 74
2011-09-06 15:29:23,259 | WARN  | hz.UDP.Sender    | ManagementCenterService         
| dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] sleep
interrupted
java.lang.InterruptedException: sleep interrupted        at
java.lang.Thread.sleep(Native Method)[:1.6.0_27]
        at
com.hazelcast.impl.management.ManagementCenterService$UDPSender.run(ManagementCenterService.java:212)[42:hazelcast:1.9.3]2011-09-06
15:29:23,260 | DEBUG | FelixStartLevel  | management                       |
?                                   ? | 74 -
org.apache.karaf.cellar.management - 2.2.2 | ServiceEvent UNREGISTERING
2011-09-06 15:29:23,261 | DEBUG | FelixStartLevel  | ReferenceRecipe                 
| eprint.container.ReferenceRecipe  152 | 9 - org.apache.aries.blueprint -
0.3.1 | Unbinding reference mbeanServer

If you could let me know what else I can try, it would be much appreciated.

thanks in advance,
Gareth




--
View this message in context: http://karaf.922171.n3.nabble.com/Configuring-Cellar-For-TCP-Instead-Of-Multicast-tp3305497p3314614.html
Sent from the Karaf - User mailing list archive at Nabble.com.

Re: Configuring Cellar For TCP Instead Of Multicast

Posted by Gareth <ga...@gmail.com>.
Hello Ioannis,

I did try Jean-Baptiste's suggestion. I shut both instances down, added the
tcpIpMembers to etc/org.apache.karaf.cellar.instance.cfg, started both
instances up and it still didn't work. When I checked the 
etc/org.apache.karaf.cellar.instance.cfg configuration files, the
tcpIpMembers field was empty:

tcpIpMembers = 

Just to make sure I didn't miss something, I repeated the test with quotes
around the tcpIpMembers:

tcpIpMembers="192.168.204.123,192.168.204.124"

with the same result.

I haven't installed the cellar-cloud feature. I do notice another weird
thing though...which I am not sure is related. I do see lots of duplicate
fileinstall instances build up (see attached screenshot).

I guess another minor point to all this. I will want to setup an apache
activemq instance for each instance. The activemq instance will need
different configuration for each instance. By default, when I create an
activemq instance via activemq:create-broker a broker blueprint
configuration file will be dropped in the deploy directory. Cellar
aurtomatically tries to replicate anything in the deploy directory, doesn't
it? What is the correct way of stopping it from doing this?

Anyway, I guess I will try again with as few features installed as possible
to see if I can get it to work. Any suggestions on what to try next would be
much appreciated.

thanks again,
Gareth

http://karaf.922171.n3.nabble.com/file/n3314332/Screen_shot_2011-09-06_at_1.53.40_PM.png 

--
View this message in context: http://karaf.922171.n3.nabble.com/Configuring-Cellar-For-TCP-Instead-Of-Multicast-tp3305497p3314332.html
Sent from the Karaf - User mailing list archive at Nabble.com.

Re: Configuring Cellar For TCP Instead Of Multicast

Posted by Ioannis Canellos <io...@gmail.com>.
>
> did you restart your Karaf instance after changing the Cellar configuration
> in etc file ?


This is not necessary. Cellar should automatically get the configuration
change event and restart the hazelcast instance.

Garteth, have you also installed the cellar-cloud module? Its the only
module that could add/remove tcpMemebers and what you are describing sounds
like a bug in that module.

-- 
*Ioannis Canellos*
*
 http://iocanel.blogspot.com

Apache Karaf <http://karaf.apache.org/> Committer & PMC
Apache ServiceMix <http://servicemix.apache.org/>  Committer
Apache Gora <http://incubator.apache.org/gora/> Committer
*

Re: Configuring Cellar For TCP Instead Of Multicast

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi Gareth,

did you restart your Karaf instance after changing the Cellar 
configuration in etc file ?

Regards
JB

On 09/03/2011 12:48 AM, Gareth wrote:
> Hello all,
>
> I finally am trying out cellar on two real linux machines (RedHat EL 5 VMs).
> I have the following software features installed on each instance:
>
> [installed  ] [1.9.3          ] hazelcast                            repo-0
> In memory data grid
> [installed  ] [2.2.2          ] cellar                               repo-0
> Karaf clustering
> [installed  ] [2.2.2          ] cellar-webconsole                    repo-0
> Karaf Cellar Webconsole Plugin
> [installed  ] [1.0.0          ] jclouds                              repo-0
> JClouds
> [installed  ] [3.0            ] guice                                repo-0
> Google Guice
> [installed  ] [3.0.6.RELEASE  ] spring
> karaf-2.2.3
> [installed  ] [1.2.1          ] spring-dm
> karaf-2.2.3
> [installed  ] [2.2.3          ] config
> karaf-2.2.3
> [installed  ] [7.4.5.v20110725] jetty
> karaf-2.2.3
> [installed  ] [2.2.3          ] http
> karaf-2.2.3
> [installed  ] [2.2.3          ] war
> karaf-2.2.3
> [installed  ] [2.2.3          ] webconsole-base
> karaf-2.2.3
> [installed  ] [2.2.3          ] webconsole
> karaf-2.2.3
> [installed  ] [2.2.3          ] ssh
> karaf-2.2.3
> [installed  ] [2.2.3          ] management
> karaf-2.2.3
> [installed  ] [5.5.0          ] activemq
> activemq-5.5.0
> [installed  ] [5.5.0          ] activemq-blueprint
> activemq-5.5.0
> [installed  ] [5.5.0          ] activemq-web-console
> activemq-5.5.0
>
> I installed the webconsole, then activemq, then cellar on both machines. I
> started both the machines up (they are in the same subnet 192.168.204.123
> and 192.168.204.124), but they couldn't automatically find each other using
> multicast (it must be a router blockage as displaying the network interfaces
> via ifconfig shows that multicast is available ->  two karaf cellar instances
> on the same machine do find each other).
>
> Anyway, so I changed the com.apache.karaf.cellar.instance configuration to
> TCP via the karaf web console and added in the IP addresses (tried hostnames
> as well) for the two machines:
>
> multicastGroup =  224.2.2.3
> tcpIpMembers =  192.168.204.123,192.168.204.124
> username =  cellar
> password =  pass
> felix.fileinstall.filename =
> file:/home/osgi/apache-karaf-2.2.3-felix/etc/org.apache.karaf.cellar.instance.cfg
> multicastEnabled =  false
> multicastPort =  54327
> multicastTimeoutSeconds =  2
> tcpIpEnabled =  true
>
> Cellar still couldn't find its partner. When I looked at the configuration
> again a minute later the tcpMembers have been deleted. When I look in the
> log I see the following:
>
> 2011-09-02 18:36:12,335 | INFO  | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar]
>
> Members [2] {
> 	Member [192.168.204.123:5701] this
> 	Member [192.168.204.124:5701]
> }
>
> 2011-09-02 18:36:14,345 | DEBUG | hz.1.InThread    | Connection
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] Connection lost
> /192.168.204.124:54468
> 2011-09-02 18:36:14,346 | WARN  | hz.1.InThread    | ReadHandler
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] hz.1.InThread
> Closing socket to endpoint Address[192.168.204.124:5701],
> Cause:java.io.EOFException
> 2011-09-02 18:36:14,346 | INFO  | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] Removing Address
> Address[192.168.204.124:5701]
> 2011-09-02 18:36:14,350 | INFO  | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar]
>
> Members [1] {
> 	Member [192.168.204.123:5701] this
> }
>
> 2011-09-02 18:36:14,358 | INFO  | hz.1.InThread    | InSelector
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 5701 is
> accepting socket connection from /192.168.204.124:34112
> 2011-09-02 18:36:14,359 | INFO  | hz.1.InThread    | InSelector
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 5701 is accepted
> socket connection from /192.168.204.124:34112
> 2011-09-02 18:36:14,471 | DEBUG | Timer-0          | MessageListenerServlet
> | ageListenerServlet$ClientCleaner  479 | 98 -
> org.apache.activemq.activemq-web-console - 5.5.0 | Cleaning up expired web
> clients.
> 2011-09-02 18:36:17,238 | DEBUG | heckpoint Worker | MessageDatabase
> | emq.store.kahadb.MessageDatabase 1161 | 81 -
> org.apache.activemq.activemq-core - 5.5.0 | Checkpoint started.
> 2011-09-02 18:36:17,243 | DEBUG | heckpoint Worker | MessageDatabase
> | emq.store.kahadb.MessageDatabase 1280 | 81 -
> org.apache.activemq.activemq-core - 5.5.0 | Checkpoint done.
> 2011-09-02 18:36:20,371 | INFO  | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar]
>
> Members [2] {
> 	Member [192.168.204.123:5701] this
> 	Member [192.168.204.124:5701]
> }
>
> 2011-09-02 18:36:22,264 | DEBUG | heckpoint Worker | MessageDatabase
> | emq.store.kahadb.MessageDatabase 1161 | 81 -
> org.apache.activemq.activemq-core - 5.5.0 | Checkpoint started.
> 2011-09-02 18:36:22,268 | DEBUG | heckpoint Worker | MessageDatabase
> | emq.store.kahadb.MessageDatabase 1280 | 81 -
> org.apache.activemq.activemq-core - 5.5.0 | Checkpoint done.
> 2011-09-02 18:36:22,377 | DEBUG | hz.1.InThread    | Connection
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] Connection lost
> /192.168.204.124:34112
> 2011-09-02 18:36:22,377 | WARN  | hz.1.InThread    | ReadHandler
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] hz.1.InThread
> Closing socket to endpoint Address[192.168.204.124:5701],
> Cause:java.io.EOFException
> 2011-09-02 18:36:22,377 | INFO  | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] Removing Address
> Address[192.168.204.124:5701]
> 2011-09-02 18:36:22,382 | INFO  | .1.ServiceThread | ClusterManager
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar]
>
> Members [1] {
> 	Member [192.168.204.123:5701] this
> }
>
> 2011-09-02 18:36:22,390 | INFO  | hz.1.InThread    | InSelector
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 5701 is
> accepting socket connection from /192.168.204.124:38994
> 2011-09-02 18:36:22,391 | INFO  | hz.1.InThread    | InSelector
> | dardLoggerFactory$StandardLogger   62 |  -  -  | [cellar] 5701 is accepted
> socket connection from /192.168.204.124:38994
>
> I am sure I have missed something obvious here (though I haven't found it
> yet in the docs). Anything obvious I have messed up?
>
> thanks in advance,
> Gareth
>
> --
> View this message in context: http://karaf.922171.n3.nabble.com/Configuring-Cellar-For-TCP-Instead-Of-Multicast-tp3305497p3305497.html
> Sent from the Karaf - User mailing list archive at Nabble.com.

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com