You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@servicemix.apache.org by "Guillaume Nodet (JIRA)" <ji...@apache.org> on 2009/01/09 18:44:59 UTC

[jira] Created: (SMX4KNL-169) Use the start level to implement the container level locking

Use the start level to implement the container level locking
------------------------------------------------------------

                 Key: SMX4KNL-169
                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
             Project: ServiceMix Kernel
          Issue Type: Improvement
            Reporter: Guillaume Nodet
             Fix For: 1.1.0


This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Jamie Goodyear (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48704#action_48704 ] 

Jamie Goodyear commented on SMX4KNL-169:
----------------------------------------


h5. Test Case:

# Install Kernel.
# Start Kernel.
# Execute: servicemix admin > create test1
# Execute: servicemix admin > create test2
# Exit from initial Kernel installation.
# Edit test1 and test2  $HOME/etc/system.properties files to include lock=true, lock.level=50, and lock.dir=/path/to/lock
# From terminal start test1 instance.
# From terminal start test2 instance.

h6. Observations:

* JMX bind exception on second instance to be started (test2).
* Received warning on second instance to be started (test2) that the lock was not obtained, however full console was provided (run level is 50 on both instances).
* When first instance is stopped the second instance gains lock (test2 is now master).
* Reconfigured test1 to start at level 0, started instance. Upon start will wait until lock acquired to fully start SMX.

h6. Notes:

*After exiting both test1 and test2 instances I started both via the original installation's admin console. Both are reported as "started" - each have lock level 50 set. 

*If I set one instance to have lock level 0, the other level 50, then start both instances via the console I will see that one instance will become started, the other will be starting. In this case the starting instance has level 0. When i stop the currently started instance I would expect the other instance to move from starting to started however the console does not display this state change.
** The second instance should have been fully started and reported as such.
** When locking is in use should we report a waiting instance as "Standby" instead of "Starting", this may be more instructive to administrators.
  

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Jamie Goodyear (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48880#action_48880 ] 

Jamie Goodyear commented on SMX4KNL-169:
----------------------------------------

Will do.

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Guillaume Nodet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48632#action_48632 ] 

Guillaume Nodet commented on SMX4KNL-169:
-----------------------------------------

To test this patch, SMX4KNL-153 patch need to be applied first.
Then, edit the {{etc//system.properties}} file and add {{servicemix.lock.level=50}} property.

This solution may work, but not when starting two processes from the same location.  You need to copy it and use the jdbc lock i suppose.  The problem is that felix won't support to use the same data/cache folder for two different instances.  A possible solution would be to change felix config to use two different folders, but it may be a bit misleading as the two instances may not have the same configuration (installed bundes, etc...).
One possible enhancement would be to have the lock file (when using the file based locking mechanism) to be specified in an alternate directory so that it could be shared by the two instances.   
The RMI / SSH port conflict has to be solved somehow too.



> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Guillaume Nodet (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guillaume Nodet updated SMX4KNL-169:
------------------------------------

    Attachment: SMX4KNL-169.patch

Work in progress.  This patch should allow to configure a start-level > 0 so that the osgi framework will be at this start level until the lock can be acquired.  When the lock is acquired, the usual start level is set, so everything will be started.
Need to check potential problems with JMX and SSH port conflicts.

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Jamie Goodyear (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48881#action_48881 ] 

Jamie Goodyear commented on SMX4KNL-169:
----------------------------------------

Updated: http://servicemix.apache.org/SMX4KNL/67-configuring-failover-deployments-available-in-110.html

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Jamie Goodyear (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48856#action_48856 ] 

Jamie Goodyear commented on SMX4KNL-169:
----------------------------------------

h5. JMX port bind exception.

After experimenting with various ways of setting jmx port usage I believe that we may have to go with configuring the JMX port separately for each SMX instance on a single host.

* The command to use JMX is passed into the JVM at startup time, we'd have to delay JVM startup to avoid a port conflict if instances were to share a common port.
** Separate full installations would only require a manual update of port allocation.
** Instances created from admin console may have their ports pre-allocated. Currently each instance references the parent's 'servicemix' run script, these scripts will have to be copy & filtered for each created instance (similar to how ssh ports are currently allocated). The JMX port for an instance would need to be reported for ease of use via console.
* Authentication would be defaulted to false.

h6. SSH port

We already posses the ability to configure the ssh port via org.apache.servicemix.shell.cfg. 

* When instances are created from the admin console a port is automatically assigned.
* Separate full installations already require manual ssh port configuration in shell.cfg.

h6. Run Level combinations.

As noted above different runtime behaviors will be observed based upon bundle start levels. A best practices document may be required in the users guide to help determine base settings for 'servicemix.lock.level'.

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Jamie Goodyear (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48771#action_48771 ] 

Jamie Goodyear commented on SMX4KNL-169:
----------------------------------------


h6. JMX port bind exception.

When smx creates instances on a single host each instance reuses the 'servicemix' startup script of the parent. The parent servicemix start script just appends the jmx remote jvm argument to command line. We can indicate when we want org.apache.servicemix.kernel.management-1.1.0-SNAPSHOT.jar loaded via the $SMX_HOME/etc/startup.properties file, allowing the management components to not be loaded until the SMX run level is achieved.

* In the case where the management component is above the current start level no bind error will occur.
** management.jar=30   lock.level=1
* In the case where the management component is below the start level we will see bind exceptions.
** management.jar=30   lock.level=50

h5. Possible solutions.

One solution would be to alter each instance of SMX to have its own JMX port, thereby removing contention for the JMX port (much the same way that each instance has its own port upon creation).

Another solution would be to link startup of the management components to which instance is currently master in the deployment (if the instance has the lock then it may load the management jar). This should allow jmx remote to remain on one default port. 

Is there any preference to how this should be addressed?
  

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Guillaume Nodet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48859#action_48859 ] 

Guillaume Nodet commented on SMX4KNL-169:
-----------------------------------------

Thanks for the heads up.  Having to configure a different port when running multiple kernels on the same host sounds good to me.
As for run level combinations, I did not really envision that, and I would think a good practice would be to set the run level to the same value on both the master and the slave.  I don't really see a good use case for having two different lock levels ...

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Guillaume Nodet (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guillaume Nodet resolved SMX4KNL-169.
-------------------------------------

    Resolution: Fixed

Patch applied.

Sending        main/src/main/java/org/apache/servicemix/kernel/main/DefaultJDBCLock.java
Sending        main/src/main/java/org/apache/servicemix/kernel/main/LockMonitor.java
Sending        main/src/main/java/org/apache/servicemix/kernel/main/Main.java
Transmitting file data ...
Committed revision 736256.

Would you mind spending a few minutes to enhance the Kernel's Users' Guide ?

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Jamie Goodyear (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48862#action_48862 ] 

Jamie Goodyear commented on SMX4KNL-169:
----------------------------------------

I agree that the members of each master / slave set should most definitely have the same value for 'servicemix.lock.level'. The value for that level setting however will need some guide lines. 

* Level 1 would indicate a nearly 'cold' standby status of the core bundles.
* Level 50 would indicate a 'hot' standby status of the core bundles.
* Any level above 50 would have various behaviors based upon installed user bundles, and as such would not be recommended. 

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Jamie Goodyear (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48704#action_48704 ] 

jgoodyear edited comment on SMX4KNL-169 at 1/14/09 7:16 AM:
-----------------------------------------------------------------

h5. Test Case:

# Install Kernel.
# Start Kernel.
# Execute: servicemix admin > create test1
# Execute: servicemix admin > create test2
# Exit from initial Kernel installation.
# Edit test1 and test2  $HOME/etc/system.properties files to include lock=true, lock.level=50, and lock.dir=/path/to/lock
# From terminal start test1 instance.
# From terminal start test2 instance.

h6. Observations:

* JMX bind exception on second instance to be started (test2).
* Received warning on second instance to be started (test2) that the lock was not obtained, however full console was provided (run level is 50 on both instances).
* When first instance is stopped the second instance gains lock (test2 is now master).
* Reconfigured test1 to start at level 0, started instance. Upon start will wait until lock acquired to fully start SMX.

h6. Notes:

* After exiting both test1 and test2 instances I started both via the original installation's admin console. Both are reported as "started" - each have lock level 50 set. 

* If I set one instance to have lock level 0, the other level 50, then start both instances via the console I will see that one instance will become started, the other will be starting. In this case the starting instance has level 0. When i stop the currently started instance I would expect the other instance to move from starting to started however the console does not display this state change.
** The second instance should have been fully started and reported as such.
** When locking is in use should we report a waiting instance as "Standby" instead of "Starting", this may be more instructive to administrators.
  

      was (Author: jgoodyear):
    
h5. Test Case:

# Install Kernel.
# Start Kernel.
# Execute: servicemix admin > create test1
# Execute: servicemix admin > create test2
# Exit from initial Kernel installation.
# Edit test1 and test2  $HOME/etc/system.properties files to include lock=true, lock.level=50, and lock.dir=/path/to/lock
# From terminal start test1 instance.
# From terminal start test2 instance.

h6. Observations:

* JMX bind exception on second instance to be started (test2).
* Received warning on second instance to be started (test2) that the lock was not obtained, however full console was provided (run level is 50 on both instances).
* When first instance is stopped the second instance gains lock (test2 is now master).
* Reconfigured test1 to start at level 0, started instance. Upon start will wait until lock acquired to fully start SMX.

h6. Notes:

*After exiting both test1 and test2 instances I started both via the original installation's admin console. Both are reported as "started" - each have lock level 50 set. 

*If I set one instance to have lock level 0, the other level 50, then start both instances via the console I will see that one instance will become started, the other will be starting. In this case the starting instance has level 0. When i stop the currently started instance I would expect the other instance to move from starting to started however the console does not display this state change.
** The second instance should have been fully started and reported as such.
** When locking is in use should we report a waiting instance as "Standby" instead of "Starting", this may be more instructive to administrators.
  
  
> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SMX4KNL-169) Use the start level to implement the container level locking

Posted by "Guillaume Nodet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/SMX4KNL-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=48656#action_48656 ] 

Guillaume Nodet commented on SMX4KNL-169:
-----------------------------------------

Another possible idea would be to implement the locking at the jbi level, where all SAs would be in a stopped state until the master crashes, which would make the salve start the SAs.  However, i think the same problem as the start level lock mechanism would occur: you can't really start two instances from the same folder, and you have to configure different ports.   Anyway, locking at the jbi level may provide a faster way to come the slave to a fully operational state, though i'm not sure it's worth it compared to start level lock mechanism.
I guess we need to experiment a bit and see what the delay is for a typical servicemix 4 installation (with one SA).  My guess is that it should be sufficient.

> Use the start level to implement the container level locking
> ------------------------------------------------------------
>
>                 Key: SMX4KNL-169
>                 URL: https://issues.apache.org/activemq/browse/SMX4KNL-169
>             Project: ServiceMix Kernel
>          Issue Type: Improvement
>            Reporter: Guillaume Nodet
>             Fix For: 1.1.0
>
>         Attachments: SMX4KNL-169.patch
>
>
> This should allow hot fail-over with all bundles being already installed and ready to be started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.