You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Jerry Cwiklik (JIRA)" <ui...@incubator.apache.org> on 2010/01/18 18:20:54 UTC

[jira] Created: (UIMA-1726) Long GC causes UIMA AS service to loose broker connection and marks the client as dead

Long GC causes UIMA AS service to loose broker connection and marks the client as dead
--------------------------------------------------------------------------------------

                 Key: UIMA-1726
                 URL: https://issues.apache.org/jira/browse/UIMA-1726
             Project: UIMA
          Issue Type: Bug
          Components: Async Scaleout
            Reporter: Jerry Cwiklik
            Assignee: Jerry Cwiklik


When the JVMs GC takes a long time, the UIMA AS fails to validate the broker connection and proceeds to close it as if the broker died. It seems that the long GC freezes the jvm and prevents the low level socket pinging to work correctly. This pinging is a mechanism by which the low level AMQ code relies upon to detect a broker failure. In this case, the broker is actually fine. The side effect of the above is that the UIMA AS service falsely adds the client to the DoNotProcess list. This is an optimization that was recently added that prevents wasting processing cycles while handling CASes that are known to have come from clients that had gone away. Each CAS origin is checked against that list and if there is a match the CAS is thrown away. It seems that we need a better mechanism to detect broker failure.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (UIMA-1726) Long GC causes UIMA AS service to loose broker connection and marks the client as dead

Posted by "Jerry Cwiklik (JIRA)" <ui...@incubator.apache.org>.
    [ https://issues.apache.org/jira/browse/UIMA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806383#action_12806383 ] 

Jerry Cwiklik commented on UIMA-1726:
-------------------------------------

There should be a message in a log like this:

Controller: {0} Adding Endpoint: {1} to the Do Not Process List.

where {0} is a UIMA AS service name, and {1} is the name of a temp queue

> Long GC causes UIMA AS service to loose broker connection and marks the client as dead
> --------------------------------------------------------------------------------------
>
>                 Key: UIMA-1726
>                 URL: https://issues.apache.org/jira/browse/UIMA-1726
>             Project: UIMA
>          Issue Type: Bug
>          Components: Async Scaleout
>            Reporter: Jerry Cwiklik
>            Assignee: Jerry Cwiklik
>
> When the JVMs GC takes a long time, the UIMA AS fails to validate the broker connection and proceeds to close it as if the broker died. It seems that the long GC freezes the jvm and prevents the low level socket pinging to work correctly. This pinging is a mechanism by which the low level AMQ code relies upon to detect a broker failure. In this case, the broker is actually fine. The side effect of the above is that the UIMA AS service falsely adds the client to the DoNotProcess list. This is an optimization that was recently added that prevents wasting processing cycles while handling CASes that are known to have come from clients that had gone away. Each CAS origin is checked against that list and if there is a match the CAS is thrown away. It seems that we need a better mechanism to detect broker failure.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (UIMA-1726) Long GC causes UIMA AS service to loose broker connection and marks the client as dead

Posted by "Jörn Kottmann (JIRA)" <ui...@incubator.apache.org>.
    [ https://issues.apache.org/jira/browse/UIMA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806270#action_12806270 ] 

Jörn Kottmann commented on UIMA-1726:
-------------------------------------

How can I find out that my service failed because of this issue ? Is there a message logged when a client is added to the DoNotProcess list ?

> Long GC causes UIMA AS service to loose broker connection and marks the client as dead
> --------------------------------------------------------------------------------------
>
>                 Key: UIMA-1726
>                 URL: https://issues.apache.org/jira/browse/UIMA-1726
>             Project: UIMA
>          Issue Type: Bug
>          Components: Async Scaleout
>            Reporter: Jerry Cwiklik
>            Assignee: Jerry Cwiklik
>
> When the JVMs GC takes a long time, the UIMA AS fails to validate the broker connection and proceeds to close it as if the broker died. It seems that the long GC freezes the jvm and prevents the low level socket pinging to work correctly. This pinging is a mechanism by which the low level AMQ code relies upon to detect a broker failure. In this case, the broker is actually fine. The side effect of the above is that the UIMA AS service falsely adds the client to the DoNotProcess list. This is an optimization that was recently added that prevents wasting processing cycles while handling CASes that are known to have come from clients that had gone away. Each CAS origin is checked against that list and if there is a match the CAS is thrown away. It seems that we need a better mechanism to detect broker failure.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (UIMA-1726) Long GC causes UIMA AS service to loose broker connection and marks the client as dead

Posted by "Jerry Cwiklik (JIRA)" <ui...@incubator.apache.org>.
     [ https://issues.apache.org/jira/browse/UIMA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jerry Cwiklik reopened UIMA-1726:
---------------------------------


When the broker is started with no JMX support, the UIMA AS service keeps logging a message like this:

4/20/10 4:20:56 PM - 30: org.apache.uima.adapter.jms.activemq.JmsInputChannel.ackMessage: WARNING:
java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 10.45.2.100; nested exception is:
       java.net.ConnectException: Connection refused]
       at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:323)
       at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
       at org.apache.uima.aae.jmx.RemoteJMXServer.initialize(RemoteJMXServer.java:76)
       at org.apache.uima.adapter.jms.activemq.JmsInputChannel.attachToRemoteBrokerJMXServer(JmsInputChannel.java:837)
       at org.apache.uima.adapter.jms.activemq.JmsInputChannel.onMessage(JmsInputChannel.java:605)
       at org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:518)
       at org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:479)
       at org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:451)
       at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:323)
       at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:261)
       at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:982)
       at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:974)
       at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:876)
       at java.lang.Thread.run(Thread.java:619)
Caused by: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 10.45.2.100; nested exception is:
       java.net.ConnectException: Connection refused]
       at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:101)
       at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:185)
       at javax.naming.InitialContext.lookup(InitialContext.java:392)
       at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1871)
       at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1841)
       at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:257)
       ... 13 more

The service should only log this exception once and continue with no further attempts to connect to broker's JMX MBeanServer. 


> Long GC causes UIMA AS service to loose broker connection and marks the client as dead
> --------------------------------------------------------------------------------------
>
>                 Key: UIMA-1726
>                 URL: https://issues.apache.org/jira/browse/UIMA-1726
>             Project: UIMA
>          Issue Type: Bug
>          Components: Async Scaleout
>            Reporter: Jerry Cwiklik
>            Assignee: Jerry Cwiklik
>             Fix For: 2.3AS
>
>
> When the JVMs GC takes a long time, the UIMA AS fails to validate the broker connection and proceeds to close it as if the broker died. It seems that the long GC freezes the jvm and prevents the low level socket pinging to work correctly. This pinging is a mechanism by which the low level AMQ code relies upon to detect a broker failure. In this case, the broker is actually fine. The side effect of the above is that the UIMA AS service falsely adds the client to the DoNotProcess list. This is an optimization that was recently added that prevents wasting processing cycles while handling CASes that are known to have come from clients that had gone away. Each CAS origin is checked against that list and if there is a match the CAS is thrown away. It seems that we need a better mechanism to detect broker failure.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (UIMA-1726) Long GC causes UIMA AS service to loose broker connection and marks the client as dead

Posted by "Jerry Cwiklik (JIRA)" <ui...@incubator.apache.org>.
     [ https://issues.apache.org/jira/browse/UIMA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jerry Cwiklik closed UIMA-1726.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 2.3AS

Removed DoNotProcess list as means of tracking disconnected clients. To detect dead clients and to optimize processing, the code checks for existence of a temp reply queue in the broker's JMX MBeanServer registry. If the temp queue exists, the message is processed. If the queue lookup fails, the message is dropped. A temp reply queue exists in the same broker that manages service input queue. The code creates a connection to a JMX MBeanServer on initial request message and caches it for subsequent lookups. If the broker is configured to not use jmx, the optimization is not performed and every message is processed. Such requests fail when a reply is attempted and the temp queue doesnt exists due to client termination. The code supports: tcp, http, as well as failover in the broker url. 
The default jmx port used for creating connection to broker's MBeanServer is 1099. To override this, add this property: -Dactivemq.broker.jmx.port=XXX to the service startup command
 

> Long GC causes UIMA AS service to loose broker connection and marks the client as dead
> --------------------------------------------------------------------------------------
>
>                 Key: UIMA-1726
>                 URL: https://issues.apache.org/jira/browse/UIMA-1726
>             Project: UIMA
>          Issue Type: Bug
>          Components: Async Scaleout
>            Reporter: Jerry Cwiklik
>            Assignee: Jerry Cwiklik
>             Fix For: 2.3AS
>
>
> When the JVMs GC takes a long time, the UIMA AS fails to validate the broker connection and proceeds to close it as if the broker died. It seems that the long GC freezes the jvm and prevents the low level socket pinging to work correctly. This pinging is a mechanism by which the low level AMQ code relies upon to detect a broker failure. In this case, the broker is actually fine. The side effect of the above is that the UIMA AS service falsely adds the client to the DoNotProcess list. This is an optimization that was recently added that prevents wasting processing cycles while handling CASes that are known to have come from clients that had gone away. Each CAS origin is checked against that list and if there is a match the CAS is thrown away. It seems that we need a better mechanism to detect broker failure.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.