You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by "python (JIRA)" <ji...@apache.org> on 2008/06/18 18:03:00 UTC

[jira] Commented: (AMQCPP-165) Core Dump on reconnect/open queue

    [ https://issues.apache.org/activemq/browse/AMQCPP-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=43562#action_43562 ] 

python commented on AMQCPP-165:
-------------------------------

I have produced (and reproduced) a similar error using activemq-cpp-2.1.3 on WindowsXP/WindowsServer2003

Versions:
ActiveMQ-cpp-2.1.3
ActiveMQ Broker 5.1

Backtrace (VS2005):
activemq::connector::openwire::OpenWireConnector::closeResource(activemq::connector::ConnectorResource * resource=0x029f0678)  Line 1195 + 0x25 bytes C++
activemq::connector::BaseConnectorResource::close()  Line 64     C++
activemq::connector::openwire::OpenWireSessionInfo::~OpenWireSessionInfo()  Line 57    C++
activemq::connector::openwire::OpenWireSessionInfo::`scalar deleting destructor'()  + 0xf bytes    C++
activemq::connector::openwire::OpenWireConnector::createSession(cms::Session::AcknowledgeMode ackMode=AUTO_ACKNOWLEDGE)  Line 282 + 0x32 bytes        C++
activemq::core::ActiveMQConnection::createSession(cms::Session::AcknowledgeMode ackMode=AUTO_ACKNOWLEDGE)  Line 98 + 0x8b bytes          C++

null pointer exception occurs on this line:
            dataStructure = session->getSessionInfo()->getSessionId(); 
			
The session object is fine, but the getSessionInfo() call returns NULL.		


Steps to reproduce:
1-Three activemq-cpp clients (within the same process) connect to a broker.
2-Three queues are used to send many messages per second to the broker.
3-While the connection is active, the broker runs out of disk space and then memory.
4-Continuous attempts to reconnect to the broker fail, and eventually may produce the above error (could take several hours to produce depending on frequency of reconnect attempts).

Since producing this error can be difficult, it's easier to just look at the code:

1: In OpenWireConnector::createSession(): syncRequest(info) throws an exception. Note that: session->setSessionInfo( info ); is never called. 
2: Exception handler calls: delete session;
3: OpenWireSessionInfo object's destructor is called which calls the BaseConnectorResource::close() method.
4: Then connector->closeResource( this ); is called (OpenWireConnector::closeResource()) which tries to access the the resource's sessionInfo. Since the sessionInfo has not been set yet, we have a crash.

We fixed this by returning immediately from closeResource() if getSessionInfo() returns NULL. Perhaps this can be fixed by updating the OpenWireConnector::state instead. Not too sure...

Also, by looking at the code it doesn't look like it's fixed in 2.2. I have not tested it though. 



> Core Dump on reconnect/open queue
> ---------------------------------
>
>                 Key: AMQCPP-165
>                 URL: https://issues.apache.org/activemq/browse/AMQCPP-165
>             Project: ActiveMQ C++ Client
>          Issue Type: Bug
>    Affects Versions: 2.1.1
>         Environment: Red Hat Linux 2.4.x
>            Reporter: pfid
>            Assignee: Nathan Mittler
>             Fix For: 2.2
>
>         Attachments: sample.tar.gz
>
>
> our activemq application core dumped several times during the last days when the connection to the broker was lost. each time it was either caused by the broker beeing restartet or write attempts failing (see exception below).
> in both cases the application catches a CMS exception, closes all queues and tries to re-open them after 60s. all activemq objects are destroyed after closing (see cleanup() from web example).
> the core dumps seemed to happen when the application trys to re-open the connection, but fails because the broker is still unreachable. here is the backtrace:
> <quote>
> #0  activemq::connector::openwire::OpenWireConnector::closeResource (this=0x8b4a268, resource=0x8b4dde0) at activemq/connector/openwire/OpenWireConnector.cpp:1200
> #1  0x080da6fc in activemq::connector::BaseConnectorResource::close (this=0x8b4dde0) at activemq/connector/BaseConnectorResource.cpp:59
> #2  0x0812ff50 in ~OpenWireSessionInfo (this=0x8b4dde0) at OpenWireSessionInfo.h:56
> #3  0x0812d0c4 in activemq::connector::openwire::OpenWireConnector::createSession (this=0x8b4dde0, ackMode=cms::Session::AUTO_ACKNOWLEDGE)
>     at activemq/connector/openwire/OpenWireConnector.cpp:281
> #4  0x080e86c1 in activemq::core::ActiveMQConnection::createSession (this=0x8b4ded0, ackMode=137247624) at activemq/core/ActiveMQConnection.cpp:98
> #5  0x08059c19 in ActiveMqQueue::open (this=0x8b1d6b0, aQueueName=0x8ab925c "outqueue", aMode=ActiveMqQueue::modeWrite, aListenMode=0) at activemqqueue.cc:335
> </quote>
> Debuggin shows that at activemq/connector/openwire/OpenWireConnector.cpp:1200
> 1200:  dataStructure = session->getSessionInfo()->getSessionId();
> the session object is null, the previously dyn-casted resource object however is not null:
> <quote>
> (gdb) p session
> $1 = (activemq::connector::openwire::OpenWireSessionInfo *) 0x0
> (gdb) p resource
> $2 = (class activemq::connector::ConnectorResource *) 0x8b4dde0</quote>
> (corrupt memory?)
> Exception when write attempts fail:
> <quote>No valid response received for command: Begin Class = ActiveMQTextMessage Begin Class = ActiveMQMessageBase  Value of ackHandler = 0  Value of redeliveryCount = 0  Value of properties = Begin Class PrimitiveMap: Begin Class PrimitiveMap:  Begin Class = Message  Value of Message::ID_MESSAGE = 0  Value of ProducerId is Below: Begin Class = ProducerId  Value of ProducerId::ID_PRODUCERID = 123  Value of ConnectionId = 0c00f32b-2269-4e0f-ace1-13fd0414b4b5  Value of Value = 0  Value of SessionId = 0 No Data for Class BaseDataStructure End Class = ProducerId   Value of Destination is Below: Begin Class = ActiveMQQueue Begin Class = ActiveMQDestination  Value of exclusive = false  Value of ordered = false  Value of advisory = false  Value of orderedTarget = coordinator  Value of physicalName = ffs_out  Value of options = Begin Class activemq::util::Properties: End Class activemq::util::Properties:  No Data for Class BaseDataStructure End Class = ActiveMQDestination End Class = ActiveMQQueue   Value of TransactionId is Below:    Object is NULL  Value of OriginalDestination is Below:    Object is NULL  Value of MessageId is Below: Begin Class = MessageId  Value of MessageId::ID_MESSAGEID = 110  Value of ProducerId is Below: Begin Class = ProducerId  Value of ProducerId::ID_PRODUCERID = 123  Value of ConnectionId = 0c00f32b-2269-4e0f-ace1-13fd0414b4b5  Value of Value = 0  Value of SessionId = 0 No Data for Class BaseDataStructure End Class = ProducerId   Value of ProducerSequenceId = 4  Value of BrokerSequenceId = 0 No Data for Class BaseDataStructure End Class = MessageId   Value of OriginalTransactionId is Below:    Object is NULL  Value of GroupID =   Value of GroupSequence = 0  Value of CorrelationId =   Value of Persistent = 1  Value of Expiration = 1201683817204  Value of Priority = 4  Value of ReplyTo is Below:    Object is NULL  Value of Timestamp = 1201676617204  Value of Type =   Value of Content[0] = , check broker.</quote>
> Versions:
> Activemq-cpp-2.1.1
> ActiveMq Broker 4.1.1
> the application handles 17 write-mode queues, with a rather low messages/second rate.
> Using 5.0.0 broker instead of 4.1.1 would most likely solve this problem, since the failed write attempts problem only occurs with 4.1.1 broker (i reported this bug before, but it seemed like no one was interested in taking care of it). however, the broker 5.0.0 won't start with preconfigured JAAS queues, so its not an option and we have to stick with 4.1.1. i will try the latest snapshot these days, however i dont feel good when using a snapshot server in production environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.