You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by sgurusam <gu...@gmail.com> on 2011/04/22 12:30:12 UTC

Senders blocked when temp store 100% and memory leak in temp store

Hi,

We were stressing activemq 5.4.2 with few persistent queues, few
non-persistent queues with TimeToLive (TTL) of 2 mins, and few non
persistent queues without any TTL. We came across 2 issues from which we
couldn’t recover to normal state without cleaning tmp-store and restarting
activemq.

Stress Details:
1.	Usecase
a.	Senders sending messages to one master queue
b.	Consumers for master queue is integrated along with activemq thru camel
integration with vm protocol. Consumer is called as router, which routes to
other sub queues based on some routing information based on event type. By,
this same message could be dropped to more than one sub queues. We had 4 sub
queues, on average one message to master queue can result into 2.3 messages
in all sub queues because of routing same messages to sub queues.
c.	Consumers to sub queues are in different JVM which drain the messages.
2.	Stats
a.	2000-4000 msgs per second sent to master queue
b.	2000-3300 msgs per second of processing at router
c.	30% cpu in activemq
3.	Hardware
a.	ActiveMQ: one node 6 CPU machine with 24 virtual cpus because of core and
hyper threading.
b.	4 Senders:  Each node having 2 CPU with 4 virtual cpus because of dual
core. Each sender was to send 1000 msgs per second
c.	4 Receivers: Each node having 2 CPU with 4 virtual cpus because of dual
core.  1 receiver per sub queue.


Issue1: Temp store usage hits more than 100%
After temp store usage hits more than 100% all senders and consumers were
blocked. There were pending messages in both non-persistent queues with TTL
and without TTL. We tried purging in non-persistent queue, thinking purging
messages in non-persistent queue can free up temp store, but all purge
requests from activemq console and jmx were also blocked with no response. 

Only way to recover was to clean the tmp store and restart the activemq. We
had tmp store configured to 2 GB.

Issue2: Leak in temp store
As Issue 1 was hit, and we couldn’t proceed with stress, so stopped the
stress when temp store reached 96% (timeframe-1), after all messages where
drained in 4-5 mins and there no pending messages in any queue, but temp
store didn’t drop to 0, it dropped only to till 79% (time-frame2). We
started stress again messages piled and temp store again went to 90, we
again stopped stress, 

Below has chain of events with temp store stats, queue stats and temp store
files names and its size. 

Please let me know any pointers on how to recover from this kind of issue
without cleaning tempstore and restart activemq. Is there anyway
configuration thru which we can avoid memory leak in temp store. Also let me
know if you need any other details on stress usecase.

Thanks
siva


TempStore stats
2011-04-21 07:18:08,760 - MEMU	STOREU	TEMPU	ENQ#	DEQ#	TOT_MSG#
2011-04-21 07:18:08,779 - 0	0	0	6934	6480	454 	                          
Started one sender
2011-04-21 07:52:10,539 - 0	1	0	13370361	13370361	0
2011-04-21 07:54:10,737 - 0	0	11	14155760	14115898	39861  Started three
more sender
2011-04-21 07:56:10,829 - 0	1	26	15484214	15375457	108759
2011-04-21 07:58:10,899 - MEMU	STOREU	TEMPU	ENQ#	DEQ#	TOT#
2011-04-21 07:58:10,902 - 0	1	41	16822317	16653849	168468
2011-04-21 08:00:10,999 - 0	2	59	18097270	17851082	246192
2011-04-21 08:02:11,111 - 0	2	75	19411640	19097534	314109
2011-04-21 08:04:11,188 - 0	0	96	20630853	20223204	407652  Stopped stress;
consumers are still draining 400k msgs
2011-04-21 08:06:11,244 - 0	1	87	21618914	21436362	182551
2011-04-21 08:08:11,301 - 0	0	79	22002217	22002217	0            Dropped to
79 in 6 mins; 0 pending msgs
2011-04-21 08:18:11,734 - 0	1	79	25445840	25445574	269        started only
one senders
2011-04-21 08:20:11,806 - 0	1	81	26328252	26311965	16285   
2011-04-21 08:22:11,901 - 0	2	82	27674442	27597605	76846    started three
more senders
2011-04-21 08:24:11,974 - 0	0	85	29008172	28864086	144093
2011-04-21 08:26:12,034 - 0	1	88	30325245	30124132	201114
2011-04-21 08:28:12,097 - 0	1	91	31664800	31398125	266677
2011-04-21 08:30:12,173 - 0	2	93	32999706	32671816	327890
2011-04-21 08:32:12,232 - 0	2	95	34307528	33946200	361331  Stopped again
2011-04-21 08:34:12,296 - 0	0	79	35082703	35082702	1             Dropped to
79 within 2 mins

Master Queue stats
2011-04-21 07:18:08,867 - masterEventQueue	PEND#	ENQ#	Chng_ENQ#	DEQ#
Chng__DEQ#	EXP#	MEM%	DSPTCH#   Stress start
2011-04-21 07:18:08,872 - masterEventQueue	0	2237		18	2237	18	0	0	2237               	 
 Started one sender
2011-04-21 07:52:10,564 - masterEventQueue	0	4313020	2185	4313020	2185	0	0
4313020
2011-04-21 07:54:10,767 - masterEventQueue	39769	4593381	2336	4553610	2004	0
0	4555338
2011-04-21 07:56:10,845 - masterEventQueue	108697	5068561	3959	4959863	3385
0	0	4961116  Started 3 more senders. Msgs enqued spiked from 2k to 4k,
after which it stayed constent at 4k; 
2011-04-21 07:58:10,926 - masterEventQueue	168461	5540674	3934	5372213	3436
0	0	5373512
2011-04-21 08:00:11,028 - masterEventQueue	246185	6004699	3866	5758517	3219
0	0	5760118
2011-04-21 08:02:11,137 - masterEventQueue	314148	6474728	3916	6160580	3350
0	0	6162105
2011-04-21 08:04:11,205 - masterEventQueue	407616	6931316	3804	6523701	3026
0	0	6525303  Stopped stress; senders were sending 4k msg per second but
router was able to process only till 3.3-3.4k msg per second.
2011-04-21 08:06:11,261 - masterEventQueue	182463	7097490	1384	6915027	3261
0	0	6920115
2011-04-21 08:08:11,368 - masterEventQueue	0	7097490	0	7097490	1520	0	0
7097490  Pending msg is 0; but Tmp store dropped only to 79%
2011-04-21 08:18:11,758 - masterEventQueue	16	8208393	2084	8208378	2084	0	0
8208397  started only one senders 
2011-04-21 08:20:11,836 - masterEventQueue	15958	8503901	2462	8487948	2329	0
0	8490179
2011-04-21 08:22:11,929 - masterEventQueue	76455	8979141	3960	8902691	3456	0
0	8903985  started three more senders
2011-04-21 08:24:11,993 - masterEventQueue	143969	9455091	3966	9311128	3403
0	0	9312780
2011-04-21 08:26:12,053 - masterEventQueue	200591	9918299	3860	9717712	3388
0	0	9719987
2011-04-21 08:28:12,124 - masterEventQueue	266382	10395026	3972	10128647
3424	0	0	10130777
2011-04-21 08:30:12,192 - masterEventQueue	327391	10866887	3932	10539495
3423	0	0	10541609
2011-04-21 08:32:12,252 - masterEventQueue	360928	11311497	3705	10950569
3425	0	0	10952774   Stopped senders again
2011-04-21 08:34:12,320 - masterEventQueue	0	11317002	45	11317002	3053	0	0
11317002  Dropped again to zero

Content of temp store
ls -la tmp-data/

33 mb Apr 21 08:56 db-253.log
33 mb Apr 21 08:57 db-254.log
33 mb Apr 21 08:57 db-255.log
33 mb Apr 21 08:57 db-256.log
0 Apr 21 07:53 lock
1.67 gb Apr 21 08:57 tmpDB.data  1.7GB
3 mb Apr 21 08:57 tmpDB.redo


--
View this message in context: http://activemq.2283324.n4.nabble.com/Senders-blocked-when-temp-store-100-and-memory-leak-in-temp-store-tp3467749p3467749.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.