You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by mark angrish 2 <ma...@gmail.com> on 2006/11/27 07:59:19 UTC

starting a consumer on a queue with lots of messages

Hi,

I've had a search around but couldn't find anything that could help.

I currently have around 2 million messages that are persisted in my queue. 
The situation I'm in is that the queue must hold onto 1 months (say 200Gb)
worth of data before I can put consumers on to process the backlog.  I
currently only have the default config file setup.  My consumer is set up to
use XA between activemq 4.0.2 and oracle 10g using atomikos transaction
manager 3 integrated with spring 2.0, which allows me to use message driven
pojos!  I am using the marathon fix (if you search the bug fix area you will
find the patch) for activemq to allow it to work with XA.

My questions:

1. My problem is that when i start active mq with the small 2 million record
scenario, it takes ages to start up.  Is there any way to overcome this poor
startup time?

2. This is actually my big pain point at the moment.  When activemq does
eventually start up with the 2 million messages, I can't use hermes to see
the messages (it seems to take hours and still no result), and i can't see
any processing happening on the xa consumer side to the database.  When used
normally say 10's or 100's of messages, so a steady stream of messages,
everything processs to the database just fine.  I tried tinkering with the
prefetch limit but that didn't seem to help either.  Something that could
say "right, let the consumer take x amount of messages off the queue and
process them with XA" would be awesome here.

3. We have about 6 queues on 6 machines with 24 applications acting as
producers into them (so 4 producers going to each queue).  We have a SAN set
up so that each queue can write to its own storage area on the san (i.e. 6
activemq data directories on the SAN).  If i wanted to set up a store and
forward kind of setup to 2 'central queues' where the consumers of the queue
lie, do i just configure a network connector on the 6 queues to point to the
central queue locations and get those central queues to read from all 6 SAN
data directories?  Or do i actually need 6 consumer queues?  In either case
is it ok for them to point to the SAN to read the message stores,
considering the producers are writing and the consumers are removing?

Any help would be massively appreciated!

cheers,

::mark
-- 
View this message in context: http://www.nabble.com/starting-a-consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7555664
Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Re: starting a consumer on a queue with lots of messages

Posted by Rob Davies <ra...@gmail.com>.
I would look at either of these - let us know how you get on!

On 27 Nov 2006, at 13:42, mark angrish 2 wrote:

>
> Also, like in the other thread you recommended, do you think i  
> should look at
> another storage mechanism other than derby (like a kaha store, or  
> postgres?)
>
>
>
> rajdavies wrote:
>>
>> Hi Mark,
>> answers inlined:
>>
>> On 27 Nov 2006, at 06:59, mark angrish 2 wrote:
>>
>>>
>>> Hi,
>>>
>>> I've had a search around but couldn't find anything that could help.
>>>
>>> I currently have around 2 million messages that are persisted in my
>>> queue.
>>> The situation I'm in is that the queue must hold onto 1 months (say
>>> 200Gb)
>>> worth of data before I can put consumers on to process the  
>>> backlog.  I
>>> currently only have the default config file setup.  My consumer is
>>> set up to
>>> use XA between activemq 4.0.2 and oracle 10g using atomikos
>>> transaction
>>> manager 3 integrated with spring 2.0, which allows me to use
>>> message driven
>>> pojos!  I am using the marathon fix (if you search the bug fix area
>>> you will
>>> find the patch) for activemq to allow it to work with XA.
>>>
>>> My questions:
>>>
>>> 1. My problem is that when i start active mq with the small 2
>>> million record
>>> scenario, it takes ages to start up.  Is there any way to overcome
>>> this poor
>>> startup time?
>>   There's new functionality that is in trunk at the moment, but is
>> due to be released in ActiveMQ 4.2 that solves this. In the current
>> release, all messages are loaded into the broker and a message
>> reference is then held in memory that has a pointer to the message in
>> the store. This can be optionally replaced by a paging mechanism that
>> pages messages into the broker when required.
>> However, there may be some work required to optimize some of the
>> queries for oracle - see http://www.nabble.com/Statements.java%2C-
>> etc.-Performance-tf2662372.html#a7425760
>>>
>>> 2. This is actually my big pain point at the moment.  When activemq
>>> does
>>> eventually start up with the 2 million messages, I can't use hermes
>>> to see
>>> the messages (it seems to take hours and still no result), and i
>>> can't see
>>> any processing happening on the xa consumer side to the database.
>>> When used
>>> normally say 10's or 100's of messages, so a steady stream of
>>> messages,
>>> everything processs to the database just fine.  I tried tinkering
>>> with the
>>> prefetch limit but that didn't seem to help either.  Something that
>>> could
>>> say "right, let the consumer take x amount of messages off the
>>> queue and
>>> process them with XA" would be awesome here.
>> Hopefully the paging mechanism will help here too.
>>
>>>
>>> 3. We have about 6 queues on 6 machines with 24 applications  
>>> acting as
>>> producers into them (so 4 producers going to each queue).  We have
>>> a SAN set
>>> up so that each queue can write to its own storage area on the san
>>> (i.e. 6
>>> activemq data directories on the SAN).  If i wanted to set up a
>>> store and
>>> forward kind of setup to 2 'central queues' where the consumers of
>>> the queue
>>> lie, do i just configure a network connector on the 6 queues to
>>> point to the
>>> central queue locations and get those central queues to read from
>>> all 6 SAN
>>> data directories?  Or do i actually need 6 consumer queues?  In
>>> either case
>>> is it ok for them to point to the SAN to read the message stores,
>>> considering the producers are writing and the consumers are  
>>> removing?
>>>
>>> Any help would be massively appreciated!
>>>
>> There isn't a problem with either 2 central queues or 6 consumer
>> queues - but I'd like to understand a little more about your setup -
>> do you really need store and forward, or could you just have a
>> centralized hub around your SAN ?
>>
>>> cheers,
>>>
>>> ::mark
>>> -- 
>>> View this message in context: http://www.nabble.com/starting-a-
>>> consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7555664
>>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>>
>>
>> cheers,
>>
>> rob
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/starting-a- 
> consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7560162
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>


Re: starting a consumer on a queue with lots of messages

Posted by mark angrish 2 <ma...@gmail.com>.
Also, like in the other thread you recommended, do you think i should look at
another storage mechanism other than derby (like a kaha store, or postgres?)



rajdavies wrote:
> 
> Hi Mark,
> answers inlined:
> 
> On 27 Nov 2006, at 06:59, mark angrish 2 wrote:
> 
>>
>> Hi,
>>
>> I've had a search around but couldn't find anything that could help.
>>
>> I currently have around 2 million messages that are persisted in my  
>> queue.
>> The situation I'm in is that the queue must hold onto 1 months (say  
>> 200Gb)
>> worth of data before I can put consumers on to process the backlog.  I
>> currently only have the default config file setup.  My consumer is  
>> set up to
>> use XA between activemq 4.0.2 and oracle 10g using atomikos  
>> transaction
>> manager 3 integrated with spring 2.0, which allows me to use  
>> message driven
>> pojos!  I am using the marathon fix (if you search the bug fix area  
>> you will
>> find the patch) for activemq to allow it to work with XA.
>>
>> My questions:
>>
>> 1. My problem is that when i start active mq with the small 2  
>> million record
>> scenario, it takes ages to start up.  Is there any way to overcome  
>> this poor
>> startup time?
>   There's new functionality that is in trunk at the moment, but is  
> due to be released in ActiveMQ 4.2 that solves this. In the current  
> release, all messages are loaded into the broker and a message  
> reference is then held in memory that has a pointer to the message in  
> the store. This can be optionally replaced by a paging mechanism that  
> pages messages into the broker when required.
> However, there may be some work required to optimize some of the  
> queries for oracle - see http://www.nabble.com/Statements.java%2C- 
> etc.-Performance-tf2662372.html#a7425760
>>
>> 2. This is actually my big pain point at the moment.  When activemq  
>> does
>> eventually start up with the 2 million messages, I can't use hermes  
>> to see
>> the messages (it seems to take hours and still no result), and i  
>> can't see
>> any processing happening on the xa consumer side to the database.   
>> When used
>> normally say 10's or 100's of messages, so a steady stream of  
>> messages,
>> everything processs to the database just fine.  I tried tinkering  
>> with the
>> prefetch limit but that didn't seem to help either.  Something that  
>> could
>> say "right, let the consumer take x amount of messages off the  
>> queue and
>> process them with XA" would be awesome here.
> Hopefully the paging mechanism will help here too.
> 
>>
>> 3. We have about 6 queues on 6 machines with 24 applications acting as
>> producers into them (so 4 producers going to each queue).  We have  
>> a SAN set
>> up so that each queue can write to its own storage area on the san  
>> (i.e. 6
>> activemq data directories on the SAN).  If i wanted to set up a  
>> store and
>> forward kind of setup to 2 'central queues' where the consumers of  
>> the queue
>> lie, do i just configure a network connector on the 6 queues to  
>> point to the
>> central queue locations and get those central queues to read from  
>> all 6 SAN
>> data directories?  Or do i actually need 6 consumer queues?  In  
>> either case
>> is it ok for them to point to the SAN to read the message stores,
>> considering the producers are writing and the consumers are removing?
>>
>> Any help would be massively appreciated!
>>
> There isn't a problem with either 2 central queues or 6 consumer  
> queues - but I'd like to understand a little more about your setup -  
> do you really need store and forward, or could you just have a  
> centralized hub around your SAN ?
> 
>> cheers,
>>
>> ::mark
>> -- 
>> View this message in context: http://www.nabble.com/starting-a- 
>> consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7555664
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
> 
> cheers,
> 
> rob
> 
> 

-- 
View this message in context: http://www.nabble.com/starting-a-consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7560162
Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Re: starting a consumer on a queue with lots of messages

Posted by mark angrish 2 <ma...@gmail.com>.

Hi Rob,

Thanks again for the quick reply.

Last question.. Do you have any documentation or javadocs that i might be
able to look at to configure a 4.2 queue up with paging from the existing
snapshots?

Also, will there be backward compatibility with 4.2 to 4.0?  We want to
store serval million messages on the queue while we wait for the queue
consumer hardware to arrive.  This is currently deloyed on 4.0.  Once the
hardware arrives and we deploy our 'central queue' and queue consumers on
them, will a 4.2 activemq be able to read the journal and database of the
4.0 activemq off the NAS?

cheers,

::mark


rajdavies wrote:
> 
> Hi Mark,
> 
> 
> On 27 Nov 2006, at 13:40, mark angrish 2 wrote:
> 
>>
>>
>> Hi Rob,
>>
>> Thanks for getting back to me so quickly :)
>>
>> Do you happen to know the planned release month for 4.2 by any chance?
> probably early next year (I doubt we'll have time to release it by  
> end of December)
> 
>> Seems like the paging mechanism is what I am after.  If you can  
>> point me to
>> the relevent classes i guess i can look at patching the 4.0.2  
>> release so
>> that it can use paging (unless I am going to run into another  
>> plethora of
>> issues!).
> I'm not sure that's going to be an option - a lot had to change to  
> support paging.
>> Since activemq is using derby under the covers, is it possible to
>> write a batch job to process directly from derby, or will that get  
>> activemq
>> out of kilt?  Is so, is there a work around?  It is impertive that  
>> we be
>> able to processes these messages as soon as we can so we can get  
>> back to the
>> normal message production and consumption numbers.
>>
>> A concern i have is what would happen if say our consuming components
>> database went down and activemq had to hold onto the messages.  If the
>> database came back after day 20Gb of data was written to the queue,  
>> how can
>> we be sure, unless we wait a long time, that it is, or can process  
>> that
>> data?
> The paging mechanism should be able to handle that amount of data
>>
>> With the XA in activemq, we have had problems with JOTM and atomikos,
>> although atomikos was by far the superior product.  I saw a bug on the
>> atomikos boards with activemq.  Do you know if that bug has also  
>> been looked
>> at and or fixed and if so in what release?  Do you recommend running
>> production on a 4.1 or 4.2 milestone drop *gulp*? hehe
> Well, unfortunately - paging for Queues is only in 4.2 ...  ;)
> It might be worth getting some production support just in case ;)
>>
>> Onto point 3:
>> The setup is 6 multiprocessor unix machines each hosting 24 of the  
>> same
>> application load balanced (4 on each machine).  Since the  
>> application must
>> be high performance we are using a broker local to the machine (so  
>> each
>> machines config points to localhost first).  This saves the need for a
>> network call.  We have a NAS mount to each machine, and each of the  
>> 6 queue
>> instances write to their own directory on the NAS.  At this stage  
>> this is
>> all that will be deployed to our production environment due to  
>> hardware
>> unavailability.
>> We are then procuring 2 machines that will also have a NAS mount,  
>> and the
>> 'queue to database' processor also resides on this machine.  We  
>> basically
>> get 10-100s of messages a second which then need to find their way,  
>> via a
>> persisted mechanism, to the consumer and into the database.  My  
>> idea was to
>> have the 6 queues write to the SAN, then use a network connector to  
>> forward
>> to the central broker.
>> Is this an approach you would recommend, or do you think a  
>> different setup
>> would be more appropriate?
> This looks reasonable to me - because NAS can be alot faster than TCP/ 
> IP. However, you'll going to have to hit the network using tcp/ip at  
> some point, so as long as your applications aren't going full pelt 24  
> x7  your central broker will be able to catch up at some point.
>>
>> Your advice would be most appreciated :)
> NP!
>>
>> cheers,
>>
>> ::mark
> 
> cheers,
> 
> Rob
>>
>>
>> rajdavies wrote:
>>>
>>> Hi Mark,
>>> answers inlined:
>>>
>>> On 27 Nov 2006, at 06:59, mark angrish 2 wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> I've had a search around but couldn't find anything that could help.
>>>>
>>>> I currently have around 2 million messages that are persisted in my
>>>> queue.
>>>> The situation I'm in is that the queue must hold onto 1 months (say
>>>> 200Gb)
>>>> worth of data before I can put consumers on to process the  
>>>> backlog.  I
>>>> currently only have the default config file setup.  My consumer is
>>>> set up to
>>>> use XA between activemq 4.0.2 and oracle 10g using atomikos
>>>> transaction
>>>> manager 3 integrated with spring 2.0, which allows me to use
>>>> message driven
>>>> pojos!  I am using the marathon fix (if you search the bug fix area
>>>> you will
>>>> find the patch) for activemq to allow it to work with XA.
>>>>
>>>> My questions:
>>>>
>>>> 1. My problem is that when i start active mq with the small 2
>>>> million record
>>>> scenario, it takes ages to start up.  Is there any way to overcome
>>>> this poor
>>>> startup time?
>>>   There's new functionality that is in trunk at the moment, but is
>>> due to be released in ActiveMQ 4.2 that solves this. In the current
>>> release, all messages are loaded into the broker and a message
>>> reference is then held in memory that has a pointer to the message in
>>> the store. This can be optionally replaced by a paging mechanism that
>>> pages messages into the broker when required.
>>> However, there may be some work required to optimize some of the
>>> queries for oracle - see http://www.nabble.com/Statements.java%2C-
>>> etc.-Performance-tf2662372.html#a7425760
>>>>
>>>> 2. This is actually my big pain point at the moment.  When activemq
>>>> does
>>>> eventually start up with the 2 million messages, I can't use hermes
>>>> to see
>>>> the messages (it seems to take hours and still no result), and i
>>>> can't see
>>>> any processing happening on the xa consumer side to the database.
>>>> When used
>>>> normally say 10's or 100's of messages, so a steady stream of
>>>> messages,
>>>> everything processs to the database just fine.  I tried tinkering
>>>> with the
>>>> prefetch limit but that didn't seem to help either.  Something that
>>>> could
>>>> say "right, let the consumer take x amount of messages off the
>>>> queue and
>>>> process them with XA" would be awesome here.
>>> Hopefully the paging mechanism will help here too.
>>>
>>>>
>>>> 3. We have about 6 queues on 6 machines with 24 applications  
>>>> acting as
>>>> producers into them (so 4 producers going to each queue).  We have
>>>> a SAN set
>>>> up so that each queue can write to its own storage area on the san
>>>> (i.e. 6
>>>> activemq data directories on the SAN).  If i wanted to set up a
>>>> store and
>>>> forward kind of setup to 2 'central queues' where the consumers of
>>>> the queue
>>>> lie, do i just configure a network connector on the 6 queues to
>>>> point to the
>>>> central queue locations and get those central queues to read from
>>>> all 6 SAN
>>>> data directories?  Or do i actually need 6 consumer queues?  In
>>>> either case
>>>> is it ok for them to point to the SAN to read the message stores,
>>>> considering the producers are writing and the consumers are  
>>>> removing?
>>>>
>>>> Any help would be massively appreciated!
>>>>
>>> There isn't a problem with either 2 central queues or 6 consumer
>>> queues - but I'd like to understand a little more about your setup -
>>> do you really need store and forward, or could you just have a
>>> centralized hub around your SAN ?
>>>
>>>> cheers,
>>>>
>>>> ::mark
>>>> -- 
>>>> View this message in context: http://www.nabble.com/starting-a-
>>>> consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7555664
>>>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>>>
>>>
>>> cheers,
>>>
>>> rob
>>>
>>>
>>
>> -- 
>> View this message in context: http://www.nabble.com/starting-a- 
>> consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7560125
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/starting-a-consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7593003
Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Re: starting a consumer on a queue with lots of messages

Posted by Rob Davies <ra...@gmail.com>.
Hi Mark,


On 27 Nov 2006, at 13:40, mark angrish 2 wrote:

>
>
> Hi Rob,
>
> Thanks for getting back to me so quickly :)
>
> Do you happen to know the planned release month for 4.2 by any chance?
probably early next year (I doubt we'll have time to release it by  
end of December)

> Seems like the paging mechanism is what I am after.  If you can  
> point me to
> the relevent classes i guess i can look at patching the 4.0.2  
> release so
> that it can use paging (unless I am going to run into another  
> plethora of
> issues!).
I'm not sure that's going to be an option - a lot had to change to  
support paging.
> Since activemq is using derby under the covers, is it possible to
> write a batch job to process directly from derby, or will that get  
> activemq
> out of kilt?  Is so, is there a work around?  It is impertive that  
> we be
> able to processes these messages as soon as we can so we can get  
> back to the
> normal message production and consumption numbers.
>
> A concern i have is what would happen if say our consuming components
> database went down and activemq had to hold onto the messages.  If the
> database came back after day 20Gb of data was written to the queue,  
> how can
> we be sure, unless we wait a long time, that it is, or can process  
> that
> data?
The paging mechanism should be able to handle that amount of data
>
> With the XA in activemq, we have had problems with JOTM and atomikos,
> although atomikos was by far the superior product.  I saw a bug on the
> atomikos boards with activemq.  Do you know if that bug has also  
> been looked
> at and or fixed and if so in what release?  Do you recommend running
> production on a 4.1 or 4.2 milestone drop *gulp*? hehe
Well, unfortunately - paging for Queues is only in 4.2 ...  ;)
It might be worth getting some production support just in case ;)
>
> Onto point 3:
> The setup is 6 multiprocessor unix machines each hosting 24 of the  
> same
> application load balanced (4 on each machine).  Since the  
> application must
> be high performance we are using a broker local to the machine (so  
> each
> machines config points to localhost first).  This saves the need for a
> network call.  We have a NAS mount to each machine, and each of the  
> 6 queue
> instances write to their own directory on the NAS.  At this stage  
> this is
> all that will be deployed to our production environment due to  
> hardware
> unavailability.
> We are then procuring 2 machines that will also have a NAS mount,  
> and the
> 'queue to database' processor also resides on this machine.  We  
> basically
> get 10-100s of messages a second which then need to find their way,  
> via a
> persisted mechanism, to the consumer and into the database.  My  
> idea was to
> have the 6 queues write to the SAN, then use a network connector to  
> forward
> to the central broker.
> Is this an approach you would recommend, or do you think a  
> different setup
> would be more appropriate?
This looks reasonable to me - because NAS can be alot faster than TCP/ 
IP. However, you'll going to have to hit the network using tcp/ip at  
some point, so as long as your applications aren't going full pelt 24  
x7  your central broker will be able to catch up at some point.
>
> Your advice would be most appreciated :)
NP!
>
> cheers,
>
> ::mark

cheers,

Rob
>
>
> rajdavies wrote:
>>
>> Hi Mark,
>> answers inlined:
>>
>> On 27 Nov 2006, at 06:59, mark angrish 2 wrote:
>>
>>>
>>> Hi,
>>>
>>> I've had a search around but couldn't find anything that could help.
>>>
>>> I currently have around 2 million messages that are persisted in my
>>> queue.
>>> The situation I'm in is that the queue must hold onto 1 months (say
>>> 200Gb)
>>> worth of data before I can put consumers on to process the  
>>> backlog.  I
>>> currently only have the default config file setup.  My consumer is
>>> set up to
>>> use XA between activemq 4.0.2 and oracle 10g using atomikos
>>> transaction
>>> manager 3 integrated with spring 2.0, which allows me to use
>>> message driven
>>> pojos!  I am using the marathon fix (if you search the bug fix area
>>> you will
>>> find the patch) for activemq to allow it to work with XA.
>>>
>>> My questions:
>>>
>>> 1. My problem is that when i start active mq with the small 2
>>> million record
>>> scenario, it takes ages to start up.  Is there any way to overcome
>>> this poor
>>> startup time?
>>   There's new functionality that is in trunk at the moment, but is
>> due to be released in ActiveMQ 4.2 that solves this. In the current
>> release, all messages are loaded into the broker and a message
>> reference is then held in memory that has a pointer to the message in
>> the store. This can be optionally replaced by a paging mechanism that
>> pages messages into the broker when required.
>> However, there may be some work required to optimize some of the
>> queries for oracle - see http://www.nabble.com/Statements.java%2C-
>> etc.-Performance-tf2662372.html#a7425760
>>>
>>> 2. This is actually my big pain point at the moment.  When activemq
>>> does
>>> eventually start up with the 2 million messages, I can't use hermes
>>> to see
>>> the messages (it seems to take hours and still no result), and i
>>> can't see
>>> any processing happening on the xa consumer side to the database.
>>> When used
>>> normally say 10's or 100's of messages, so a steady stream of
>>> messages,
>>> everything processs to the database just fine.  I tried tinkering
>>> with the
>>> prefetch limit but that didn't seem to help either.  Something that
>>> could
>>> say "right, let the consumer take x amount of messages off the
>>> queue and
>>> process them with XA" would be awesome here.
>> Hopefully the paging mechanism will help here too.
>>
>>>
>>> 3. We have about 6 queues on 6 machines with 24 applications  
>>> acting as
>>> producers into them (so 4 producers going to each queue).  We have
>>> a SAN set
>>> up so that each queue can write to its own storage area on the san
>>> (i.e. 6
>>> activemq data directories on the SAN).  If i wanted to set up a
>>> store and
>>> forward kind of setup to 2 'central queues' where the consumers of
>>> the queue
>>> lie, do i just configure a network connector on the 6 queues to
>>> point to the
>>> central queue locations and get those central queues to read from
>>> all 6 SAN
>>> data directories?  Or do i actually need 6 consumer queues?  In
>>> either case
>>> is it ok for them to point to the SAN to read the message stores,
>>> considering the producers are writing and the consumers are  
>>> removing?
>>>
>>> Any help would be massively appreciated!
>>>
>> There isn't a problem with either 2 central queues or 6 consumer
>> queues - but I'd like to understand a little more about your setup -
>> do you really need store and forward, or could you just have a
>> centralized hub around your SAN ?
>>
>>> cheers,
>>>
>>> ::mark
>>> -- 
>>> View this message in context: http://www.nabble.com/starting-a-
>>> consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7555664
>>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>>
>>
>> cheers,
>>
>> rob
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/starting-a- 
> consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7560125
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>


Re: starting a consumer on a queue with lots of messages

Posted by mark angrish 2 <ma...@gmail.com>.

Hi Rob,

Thanks for getting back to me so quickly :)

Do you happen to know the planned release month for 4.2 by any chance?
Seems like the paging mechanism is what I am after.  If you can point me to
the relevent classes i guess i can look at patching the 4.0.2 release so
that it can use paging (unless I am going to run into another plethora of
issues!).  Since activemq is using derby under the covers, is it possible to
write a batch job to process directly from derby, or will that get activemq
out of kilt?  Is so, is there a work around?  It is impertive that we be
able to processes these messages as soon as we can so we can get back to the
normal message production and consumption numbers.

A concern i have is what would happen if say our consuming components
database went down and activemq had to hold onto the messages.  If the
database came back after day 20Gb of data was written to the queue, how can
we be sure, unless we wait a long time, that it is, or can process that
data?

With the XA in activemq, we have had problems with JOTM and atomikos,
although atomikos was by far the superior product.  I saw a bug on the
atomikos boards with activemq.  Do you know if that bug has also been looked
at and or fixed and if so in what release?  Do you recommend running
production on a 4.1 or 4.2 milestone drop *gulp*? hehe

Onto point 3:
The setup is 6 multiprocessor unix machines each hosting 24 of the same
application load balanced (4 on each machine).  Since the application must
be high performance we are using a broker local to the machine (so each
machines config points to localhost first).  This saves the need for a
network call.  We have a NAS mount to each machine, and each of the 6 queue
instances write to their own directory on the NAS.  At this stage this is
all that will be deployed to our production environment due to hardware
unavailability.
We are then procuring 2 machines that will also have a NAS mount, and the
'queue to database' processor also resides on this machine.  We basically
get 10-100s of messages a second which then need to find their way, via a
persisted mechanism, to the consumer and into the database.  My idea was to
have the 6 queues write to the SAN, then use a network connector to forward
to the central broker.
Is this an approach you would recommend, or do you think a different setup
would be more appropriate? 

Your advice would be most appreciated :)

cheers,

::mark


rajdavies wrote:
> 
> Hi Mark,
> answers inlined:
> 
> On 27 Nov 2006, at 06:59, mark angrish 2 wrote:
> 
>>
>> Hi,
>>
>> I've had a search around but couldn't find anything that could help.
>>
>> I currently have around 2 million messages that are persisted in my  
>> queue.
>> The situation I'm in is that the queue must hold onto 1 months (say  
>> 200Gb)
>> worth of data before I can put consumers on to process the backlog.  I
>> currently only have the default config file setup.  My consumer is  
>> set up to
>> use XA between activemq 4.0.2 and oracle 10g using atomikos  
>> transaction
>> manager 3 integrated with spring 2.0, which allows me to use  
>> message driven
>> pojos!  I am using the marathon fix (if you search the bug fix area  
>> you will
>> find the patch) for activemq to allow it to work with XA.
>>
>> My questions:
>>
>> 1. My problem is that when i start active mq with the small 2  
>> million record
>> scenario, it takes ages to start up.  Is there any way to overcome  
>> this poor
>> startup time?
>   There's new functionality that is in trunk at the moment, but is  
> due to be released in ActiveMQ 4.2 that solves this. In the current  
> release, all messages are loaded into the broker and a message  
> reference is then held in memory that has a pointer to the message in  
> the store. This can be optionally replaced by a paging mechanism that  
> pages messages into the broker when required.
> However, there may be some work required to optimize some of the  
> queries for oracle - see http://www.nabble.com/Statements.java%2C- 
> etc.-Performance-tf2662372.html#a7425760
>>
>> 2. This is actually my big pain point at the moment.  When activemq  
>> does
>> eventually start up with the 2 million messages, I can't use hermes  
>> to see
>> the messages (it seems to take hours and still no result), and i  
>> can't see
>> any processing happening on the xa consumer side to the database.   
>> When used
>> normally say 10's or 100's of messages, so a steady stream of  
>> messages,
>> everything processs to the database just fine.  I tried tinkering  
>> with the
>> prefetch limit but that didn't seem to help either.  Something that  
>> could
>> say "right, let the consumer take x amount of messages off the  
>> queue and
>> process them with XA" would be awesome here.
> Hopefully the paging mechanism will help here too.
> 
>>
>> 3. We have about 6 queues on 6 machines with 24 applications acting as
>> producers into them (so 4 producers going to each queue).  We have  
>> a SAN set
>> up so that each queue can write to its own storage area on the san  
>> (i.e. 6
>> activemq data directories on the SAN).  If i wanted to set up a  
>> store and
>> forward kind of setup to 2 'central queues' where the consumers of  
>> the queue
>> lie, do i just configure a network connector on the 6 queues to  
>> point to the
>> central queue locations and get those central queues to read from  
>> all 6 SAN
>> data directories?  Or do i actually need 6 consumer queues?  In  
>> either case
>> is it ok for them to point to the SAN to read the message stores,
>> considering the producers are writing and the consumers are removing?
>>
>> Any help would be massively appreciated!
>>
> There isn't a problem with either 2 central queues or 6 consumer  
> queues - but I'd like to understand a little more about your setup -  
> do you really need store and forward, or could you just have a  
> centralized hub around your SAN ?
> 
>> cheers,
>>
>> ::mark
>> -- 
>> View this message in context: http://www.nabble.com/starting-a- 
>> consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7555664
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
> 
> cheers,
> 
> rob
> 
> 

-- 
View this message in context: http://www.nabble.com/starting-a-consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7560125
Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Re: starting a consumer on a queue with lots of messages

Posted by Rob Davies <ra...@gmail.com>.
Hi Mark,
answers inlined:

On 27 Nov 2006, at 06:59, mark angrish 2 wrote:

>
> Hi,
>
> I've had a search around but couldn't find anything that could help.
>
> I currently have around 2 million messages that are persisted in my  
> queue.
> The situation I'm in is that the queue must hold onto 1 months (say  
> 200Gb)
> worth of data before I can put consumers on to process the backlog.  I
> currently only have the default config file setup.  My consumer is  
> set up to
> use XA between activemq 4.0.2 and oracle 10g using atomikos  
> transaction
> manager 3 integrated with spring 2.0, which allows me to use  
> message driven
> pojos!  I am using the marathon fix (if you search the bug fix area  
> you will
> find the patch) for activemq to allow it to work with XA.
>
> My questions:
>
> 1. My problem is that when i start active mq with the small 2  
> million record
> scenario, it takes ages to start up.  Is there any way to overcome  
> this poor
> startup time?
  There's new functionality that is in trunk at the moment, but is  
due to be released in ActiveMQ 4.2 that solves this. In the current  
release, all messages are loaded into the broker and a message  
reference is then held in memory that has a pointer to the message in  
the store. This can be optionally replaced by a paging mechanism that  
pages messages into the broker when required.
However, there may be some work required to optimize some of the  
queries for oracle - see http://www.nabble.com/Statements.java%2C- 
etc.-Performance-tf2662372.html#a7425760
>
> 2. This is actually my big pain point at the moment.  When activemq  
> does
> eventually start up with the 2 million messages, I can't use hermes  
> to see
> the messages (it seems to take hours and still no result), and i  
> can't see
> any processing happening on the xa consumer side to the database.   
> When used
> normally say 10's or 100's of messages, so a steady stream of  
> messages,
> everything processs to the database just fine.  I tried tinkering  
> with the
> prefetch limit but that didn't seem to help either.  Something that  
> could
> say "right, let the consumer take x amount of messages off the  
> queue and
> process them with XA" would be awesome here.
Hopefully the paging mechanism will help here too.

>
> 3. We have about 6 queues on 6 machines with 24 applications acting as
> producers into them (so 4 producers going to each queue).  We have  
> a SAN set
> up so that each queue can write to its own storage area on the san  
> (i.e. 6
> activemq data directories on the SAN).  If i wanted to set up a  
> store and
> forward kind of setup to 2 'central queues' where the consumers of  
> the queue
> lie, do i just configure a network connector on the 6 queues to  
> point to the
> central queue locations and get those central queues to read from  
> all 6 SAN
> data directories?  Or do i actually need 6 consumer queues?  In  
> either case
> is it ok for them to point to the SAN to read the message stores,
> considering the producers are writing and the consumers are removing?
>
> Any help would be massively appreciated!
>
There isn't a problem with either 2 central queues or 6 consumer  
queues - but I'd like to understand a little more about your setup -  
do you really need store and forward, or could you just have a  
centralized hub around your SAN ?

> cheers,
>
> ::mark
> -- 
> View this message in context: http://www.nabble.com/starting-a- 
> consumer-on-a-queue-with-lots-of-messages-tf2710121.html#a7555664
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>

cheers,

rob