You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@tomcat.apache.org by Mott Leroy <mo...@acadaca.com> on 2005/11/17 20:40:54 UTC

robust Failover, mod_jk

I was wondering if I could get some advice on better failover for my 
current setup. I'm using mod_jk 1.2.14 with Tomcat 5.0.28.

One issue that we occassionally run across is that an instance of tomcat 
will become unresponsive (due to out of memory errors for example) but 
mod_jk will still route requests to it. I realize the longer term 
solution here is to fix applications which cause tomcat problems (out of 
memory errors especially), but the reality is, these things happen, and 
don't cause a failover.

Does anyone have any suggestions on how to handle situations like this?

- Mott



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: robust Failover, mod_jk

Posted by Vidya TR <vi...@hotmail.com>.

Our application which runs on Apache HTTP Server 2.0.54, TOMCAT 5.0.28 with
mod_jk 1.2.15 connector. In production while our application was running.
Our application had a thread dead Lock hosing up all the users trying to
connect our application. So started investigating the failover setup in
mod_jk. I have two tomcat instances nodeA (active node )  and nodeB
(failover node). When I run my test scenario of a thread dead lock on nodeA
it doesn't seem to be failing over to nodeB. Instead whole site becomes
inaccessible and based on the reply_timeout setting I have I receive a
"Service Temporarily Unavailable" message. It is very critical for me to
find an answer to this. Is mod_jk the best option? Or should I be
investigating at other High Availability Clustering solutions like Heartbeat
etc...? Please help.  Following is my worker.properties .



worker.list=loadbalancer

# ----------------------
# Load Balancer worker
# ----------------------

worker.loadbalancer.type=lb                                   # ajp13, aj14,
jni, lb or status (lb = loadbalancer)
worker.loadbalancer.balance_workers=nodeA,nodeB        #
worker.loadbalancer.sticky_session=1                       #
worker.loadbalancer.prepost_timeout=5                         #

# ----------------
# First worker
# ----------------

worker.nodeA.port=38009                                    # connector port
worker.nodeA.host=localhost                                # ipaddress
worker.nodeA.type=ajp13                                    # connector type
worker.nodeA.cachesize=1
worker.nodeA.retries=3
# below are in seconds worker.nodeA.cache_timeout=60
worker.nodeA.socket_timeout=30
worker.nodeA.recycle_timeout=60
# Advanced ping-pong options
# 0 (full recovery)
worker.nodeA.recovery_options=0
# below are in milliseconds
worker.nodeA.prepost_timeout=5000
worker.nodeA.connect_timeout=5000
# under load it can take awhile to get a reply when doing an initial JSP
compile
worker.nodeA.reply_timeout=60000

worker.nodeA.redirect=nodeB


# ----------------
# Second worker
# ----------------

worker.nodeB.port=48009
worker.nodeB.host=localhost
worker.nodeB.type=ajp13
worker.nodeB.cachesize=1
worker.nodeB.retries=3
# below are in seconds worker.nodeB.cache_timeout=60
worker.nodeB.socket_timeout=30
worker.nodeB.recycle_timeout=60
# Advanced ping-pong options
# 0 (full recovery)
worker.nodeB.recovery_options=0
# below are in milliseconds
worker.nodeB.prepost_timeout=5000
worker.nodeB.connect_timeout=5000
# under load it can take awhile to get a reply when doing an initial JSP
compile
worker.nodeB.reply_timeout=60000

worker.nodeB.disabled=true


-- tr123
--
View this message in context: http://www.nabble.com/robust-Failover%2C-mod_jk-t572871.html#a3539521
Sent from the Tomcat - User forum at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: robust Failover, mod_jk

Posted by Mladen Turk <ml...@jboss.com>.

Mott Leroy wrote:
> Mladen Turk wrote:
> 
>> Yes, use the 'Advanced worker directives'
>> http://tomcat.apache.org/connectors-doc/config/workers.html
>>
>> The connect_timeout, prepost_timeout and reply_timeout are meant to
>> be used with hanged or very busy backend (Tomcat) servers.
> 
> I'm having trouble understanding the difference between a 
> prepost_timeout and a connect_timeout ... both seem to happen before a 
> request is forwarded.

1. connect_timeout happens when the physical connection between
mod_jk and Tomcat is established. The Tomcat might accept the
physical connections, but might refuse to serve it.

2. prepost_timeout happens before each request on already opened
connection, and is meant to be used to solve the hanged Tomcats,
and to replace the 'socket_timeout' functionality on platforms
that does not have a full BSD socket implementation (Solaris).

>  Are these pings happening for every request 
> (doubling the load?) ?

Yes, they happen on every request, but the packet size is 8 bytes, so
the load increase is negligible compared with robustness.

> Also, what happens to the request if reply_timeout is exceeded? Does the 
> request get dropped? Or is it somehow retried on another server in the 
> cluster? [not saying it should .. it's ambiguous what should happen]
>

It depends on the 'recovery_options' for a worker.
By default the request will be forwarded to another worker in lb.

Regards,
Mladen.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: robust Failover, mod_jk

Posted by Mott Leroy <mo...@acadaca.com>.

Mladen Turk wrote:

> Yes, use the 'Advanced worker directives'
> http://tomcat.apache.org/connectors-doc/config/workers.html
> 
> The connect_timeout, prepost_timeout and reply_timeout are meant to
> be used with hanged or very busy backend (Tomcat) servers.

Thanks.

I'm having trouble understanding the difference between a 
prepost_timeout and a connect_timeout ... both seem to happen before a 
request is forwarded. Are these pings happening for every request 
(doubling the load?) ?

Also, what happens to the request if reply_timeout is exceeded? Does the 
request get dropped? Or is it somehow retried on another server in the 
cluster? [not saying it should .. it's ambiguous what should happen]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: robust Failover, mod_jk

Posted by Mladen Turk <ml...@jboss.com>.

Mott Leroy wrote:
> I was wondering if I could get some advice on better failover for my 
> current setup. I'm using mod_jk 1.2.14 with Tomcat 5.0.28.
> 
> One issue that we occassionally run across is that an instance of tomcat 
> will become unresponsive (due to out of memory errors for example) but 
> mod_jk will still route requests to it. I realize the longer term 
> solution here is to fix applications which cause tomcat problems (out of 
> memory errors especially), but the reality is, these things happen, and 
> don't cause a failover.
> 
> Does anyone have any suggestions on how to handle situations like this?
>

Yes, use the 'Advanced worker directives'
http://tomcat.apache.org/connectors-doc/config/workers.html

The connect_timeout, prepost_timeout and reply_timeout are meant to
be used with hanged or very busy backend (Tomcat) servers.

Regards,
Mladen.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org