You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@tomcat.apache.org by Ingo Düppe <ma...@dueppe.com> on 2007/10/28 11:41:01 UTC

Unrecoverable error 200, request failed

Hello,

I found the following lines in my mod_jk.log every 20 to 30 seconds for 
all of my instances.

[info] jk_ajp_common.c (1511): Writing to client aborted or client 
network problems
[info] jk_ajp_common.c (1996): (plexus12) request failed, because of 
client write error without recovery in send loop attempt=0
[info] jk_lb_worker.c (1092): unrecoverable error 200, request failed. 
Client failed in the middle of request, we can't recover to another 
instance.
[info] mod_jk.c (2270): Aborting connection for worker=plexus

First I thought, this is caused because of some related network traffic 
problem, but also the two Tomcat instances on the same machine appear in 
the log.

My configuration exist of Windows 2003 SP2 Servers, Apache httpd 2.2.6, 
mod_jk 1.2.25, and Tomcat 6.0.14.


Please, can anybody tell me what's wrong with my configuration and what 
is the meaning of these lines.

Thx in advanced

Ingo



My workers.properties:

workers.tomcat_home="C:\Server\tomcat-6.0.14"
workers.java_home="c:\java\jdk-6.0"
ps=\
worker.list=plexus,jkstatus

worker.plexus11.port=8109
worker.plexus11.host=openuss01
worker.plexus11.type=ajp13
worker.plexus11.lbfactor=10
...
worker.plexus52.port=8209
worker.plexus52.host=openuss05
worker.plexus52.type=ajp13
worker.plexus52.lbfactor=10

worker.jkstatus.type=status
worker.plexus.type=lb
worker.plexus.sticky_session=true
worker.plexus.socket_timeout=20
worker.plexus.reply_timeout=3000
worker.plexus.max_reply_timeouts=40000
worker.plexus.balance_workers=plexus11,plexus21,plexus12,plexus22,plexus51,plexus31,plexus52




---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Unrecoverable error 200, request failed

Posted by Ingo Düppe <ma...@dueppe.com>.

Hi Rainer,
now I got it what you mean about the steady state.
Thx Ingo

Rainer Jung schrieb:
> Hi Ingo,
>
> Ingo Düppe wrote:
>> Am I'm right, that with the following setting a reply_timeout is 
>> detected when the tomcat doesn't reply within 10 seconds and it will 
>> be set into error if more than 15 timeouts in a minutes are detected?
>>
>> worker.plexus.reply_timeout=10000
>
> Yes. The timeout gets reset every time mod_jk waits for the next 
> package. So it's not an overall response time timeout, but it limits 
> the longest pause allowed between reposnse packets. Usually it will 
> fire, between the point in time the request was forwarded to Tomcat 
> and the first request packet arrives. Most apps don't have huge delays 
> after that point.
>
> See also
>
> http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html
>
> for a description of all possible timeouts.
>
>> worker.plexus.max_reply_timeouts=30
>
> In case you assume a steady rate of long running requests, yes: 15 per 
> minute. In case there were no long running requests for a long time, 
> our intrnal counter will be zero and we will allo up to 30 long 
> running requests before triggering the error status.
>
> The counter gets divided by 2 every minute, so that older events count 
> less the longer they are away. A worker that worked fast for a long 
> time can use all of the 30 events, if it already had 30 (or 29) 
> timeouts in the minute before, only 15 are free the next minute.
>
> Regards,
>
> Rainer
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: users@tomcat.apache.org
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>
>


---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Unrecoverable error 200, request failed

Posted by Rainer Jung <ra...@kippdata.de>.

Hi Ingo,

Ingo Düppe wrote:
> Am I'm right, that with the following setting a reply_timeout is 
> detected when the tomcat doesn't reply within 10 seconds and it will be 
> set into error if more than 15 timeouts in a minutes are detected?
> 
> worker.plexus.reply_timeout=10000

Yes. The timeout gets reset every time mod_jk waits for the next 
package. So it's not an overall response time timeout, but it limits the 
longest pause allowed between reposnse packets. Usually it will fire, 
between the point in time the request was forwarded to Tomcat and the 
first request packet arrives. Most apps don't have huge delays after 
that point.

See also

http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html

for a description of all possible timeouts.

> worker.plexus.max_reply_timeouts=30

In case you assume a steady rate of long running requests, yes: 15 per 
minute. In case there were no long running requests for a long time, our 
intrnal counter will be zero and we will allo up to 30 long running 
requests before triggering the error status.

The counter gets divided by 2 every minute, so that older events count 
less the longer they are away. A worker that worked fast for a long time 
can use all of the 30 events, if it already had 30 (or 29) timeouts in 
the minute before, only 15 are free the next minute.

Regards,

Rainer

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Unrecoverable error 200, request failed

Posted by Ingo Düppe <ma...@dueppe.com>.

Hi Rainer,

thank you very much for your detailed answer.
Now I can analyse whats going to slow :-)

Am I'm right, that with the following setting a reply_timeout is 
detected when the tomcat doesn't reply within 10 seconds and it will be 
set into error if more than 15 timeouts in a minutes are detected?

worker.plexus.reply_timeout=10000
worker.plexus.max_reply_timeouts=30

Regards
Ingo

Rainer Jung schrieb:
> Hi,
>
> usually those info messages indicate, that the request took to long, 
> so the user in front of the browser did a retry or back or something 
> similar. In this case the browser closes the existing connection to 
> the web server, although parts of the response are still missing. When 
> Tomcat send the next part of the response to the web server (e.G. 
> Apache httpd), and the web server tries to send it back to the client 
> (=browser), it detects a closed connection and this info message gets 
> written into the log.
>
> Depending on your load, the performance of the app and the patience of 
> your users, a message every 20 to 30 seconds seems a bit to much.
>
> You can add %D to your httpd access log, which will log microseconds 
> response time to the access log, and also %{pid}P and %{tid}P, which 
> will give the process id and the thread id of the thread serving the 
> request. mod_jk also logs the process id and thread id automatically 
> for each line, so there is a chance, that you can match those lines 
> and find out, if they come from special requests (e.g. long running 
> ones), from special users (IPs, Browsers) etc.
>
> For the httpd access log configuration see
>
> http://httpd.apache.org/docs/2.2/mod/mod_log_config.html
>
> Caution: the timestamp in the jk log file, is the moment the problem 
> got detected, the time stamp in the access log is the moment the 
> request started (so you need to add the %d value to make them match).
>
> Some more hints about your config included below.
>
> Regards,
>
> Rainer
>
> Ingo Düppe wrote:
>> Hello,
>>
>> I found the following lines in my mod_jk.log every 20 to 30 seconds 
>> for all of my instances.
>>
>> [info] jk_ajp_common.c (1511): Writing to client aborted or client 
>> network problems
>> [info] jk_ajp_common.c (1996): (plexus12) request failed, because of 
>> client write error without recovery in send loop attempt=0
>> [info] jk_lb_worker.c (1092): unrecoverable error 200, request 
>> failed. Client failed in the middle of request, we can't recover to 
>> another instance.
>> [info] mod_jk.c (2270): Aborting connection for worker=plexus
>>
>> First I thought, this is caused because of some related network 
>> traffic problem, but also the two Tomcat instances on the same 
>> machine appear in the log.
>>
>> My configuration exist of Windows 2003 SP2 Servers, Apache httpd 
>> 2.2.6, mod_jk 1.2.25, and Tomcat 6.0.14.
>>
>>
>> Please, can anybody tell me what's wrong with my configuration and 
>> what is the meaning of these lines.
>>
>> Thx in advanced
>>
>> Ingo
>>
>>
>>
>> My workers.properties:
>>
>> workers.tomcat_home="C:\Server\tomcat-6.0.14"
>> workers.java_home="c:\java\jdk-6.0"
>> ps=\
>
> The three above are not used since a long time. Remove them.
>
>> worker.list=plexus,jkstatus
>> worker.plexus11.port=8109
>> worker.plexus11.host=openuss01
>> worker.plexus11.type=ajp13
>> worker.plexus11.lbfactor=10
>> ...
>> worker.plexus52.port=8209
>> worker.plexus52.host=openuss05
>> worker.plexus52.type=ajp13
>> worker.plexus52.lbfactor=10
>>
>> worker.jkstatus.type=status
>> worker.plexus.type=lb
>> worker.plexus.sticky_session=true
>> worker.plexus.socket_timeout=20
>> worker.plexus.reply_timeout=3000
>
> 3 seconds reply timeout sounds a bit ambitious (but I don't know what 
> your app does)
>
>> worker.plexus.max_reply_timeouts=40000
>
> This max_reply_timeouts sounds a bit high. It controls after how many 
> reply_timeouts a single sub worker of an lb gets put into an error 
> state (and thus will not be used for a minute, before it gets tried 
> again).
>
> The counter gets incremented for each reply_timeout missed and gets 
> divided by 2 about every minute. So assuming a steady flow of missed 
> reply_timeouts, the value 40000 corresponds to about 20000 
> reply_timeouts missed per minute.
>
>> worker.plexus.balance_workers=plexus11,plexus21,plexus12,plexus22,plexus51,plexus31,plexus52 
>
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: users@tomcat.apache.org
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>
>


---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Unrecoverable error 200, request failed

Posted by Rainer Jung <ra...@kippdata.de>.

Hi,

usually those info messages indicate, that the request took to long, so 
the user in front of the browser did a retry or back or something 
similar. In this case the browser closes the existing connection to the 
web server, although parts of the response are still missing. When 
Tomcat send the next part of the response to the web server (e.G. Apache 
httpd), and the web server tries to send it back to the client 
(=browser), it detects a closed connection and this info message gets 
written into the log.

Depending on your load, the performance of the app and the patience of 
your users, a message every 20 to 30 seconds seems a bit to much.

You can add %D to your httpd access log, which will log microseconds 
response time to the access log, and also %{pid}P and %{tid}P, which 
will give the process id and the thread id of the thread serving the 
request. mod_jk also logs the process id and thread id automatically for 
each line, so there is a chance, that you can match those lines and find 
out, if they come from special requests (e.g. long running ones), from 
special users (IPs, Browsers) etc.

For the httpd access log configuration see

http://httpd.apache.org/docs/2.2/mod/mod_log_config.html

Caution: the timestamp in the jk log file, is the moment the problem got 
detected, the time stamp in the access log is the moment the request 
started (so you need to add the %d value to make them match).

Some more hints about your config included below.

Regards,

Rainer

Ingo Düppe wrote:
> Hello,
> 
> I found the following lines in my mod_jk.log every 20 to 30 seconds for 
> all of my instances.
> 
> [info] jk_ajp_common.c (1511): Writing to client aborted or client 
> network problems
> [info] jk_ajp_common.c (1996): (plexus12) request failed, because of 
> client write error without recovery in send loop attempt=0
> [info] jk_lb_worker.c (1092): unrecoverable error 200, request failed. 
> Client failed in the middle of request, we can't recover to another 
> instance.
> [info] mod_jk.c (2270): Aborting connection for worker=plexus
> 
> First I thought, this is caused because of some related network traffic 
> problem, but also the two Tomcat instances on the same machine appear in 
> the log.
> 
> My configuration exist of Windows 2003 SP2 Servers, Apache httpd 2.2.6, 
> mod_jk 1.2.25, and Tomcat 6.0.14.
> 
> 
> Please, can anybody tell me what's wrong with my configuration and what 
> is the meaning of these lines.
> 
> Thx in advanced
> 
> Ingo
> 
> 
> 
> My workers.properties:
> 
> workers.tomcat_home="C:\Server\tomcat-6.0.14"
> workers.java_home="c:\java\jdk-6.0"
> ps=\

The three above are not used since a long time. Remove them.

> worker.list=plexus,jkstatus
> worker.plexus11.port=8109
> worker.plexus11.host=openuss01
> worker.plexus11.type=ajp13
> worker.plexus11.lbfactor=10
> ...
> worker.plexus52.port=8209
> worker.plexus52.host=openuss05
> worker.plexus52.type=ajp13
> worker.plexus52.lbfactor=10
> 
> worker.jkstatus.type=status
> worker.plexus.type=lb
> worker.plexus.sticky_session=true
> worker.plexus.socket_timeout=20
> worker.plexus.reply_timeout=3000

3 seconds reply timeout sounds a bit ambitious (but I don't know what 
your app does)

> worker.plexus.max_reply_timeouts=40000

This max_reply_timeouts sounds a bit high. It controls after how many 
reply_timeouts a single sub worker of an lb gets put into an error state 
(and thus will not be used for a minute, before it gets tried again).

The counter gets incremented for each reply_timeout missed and gets 
divided by 2 about every minute. So assuming a steady flow of missed 
reply_timeouts, the value 40000 corresponds to about 20000 
reply_timeouts missed per minute.

> worker.plexus.balance_workers=plexus11,plexus21,plexus12,plexus22,plexus51,plexus31,plexus52 

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org