You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Éric Gauthier <EG...@labvolt.ca> on 2008/11/24 15:39:23 UTC

Apache 2.2.3 with mod_jk 1.20 loosing connection with a cluster of two Tomcat 6.0.10

Hello,

My application server running Apache HTTP Server 2.2.3 as a front end with mod_jk 1.20 to transfer request in AJP to a cluster of two Tomcat 6.0.10 instances is getting "Connecting to tomcat failed." errors within the mod_jk logs on a random basis. The whole cluster runs on a Windows Server 2003 Release 2, which is itself a virtual machine running under VMWare ESX 3.2.

The cluster runs applications for an intranet and some specific users accesses it via the extranet. The whole thing is not public on the internet.

The whole cluster can run properly for a week and then going down as much as three times in the same day. I assume this is not traffic related since the issue occured at times of day where no one is at the office (very early in the morning).

When this happens to the first instance of Tomcat, the second one goes down seconds after. Restarting Apache HTTP server nor Tomcat services fixes the issue. I need a complete reboot of the server.

I cannot see anything relevant in the Tomcat logs.

However, in JConsole, when it occurs I can see the total threads of both tomcat instances rising up to 1000 (it normally floats aroung 200 / 250 total  threads) even if the number of active threads stays the same (150). As if threads were created/deleted/created... and so on very quickly.

There is a strange behaviour: When it occurs, obviously I cannot access the applications via Apache HTTP server since it says "Service Temporarily Unavailable" but if I try to access directly both Tomcat on their HTTP connector, pages shows but with incomplete parts, sometimes pictures are missing, sometimes HTML is incomplete.

I assume this is a Tomcat issue since the threads are going berserk, but I am unsure at this point.

No need to say that my goal is to prevent all of this from happening...

Here is what's dumped in mod_jk.log:

[Tue Nov 04 09:40:09 2008] TC1 applications.labvolt.ca 2.135721
[Tue Nov 04 09:40:09 2008] [1304:2444] [info]  mod_jk.c (2142): Service error=0 for worker=TC1
[Tue Nov 04 09:40:27 2008] [1304:2444] [info]  jk_connect.c (451): connect to 127.0.0.1:8013 failed with errno=61
[Tue Nov 04 09:40:27 2008] [1304:2444] [info]  jk_ajp_common.c (873): Failed opening socket to (127.0.0.1:8013) with (errno=61)
[Tue Nov 04 09:40:27 2008] [1304:2444] [info]  jk_ajp_common.c (1259): (TC1) error connecting to the backend server (errno=61)
[Tue Nov 04 09:40:27 2008] [1304:2444] [info]  jk_ajp_common.c (1916): (TC1) sending request to tomcat failed,  recoverable operation attempt=1
[Tue Nov 04 09:40:28 2008] [1304:2444] [info]  jk_connect.c (451): connect to 127.0.0.1:8013 failed with errno=61
[Tue Nov 04 09:40:28 2008] [1304:2444] [info]  jk_ajp_common.c (873): Failed opening socket to (127.0.0.1:8013) with (errno=61)
[Tue Nov 04 09:40:28 2008] [1304:2444] [info]  jk_ajp_common.c (1259): (TC1) error connecting to the backend server (errno=61)
[Tue Nov 04 09:40:28 2008] [1304:2444] [info]  jk_ajp_common.c (1916): (TC1) sending request to tomcat failed,  recoverable operation attempt=2
[Tue Nov 04 09:40:28 2008] [1304:2444] [error] jk_ajp_common.c (1928): (TC1) Connecting to tomcat failed. Tomcat is probably not started or is listening on the wrong port
[Tue Nov 04 09:41:26 2008] TC1 applications.labvolt.ca 0.061460
[Tue Nov 04 09:41:33 2008] [1304:2444] [info]  jk_connect.c (451): connect to 127.0.0.1:8014 failed with errno=61
[Tue Nov 04 09:41:33 2008] [1304:2444] [info]  jk_ajp_common.c (873): Failed opening socket to (127.0.0.1:8014) with (errno=61)
[Tue Nov 04 09:41:33 2008] [1304:2444] [info]  jk_ajp_common.c (1259): (TC2) error connecting to the backend server (errno=61)
[Tue Nov 04 09:41:33 2008] [1304:2444] [info]  jk_ajp_common.c (1916): (TC2) sending request to tomcat failed,  recoverable operation attempt=1
[Tue Nov 04 09:41:34 2008] [1304:2444] [info]  jk_connect.c (451): connect to 127.0.0.1:8014 failed with errno=61
[Tue Nov 04 09:41:34 2008] [1304:2444] [info]  jk_ajp_common.c (873): Failed opening socket to (127.0.0.1:8014) with (errno=61)
[Tue Nov 04 09:41:34 2008] [1304:2444] [info]  jk_ajp_common.c (1259): (TC2) error connecting to the backend server (errno=61)
[Tue Nov 04 09:41:34 2008] [1304:2444] [info]  jk_ajp_common.c (1916): (TC2) sending request to tomcat failed,  recoverable operation attempt=2
[Tue Nov 04 09:41:34 2008] [1304:2444] [error] jk_ajp_common.c (1928): (TC2) Connecting to tomcat failed. Tomcat is probably not started or is listening on the wrong port

Here is my workers used in Apache HTTP server:

worker.TC1.port=8013
worker.TC1.host=localhost
worker.TC1.type=ajp13
worker.TC1.lbfactor=1
worker.TC2.port=8014
worker.TC2.host=localhost
worker.TC2.type=ajp13
worker.TC2.lbfactor=0

Here is one of my AJP connectors  (both are same) in Tomcat:

<Connector port="8013" protocol="AJP/1.3" redirectPort="8443" minSpareThreads="100" maxSpareThreads="300"/>

I am new to mailing lists so feel free to ask me whatever I may have omitted.

Best regards, and thanks in advance!

-Eric Gauthier

RE: Apache 2.2.3 with mod_jk 1.20 loosing connection with a cluster of two Tomcat 6.0.10

Posted by Éric Gauthier <EG...@labvolt.ca>.
Hello Peter,

Thank you for your suggestions.

I forgot to say that even if there is two Tomcat instances running behind an AJP connector, no load balancing has been enabled. Each Tomcat contains its own set of applications.

-----Original Message-----
From: Peter Crowther [mailto:Peter.Crowther@melandra.com] 
Sent: 24 novembre 2008 09:52
To: 'Tomcat Users List'
Subject: RE: Apache 2.2.3 with mod_jk 1.20 loosing connection with a cluster of two Tomcat 6.0.10

> From: Éric Gauthier [mailto:EGauthier@labvolt.ca]
[...]
> I assume this is a Tomcat issue since the threads are going
> berserk, but I am unsure at this point.
[... good description elided...]

Thanks for a comprehensive description of the problem.

It *may* be an application or environment issue, especially as both app servers fail at nearly the same time.  Common mode failures such as this could indicate:

- Shortage of memory on the server causing OutOfMemoryException, after which anything could happen!

I didn't see any OutOfMemoryException in Tomcat logs.

- If your application uses other resources (such as a database), is the database doing anything unusual at around these times?  Suddenly long response times, down for maintenance, or similar?

Yes they use the same database resource. I will check if anything occurs at this time with the database.

- If the app is an Internet application, is one of the web crawlers suddenly hitting it with a massive load and saturating it?

I tried to perform a suddent massive load with JMeter and the only thing it caused (I may have abused on the concurrent requests) is an OutOfMemoryException and then both Tomcats stopped responding. But this is not what I am actually experiencing...

- Is the virtual machine being starved of resources at these times due to something else happening on the host?

The VM is dedicated to webapp hosting. No other VM has the possibility to "steal" resources to this VM. But to make sure, I will check the resources usage logs for this VM the guys in charge of that.

All just vague ideas at this point, and I suspect you have already considered and rejected them!


                - Peter

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


RE: Apache 2.2.3 with mod_jk 1.20 loosing connection with a cluster of two Tomcat 6.0.10

Posted by Peter Crowther <Pe...@melandra.com>.
> From: Éric Gauthier [mailto:EGauthier@labvolt.ca]
[...]
> I assume this is a Tomcat issue since the threads are going
> berserk, but I am unsure at this point.
[... good description elided...]

Thanks for a comprehensive description of the problem.

It *may* be an application or environment issue, especially as both app servers fail at nearly the same time.  Common mode failures such as this could indicate:

- Shortage of memory on the server causing OutOfMemoryException, after which anything could happen!

- If your application uses other resources (such as a database), is the database doing anything unusual at around these times?  Suddenly long response times, down for maintenance, or similar?

- If the app is an Internet application, is one of the web crawlers suddenly hitting it with a massive load and saturating it?

- Is the virtual machine being starved of resources at these times due to something else happening on the host?

All just vague ideas at this point, and I suspect you have already considered and rejected them!

                - Peter

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org