You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@tomcat.apache.org by Yogesh Prajapati <yk...@gmail.com> on 2005/12/16 20:46:35 UTC

Tomcat 5.5.12 clustering - messages lost under high load

The detail on Tomcat Clustering Load Testing Environment:

Application: A web Portal, Pure JSP/Servlet based implementation using JDBC
(Oracle 10g RAC) and OLTP in nature.

Load Test Tool: Jmeter

Clustering Setup: 4 nodes

OS: SUSE Enterprize 9 (SP2) on all nodes (kernel: 2.6.5-7.97)

Sofwares: JDK 1.5.0_05, Tomcat 5.5.12

Hardware configuration:
Node #1:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
Node #2:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
Node #3:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
Node #4:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM

Network Configuration: All nodes are behind Alteon Load balancer
(response-time based load balancing), all have two nic cards with subnets
10.1.13.0 for load balancing network, 10.1.11.0 for private LAN. The private
nic has multicast enabled. All private nic are connected to 10/100 Fast
Ethernet switch.

Tomcat cluster configuration (same on all nodes):
        <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster
"
                 managerClassName="
org.apache.catalina.cluster.session.DeltaManager"
                 expireSessionsOnShutdown="false"
                 useDirtyFlag="true"
                 notifyListenersOnReplication="true">

            <Membership
                className="org.apache.catalina.cluster.mcast.McastService"
                mcastAddr="228.0.0.4"
                mcastPort="45564"
                mcastFrequency="1000"
                mcastDropTime="35000"
                mcastBindAddr="auto"
                />

            <Receiver
                className="
org.apache.catalina.cluster.tcp.ReplicationListener"
                tcpListenAddress="auto"
                tcpListenPort="4001"
                tcpThreadCount="24"/>

            <Sender
                className="
org.apache.catalina.cluster.tcp.ReplicationTransmitter"
                replicationMode="pooled"
                autoConnect="true"
                keepAliveTimeout="-1"
                maxPoolSocketLimit="600"
                doTransmitterProcessingStats="true"
                />

            <Valve className="
org.apache.catalina.cluster.tcp.ReplicationValve"

filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>

            <Deployer className="
org.apache.catalina.cluster.deploy.FarmWarDeployer"
                      tempDir="/tmp/war-temp/"
                      deployDir="/tmp/war-deploy/"
                      watchDir="/tmp/war-listen/"
                      watchEnabled="false"/>

            <ClusterListener className="
org.apache.catalina.cluster.session.ClusterSessionListener"/>
        </Cluster>
     Note: for the application session availability on all the nodes is
must, so using "pooled" mode.

Tomcate VM Parameters (additional switches for VM tunning):
-XX:+AggressiveHeap -Xms832m -Xmx832m -XX:+UseParallelGC -XX:+PrintGCDetails
-XX:MaxGCPauseMillis=200 -XX:GCTimeRatio=9

After starting tomcat on all the nodes, when I run Jmeter scripts with 20-70
concurrent user threads, the entire cluster works fine (almost 0% error) but
at high number of users like > 200 concurrent user threads the tomcat
cluster session replication starts failing consistently and the replication
messages getting lost. Here is what I get in tomcat logs on all the nodes
(too many times):

WARNING: Message lost: [10.1.11.95:4,001] type=[
org.apache.catalina.cluster.session.SessionMessageImpl],
id=[40FC741DB987BF5161C3AEEB32570A8E-
1134732225260]
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java
:92)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:124)
        at org.apache.catalina.cluster.tcp.DataSender.writeData(
DataSender.java:858)
        at org.apache.catalina.cluster.tcp.DataSender.pushMessage(
DataSender.java:799)
        at org.apache.catalina.cluster.tcp.DataSender.sendMessage(
DataSender.java:623)
        at org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage(
PooledSocketSender.java:128)
        at
org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(
ReplicationTransmitter.java:867)
        at
org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageClusterDomain
(ReplicationTransmitter.java:460)
        at
org.apache.catalina.cluster.tcp.SimpleTcpCluster.sendClusterDomain(
SimpleTcpCluster.java:1012)
        at org.apache.catalina.cluster.session.DeltaManager.send(
DeltaManager.java:629)
        at
org.apache.catalina.cluster.session.DeltaManager.sendCreateSession(
DeltaManager.java:617)
        at org.apache.catalina.cluster.session.DeltaManager.createSession(
DeltaManager.java:593)
        at org.apache.catalina.cluster.session.DeltaManager.createSession(
DeltaManager.java:572)
.............................
.............................

Also I have noticed fewer times on two of the nodes (#3, #4) following
error:

SEVERE: TCP Worker thread in cluster caught '
java.lang.ArrayIndexOutOfBoundsException: 1025' closing channel
java.lang.ArrayIndexOutOfBoundsException: 1025
        at org.apache.catalina.cluster.io.XByteBuffer.toInt(XByteBuffer.java
:231)
        at org.apache.catalina.cluster.io.XByteBuffer.countPackages(
XByteBuffer.java:164)
        at org.apache.catalina.cluster.io.ObjectReader.append(
ObjectReader.java:87)
        at org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel
(TcpReplicationThread.java:127)
        at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(
TcpReplicationThread.java:69)

With all the above warning/exception I get the following jmeter results
(scripts runs at: 200 concurrent threads, 5 iteration, 0 sec ramp-up
period):

Rate: 28 req/sec
Error: 9.07 %

The rate is acceptable but error is very high and specially at high number
of user thread the error % goes up. I have run the Jmeter script several
times along with tweaking cluster configuration but I am not able to figure
out what am I doing wrong.

Is "Broken pipe" is some kind failure and serious blocker OR it can safely
be ignored?

"ArrayIndexOutOfBoundsException" looks to me a bug, it may already have been
reported but I don't know yet?

With current scenario the memory usage are below 600 MB. My target is reach
2000 concurrent users thread keeping error within 3% and maintain the same
req/sec. Does this mean I have to add more memory (making it 2 GB on each
node).

Is there something else I am missing that I need to look at?

Any suggestions, ideas, tips are most welcome and appreciated.

Thanks

Yogi

Re: Tomcat 5.5.12 clustering - messages lost under high load

Posted by Yogesh Prajapati <yk...@gmail.com>.

On 12/19/05, Peter Rossbach <pr...@objektpark.de> wrote:
>
> Hey,
> Am 20.12.2005 um 01:09 schrieb Yogesh Prajapati:
>
> > On 12/18/05, Peter Rossbach <pr@objektpark.de > wrote:
> >>
> >> Hey,
> >>
> >> a)   Servlet Spec say: You must have sticky session when you use
> >> distributable web apps. Session Replication is only used when
> >> primary node
> >> crashed!!
> >
> >
> > I looked into the servlet spec (V 2.4). I did not find anything
> > like "sticky
> > session" or "session replication". I do see the sentence in the spec:
> > "Within an application marked as distributable, all requests that
> > are part
> > of a session must be handled by one Java Virtual Machine1 ("JVM") at a
> > time.", is this what you meant by using "Sticky Session"...if yes,
> > it make
> > sense to me.
> >
> YES!
> >


Great relief to say good bye to such a big problem! many thanks to you! BTW,
meantime I noticed running couple load tests that the problem of "Broken
pipe" exception is gone away (it is fixed).

> b)   When you app don't send a new request before the first is
> > complete:
> >> use pooled mode with waitForAck=true!
> >>        It can work, but....
> >
> > can you please elaborate more on this ...I did not follow what you
> > tried to
> > communicate?
> >
> Pooled mode is a synchronous mode. With waitForAck the response
> thread wait that all backups are received the replication message.


It is interesting to try this option. As you said "It can work, but...." I
understand that by setting "waitForAck=true" will greatly reduce the
throughput therefore not a recommended clustering approach......am I right?

> c)   Rhe reported exception has nothing do with clustering, Seems that
> >> your app send response, before open session. Violate Spec!
> >
> >
> > I think you are right, it is not a cluster exception But I don't
> > see the
> > exception for every request, it appears randomly not sure what is
> > causing
> > it, also I dont know how my JSP based app is violating the spec.
> > All my JSPs
> > are doing is to use "<jsp:include..../>" with two level nesting. E.g.
> >
>
> Why you set JspWriter out=null?? Thats wrong!
> Implement the response header setting inside a servlet filter is better!


I kind of debated myself about this sometime ago, but since JSP pre-compiler
never gave any warning/error (or ignored it) it never caught my attention. I
am crazy at the moment after having gone through it so much that I am unable
to control every single little thing. Thanks very much for pointing that
out.

Re: Tomcat 5.5.12 clustering - messages lost under high load

Posted by Peter Rossbach <pr...@objektpark.de>.

Hey,
Am 20.12.2005 um 01:09 schrieb Yogesh Prajapati:

> On 12/18/05, Peter Rossbach <pr@objektpark.de > wrote:
>>
>> Hey,
>>
>> a)   Servlet Spec say: You must have sticky session when you use
>> distributable web apps. Session Replication is only used when  
>> primary node
>> crashed!!
>
>
> I looked into the servlet spec (V 2.4). I did not find anything  
> like "sticky
> session" or "session replication". I do see the sentence in the spec:
> "Within an application marked as distributable, all requests that  
> are part
> of a session must be handled by one Java Virtual Machine1 ("JVM") at a
> time.", is this what you meant by using "Sticky Session"...if yes,  
> it make
> sense to me.
>
YES!
>
> b)   When you app don't send a new request before the first is  
> complete:
>> use pooled mode with waitForAck=true!
>>        It can work, but....
>
> can you please elaborate more on this ...I did not follow what you  
> tried to
> communicate?
>
Pooled mode is a synchronous mode. With waitForAck the response  
thread wait that all
backups are received the replication message.

> c)   Rhe reported exception has nothing do with clustering, Seems that
>> your app send response, before open session. Violate Spec!
>
>
> I think you are right, it is not a cluster exception But I don't  
> see the
> exception for every request, it appears randomly not sure what is  
> causing
> it, also I dont know how my JSP based app is violating the spec.  
> All my JSPs
> are doing is to use "<jsp:include..../>" with two level nesting. E.g.
>
> level_1.jsp:
>     <%@ include file="javaImport.jsp"%>
>      ...........
>      ...........
>       <jsp:include page="level_2.jsp"/>
>      ...........
>      ...........
>
> level_2.jsp:
>     <%@ include file="javaImport.jsp"%>
>      ...........
>      ...........
>       <jsp:include page="level_3.jsp"/>
>      ...........
>      ...........
>
> level_3.jsp:
>     <%@ include file="javaImport.jsp"%>
>      ...........
>      ...........
>      ...........
>
> javaImport.jsp:
>
> <%@ page import="
>         , java.lang.*
>         , java.sql.*
>         , java.util.*
>         , java.io.*
>         , java.text.*
>         , javax.mail.*
>         , javax.mail.internet.*
>         , org.apache.commons.fileupload.* " %>
>
> <%
>         response.setHeader("Cache-Control","no-cache"); //HTTP 1.1
>         response.setHeader("Pragma","no-cache"); //HTTP 1.0
>         response.setDateHeader ("Expires", -1); //prevents caching  
> at the
> proxy server
> %>
>
> <%! JspWriter out = null; %>

Why you set JspWriter out=null?? Thats wrong!
Implement the response header setting inside a servlet filter is better!

Peter


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Tomcat 5.5.12 clustering - messages lost under high load

Posted by Yogesh Prajapati <yk...@gmail.com>.

Please ignore the first "," for "import" attribute.. it should be

javaImport.jsp:

<%@ page import="
         java.lang.*
        , java.sql.*
        , java.util.*
        , java.io.*
        , java.text.*
        , javax.mail.*
        , javax.mail.internet.*
        , org.apache.commons.fileupload.* " %>

Re: Tomcat 5.5.12 clustering - messages lost under high load

Posted by Yogesh Prajapati <yk...@gmail.com>.

On 12/18/05, Peter Rossbach <pr@objektpark.de > wrote:
>
> Hey,
>
> a)   Servlet Spec say: You must have sticky session when you use
> distributable web apps. Session Replication is only used when primary node
> crashed!!

I looked into the servlet spec (V 2.4). I did not find anything like "sticky
session" or "session replication". I do see the sentence in the spec:
"Within an application marked as distributable, all requests that are part
of a session must be handled by one Java Virtual Machine1 ("JVM") at a
time.", is this what you meant by using "Sticky Session"...if yes, it make
sense to me.

b)   When you app don't send a new request before the first is complete:
> use pooled mode with waitForAck=true!
>        It can work, but....

can you please elaborate more on this ...I did not follow what you tried to
communicate?

c)   Rhe reported exception has nothing do with clustering, Seems that
> your app send response, before open session. Violate Spec!

I think you are right, it is not a cluster exception But I don't see the
exception for every request, it appears randomly not sure what is causing
it, also I dont know how my JSP based app is violating the spec. All my JSPs
are doing is to use "<jsp:include..../>" with two level nesting. E.g.

level_1.jsp:
    <%@ include file="javaImport.jsp"%>
     ...........
     ...........
      <jsp:include page="level_2.jsp"/>
     ...........
     ...........

level_2.jsp:
    <%@ include file="javaImport.jsp"%>
     ...........
     ...........
      <jsp:include page="level_3.jsp"/>
     ...........
     ...........

level_3.jsp:
    <%@ include file="javaImport.jsp"%>
     ...........
     ...........
     ...........

javaImport.jsp:

<%@ page import="
        , java.lang.*
        , java.sql.*
        , java.util.*
        , java.io.*
        , java.text.*
        , javax.mail.*
        , javax.mail.internet.*
        , org.apache.commons.fileupload.* " %>

<%
        response.setHeader("Cache-Control","no-cache"); //HTTP 1.1
        response.setHeader("Pragma","no-cache"); //HTTP 1.0
        response.setDateHeader ("Expires", -1); //prevents caching at the
proxy server
%>

<%! JspWriter out = null; %>

Re: Tomcat 5.5.12 clustering - messages lost under high load

Posted by Peter Rossbach <pr...@objektpark.de>.

Hey,

a)   Servlet Spec say: You must have sticky session when you use 
distributable web apps. Session Replication is
      only used when primary node crashed!!
b)   When you app don't send a new request before the first is complete: 
use pooled mode with waitForAck=true!
       It can work, but....
c)   Rhe reported exception has nothing do with clustering, Seems that 
your app send response, before open session. Violate Spec!


Peter

Yogesh Prajapati schrieb:

>Peter,
>
>I tried the latest Tomcat source (I believe it is 5.5.15 head as stated in
>bug #37896). As you suggested I used "fastasyncqueue". Here is the config
>for "fastasyncqueue"
>            <Sender
>                className="
>org.apache.catalina.cluster.tcp.ReplicationTransmitter"
>                replicationMode="fastasyncqueue"
>                keepAliveTimeout="320000"
>                threadPriority="10"
>                queueTimeWait="true"
>                queueDoStats="true"
>                waitForAck="false"
>                autoConnect="false"
>                doTransmitterProcessingStats="true"
>                />
>
>But it did not work (I am not able to use stickyseesion load balancing at
>the moment)..... the error % was very high (> 34%) so reverted back to
>"pooled" mode but removed "autoConnect" attribute. I still saw the  "Broken
>pipe" exceptions, so I wondered if the problem is really fixed or not. I
>further tried to tweak listener threads and sender socket pool limit:
>
>            <Receiver
>                className="
>org.apache.catalina.cluster.tcp.ReplicationListener"
>                tcpListenAddress="auto"
>                tcpListenPort="4001"
>                tcpThreadCount="50"/>
>
>            <Sender
>                className="
>org.apache.catalina.cluster.tcp.ReplicationTransmitter"
>                replicationMode="pooled"
>                keepAliveTimeout="-1"
>                maxPoolSocketLimit="200"
>                doTransmitterProcessingStats="true"
>                />
>with the new configuration I am getting a lot of following "SEVERE" error.
>
>SEVERE: Exception initializing page context
>java.lang.IllegalStateException: Cannot create a session after the response
>has been committed
>        at org.apache.catalina.connector.Request.doGetSession(Request.java
>:2214)
>        at org.apache.catalina.connector.Request.getSession(Request.java
>:2024)
>        at org.apache.catalina.connector.RequestFacade.getSession(
>RequestFacade.java:831)
>        at javax.servlet.http.HttpServletRequestWrapper.getSession(
>HttpServletRequestWrapper.java:215)
>        at org.apache.catalina.core.ApplicationHttpRequest.getSession(
>ApplicationHttpRequest.java:544)
>        at org.apache.catalina.core.ApplicationHttpRequest.getSession(
>ApplicationHttpRequest.java:493)
>        at org.apache.jasper.runtime.PageContextImpl._initialize(
>PageContextImpl.java:148)
>        at org.apache.jasper.runtime.PageContextImpl.initialize(
>PageContextImpl.java:123)
>        at org.apache.jasper.runtime.JspFactoryImpl.internalGetPageContext(
>JspFactoryImpl.java:104)
>        at org.apache.jasper.runtime.JspFactoryImpl.getPageContext(
>JspFactoryImpl.java:61)
>        at org.apache.jsp.dynaLeftMenuItems_jsp._jspService(
>org.apache.jsp.dynaLeftMenuItems_jsp:50)
>
>
>Having said all that, since Jmeter script fails at initial steps therefore
>successive steps failed too so overall error % went up to 30% and req/sec
>was 19. I am kind of confused while trying to analyze the situation as to
>why would "Broken pipe" exception occur (even when it was supposed to be
>fixed) but then it disappears by increasing listener thread and sender
>socket limit....is it some kind of timing issue and balancing between no of
>listeners thread and no of sender sockets in the pool. I didn't find in the
>documentation about the effect of changing those parameter or any
>recommendation.
>
>Thanks
>Yogesh
>
>On 12/16/05, Peter Rossbach <pr...@objektpark.de> wrote:
>  
>
>>Hey Yogesh,
>>
>>please update to current svn head.
>>
>>s. following bug that now fixed:
>>
>>http://issues.apache.org/bugzilla/show_bug.cgi?id=37896
>>
>>S. 5.5.14 Changelog the see that more bugs exists inside 5.5.12.
>>
>>Please, report as it works!
>>
>>
>>Peter
>>
>>Tipp: For high load the fastasyncqueue sender mode is better.
>>Also you don't need autoconnect!
>>
>>
>>
>>Yogesh Prajapati schrieb:
>>
>>    
>>
>>>The detail on Tomcat Clustering Load Testing Environment:
>>>
>>>Application: A web Portal, Pure JSP/Servlet based implementation using
>>>      
>>>
>>JDBC
>>    
>>
>>>(Oracle 10g RAC) and OLTP in nature.
>>>
>>>Load Test Tool: Jmeter
>>>
>>>Clustering Setup: 4 nodes
>>>
>>>OS: SUSE Enterprize 9 (SP2) on all nodes (kernel: 2.6.5-7.97)
>>>
>>>Sofwares: JDK 1.5.0_05, Tomcat 5.5.12
>>>
>>>Hardware configuration:
>>>Node #1:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
>>>Node #2:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
>>>Node #3:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
>>>Node #4:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
>>>
>>>Network Configuration: All nodes are behind Alteon Load balancer
>>>(response-time based load balancing), all have two nic cards with subnets
>>>10.1.13.0 for load balancing network, 10.1.11.0 for private LAN. The
>>>      
>>>
>>private
>>    
>>
>>>nic has multicast enabled. All private nic are connected to 10/100 Fast
>>>Ethernet switch.
>>>
>>>Tomcat cluster configuration (same on all nodes):
>>>       <Cluster className="
>>>      
>>>
>>org.apache.catalina.cluster.tcp.SimpleTcpCluster
>>    
>>
>>>"
>>>                managerClassName="
>>>org.apache.catalina.cluster.session.DeltaManager"
>>>                expireSessionsOnShutdown="false"
>>>                useDirtyFlag="true"
>>>                notifyListenersOnReplication="true">
>>>
>>>           <Membership
>>>               className="org.apache.catalina.cluster.mcast.McastService
>>>      
>>>
>>"
>>    
>>
>>>               mcastAddr="228.0.0.4"
>>>               mcastPort="45564"
>>>               mcastFrequency="1000"
>>>               mcastDropTime="35000"
>>>               mcastBindAddr="auto"
>>>               />
>>>
>>>           <Receiver
>>>               className="
>>>org.apache.catalina.cluster.tcp.ReplicationListener"
>>>               tcpListenAddress="auto"
>>>               tcpListenPort="4001"
>>>               tcpThreadCount="24"/>
>>>
>>>           <Sender
>>>               className="
>>>org.apache.catalina.cluster.tcp.ReplicationTransmitter"
>>>               replicationMode="pooled"
>>>               autoConnect="true"
>>>               keepAliveTimeout="-1"
>>>               maxPoolSocketLimit="600"
>>>               doTransmitterProcessingStats="true"
>>>               />
>>>
>>>           <Valve className="
>>>org.apache.catalina.cluster.tcp.ReplicationValve"
>>>
>>>      
>>>
>>>filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
>>>
>>>           <Deployer className="
>>>org.apache.catalina.cluster.deploy.FarmWarDeployer"
>>>                     tempDir="/tmp/war-temp/"
>>>                     deployDir="/tmp/war-deploy/"
>>>                     watchDir="/tmp/war-listen/"
>>>                     watchEnabled="false"/>
>>>
>>>           <ClusterListener className="
>>>org.apache.catalina.cluster.session.ClusterSessionListener"/>
>>>       </Cluster>
>>>    Note: for the application session availability on all the nodes is
>>>must, so using "pooled" mode.
>>>
>>>Tomcate VM Parameters (additional switches for VM tunning):
>>>-XX:+AggressiveHeap -Xms832m -Xmx832m -XX:+UseParallelGC
>>>      
>>>
>>-XX:+PrintGCDetails
>>    
>>
>>>-XX:MaxGCPauseMillis=200 -XX:GCTimeRatio=9
>>>
>>>After starting tomcat on all the nodes, when I run Jmeter scripts with
>>>      
>>>
>>20-70
>>    
>>
>>>concurrent user threads, the entire cluster works fine (almost 0% error)
>>>      
>>>
>>but
>>    
>>
>>>at high number of users like > 200 concurrent user threads the tomcat
>>>cluster session replication starts failing consistently and the
>>>      
>>>
>>replication
>>    
>>
>>>messages getting lost. Here is what I get in tomcat logs on all the nodes
>>>(too many times):
>>>
>>>WARNING: Message lost: [10.1.11.95:4,001] type=[
>>>org.apache.catalina.cluster.session.SessionMessageImpl],
>>>id=[40FC741DB987BF5161C3AEEB32570A8E-
>>>1134732225260]
>>>java.net.SocketException: Broken pipe
>>>       at java.net.SocketOutputStream.socketWrite0(Native Method)
>>>       at java.net.SocketOutputStream.socketWrite(
>>>      
>>>
>>SocketOutputStream.java
>>    
>>
>>>:92)
>>>       at java.net.SocketOutputStream.write(SocketOutputStream.java:124)
>>>       at org.apache.catalina.cluster.tcp.DataSender.writeData(
>>>DataSender.java:858)
>>>       at org.apache.catalina.cluster.tcp.DataSender.pushMessage(
>>>DataSender.java:799)
>>>       at org.apache.catalina.cluster.tcp.DataSender.sendMessage(
>>>DataSender.java:623)
>>>       at org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage
>>>      
>>>
>>(
>>    
>>
>>>PooledSocketSender.java:128)
>>>       at
>>>org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(
>>>ReplicationTransmitter.java:867)
>>>       at
>>>
>>>      
>>>
>>org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageClusterDomain
>>    
>>
>>>(ReplicationTransmitter.java:460)
>>>       at
>>>org.apache.catalina.cluster.tcp.SimpleTcpCluster.sendClusterDomain(
>>>SimpleTcpCluster.java:1012)
>>>       at org.apache.catalina.cluster.session.DeltaManager.send(
>>>DeltaManager.java:629)
>>>       at
>>>org.apache.catalina.cluster.session.DeltaManager.sendCreateSession(
>>>DeltaManager.java:617)
>>>       at org.apache.catalina.cluster.session.DeltaManager.createSession
>>>      
>>>
>>(
>>    
>>
>>>DeltaManager.java:593)
>>>       at org.apache.catalina.cluster.session.DeltaManager.createSession
>>>      
>>>
>>(
>>    
>>
>>>DeltaManager.java:572)
>>>.............................
>>>.............................
>>>
>>>Also I have noticed fewer times on two of the nodes (#3, #4) following
>>>error:
>>>
>>>SEVERE: TCP Worker thread in cluster caught '
>>>java.lang.ArrayIndexOutOfBoundsException: 1025' closing channel
>>>java.lang.ArrayIndexOutOfBoundsException: 1025
>>>       at org.apache.catalina.cluster.io.XByteBuffer.toInt(
>>>      
>>>
>>XByteBuffer.java
>>    
>>
>>>:231)
>>>       at org.apache.catalina.cluster.io.XByteBuffer.countPackages(
>>>XByteBuffer.java:164)
>>>       at org.apache.catalina.cluster.io.ObjectReader.append(
>>>ObjectReader.java:87)
>>>       at
>>>      
>>>
>>org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel
>>    
>>
>>>(TcpReplicationThread.java:127)
>>>       at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(
>>>TcpReplicationThread.java:69)
>>>
>>>With all the above warning/exception I get the following jmeter results
>>>(scripts runs at: 200 concurrent threads, 5 iteration, 0 sec ramp-up
>>>period):
>>>
>>>Rate: 28 req/sec
>>>Error: 9.07 %
>>>
>>>The rate is acceptable but error is very high and specially at high
>>>      
>>>
>>number
>>    
>>
>>>of user thread the error % goes up. I have run the Jmeter script several
>>>times along with tweaking cluster configuration but I am not able to
>>>      
>>>
>>figure
>>    
>>
>>>out what am I doing wrong.
>>>
>>>Is "Broken pipe" is some kind failure and serious blocker OR it can
>>>      
>>>
>>safely
>>    
>>
>>>be ignored?
>>>
>>>"ArrayIndexOutOfBoundsException" looks to me a bug, it may already have
>>>      
>>>
>>been
>>    
>>
>>>reported but I don't know yet?
>>>
>>>With current scenario the memory usage are below 600 MB. My target is
>>>      
>>>
>>reach
>>    
>>
>>>2000 concurrent users thread keeping error within 3% and maintain the
>>>      
>>>
>>same
>>    
>>
>>>req/sec. Does this mean I have to add more memory (making it 2 GB on each
>>>node).
>>>
>>>Is there something else I am missing that I need to look at?
>>>
>>>Any suggestions, ideas, tips are most welcome and appreciated.
>>>
>>>Thanks
>>>
>>>Yogi
>>>
>>>
>>>
>>>      
>>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>>    
>>
>
>  
>




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Tomcat 5.5.12 clustering - messages lost under high load

Posted by Yogesh Prajapati <yk...@gmail.com>.

Peter,

I tried the latest Tomcat source (I believe it is 5.5.15 head as stated in
bug #37896). As you suggested I used "fastasyncqueue". Here is the config
for "fastasyncqueue"
            <Sender
                className="
org.apache.catalina.cluster.tcp.ReplicationTransmitter"
                replicationMode="fastasyncqueue"
                keepAliveTimeout="320000"
                threadPriority="10"
                queueTimeWait="true"
                queueDoStats="true"
                waitForAck="false"
                autoConnect="false"
                doTransmitterProcessingStats="true"
                />

But it did not work (I am not able to use stickyseesion load balancing at
the moment)..... the error % was very high (> 34%) so reverted back to
"pooled" mode but removed "autoConnect" attribute. I still saw the  "Broken
pipe" exceptions, so I wondered if the problem is really fixed or not. I
further tried to tweak listener threads and sender socket pool limit:

            <Receiver
                className="
org.apache.catalina.cluster.tcp.ReplicationListener"
                tcpListenAddress="auto"
                tcpListenPort="4001"
                tcpThreadCount="50"/>

            <Sender
                className="
org.apache.catalina.cluster.tcp.ReplicationTransmitter"
                replicationMode="pooled"
                keepAliveTimeout="-1"
                maxPoolSocketLimit="200"
                doTransmitterProcessingStats="true"
                />
with the new configuration I am getting a lot of following "SEVERE" error.

SEVERE: Exception initializing page context
java.lang.IllegalStateException: Cannot create a session after the response
has been committed
        at org.apache.catalina.connector.Request.doGetSession(Request.java
:2214)
        at org.apache.catalina.connector.Request.getSession(Request.java
:2024)
        at org.apache.catalina.connector.RequestFacade.getSession(
RequestFacade.java:831)
        at javax.servlet.http.HttpServletRequestWrapper.getSession(
HttpServletRequestWrapper.java:215)
        at org.apache.catalina.core.ApplicationHttpRequest.getSession(
ApplicationHttpRequest.java:544)
        at org.apache.catalina.core.ApplicationHttpRequest.getSession(
ApplicationHttpRequest.java:493)
        at org.apache.jasper.runtime.PageContextImpl._initialize(
PageContextImpl.java:148)
        at org.apache.jasper.runtime.PageContextImpl.initialize(
PageContextImpl.java:123)
        at org.apache.jasper.runtime.JspFactoryImpl.internalGetPageContext(
JspFactoryImpl.java:104)
        at org.apache.jasper.runtime.JspFactoryImpl.getPageContext(
JspFactoryImpl.java:61)
        at org.apache.jsp.dynaLeftMenuItems_jsp._jspService(
org.apache.jsp.dynaLeftMenuItems_jsp:50)


Having said all that, since Jmeter script fails at initial steps therefore
successive steps failed too so overall error % went up to 30% and req/sec
was 19. I am kind of confused while trying to analyze the situation as to
why would "Broken pipe" exception occur (even when it was supposed to be
fixed) but then it disappears by increasing listener thread and sender
socket limit....is it some kind of timing issue and balancing between no of
listeners thread and no of sender sockets in the pool. I didn't find in the
documentation about the effect of changing those parameter or any
recommendation.

Thanks
Yogesh

On 12/16/05, Peter Rossbach <pr...@objektpark.de> wrote:
>
> Hey Yogesh,
>
> please update to current svn head.
>
> s. following bug that now fixed:
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=37896
>
> S. 5.5.14 Changelog the see that more bugs exists inside 5.5.12.
>
> Please, report as it works!
>
>
> Peter
>
> Tipp: For high load the fastasyncqueue sender mode is better.
> Also you don't need autoconnect!
>
>
>
> Yogesh Prajapati schrieb:
>
> >The detail on Tomcat Clustering Load Testing Environment:
> >
> >Application: A web Portal, Pure JSP/Servlet based implementation using
> JDBC
> >(Oracle 10g RAC) and OLTP in nature.
> >
> >Load Test Tool: Jmeter
> >
> >Clustering Setup: 4 nodes
> >
> >OS: SUSE Enterprize 9 (SP2) on all nodes (kernel: 2.6.5-7.97)
> >
> >Sofwares: JDK 1.5.0_05, Tomcat 5.5.12
> >
> >Hardware configuration:
> >Node #1:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
> >Node #2:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
> >Node #3:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
> >Node #4:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
> >
> >Network Configuration: All nodes are behind Alteon Load balancer
> >(response-time based load balancing), all have two nic cards with subnets
> >10.1.13.0 for load balancing network, 10.1.11.0 for private LAN. The
> private
> >nic has multicast enabled. All private nic are connected to 10/100 Fast
> >Ethernet switch.
> >
> >Tomcat cluster configuration (same on all nodes):
> >        <Cluster className="
> org.apache.catalina.cluster.tcp.SimpleTcpCluster
> >"
> >                 managerClassName="
> >org.apache.catalina.cluster.session.DeltaManager"
> >                 expireSessionsOnShutdown="false"
> >                 useDirtyFlag="true"
> >                 notifyListenersOnReplication="true">
> >
> >            <Membership
> >                className="org.apache.catalina.cluster.mcast.McastService
> "
> >                mcastAddr="228.0.0.4"
> >                mcastPort="45564"
> >                mcastFrequency="1000"
> >                mcastDropTime="35000"
> >                mcastBindAddr="auto"
> >                />
> >
> >            <Receiver
> >                className="
> >org.apache.catalina.cluster.tcp.ReplicationListener"
> >                tcpListenAddress="auto"
> >                tcpListenPort="4001"
> >                tcpThreadCount="24"/>
> >
> >            <Sender
> >                className="
> >org.apache.catalina.cluster.tcp.ReplicationTransmitter"
> >                replicationMode="pooled"
> >                autoConnect="true"
> >                keepAliveTimeout="-1"
> >                maxPoolSocketLimit="600"
> >                doTransmitterProcessingStats="true"
> >                />
> >
> >            <Valve className="
> >org.apache.catalina.cluster.tcp.ReplicationValve"
> >
>
> >filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
> >
> >            <Deployer className="
> >org.apache.catalina.cluster.deploy.FarmWarDeployer"
> >                      tempDir="/tmp/war-temp/"
> >                      deployDir="/tmp/war-deploy/"
> >                      watchDir="/tmp/war-listen/"
> >                      watchEnabled="false"/>
> >
> >            <ClusterListener className="
> >org.apache.catalina.cluster.session.ClusterSessionListener"/>
> >        </Cluster>
> >     Note: for the application session availability on all the nodes is
> >must, so using "pooled" mode.
> >
> >Tomcate VM Parameters (additional switches for VM tunning):
> >-XX:+AggressiveHeap -Xms832m -Xmx832m -XX:+UseParallelGC
> -XX:+PrintGCDetails
> >-XX:MaxGCPauseMillis=200 -XX:GCTimeRatio=9
> >
> >After starting tomcat on all the nodes, when I run Jmeter scripts with
> 20-70
> >concurrent user threads, the entire cluster works fine (almost 0% error)
> but
> >at high number of users like > 200 concurrent user threads the tomcat
> >cluster session replication starts failing consistently and the
> replication
> >messages getting lost. Here is what I get in tomcat logs on all the nodes
> >(too many times):
> >
> >WARNING: Message lost: [10.1.11.95:4,001] type=[
> >org.apache.catalina.cluster.session.SessionMessageImpl],
> >id=[40FC741DB987BF5161C3AEEB32570A8E-
> >1134732225260]
> >java.net.SocketException: Broken pipe
> >        at java.net.SocketOutputStream.socketWrite0(Native Method)
> >        at java.net.SocketOutputStream.socketWrite(
> SocketOutputStream.java
> >:92)
> >        at java.net.SocketOutputStream.write(SocketOutputStream.java:124)
> >        at org.apache.catalina.cluster.tcp.DataSender.writeData(
> >DataSender.java:858)
> >        at org.apache.catalina.cluster.tcp.DataSender.pushMessage(
> >DataSender.java:799)
> >        at org.apache.catalina.cluster.tcp.DataSender.sendMessage(
> >DataSender.java:623)
> >        at org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage
> (
> >PooledSocketSender.java:128)
> >        at
> >org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(
> >ReplicationTransmitter.java:867)
> >        at
> >
> org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageClusterDomain
> >(ReplicationTransmitter.java:460)
> >        at
> >org.apache.catalina.cluster.tcp.SimpleTcpCluster.sendClusterDomain(
> >SimpleTcpCluster.java:1012)
> >        at org.apache.catalina.cluster.session.DeltaManager.send(
> >DeltaManager.java:629)
> >        at
> >org.apache.catalina.cluster.session.DeltaManager.sendCreateSession(
> >DeltaManager.java:617)
> >        at org.apache.catalina.cluster.session.DeltaManager.createSession
> (
> >DeltaManager.java:593)
> >        at org.apache.catalina.cluster.session.DeltaManager.createSession
> (
> >DeltaManager.java:572)
> >.............................
> >.............................
> >
> >Also I have noticed fewer times on two of the nodes (#3, #4) following
> >error:
> >
> >SEVERE: TCP Worker thread in cluster caught '
> >java.lang.ArrayIndexOutOfBoundsException: 1025' closing channel
> >java.lang.ArrayIndexOutOfBoundsException: 1025
> >        at org.apache.catalina.cluster.io.XByteBuffer.toInt(
> XByteBuffer.java
> >:231)
> >        at org.apache.catalina.cluster.io.XByteBuffer.countPackages(
> >XByteBuffer.java:164)
> >        at org.apache.catalina.cluster.io.ObjectReader.append(
> >ObjectReader.java:87)
> >        at
> org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel
> >(TcpReplicationThread.java:127)
> >        at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(
> >TcpReplicationThread.java:69)
> >
> >With all the above warning/exception I get the following jmeter results
> >(scripts runs at: 200 concurrent threads, 5 iteration, 0 sec ramp-up
> >period):
> >
> >Rate: 28 req/sec
> >Error: 9.07 %
> >
> >The rate is acceptable but error is very high and specially at high
> number
> >of user thread the error % goes up. I have run the Jmeter script several
> >times along with tweaking cluster configuration but I am not able to
> figure
> >out what am I doing wrong.
> >
> >Is "Broken pipe" is some kind failure and serious blocker OR it can
> safely
> >be ignored?
> >
> >"ArrayIndexOutOfBoundsException" looks to me a bug, it may already have
> been
> >reported but I don't know yet?
> >
> >With current scenario the memory usage are below 600 MB. My target is
> reach
> >2000 concurrent users thread keeping error within 3% and maintain the
> same
> >req/sec. Does this mean I have to add more memory (making it 2 GB on each
> >node).
> >
> >Is there something else I am missing that I need to look at?
> >
> >Any suggestions, ideas, tips are most welcome and appreciated.
> >
> >Thanks
> >
> >Yogi
> >
> >
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: Tomcat 5.5.12 clustering - messages lost under high load

Posted by Peter Rossbach <pr...@objektpark.de>.

Hey Yogesh,

please update to current svn head.

s. following bug that now fixed:

http://issues.apache.org/bugzilla/show_bug.cgi?id=37896

S. 5.5.14 Changelog the see that more bugs exists inside 5.5.12.

Please, report as it works! 


Peter

Tipp: For high load the fastasyncqueue sender mode is better.
Also you don't need autoconnect!



Yogesh Prajapati schrieb:

>The detail on Tomcat Clustering Load Testing Environment:
>
>Application: A web Portal, Pure JSP/Servlet based implementation using JDBC
>(Oracle 10g RAC) and OLTP in nature.
>
>Load Test Tool: Jmeter
>
>Clustering Setup: 4 nodes
>
>OS: SUSE Enterprize 9 (SP2) on all nodes (kernel: 2.6.5-7.97)
>
>Sofwares: JDK 1.5.0_05, Tomcat 5.5.12
>
>Hardware configuration:
>Node #1:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
>Node #2:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
>Node #3:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
>Node #4:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
>
>Network Configuration: All nodes are behind Alteon Load balancer
>(response-time based load balancing), all have two nic cards with subnets
>10.1.13.0 for load balancing network, 10.1.11.0 for private LAN. The private
>nic has multicast enabled. All private nic are connected to 10/100 Fast
>Ethernet switch.
>
>Tomcat cluster configuration (same on all nodes):
>        <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster
>"
>                 managerClassName="
>org.apache.catalina.cluster.session.DeltaManager"
>                 expireSessionsOnShutdown="false"
>                 useDirtyFlag="true"
>                 notifyListenersOnReplication="true">
>
>            <Membership
>                className="org.apache.catalina.cluster.mcast.McastService"
>                mcastAddr="228.0.0.4"
>                mcastPort="45564"
>                mcastFrequency="1000"
>                mcastDropTime="35000"
>                mcastBindAddr="auto"
>                />
>
>            <Receiver
>                className="
>org.apache.catalina.cluster.tcp.ReplicationListener"
>                tcpListenAddress="auto"
>                tcpListenPort="4001"
>                tcpThreadCount="24"/>
>
>            <Sender
>                className="
>org.apache.catalina.cluster.tcp.ReplicationTransmitter"
>                replicationMode="pooled"
>                autoConnect="true"
>                keepAliveTimeout="-1"
>                maxPoolSocketLimit="600"
>                doTransmitterProcessingStats="true"
>                />
>
>            <Valve className="
>org.apache.catalina.cluster.tcp.ReplicationValve"
>
>filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
>
>            <Deployer className="
>org.apache.catalina.cluster.deploy.FarmWarDeployer"
>                      tempDir="/tmp/war-temp/"
>                      deployDir="/tmp/war-deploy/"
>                      watchDir="/tmp/war-listen/"
>                      watchEnabled="false"/>
>
>            <ClusterListener className="
>org.apache.catalina.cluster.session.ClusterSessionListener"/>
>        </Cluster>
>     Note: for the application session availability on all the nodes is
>must, so using "pooled" mode.
>
>Tomcate VM Parameters (additional switches for VM tunning):
>-XX:+AggressiveHeap -Xms832m -Xmx832m -XX:+UseParallelGC -XX:+PrintGCDetails
>-XX:MaxGCPauseMillis=200 -XX:GCTimeRatio=9
>
>After starting tomcat on all the nodes, when I run Jmeter scripts with 20-70
>concurrent user threads, the entire cluster works fine (almost 0% error) but
>at high number of users like > 200 concurrent user threads the tomcat
>cluster session replication starts failing consistently and the replication
>messages getting lost. Here is what I get in tomcat logs on all the nodes
>(too many times):
>
>WARNING: Message lost: [10.1.11.95:4,001] type=[
>org.apache.catalina.cluster.session.SessionMessageImpl],
>id=[40FC741DB987BF5161C3AEEB32570A8E-
>1134732225260]
>java.net.SocketException: Broken pipe
>        at java.net.SocketOutputStream.socketWrite0(Native Method)
>        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java
>:92)
>        at java.net.SocketOutputStream.write(SocketOutputStream.java:124)
>        at org.apache.catalina.cluster.tcp.DataSender.writeData(
>DataSender.java:858)
>        at org.apache.catalina.cluster.tcp.DataSender.pushMessage(
>DataSender.java:799)
>        at org.apache.catalina.cluster.tcp.DataSender.sendMessage(
>DataSender.java:623)
>        at org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage(
>PooledSocketSender.java:128)
>        at
>org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(
>ReplicationTransmitter.java:867)
>        at
>org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageClusterDomain
>(ReplicationTransmitter.java:460)
>        at
>org.apache.catalina.cluster.tcp.SimpleTcpCluster.sendClusterDomain(
>SimpleTcpCluster.java:1012)
>        at org.apache.catalina.cluster.session.DeltaManager.send(
>DeltaManager.java:629)
>        at
>org.apache.catalina.cluster.session.DeltaManager.sendCreateSession(
>DeltaManager.java:617)
>        at org.apache.catalina.cluster.session.DeltaManager.createSession(
>DeltaManager.java:593)
>        at org.apache.catalina.cluster.session.DeltaManager.createSession(
>DeltaManager.java:572)
>.............................
>.............................
>
>Also I have noticed fewer times on two of the nodes (#3, #4) following
>error:
>
>SEVERE: TCP Worker thread in cluster caught '
>java.lang.ArrayIndexOutOfBoundsException: 1025' closing channel
>java.lang.ArrayIndexOutOfBoundsException: 1025
>        at org.apache.catalina.cluster.io.XByteBuffer.toInt(XByteBuffer.java
>:231)
>        at org.apache.catalina.cluster.io.XByteBuffer.countPackages(
>XByteBuffer.java:164)
>        at org.apache.catalina.cluster.io.ObjectReader.append(
>ObjectReader.java:87)
>        at org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel
>(TcpReplicationThread.java:127)
>        at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(
>TcpReplicationThread.java:69)
>
>With all the above warning/exception I get the following jmeter results
>(scripts runs at: 200 concurrent threads, 5 iteration, 0 sec ramp-up
>period):
>
>Rate: 28 req/sec
>Error: 9.07 %
>
>The rate is acceptable but error is very high and specially at high number
>of user thread the error % goes up. I have run the Jmeter script several
>times along with tweaking cluster configuration but I am not able to figure
>out what am I doing wrong.
>
>Is "Broken pipe" is some kind failure and serious blocker OR it can safely
>be ignored?
>
>"ArrayIndexOutOfBoundsException" looks to me a bug, it may already have been
>reported but I don't know yet?
>
>With current scenario the memory usage are below 600 MB. My target is reach
>2000 concurrent users thread keeping error within 3% and maintain the same
>req/sec. Does this mean I have to add more memory (making it 2 GB on each
>node).
>
>Is there something else I am missing that I need to look at?
>
>Any suggestions, ideas, tips are most welcome and appreciated.
>
>Thanks
>
>Yogi
>
>  
>




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org