You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@tomcat.apache.org by João Sávio <jo...@gmail.com> on 2014/07/02 15:37:45 UTC

Re: Help with Tomcat 7 clustering using BIO receiver

?


2014-06-30 18:38 GMT-03:00 João Sávio <jo...@gmail.com>:

> Hello people
>
> This is my first message on this group! I'm trying to set up a Tomcat
> clustering using BIO receiver but I've been receiving the following error
> when I started two nodes and tried to enter on my application:
>
> Jun 30, 2014 9:25:12 PM
> org.apache.catalina.tribes.transport.bio.BioReplicationT
> ask run
> SEVERE: Unable to service bio socket
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:152)
>         at java.net.SocketInputStream.read(SocketInputStream.java:122)
>         at java.net.SocketInputStream.read(SocketInputStream.java:108)
>         at
> org.apache.catalina.tribes.transport.bio.BioReplicationTask.drainSock
> et(BioReplicationTask.java:146)
>         at
> org.apache.catalina.tribes.transport.bio.BioReplicationTask.run(BioRe
> plicationTask.java:64)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
> java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
> .java:615)
>         at java.lang.Thread.run(Thread.java:744)
>
> Jun 30, 2014 9:25:15 PM org.apache.catalina.ha.session.DeltaManager
> startInternal
> INFO: Register manager localhost#/myApp to cluster element Engine with
> name Catalina
> Jun 30, 2014 9:25:15 PM org.apache.catalina.ha.session.DeltaManager
> startInternal
> INFO: Starting clustering manager at localhost#/myApp
> Jun 30, 2014 9:25:15 PM org.apache.catalina.ha.session.DeltaManager
> getAllClusterSessions
> INFO: Manager [localhost#/myApp], requesting session state from
> org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 212, 191,
> 209}:4000,{10, 212, 191, 2
> 09},4000, alive=32096, securePort=-1, UDP Port=-1, id={-58 95 -116 89 86
> -12 714 -75 -2 74 -45 -54 -82 40 -114 }, payload={}, command={},
> domain={}, ]. This operation will timeout if no session state has been
> received within 60 seconds.
>
>
> Attached are my two server.xml configuration. I've taken the default
> configuration (worked on my computer) and replaced the following things:
>
>
> <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
> by
> <Receiver className="org.apache.catalina.tribes.transport.bio.BioReceiver"
>
> and
>
> <Transport
> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
> by
> <Transport
> className="org.apache.catalina.tribes.transport.bio.PooledMultiSender"/>
>
>
> I don't know if I'm missing something. I'll be very thankful if someone
> can take a look on my server.xml files or send me an example configuration
> using BIO receiver (I couldn't find anyone).
>
> Thanks
> João
> --
> http://joaosavio.wordpress.com
>
>
>


-- 
http://joaosavio.wordpress.com

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Daniel Mikusa <dm...@gopivotal.com>.

On Thu, Jul 3, 2014 at 12:04 PM, João Sávio <jo...@gmail.com> wrote:

> Hello!
>
> Using NIO (with channelSendOptions="4", i.e., synchronous), with lightly
> load, my tests pass 100%. But, on heavy load, not all sessions are
> replicated on time,
>

Define "on time".

> and I have about 20% of errors.
>

Can you explain the errors more?  Stack trace?

> If I increase maxThreads to 400, I have about 15% of errors.
>
> More information:
> * I am not performing parallel requests with same session
> * my cluster has 4 nodes (all in one machine - for test purpose only)
> * Java 7 64 bits, Tomcat 7.0.52, windows 7 64 bits
> * using default NIO configuration, but with maxThreads=400
> * VM options:
> -Xms512M   - on real environment this value is 1024
> -Xmx512M  - on real environment this value is 1024
> -XX:NewSize=450M
> -XX:MaxNewSize=450M
> -XX:PermSize=128M
> -XX:MaxPermSize=245M
> -XX:SurvivorRatio=8
> -XX:TargetSurvivorRatio=90
> -XX:MaxTenuringThreshold=15
> -XX:+UseBiasedLocking
> -XX:CMSInitiatingOccupancyFraction=60
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+CMSClassUnloadingEnabled
> -XX:+UseConcMarkSweepGC
> -XX:+CMSIncrementalMode
> -XX:+UseParNewGC
>

You have quite a few JVM settings configured here.  No criticism to this, I
assume that you've not copy and pasted these from the internet and have
thoroughly tested each one.  I'm just mentioning this because I often see
people do this and it usually backfires.  Just curious if you've tried
things without all of these customizations.  Perhaps with just some basics
like -Xms, -Xmx and gc logging?

> -XX:+DisableExplicitGC
> -XX:+PrintGCDateStamps
> -XX:+PrintGCDetails
> -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
>
> Moreover, I'm trying to attach the logs again.
>

The list doesn't like attachments.  On some rare occasions it'll accept
them, but it's generally better to inline the logs.  Also some people,
myself included, are paranoid and don't open attachments from people they
do not know.

Dan

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

Hello Filip

You solved my issue! Thank you very much!

Thanks everyone
João

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Filip Hanik <fi...@hanik.com>.

Joao,
try channelSendOptions="6"
this will mean that
1. You wish to use ACK's (option 2)
2. You wish the ACK to be synchronous

If you don't have the 0x0002 option enabled, it wont use ACKs at all.

Filip

On Thu, Jul 3, 2014 at 4:44 PM, João Sávio <jo...@gmail.com> wrote:

> If I set channelSendOptions="8" (default value = asynchronous), the % of
> errors increase (as expected)
>
> Regards
> João
>
>
> 2014-07-03 19:43 GMT-03:00 João Sávio <jo...@gmail.com>:
>
> > I don't think so.
> >
> > Here it is: http://pastebin.com/qYCzmECb  (server.xml - node1)
> >
> > Regards
> > João
> >
>
>
>
> --
> http://joaosavio.wordpress.com
>

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

If I set channelSendOptions="8" (default value = asynchronous), the % of
errors increase (as expected)

Regards
João


2014-07-03 19:43 GMT-03:00 João Sávio <jo...@gmail.com>:

> I don't think so.
>
> Here it is: http://pastebin.com/qYCzmECb  (server.xml - node1)
>
> Regards
> João
>



-- 
http://joaosavio.wordpress.com

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

I don't think so.

Here it is: http://pastebin.com/qYCzmECb  (server.xml - node1)

Regards
João

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Filip Hanik <fi...@hanik.com>.

did you post your server.xml cause I can't find it?


On Thu, Jul 3, 2014 at 4:25 PM, João Sávio <jo...@gmail.com> wrote:

> Hello Filip
>
> I'm using channelSendOptions="4", which is supposed to be synchronous
>
> Regards
> João
>

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

Hello Filip

I'm using channelSendOptions="4", which is supposed to be synchronous

Regards
João

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Filip Hanik <fi...@hanik.com>.

A race condition could happen if you set replication to happen async. But I
do have a memory of the configuration specifying synchronous replication,
which would guarantee that the replication changes have happened before the
request is complete.




On Thu, Jul 3, 2014 at 3:51 PM, Christopher Schultz <
chris@christopherschultz.net> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Mark,
>
> On 7/3/14, 2:19 PM, Mark Eggers wrote:
> > João,
> >
> > This list has a convention of posting either inline or at the end
> > of the message you're replying to.
> >
> > See here for mailing list notes:
> >
> > http://tomcat.apache.org/lists.html#tomcat-users
> >
> > On 7/3/2014 10:24 AM, João Sávio wrote:
> >> Hello
> >
> >> Some points below:
> >
> >> ** What is "on time"?* In my application, a group of users should
> >>  always hit the same node after the first request. So, in first
> >> request each group of users will receive an specific cookie, and
> >> LB will perform the load balancing based on this cookie. In first
> >>  request, a user can hit any node, but from the second, he or she
> >>  should hit the same node.
> >
> > Hmm, so 'on time' really means that subsequent requests should hit
> > the same server.
> >
> > If you're using sessions, Tomcat has an attribute on the Engine
> > element called jvmRoute. So depending on your load balancer (and
> > if you use AJP), you can use Tomcat and AJP to route traffic. In
> > that case, there's no need to write a special cookie.
> >
> > At any rate, this doesn't sound like a clustering error per se.
>
>
> I wonder if the real problem is a race condition: the cluster can't
> replicate fast enough to stabilize before the second request comes in,
> plus the lb configuration might not be correct.
>
> João, can you confirm that request #1 and #2 are definitely hitting
> the same Tomcat instance?
>
> If you connect to TomcatA, set a session attribute, then reconnect to
> TomcatA and get that session attribute, then it should be the same
> unless something is awfully wrong. You don't even need to have
> clustering enabled to test the above.
>
> However if you hit TomcatA, set a session attribute, then connect to
> TomcatB and try to get that session attribute, your request may have
> arrived too early for the cluster to have pushed-out the session
> attribute change.
>
> So, if you can prove that both requests have gone to TomcatA and you
> are getting "errors", then there are several possibilities:
>
> 1. Tomcat has a huge bug and no web applications would work worldwide.
> 2. You are mishandling the "setting" of the session attribute
> 3. You are wrong about which server the client is hitting
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
> Comment: GPGTools - http://gpgtools.org
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQIcBAEBCAAGBQJTtdBTAAoJEBzwKT+lPKRYtiwP/1dW2qplyepTgDTNixNw0viZ
> 29XFywsYAmDdMxzWcgkcl7Nrw3kVUcJVf+jLpxNCUxRJq7z4+zuyOLkImn2XW4a+
> ygG1op69FSwsVEfQyHIH8OVjdYDUj6WPpP8bu2KbbkR0jtAiHO569+869WOvPyuA
> z+oBhBhWB5w7e41qmQnLr6y3+hU19hGuayxkR61tqmZCPp6kpwRH2yN3IbhId2In
> 8DLoR5z6077jxPeXR6o3goB6Y9LbrPoYFUwdfQTpzrF8AvQ2wDl/CRuM2n9wB/ez
> Oclnz0bw4JNegtarEJeiu4G1Qqf7WCqhVv4a8GfWYtr0ISk8GBBcCRjYZcoyU5IU
> hSnNBGn586AhZ3BK5t1ySwrC6RiKH6MIR8fdBOSw1eZnTycPBSK6avZ4E8ahQDIp
> uA93W+cME58gtmzdl2q7iLjRbwGdgebw++yfR4G42Tb4rUYsmOzsCPGx/nIqxB5E
> FBea8xGwb802rFpYMxgMp8SzRy078RrDx2aptNfrb5oP9YeQ/pGrX9tVVtTlxNTk
> 8DKA8GHL4fiONAJB48iD2sTSv4jAhFInHnF4ykl0zjN7t3f0phMmSExeoH7HbFUI
> G589M4KAs5X00xCSFt9gXdU+tpuFL+/x6kBAGrNmT5IySIvm+BfxTXjvg2daAjcC
> +FAocYeosZumP5g2tICv
> =1Si4
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Christopher Schultz <ch...@christopherschultz.net>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Mark,

On 7/3/14, 2:19 PM, Mark Eggers wrote:
> João,
> 
> This list has a convention of posting either inline or at the end
> of the message you're replying to.
> 
> See here for mailing list notes:
> 
> http://tomcat.apache.org/lists.html#tomcat-users
> 
> On 7/3/2014 10:24 AM, João Sávio wrote:
>> Hello
> 
>> Some points below:
> 
>> ** What is "on time"?* In my application, a group of users should
>>  always hit the same node after the first request. So, in first 
>> request each group of users will receive an specific cookie, and 
>> LB will perform the load balancing based on this cookie. In first
>>  request, a user can hit any node, but from the second, he or she
>>  should hit the same node.
> 
> Hmm, so 'on time' really means that subsequent requests should hit
> the same server.
> 
> If you're using sessions, Tomcat has an attribute on the Engine 
> element called jvmRoute. So depending on your load balancer (and
> if you use AJP), you can use Tomcat and AJP to route traffic. In
> that case, there's no need to write a special cookie.
> 
> At any rate, this doesn't sound like a clustering error per se.

I wonder if the real problem is a race condition: the cluster can't
replicate fast enough to stabilize before the second request comes in,
plus the lb configuration might not be correct.

João, can you confirm that request #1 and #2 are definitely hitting
the same Tomcat instance?

If you connect to TomcatA, set a session attribute, then reconnect to
TomcatA and get that session attribute, then it should be the same
unless something is awfully wrong. You don't even need to have
clustering enabled to test the above.

However if you hit TomcatA, set a session attribute, then connect to
TomcatB and try to get that session attribute, your request may have
arrived too early for the cluster to have pushed-out the session
attribute change.

So, if you can prove that both requests have gone to TomcatA and you
are getting "errors", then there are several possibilities:

1. Tomcat has a huge bug and no web applications would work worldwide.
2. You are mishandling the "setting" of the session attribute
3. You are wrong about which server the client is hitting

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJTtdBTAAoJEBzwKT+lPKRYtiwP/1dW2qplyepTgDTNixNw0viZ
29XFywsYAmDdMxzWcgkcl7Nrw3kVUcJVf+jLpxNCUxRJq7z4+zuyOLkImn2XW4a+
ygG1op69FSwsVEfQyHIH8OVjdYDUj6WPpP8bu2KbbkR0jtAiHO569+869WOvPyuA
z+oBhBhWB5w7e41qmQnLr6y3+hU19hGuayxkR61tqmZCPp6kpwRH2yN3IbhId2In
8DLoR5z6077jxPeXR6o3goB6Y9LbrPoYFUwdfQTpzrF8AvQ2wDl/CRuM2n9wB/ez
Oclnz0bw4JNegtarEJeiu4G1Qqf7WCqhVv4a8GfWYtr0ISk8GBBcCRjYZcoyU5IU
hSnNBGn586AhZ3BK5t1ySwrC6RiKH6MIR8fdBOSw1eZnTycPBSK6avZ4E8ahQDIp
uA93W+cME58gtmzdl2q7iLjRbwGdgebw++yfR4G42Tb4rUYsmOzsCPGx/nIqxB5E
FBea8xGwb802rFpYMxgMp8SzRy078RrDx2aptNfrb5oP9YeQ/pGrX9tVVtTlxNTk
8DKA8GHL4fiONAJB48iD2sTSv4jAhFInHnF4ykl0zjN7t3f0phMmSExeoH7HbFUI
G589M4KAs5X00xCSFt9gXdU+tpuFL+/x6kBAGrNmT5IySIvm+BfxTXjvg2daAjcC
+FAocYeosZumP5g2tICv
=1Si4
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Filip Hanik <fi...@hanik.com>.

Ok, at least the stack trace is clear. The session has been invalidated
somehow.
We would need to figure out when and how this happens, is it possible that
you are doing a clean shutdown of a tomcat instance and that instance
expires all the sessions? If that is the case, kill the tomcat with 'kill
-9' to simulate a failure. there is a flag called
'expireSessionsOnShutdown', do you have that set?





On Thu, Jul 3, 2014 at 2:05 PM, João Sávio <jo...@gmail.com> wrote:

> Hello Mark
>
> In fact, I'm not explicit invalidating session on this two requests.
>
> Regards
> João
>

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

Hello Mark

In fact, I'm not explicit invalidating session on this two requests.

Regards
João

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Mark Eggers <it...@yahoo.com.INVALID>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Filip,

On 7/3/2014 12:11 PM, Filip Hanik wrote:
> 1. are your machines in time sync? If they are not, a session can
> get timed out. 2. 3. SEVERE: Manager [localhost#/myApp]: Unable to
> receive message through TCP channel 4.
> java.lang.IllegalStateException: setAttribute: Session [ 
> DEC3612CF763194E7953DB3FD2C433E0] has already been invalidated 5.
> at org.apache.catalina.session.StandardSession.setAttribute( 
> StandardSession.java:1437) 6.         at
> org.apache.catalina.ha.session.DeltaSession.setAttribute( 
> DeltaSession.java:695) 7.         at
> org.apache.catalina.ha.session.DeltaRequest.execute( 
> DeltaRequest.java:168) 8.         at 
> org.apache.catalina.ha.session.DeltaManager.handleSESSION_DELTA( 
> DeltaManager.java:1337) 9.         at
> org.apache.catalina.ha.session.DeltaManager.messageReceived 
> (DeltaManager.java:1283) 10.         at 
> org.apache.catalina.ha.session.DeltaManager.messageDataReceived( 
> DeltaManager.java:1001) 11.         at 
> org.apache.catalina.ha.session.ClusterSessionListener.messageReceived(
>
> 
ClusterSessionListener.java:91)
> 12.         at 
> org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived( 
> SimpleTcpCluster.java:943) 13.         at 
> org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived( 
> SimpleTcpCluster.java:924) 14.         at 
> org.apache.catalina.tribes.group.GroupChannel.messageReceived( 
> GroupChannel.java:278)
> 
> 

I think he's running the Tomcat cluster on the same physical machine
for testing.

So is the web application invalidating the session before the
attribute is replicated across the cluster?

/mde/

> 
> 
> On Thu, Jul 3, 2014 at 1:07 PM, João Sávio <jo...@gmail.com>
> wrote:
> 
>> Hi everyone
>> 
>> I ran my test (total of 1k requests, total of 100 threads)
>> against two nodes with default VM settings. I've just set heap
>> size. I had about 15% of errors.
>> 
>> cluster.log - node1 - http://pastebin.com/cpX900Qw cluster.log -
>> node2 - http://pastebin.com/qCSzMaU6
>> 
>> Running for a long time (total of 500k requests, total of 100
>> threads) I had about 11% of errors. In this case we can see the
>> statistics:
>> 
>> Jul 03, 2014 5:53:28 PM 
>> org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor
>> report INFO: ThroughputInterceptor Report[ Tx Msg:10000 messages 
>> Sent:12.36 MB (total) Sent:12.36 MB (application) Time:7.82
>> seconds Tx Speed:1.58 MB/sec (total) TxSpeed:1.58 MB/sec
>> (application) Error Msg:0 Rx Msg:10198 messages Rx Speed:0.08
>> MB/sec (since 1st msg) Received:12.38 MB]
>> 
>> 
>> All session attributes are Serializable, and it's a session
>> replication issue because if I ran my test with just one node, I
>> had 0% of errors.
>> 
>> Regarding "on time", just a correction:
>> 
>> 1. first request, pick a random server and store a session
>> object 2. second request, pick *ANY* server (chose by LB based on
>> the cookie - it can be the same, but not necessarily) and ask for
>> the session object.
>> 
>> To be more clean, I've been working with a conference system.
>> Each conference should occur in one node. So, the first request
>> can hit any server, and from the second request should hit the
>> node where the conference is.
>> 
>> 
>> Thanks a lot João
>> 
>> 
>> 2014-07-03 15:40 GMT-03:00 Filip Hanik <fi...@hanik.com>:
>> 
>>> you mention NIO and say maxThreads, that sounds like the
>>> <Connector> configuration, but the BIO receiver is on the
>>> cluster, and it a
>> completely
>>> different component that also has an applicable NIO
>>> configuration.
>>> 
>>> are you confusing the two? I'm saying that you should use the
>>> NIO receiver on the cluster component, and if you do, what kind
>>> of errors do you get?
>>> 
>>> 
>>> On Thu, Jul 3, 2014 at 12:19 PM, Mark Eggers
>> <its_toasted@yahoo.com.invalid
>>>> 
>>> wrote:
>>> 
> João,
> 
> This list has a convention of posting either inline or at the end
> of the message you're replying to.
> 
> See here for mailing list notes:
> 
> http://tomcat.apache.org/lists.html#tomcat-users
> 
> On 7/3/2014 10:24 AM, João Sávio wrote:
>>>>>> Hello
>>>>>> 
>>>>>> Some points below:
>>>>>> 
>>>>>> ** What is "on time"?* In my application, a group of
>>>>>> users should always hit the same node after the first
>>>>>> request. So, in first request each group of users will
>>>>>> receive an specific cookie, and LB will perform the load
>>>>>> balancing based on this cookie. In first request, a user
>>>>>> can hit any node, but from the second, he or she should
>>>>>> hit the same node.
> 
> Hmm, so 'on time' really means that subsequent requests should hit
> the same server.
> 
> If you're using sessions, Tomcat has an attribute on the Engine 
> element called jvmRoute. So depending on your load balancer (and
> if you use AJP), you can use Tomcat and AJP to route traffic. In
> that case, there's no need to write a special cookie.
> 
> At any rate, this doesn't sound like a clustering error per se.
> 
>>>>>> 
>>>>>> ** What are the errors? Test result errors?* For this
>>>>>> test, I simplified the code of my application: - first
>>>>>> request: store one object in session - second request:
>>>>>> verify if the object is in session. If it's not -> ERROR
>>>>>> 
> 
> So looking at the information from 'on time', the scenario should
> be:
> 
> 1. first request, pick a random server and store a session object 
> 2. second request, pick the SAME server and ask for the session
> object
> 
> Again, I'm not seeing where this is a clustering issue per se.
> 
>>>>>> ** How big are are the sessions that you're trying to
>>>>>> replicate?* - I'm using Spring MVC, and I have 3
>>>>>> additional objects in session. They are not big (15
>>>>>> attributes each one)
>>>>>> 
> 
> And all attributes are serializable? The objects are also marked
> as serializable?
> 
>>>>>> ** What's the load like on the box when you're running
>>>>>> the tests that you get errors on?* - I've experiencing
>>>>>> this issue on BIO even without load
>>>>>> 
> 
> I may have not phrased my question carefully. What is the CPU and 
> memory situation on your test box while running the 4 Tomcat
> servers?
> 
> I know you've trimmed down your Xms and Xmx (presumably to fit in
> your test environment), but in combination with your other JVM
> parameters could this be causing some issues?
> 
> I would follow Dan's recommendation of maybe just setting Xms, Xmx,
> GC logging to see what happens. Ah, I see you're going to do that
> below.
> 
>>>>>> ** It is preferred to use the non blocking receiver to be
>>>>>> able to grow your cluster without running into thread
>>>>>> starvation.* - That's why I've tried NIO first, but I'd
>>>>>> like to see if BIO solve my issue and if using BIO my
>>>>>> system doesn't get too slow.
> 
> I don't think speed is so much an issue here, but scalability is.
> NIO can handle multiple requests per thread, BIO cannot.
> 
>>>>>> 
>>>>>> 
>>>>>> Now, I'll try to run my tests using NIO, default VM
>>>>>> configuration and FINER logs.
> 
> Post the results when you get them. If the logs are relatively
> small, just cut and paste into the mail message.
> 
> I suspect FINER is going to generate LOTS of logging and slow down 
> your application.
> 
>>>>>> 
>>>>>> Thanks a lot João
> 
> . . . . just my two cents /mde/
> 
>>>>>> 
>>>>>> 
>>>>>> 2014-07-03 14:07 GMT-03:00 Mark Eggers 
>>>>>> <it...@yahoo.com.invalid>:
>>>>>> 
>>>>>> On 7/3/2014 9:12 AM, João Sávio wrote:
>>>>>>>>> cluster.log -> http://pastebin.com/c98WhnmG
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 2014-07-03 13:04 GMT-03:00 João Sávio
>>>>>>>>> <jo...@gmail.com>:
>>>>>>>>> 
>>>>>>>>>> Hello!
>>>>>>>>>> 
>>>>>>>>>> Using NIO (with channelSendOptions="4", i.e., 
>>>>>>>>>> synchronous), with lightly load, my tests pass
>>>>>>>>>> 100%. But, on heavy load, not all sessions are
>>>>>>>>>> replicated on time, and I have about 20% of
>>>>>>>>>> errors. If I increase maxThreads to 400, I have
>>>>>>>>>> about 15% of errors.
>>>>>>>>>> 
>>>>>>>>>> More information: * I am not performing parallel
>>>>>>>>>> requests with same session * my cluster has 4
>>>>>>>>>> nodes (all in one machine - for test purpose
>>>>>>>>>> only) * Java 7 64 bits, Tomcat 7.0.52, windows 7
>>>>>>>>>> 64 bits * using default NIO configuration, but
>>>>>>>>>> with maxThreads=400 * VM options: -Xms512M   - on
>>>>>>>>>> real environment this value is 1024 -Xmx512M  -
>>>>>>>>>> on real environment this value is 1024 
>>>>>>>>>> -XX:NewSize=450M -XX:MaxNewSize=450M
>>>>>>>>>> -XX:PermSize=128M -XX:MaxPermSize=245M
>>>>>>>>>> -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=90
>>>>>>>>>> -XX:MaxTenuringThreshold=15 -XX:+UseBiasedLocking
>>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=60 
>>>>>>>>>> -XX:+UseCMSInitiatingOccupancyOnly 
>>>>>>>>>> -XX:+CMSClassUnloadingEnabled
>>>>>>>>>> -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
>>>>>>>>>> -XX:+UseParNewGC -XX:+DisableExplicitGC
>>>>>>>>>> -XX:+PrintGCDateStamps -XX:+PrintGCDetails 
>>>>>>>>>> -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
>>>>>>>>>> 
>>>>>>>>>> Moreover, I'm trying to attach the logs again.
>>>>>>>>>> 
>>>>>>>>>> Thanks João
>>>>>> 
>>>>>> João,
>>>>>> 
>>>>>> I took a look at the log. This is the BIO attempt and you
>>>>>> do run out of threads. See the following:
>>>>>> 
>>>>>> Jul 03, 2014 11:41:21 AM 
>>>>>> org.apache.catalina.tribes.transport.bio.BioReceiver
>>>>>> listen WARNING: All BIO server replication threads are
>>>>>> busy, unable to handle more requests until a thread is
>>>>>> freed up.
>>>>>> 
>>>>>> What's the load like on the box when you're running the
>>>>>> tests that you get errors on?
>>>>>> 
>>>>>> As Dan asks in his message:
>>>>>> 
>>>>>> What is "on time"? What are the errors? Test result
>>>>>> errors?
>>>>>> 
>>>>>> How big are are the sessions that you're trying to
>>>>>> replicate?
>>>>>> 
>>>>>> My guess is that something else is going on, since the
>>>>>> following log entry doesn't show much in the way of
>>>>>> cluster traffic.
>>>>>> 
>>>>>> INFO: ThroughputInterceptor Report[ Tx Msg:1 messages
>>>>>> Sent:0.00 MB (total) Sent:0.00 MB (application) Time:0.01
>>>>>> seconds Tx Speed:0.04 MB/sec (total) TxSpeed:0.04 MB/sec
>>>>>> (application) Error Msg:0 Rx Msg:13 messages Rx
>>>>>> Speed:0.00 MB/sec (since 1st msg) Received:0.00 MB]
>>>>>> 
>>>>>> It would also be interesting to see the logs when you use
>>>>>> the NIO connector. According to the documentation:
>>>>>> 
>>>>>> It is preferred to use the non blocking receiver to be
>>>>>> able to grow your cluster without running into thread
>>>>>> starvation.
>>>>>> 
>>>>>> Also from the documentation:
>>>>>> 
>>>>>> Usually the rule is to use 1 thread per node in the
>>>>>> cluster for small clusters, and then depending on your
>>>>>> message frequency and your hardware, you'll find an
>>>>>> optimal number of threads peak out at a certain number.
>>>>>> 
>>>>>> We might need a little more background on your
>>>>>> application and your test environment to figure out why
>>>>>> clustering is not behaving for you.
>>>>>> 
>>>>>> . . . just my two cents /mde/
>>>>>> 
>>>>>> PS - you have some errors in your server.xml (see the
>>>>>> log). While they won't impact this problem, it might be a
>>>>>> good idea to address them.
>>>>>> 
>>>>>> /mde/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTtbGoAAoJEEFGbsYNeTwtOMoH/1WP4Le5CRiJvB3VwUSYuh/P
GciCZcYs8KaV/0Ff7kpy4pJNJWe5HOATNkY6y8QabQleAqMxooarOxAwP4+DPalw
kiMYGw0ad9a6NlxyABTpN2547Lc5L906s6O7ZwT4+qPCtGFYbmu9fKq8qK/XoPgW
5MvTLc9JAGsZtlfSLmkyi8F4NiDR0syqIWZlTb+pIOA8AF+LxFOlfqqZE6d6DeSy
I7pHqmv/BHjk3Jl3Pu92KMBMu13yclCBMHO5rlquhCtHZ+fAVh1wh92sMEEv79Ow
xnUihnxTEpAhIX9jC+MsO10vJXXqDqHD732YUz3l0gTDk9aGeWeDKOwa/B4aA1g=
=TfQq
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

Hello Filip

The nodes are in the same machine!

Regards
João

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Filip Hanik <fi...@hanik.com>.

   1. are your machines in time sync? If they are not, a session can get
   timed out.
   2.
   3. SEVERE: Manager [localhost#/myApp]: Unable to receive message through
   TCP channel
   4. java.lang.IllegalStateException: setAttribute: Session [
   DEC3612CF763194E7953DB3FD2C433E0] has already been invalidated
   5.         at org.apache.catalina.session.StandardSession.setAttribute(
   StandardSession.java:1437)
   6.         at org.apache.catalina.ha.session.DeltaSession.setAttribute(
   DeltaSession.java:695)
   7.         at org.apache.catalina.ha.session.DeltaRequest.execute(
   DeltaRequest.java:168)
   8.         at
   org.apache.catalina.ha.session.DeltaManager.handleSESSION_DELTA(
   DeltaManager.java:1337)
   9.         at org.apache.catalina.ha.session.DeltaManager.messageReceived
   (DeltaManager.java:1283)
   10.         at
   org.apache.catalina.ha.session.DeltaManager.messageDataReceived(
   DeltaManager.java:1001)
   11.         at
   org.apache.catalina.ha.session.ClusterSessionListener.messageReceived(
   ClusterSessionListener.java:91)
   12.         at
   org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(
   SimpleTcpCluster.java:943)
   13.         at
   org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(
   SimpleTcpCluster.java:924)
   14.         at
   org.apache.catalina.tribes.group.GroupChannel.messageReceived(
   GroupChannel.java:278)




On Thu, Jul 3, 2014 at 1:07 PM, João Sávio <jo...@gmail.com> wrote:

> Hi everyone
>
> I ran my test (total of 1k requests, total of 100 threads) against two
> nodes with default VM settings. I've just set heap size. I had about 15% of
> errors.
>
> cluster.log - node1 - http://pastebin.com/cpX900Qw
> cluster.log - node2 - http://pastebin.com/qCSzMaU6
>
> Running for a long time (total of 500k requests, total of 100 threads) I
> had about 11% of errors. In this case we can see the statistics:
>
> Jul 03, 2014 5:53:28 PM
> org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor report
> INFO: ThroughputInterceptor Report[
> Tx Msg:10000 messages
> Sent:12.36 MB (total)
>  Sent:12.36 MB (application)
> Time:7.82 seconds
> Tx Speed:1.58 MB/sec (total)
>  TxSpeed:1.58 MB/sec (application)
> Error Msg:0
> Rx Msg:10198 messages
>  Rx Speed:0.08 MB/sec (since 1st msg)
> Received:12.38 MB]
>
>
> All session attributes are Serializable, and it's a session replication
> issue because if I ran my test with just one node, I had 0% of errors.
>
> Regarding "on time", just a correction:
>
> 1. first request, pick a random server and store a session object
> 2. second request, pick *ANY* server (chose by LB based on the cookie - it
> can be the same, but not necessarily) and ask for the session object.
>
> To be more clean, I've been working with a conference system. Each
> conference should occur in one node. So, the first request can hit any
> server, and from the second request should hit the node where the
> conference is.
>
>
> Thanks a lot
> João
>
>
> 2014-07-03 15:40 GMT-03:00 Filip Hanik <fi...@hanik.com>:
>
> > you mention NIO and say maxThreads, that sounds like the <Connector>
> > configuration, but the BIO receiver is on the cluster, and it a
> completely
> > different component that also has an applicable NIO configuration.
> >
> > are you confusing the two?
> > I'm saying that you should use the NIO receiver on the cluster component,
> > and if you do, what kind of errors do you get?
> >
> >
> > On Thu, Jul 3, 2014 at 12:19 PM, Mark Eggers
> <its_toasted@yahoo.com.invalid
> > >
> > wrote:
> >
> > > -----BEGIN PGP SIGNED MESSAGE-----
> > > Hash: SHA1
> > >
> > > João,
> > >
> > > This list has a convention of posting either inline or at the end of
> > > the message you're replying to.
> > >
> > > See here for mailing list notes:
> > >
> > > http://tomcat.apache.org/lists.html#tomcat-users
> > >
> > > On 7/3/2014 10:24 AM, João Sávio wrote:
> > > > Hello
> > > >
> > > > Some points below:
> > > >
> > > > ** What is "on time"?* In my application, a group of users should
> > > > always hit the same node after the first request. So, in first
> > > > request each group of users will receive an specific cookie, and
> > > > LB will perform the load balancing based on this cookie. In first
> > > > request, a user can hit any node, but from the second, he or she
> > > > should hit the same node.
> > >
> > > Hmm, so 'on time' really means that subsequent requests should hit the
> > > same server.
> > >
> > > If you're using sessions, Tomcat has an attribute on the Engine
> > > element called jvmRoute. So depending on your load balancer (and if
> > > you use AJP), you can use Tomcat and AJP to route traffic. In that
> > > case, there's no need to write a special cookie.
> > >
> > > At any rate, this doesn't sound like a clustering error per se.
> > >
> > > >
> > > > ** What are the errors? Test result errors?* For this test, I
> > > > simplified the code of my application: - first request: store one
> > > > object in session - second request: verify if the object is in
> > > > session. If it's not -> ERROR
> > > >
> > >
> > > So looking at the information from 'on time', the scenario should be:
> > >
> > > 1. first request, pick a random server and store a session object
> > > 2. second request, pick the SAME server and ask for the session object
> > >
> > > Again, I'm not seeing where this is a clustering issue per se.
> > >
> > > > ** How big are are the sessions that you're trying to replicate?*
> > > > - I'm using Spring MVC, and I have 3 additional objects in
> > > > session. They are not big (15 attributes each one)
> > > >
> > >
> > > And all attributes are serializable? The objects are also marked as
> > > serializable?
> > >
> > > > ** What's the load like on the box when you're running the tests
> > > > that you get errors on?* - I've experiencing this issue on BIO
> > > > even without load
> > > >
> > >
> > > I may have not phrased my question carefully. What is the CPU and
> > > memory situation on your test box while running the 4 Tomcat servers?
> > >
> > > I know you've trimmed down your Xms and Xmx (presumably to fit in your
> > > test environment), but in combination with your other JVM parameters
> > > could this be causing some issues?
> > >
> > > I would follow Dan's recommendation of maybe just setting Xms, Xmx, GC
> > > logging to see what happens. Ah, I see you're going to do that below.
> > >
> > > > ** It is preferred to use the non blocking receiver to be able to
> > > > grow your cluster without running into thread starvation.* -
> > > > That's why I've tried NIO first, but I'd like to see if BIO solve
> > > > my issue and if using BIO my system doesn't get too slow.
> > >
> > > I don't think speed is so much an issue here, but scalability is. NIO
> > > can handle multiple requests per thread, BIO cannot.
> > >
> > > >
> > > >
> > > > Now, I'll try to run my tests using NIO, default VM configuration
> > > > and FINER logs.
> > >
> > > Post the results when you get them. If the logs are relatively small,
> > > just cut and paste into the mail message.
> > >
> > > I suspect FINER is going to generate LOTS of logging and slow down
> > > your application.
> > >
> > > >
> > > > Thanks a lot João
> > >
> > > . . . . just my two cents
> > > /mde/
> > >
> > > >
> > > >
> > > > 2014-07-03 14:07 GMT-03:00 Mark Eggers
> > > > <it...@yahoo.com.invalid>:
> > > >
> > > > On 7/3/2014 9:12 AM, João Sávio wrote:
> > > >>>> cluster.log -> http://pastebin.com/c98WhnmG
> > > >>>>
> > > >>>>
> > > >>>> 2014-07-03 13:04 GMT-03:00 João Sávio <jo...@gmail.com>:
> > > >>>>
> > > >>>>> Hello!
> > > >>>>>
> > > >>>>> Using NIO (with channelSendOptions="4", i.e.,
> > > >>>>> synchronous), with lightly load, my tests pass 100%. But,
> > > >>>>> on heavy load, not all sessions are replicated on time, and
> > > >>>>> I have about 20% of errors. If I increase maxThreads to
> > > >>>>> 400, I have about 15% of errors.
> > > >>>>>
> > > >>>>> More information: * I am not performing parallel requests
> > > >>>>> with same session * my cluster has 4 nodes (all in one
> > > >>>>> machine - for test purpose only) * Java 7 64 bits, Tomcat
> > > >>>>> 7.0.52, windows 7 64 bits * using default NIO
> > > >>>>> configuration, but with maxThreads=400 * VM options:
> > > >>>>> -Xms512M   - on real environment this value is 1024
> > > >>>>> -Xmx512M  - on real environment this value is 1024
> > > >>>>> -XX:NewSize=450M -XX:MaxNewSize=450M -XX:PermSize=128M
> > > >>>>> -XX:MaxPermSize=245M -XX:SurvivorRatio=8
> > > >>>>> -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=15
> > > >>>>> -XX:+UseBiasedLocking -XX:CMSInitiatingOccupancyFraction=60
> > > >>>>>  -XX:+UseCMSInitiatingOccupancyOnly
> > > >>>>> -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC
> > > >>>>> -XX:+CMSIncrementalMode -XX:+UseParNewGC
> > > >>>>> -XX:+DisableExplicitGC -XX:+PrintGCDateStamps
> > > >>>>> -XX:+PrintGCDetails
> > > >>>>> -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
> > > >>>>>
> > > >>>>> Moreover, I'm trying to attach the logs again.
> > > >>>>>
> > > >>>>> Thanks João
> > > >
> > > > João,
> > > >
> > > > I took a look at the log. This is the BIO attempt and you do run
> > > > out of threads. See the following:
> > > >
> > > > Jul 03, 2014 11:41:21 AM
> > > > org.apache.catalina.tribes.transport.bio.BioReceiver listen
> > > > WARNING: All BIO server replication threads are busy, unable to
> > > > handle more requests until a thread is freed up.
> > > >
> > > > What's the load like on the box when you're running the tests that
> > > > you get errors on?
> > > >
> > > > As Dan asks in his message:
> > > >
> > > > What is "on time"? What are the errors? Test result errors?
> > > >
> > > > How big are are the sessions that you're trying to replicate?
> > > >
> > > > My guess is that something else is going on, since the following
> > > > log entry doesn't show much in the way of cluster traffic.
> > > >
> > > > INFO: ThroughputInterceptor Report[ Tx Msg:1 messages Sent:0.00 MB
> > > > (total) Sent:0.00 MB (application) Time:0.01 seconds Tx Speed:0.04
> > > > MB/sec (total) TxSpeed:0.04 MB/sec (application) Error Msg:0 Rx
> > > > Msg:13 messages Rx Speed:0.00 MB/sec (since 1st msg) Received:0.00
> > > > MB]
> > > >
> > > > It would also be interesting to see the logs when you use the NIO
> > > > connector. According to the documentation:
> > > >
> > > > It is preferred to use the non blocking receiver to be able to grow
> > > > your cluster without running into thread starvation.
> > > >
> > > > Also from the documentation:
> > > >
> > > > Usually the rule is to use 1 thread per node in the cluster for
> > > > small clusters, and then depending on your message frequency and
> > > > your hardware, you'll find an optimal number of threads peak out
> > > > at a certain number.
> > > >
> > > > We might need a little more background on your application and your
> > > > test environment to figure out why clustering is not behaving for
> > > > you.
> > > >
> > > > . . . just my two cents /mde/
> > > >
> > > > PS - you have some errors in your server.xml (see the log). While
> > > > they won't impact this problem, it might be a good idea to address
> > > > them.
> > > >
> > > > /mde/
> > > >>
> > > >>
> ---------------------------------------------------------------------
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> > > >> For additional commands, e-mail: users-help@tomcat.apache.org
> > > >>
> > > >>
> > > >
> > > >
> > >
> > > -----BEGIN PGP SIGNATURE-----
> > > Version: GnuPG v1.4.13 (MingW32)
> > > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> > >
> > > iQEcBAEBAgAGBQJTtZ6rAAoJEEFGbsYNeTwtAPsH/jqUHnP5Wag9fRLUYQD582/O
> > > 7FRoBv+Iq0/VWs2o9Wv0VrAOOazUhtAG38JgCX+v70u2MJNOIcVVpXCuOjZeSJYB
> > > WRkNIXqRCrVDc3/ZX3nTQoXJheZEfrdvB5cikoARPmBJeb4kOpnxKSs97OSJjHYU
> > > uCCoXocVfDM3JxtEHXNHyy6BuIYdizvH7DwGSts7shggT/LmKmxA16AzChppwSr4
> > > 87p7jCJyxxPJ9MeRNP4uDQpV+Z/1DDhMxzUc9P8VJuSykJ1YUdQOm24AuGezsYyx
> > > ZQrLkioRnxDcOwpKSoI1o0r/2NgS97YR4GZbU6npzD1DjvPjZm4zimbNKM+l0iE=
> > > =ftU9
> > > -----END PGP SIGNATURE-----
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> > > For additional commands, e-mail: users-help@tomcat.apache.org
> > >
> > >
> >
>
>
>
> --
> http://joaosavio.wordpress.com
>

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

Hi everyone

I ran my test (total of 1k requests, total of 100 threads) against two
nodes with default VM settings. I've just set heap size. I had about 15% of
errors.

cluster.log - node1 - http://pastebin.com/cpX900Qw
cluster.log - node2 - http://pastebin.com/qCSzMaU6

Running for a long time (total of 500k requests, total of 100 threads) I
had about 11% of errors. In this case we can see the statistics:

Jul 03, 2014 5:53:28 PM
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor report
INFO: ThroughputInterceptor Report[
Tx Msg:10000 messages
Sent:12.36 MB (total)
 Sent:12.36 MB (application)
Time:7.82 seconds
Tx Speed:1.58 MB/sec (total)
 TxSpeed:1.58 MB/sec (application)
Error Msg:0
Rx Msg:10198 messages
 Rx Speed:0.08 MB/sec (since 1st msg)
Received:12.38 MB]


All session attributes are Serializable, and it's a session replication
issue because if I ran my test with just one node, I had 0% of errors.

Regarding "on time", just a correction:

1. first request, pick a random server and store a session object
2. second request, pick *ANY* server (chose by LB based on the cookie - it
can be the same, but not necessarily) and ask for the session object.

To be more clean, I've been working with a conference system. Each
conference should occur in one node. So, the first request can hit any
server, and from the second request should hit the node where the
conference is.


Thanks a lot
João


2014-07-03 15:40 GMT-03:00 Filip Hanik <fi...@hanik.com>:

> you mention NIO and say maxThreads, that sounds like the <Connector>
> configuration, but the BIO receiver is on the cluster, and it a completely
> different component that also has an applicable NIO configuration.
>
> are you confusing the two?
> I'm saying that you should use the NIO receiver on the cluster component,
> and if you do, what kind of errors do you get?
>
>
> On Thu, Jul 3, 2014 at 12:19 PM, Mark Eggers <its_toasted@yahoo.com.invalid
> >
> wrote:
>
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > João,
> >
> > This list has a convention of posting either inline or at the end of
> > the message you're replying to.
> >
> > See here for mailing list notes:
> >
> > http://tomcat.apache.org/lists.html#tomcat-users
> >
> > On 7/3/2014 10:24 AM, João Sávio wrote:
> > > Hello
> > >
> > > Some points below:
> > >
> > > ** What is "on time"?* In my application, a group of users should
> > > always hit the same node after the first request. So, in first
> > > request each group of users will receive an specific cookie, and
> > > LB will perform the load balancing based on this cookie. In first
> > > request, a user can hit any node, but from the second, he or she
> > > should hit the same node.
> >
> > Hmm, so 'on time' really means that subsequent requests should hit the
> > same server.
> >
> > If you're using sessions, Tomcat has an attribute on the Engine
> > element called jvmRoute. So depending on your load balancer (and if
> > you use AJP), you can use Tomcat and AJP to route traffic. In that
> > case, there's no need to write a special cookie.
> >
> > At any rate, this doesn't sound like a clustering error per se.
> >
> > >
> > > ** What are the errors? Test result errors?* For this test, I
> > > simplified the code of my application: - first request: store one
> > > object in session - second request: verify if the object is in
> > > session. If it's not -> ERROR
> > >
> >
> > So looking at the information from 'on time', the scenario should be:
> >
> > 1. first request, pick a random server and store a session object
> > 2. second request, pick the SAME server and ask for the session object
> >
> > Again, I'm not seeing where this is a clustering issue per se.
> >
> > > ** How big are are the sessions that you're trying to replicate?*
> > > - I'm using Spring MVC, and I have 3 additional objects in
> > > session. They are not big (15 attributes each one)
> > >
> >
> > And all attributes are serializable? The objects are also marked as
> > serializable?
> >
> > > ** What's the load like on the box when you're running the tests
> > > that you get errors on?* - I've experiencing this issue on BIO
> > > even without load
> > >
> >
> > I may have not phrased my question carefully. What is the CPU and
> > memory situation on your test box while running the 4 Tomcat servers?
> >
> > I know you've trimmed down your Xms and Xmx (presumably to fit in your
> > test environment), but in combination with your other JVM parameters
> > could this be causing some issues?
> >
> > I would follow Dan's recommendation of maybe just setting Xms, Xmx, GC
> > logging to see what happens. Ah, I see you're going to do that below.
> >
> > > ** It is preferred to use the non blocking receiver to be able to
> > > grow your cluster without running into thread starvation.* -
> > > That's why I've tried NIO first, but I'd like to see if BIO solve
> > > my issue and if using BIO my system doesn't get too slow.
> >
> > I don't think speed is so much an issue here, but scalability is. NIO
> > can handle multiple requests per thread, BIO cannot.
> >
> > >
> > >
> > > Now, I'll try to run my tests using NIO, default VM configuration
> > > and FINER logs.
> >
> > Post the results when you get them. If the logs are relatively small,
> > just cut and paste into the mail message.
> >
> > I suspect FINER is going to generate LOTS of logging and slow down
> > your application.
> >
> > >
> > > Thanks a lot João
> >
> > . . . . just my two cents
> > /mde/
> >
> > >
> > >
> > > 2014-07-03 14:07 GMT-03:00 Mark Eggers
> > > <it...@yahoo.com.invalid>:
> > >
> > > On 7/3/2014 9:12 AM, João Sávio wrote:
> > >>>> cluster.log -> http://pastebin.com/c98WhnmG
> > >>>>
> > >>>>
> > >>>> 2014-07-03 13:04 GMT-03:00 João Sávio <jo...@gmail.com>:
> > >>>>
> > >>>>> Hello!
> > >>>>>
> > >>>>> Using NIO (with channelSendOptions="4", i.e.,
> > >>>>> synchronous), with lightly load, my tests pass 100%. But,
> > >>>>> on heavy load, not all sessions are replicated on time, and
> > >>>>> I have about 20% of errors. If I increase maxThreads to
> > >>>>> 400, I have about 15% of errors.
> > >>>>>
> > >>>>> More information: * I am not performing parallel requests
> > >>>>> with same session * my cluster has 4 nodes (all in one
> > >>>>> machine - for test purpose only) * Java 7 64 bits, Tomcat
> > >>>>> 7.0.52, windows 7 64 bits * using default NIO
> > >>>>> configuration, but with maxThreads=400 * VM options:
> > >>>>> -Xms512M   - on real environment this value is 1024
> > >>>>> -Xmx512M  - on real environment this value is 1024
> > >>>>> -XX:NewSize=450M -XX:MaxNewSize=450M -XX:PermSize=128M
> > >>>>> -XX:MaxPermSize=245M -XX:SurvivorRatio=8
> > >>>>> -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=15
> > >>>>> -XX:+UseBiasedLocking -XX:CMSInitiatingOccupancyFraction=60
> > >>>>>  -XX:+UseCMSInitiatingOccupancyOnly
> > >>>>> -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC
> > >>>>> -XX:+CMSIncrementalMode -XX:+UseParNewGC
> > >>>>> -XX:+DisableExplicitGC -XX:+PrintGCDateStamps
> > >>>>> -XX:+PrintGCDetails
> > >>>>> -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
> > >>>>>
> > >>>>> Moreover, I'm trying to attach the logs again.
> > >>>>>
> > >>>>> Thanks João
> > >
> > > João,
> > >
> > > I took a look at the log. This is the BIO attempt and you do run
> > > out of threads. See the following:
> > >
> > > Jul 03, 2014 11:41:21 AM
> > > org.apache.catalina.tribes.transport.bio.BioReceiver listen
> > > WARNING: All BIO server replication threads are busy, unable to
> > > handle more requests until a thread is freed up.
> > >
> > > What's the load like on the box when you're running the tests that
> > > you get errors on?
> > >
> > > As Dan asks in his message:
> > >
> > > What is "on time"? What are the errors? Test result errors?
> > >
> > > How big are are the sessions that you're trying to replicate?
> > >
> > > My guess is that something else is going on, since the following
> > > log entry doesn't show much in the way of cluster traffic.
> > >
> > > INFO: ThroughputInterceptor Report[ Tx Msg:1 messages Sent:0.00 MB
> > > (total) Sent:0.00 MB (application) Time:0.01 seconds Tx Speed:0.04
> > > MB/sec (total) TxSpeed:0.04 MB/sec (application) Error Msg:0 Rx
> > > Msg:13 messages Rx Speed:0.00 MB/sec (since 1st msg) Received:0.00
> > > MB]
> > >
> > > It would also be interesting to see the logs when you use the NIO
> > > connector. According to the documentation:
> > >
> > > It is preferred to use the non blocking receiver to be able to grow
> > > your cluster without running into thread starvation.
> > >
> > > Also from the documentation:
> > >
> > > Usually the rule is to use 1 thread per node in the cluster for
> > > small clusters, and then depending on your message frequency and
> > > your hardware, you'll find an optimal number of threads peak out
> > > at a certain number.
> > >
> > > We might need a little more background on your application and your
> > > test environment to figure out why clustering is not behaving for
> > > you.
> > >
> > > . . . just my two cents /mde/
> > >
> > > PS - you have some errors in your server.xml (see the log). While
> > > they won't impact this problem, it might be a good idea to address
> > > them.
> > >
> > > /mde/
> > >>
> > >> ---------------------------------------------------------------------
> > >>
> > >>
> > >>
> > >>
> > >>
> > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> > >> For additional commands, e-mail: users-help@tomcat.apache.org
> > >>
> > >>
> > >
> > >
> >
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v1.4.13 (MingW32)
> > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> >
> > iQEcBAEBAgAGBQJTtZ6rAAoJEEFGbsYNeTwtAPsH/jqUHnP5Wag9fRLUYQD582/O
> > 7FRoBv+Iq0/VWs2o9Wv0VrAOOazUhtAG38JgCX+v70u2MJNOIcVVpXCuOjZeSJYB
> > WRkNIXqRCrVDc3/ZX3nTQoXJheZEfrdvB5cikoARPmBJeb4kOpnxKSs97OSJjHYU
> > uCCoXocVfDM3JxtEHXNHyy6BuIYdizvH7DwGSts7shggT/LmKmxA16AzChppwSr4
> > 87p7jCJyxxPJ9MeRNP4uDQpV+Z/1DDhMxzUc9P8VJuSykJ1YUdQOm24AuGezsYyx
> > ZQrLkioRnxDcOwpKSoI1o0r/2NgS97YR4GZbU6npzD1DjvPjZm4zimbNKM+l0iE=
> > =ftU9
> > -----END PGP SIGNATURE-----
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> > For additional commands, e-mail: users-help@tomcat.apache.org
> >
> >
>



-- 
http://joaosavio.wordpress.com

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Filip Hanik <fi...@hanik.com>.

you mention NIO and say maxThreads, that sounds like the <Connector>
configuration, but the BIO receiver is on the cluster, and it a completely
different component that also has an applicable NIO configuration.

are you confusing the two?
I'm saying that you should use the NIO receiver on the cluster component,
and if you do, what kind of errors do you get?


On Thu, Jul 3, 2014 at 12:19 PM, Mark Eggers <it...@yahoo.com.invalid>
wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> João,
>
> This list has a convention of posting either inline or at the end of
> the message you're replying to.
>
> See here for mailing list notes:
>
> http://tomcat.apache.org/lists.html#tomcat-users
>
> On 7/3/2014 10:24 AM, João Sávio wrote:
> > Hello
> >
> > Some points below:
> >
> > ** What is "on time"?* In my application, a group of users should
> > always hit the same node after the first request. So, in first
> > request each group of users will receive an specific cookie, and
> > LB will perform the load balancing based on this cookie. In first
> > request, a user can hit any node, but from the second, he or she
> > should hit the same node.
>
> Hmm, so 'on time' really means that subsequent requests should hit the
> same server.
>
> If you're using sessions, Tomcat has an attribute on the Engine
> element called jvmRoute. So depending on your load balancer (and if
> you use AJP), you can use Tomcat and AJP to route traffic. In that
> case, there's no need to write a special cookie.
>
> At any rate, this doesn't sound like a clustering error per se.
>
> >
> > ** What are the errors? Test result errors?* For this test, I
> > simplified the code of my application: - first request: store one
> > object in session - second request: verify if the object is in
> > session. If it's not -> ERROR
> >
>
> So looking at the information from 'on time', the scenario should be:
>
> 1. first request, pick a random server and store a session object
> 2. second request, pick the SAME server and ask for the session object
>
> Again, I'm not seeing where this is a clustering issue per se.
>
> > ** How big are are the sessions that you're trying to replicate?*
> > - I'm using Spring MVC, and I have 3 additional objects in
> > session. They are not big (15 attributes each one)
> >
>
> And all attributes are serializable? The objects are also marked as
> serializable?
>
> > ** What's the load like on the box when you're running the tests
> > that you get errors on?* - I've experiencing this issue on BIO
> > even without load
> >
>
> I may have not phrased my question carefully. What is the CPU and
> memory situation on your test box while running the 4 Tomcat servers?
>
> I know you've trimmed down your Xms and Xmx (presumably to fit in your
> test environment), but in combination with your other JVM parameters
> could this be causing some issues?
>
> I would follow Dan's recommendation of maybe just setting Xms, Xmx, GC
> logging to see what happens. Ah, I see you're going to do that below.
>
> > ** It is preferred to use the non blocking receiver to be able to
> > grow your cluster without running into thread starvation.* -
> > That's why I've tried NIO first, but I'd like to see if BIO solve
> > my issue and if using BIO my system doesn't get too slow.
>
> I don't think speed is so much an issue here, but scalability is. NIO
> can handle multiple requests per thread, BIO cannot.
>
> >
> >
> > Now, I'll try to run my tests using NIO, default VM configuration
> > and FINER logs.
>
> Post the results when you get them. If the logs are relatively small,
> just cut and paste into the mail message.
>
> I suspect FINER is going to generate LOTS of logging and slow down
> your application.
>
> >
> > Thanks a lot João
>
> . . . . just my two cents
> /mde/
>
> >
> >
> > 2014-07-03 14:07 GMT-03:00 Mark Eggers
> > <it...@yahoo.com.invalid>:
> >
> > On 7/3/2014 9:12 AM, João Sávio wrote:
> >>>> cluster.log -> http://pastebin.com/c98WhnmG
> >>>>
> >>>>
> >>>> 2014-07-03 13:04 GMT-03:00 João Sávio <jo...@gmail.com>:
> >>>>
> >>>>> Hello!
> >>>>>
> >>>>> Using NIO (with channelSendOptions="4", i.e.,
> >>>>> synchronous), with lightly load, my tests pass 100%. But,
> >>>>> on heavy load, not all sessions are replicated on time, and
> >>>>> I have about 20% of errors. If I increase maxThreads to
> >>>>> 400, I have about 15% of errors.
> >>>>>
> >>>>> More information: * I am not performing parallel requests
> >>>>> with same session * my cluster has 4 nodes (all in one
> >>>>> machine - for test purpose only) * Java 7 64 bits, Tomcat
> >>>>> 7.0.52, windows 7 64 bits * using default NIO
> >>>>> configuration, but with maxThreads=400 * VM options:
> >>>>> -Xms512M   - on real environment this value is 1024
> >>>>> -Xmx512M  - on real environment this value is 1024
> >>>>> -XX:NewSize=450M -XX:MaxNewSize=450M -XX:PermSize=128M
> >>>>> -XX:MaxPermSize=245M -XX:SurvivorRatio=8
> >>>>> -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=15
> >>>>> -XX:+UseBiasedLocking -XX:CMSInitiatingOccupancyFraction=60
> >>>>>  -XX:+UseCMSInitiatingOccupancyOnly
> >>>>> -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC
> >>>>> -XX:+CMSIncrementalMode -XX:+UseParNewGC
> >>>>> -XX:+DisableExplicitGC -XX:+PrintGCDateStamps
> >>>>> -XX:+PrintGCDetails
> >>>>> -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
> >>>>>
> >>>>> Moreover, I'm trying to attach the logs again.
> >>>>>
> >>>>> Thanks João
> >
> > João,
> >
> > I took a look at the log. This is the BIO attempt and you do run
> > out of threads. See the following:
> >
> > Jul 03, 2014 11:41:21 AM
> > org.apache.catalina.tribes.transport.bio.BioReceiver listen
> > WARNING: All BIO server replication threads are busy, unable to
> > handle more requests until a thread is freed up.
> >
> > What's the load like on the box when you're running the tests that
> > you get errors on?
> >
> > As Dan asks in his message:
> >
> > What is "on time"? What are the errors? Test result errors?
> >
> > How big are are the sessions that you're trying to replicate?
> >
> > My guess is that something else is going on, since the following
> > log entry doesn't show much in the way of cluster traffic.
> >
> > INFO: ThroughputInterceptor Report[ Tx Msg:1 messages Sent:0.00 MB
> > (total) Sent:0.00 MB (application) Time:0.01 seconds Tx Speed:0.04
> > MB/sec (total) TxSpeed:0.04 MB/sec (application) Error Msg:0 Rx
> > Msg:13 messages Rx Speed:0.00 MB/sec (since 1st msg) Received:0.00
> > MB]
> >
> > It would also be interesting to see the logs when you use the NIO
> > connector. According to the documentation:
> >
> > It is preferred to use the non blocking receiver to be able to grow
> > your cluster without running into thread starvation.
> >
> > Also from the documentation:
> >
> > Usually the rule is to use 1 thread per node in the cluster for
> > small clusters, and then depending on your message frequency and
> > your hardware, you'll find an optimal number of threads peak out
> > at a certain number.
> >
> > We might need a little more background on your application and your
> > test environment to figure out why clustering is not behaving for
> > you.
> >
> > . . . just my two cents /mde/
> >
> > PS - you have some errors in your server.xml (see the log). While
> > they won't impact this problem, it might be a good idea to address
> > them.
> >
> > /mde/
> >>
> >> ---------------------------------------------------------------------
> >>
> >>
> >>
> >>
> >>
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> >> For additional commands, e-mail: users-help@tomcat.apache.org
> >>
> >>
> >
> >
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.13 (MingW32)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJTtZ6rAAoJEEFGbsYNeTwtAPsH/jqUHnP5Wag9fRLUYQD582/O
> 7FRoBv+Iq0/VWs2o9Wv0VrAOOazUhtAG38JgCX+v70u2MJNOIcVVpXCuOjZeSJYB
> WRkNIXqRCrVDc3/ZX3nTQoXJheZEfrdvB5cikoARPmBJeb4kOpnxKSs97OSJjHYU
> uCCoXocVfDM3JxtEHXNHyy6BuIYdizvH7DwGSts7shggT/LmKmxA16AzChppwSr4
> 87p7jCJyxxPJ9MeRNP4uDQpV+Z/1DDhMxzUc9P8VJuSykJ1YUdQOm24AuGezsYyx
> ZQrLkioRnxDcOwpKSoI1o0r/2NgS97YR4GZbU6npzD1DjvPjZm4zimbNKM+l0iE=
> =ftU9
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Mark Eggers <it...@yahoo.com.INVALID>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

João,

This list has a convention of posting either inline or at the end of
the message you're replying to.

See here for mailing list notes:

http://tomcat.apache.org/lists.html#tomcat-users

On 7/3/2014 10:24 AM, João Sávio wrote:
> Hello
> 
> Some points below:
> 
> ** What is "on time"?* In my application, a group of users should 
> always hit the same node after the first request. So, in first 
> request each group of users will receive an specific cookie, and
> LB will perform the load balancing based on this cookie. In first 
> request, a user can hit any node, but from the second, he or she 
> should hit the same node.

Hmm, so 'on time' really means that subsequent requests should hit the
same server.

If you're using sessions, Tomcat has an attribute on the Engine
element called jvmRoute. So depending on your load balancer (and if
you use AJP), you can use Tomcat and AJP to route traffic. In that
case, there's no need to write a special cookie.

At any rate, this doesn't sound like a clustering error per se.

> 
> ** What are the errors? Test result errors?* For this test, I 
> simplified the code of my application: - first request: store one 
> object in session - second request: verify if the object is in 
> session. If it's not -> ERROR
> 

So looking at the information from 'on time', the scenario should be:

1. first request, pick a random server and store a session object
2. second request, pick the SAME server and ask for the session object

Again, I'm not seeing where this is a clustering issue per se.

> ** How big are are the sessions that you're trying to replicate?*
> - I'm using Spring MVC, and I have 3 additional objects in
> session. They are not big (15 attributes each one)
> 

And all attributes are serializable? The objects are also marked as
serializable?

> ** What's the load like on the box when you're running the tests 
> that you get errors on?* - I've experiencing this issue on BIO
> even without load
> 

I may have not phrased my question carefully. What is the CPU and
memory situation on your test box while running the 4 Tomcat servers?

I know you've trimmed down your Xms and Xmx (presumably to fit in your
test environment), but in combination with your other JVM parameters
could this be causing some issues?

I would follow Dan's recommendation of maybe just setting Xms, Xmx, GC
logging to see what happens. Ah, I see you're going to do that below.

> ** It is preferred to use the non blocking receiver to be able to 
> grow your cluster without running into thread starvation.* -
> That's why I've tried NIO first, but I'd like to see if BIO solve
> my issue and if using BIO my system doesn't get too slow.

I don't think speed is so much an issue here, but scalability is. NIO
can handle multiple requests per thread, BIO cannot.

> 
> 
> Now, I'll try to run my tests using NIO, default VM configuration 
> and FINER logs.

Post the results when you get them. If the logs are relatively small,
just cut and paste into the mail message.

I suspect FINER is going to generate LOTS of logging and slow down
your application.

> 
> Thanks a lot João

. . . . just my two cents
/mde/

> 
> 
> 2014-07-03 14:07 GMT-03:00 Mark Eggers 
> <it...@yahoo.com.invalid>:
> 
> On 7/3/2014 9:12 AM, João Sávio wrote:
>>>> cluster.log -> http://pastebin.com/c98WhnmG
>>>> 
>>>> 
>>>> 2014-07-03 13:04 GMT-03:00 João Sávio <jo...@gmail.com>:
>>>> 
>>>>> Hello!
>>>>> 
>>>>> Using NIO (with channelSendOptions="4", i.e.,
>>>>> synchronous), with lightly load, my tests pass 100%. But,
>>>>> on heavy load, not all sessions are replicated on time, and
>>>>> I have about 20% of errors. If I increase maxThreads to
>>>>> 400, I have about 15% of errors.
>>>>> 
>>>>> More information: * I am not performing parallel requests 
>>>>> with same session * my cluster has 4 nodes (all in one 
>>>>> machine - for test purpose only) * Java 7 64 bits, Tomcat 
>>>>> 7.0.52, windows 7 64 bits * using default NIO 
>>>>> configuration, but with maxThreads=400 * VM options: 
>>>>> -Xms512M   - on real environment this value is 1024 
>>>>> -Xmx512M  - on real environment this value is 1024 
>>>>> -XX:NewSize=450M -XX:MaxNewSize=450M -XX:PermSize=128M 
>>>>> -XX:MaxPermSize=245M -XX:SurvivorRatio=8 
>>>>> -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=15 
>>>>> -XX:+UseBiasedLocking -XX:CMSInitiatingOccupancyFraction=60
>>>>>  -XX:+UseCMSInitiatingOccupancyOnly 
>>>>> -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC 
>>>>> -XX:+CMSIncrementalMode -XX:+UseParNewGC 
>>>>> -XX:+DisableExplicitGC -XX:+PrintGCDateStamps 
>>>>> -XX:+PrintGCDetails 
>>>>> -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
>>>>> 
>>>>> Moreover, I'm trying to attach the logs again.
>>>>> 
>>>>> Thanks João
> 
> João,
> 
> I took a look at the log. This is the BIO attempt and you do run 
> out of threads. See the following:
> 
> Jul 03, 2014 11:41:21 AM 
> org.apache.catalina.tribes.transport.bio.BioReceiver listen 
> WARNING: All BIO server replication threads are busy, unable to 
> handle more requests until a thread is freed up.
> 
> What's the load like on the box when you're running the tests that 
> you get errors on?
> 
> As Dan asks in his message:
> 
> What is "on time"? What are the errors? Test result errors?
> 
> How big are are the sessions that you're trying to replicate?
> 
> My guess is that something else is going on, since the following 
> log entry doesn't show much in the way of cluster traffic.
> 
> INFO: ThroughputInterceptor Report[ Tx Msg:1 messages Sent:0.00 MB 
> (total) Sent:0.00 MB (application) Time:0.01 seconds Tx Speed:0.04 
> MB/sec (total) TxSpeed:0.04 MB/sec (application) Error Msg:0 Rx 
> Msg:13 messages Rx Speed:0.00 MB/sec (since 1st msg) Received:0.00 
> MB]
> 
> It would also be interesting to see the logs when you use the NIO 
> connector. According to the documentation:
> 
> It is preferred to use the non blocking receiver to be able to grow
> your cluster without running into thread starvation.
> 
> Also from the documentation:
> 
> Usually the rule is to use 1 thread per node in the cluster for 
> small clusters, and then depending on your message frequency and 
> your hardware, you'll find an optimal number of threads peak out
> at a certain number.
> 
> We might need a little more background on your application and your
> test environment to figure out why clustering is not behaving for
> you.
> 
> . . . just my two cents /mde/
> 
> PS - you have some errors in your server.xml (see the log). While 
> they won't impact this problem, it might be a good idea to address 
> them.
> 
> /mde/
>> 
>> ---------------------------------------------------------------------
>>
>>
>>
>>
>> 
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>> 
>> 
> 
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTtZ6rAAoJEEFGbsYNeTwtAPsH/jqUHnP5Wag9fRLUYQD582/O
7FRoBv+Iq0/VWs2o9Wv0VrAOOazUhtAG38JgCX+v70u2MJNOIcVVpXCuOjZeSJYB
WRkNIXqRCrVDc3/ZX3nTQoXJheZEfrdvB5cikoARPmBJeb4kOpnxKSs97OSJjHYU
uCCoXocVfDM3JxtEHXNHyy6BuIYdizvH7DwGSts7shggT/LmKmxA16AzChppwSr4
87p7jCJyxxPJ9MeRNP4uDQpV+Z/1DDhMxzUc9P8VJuSykJ1YUdQOm24AuGezsYyx
ZQrLkioRnxDcOwpKSoI1o0r/2NgS97YR4GZbU6npzD1DjvPjZm4zimbNKM+l0iE=
=ftU9
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

Hello

Some points below:

** What is "on time"?*
In my application, a group of users should always hit the same node after
the first request. So, in first request each group of users will receive an
specific cookie, and LB will perform the load balancing based on this
cookie. In first request, a user can hit any node, but from the second, he
or she should hit the same node.

** What are the errors? Test result errors?*
For this test, I simplified the code of my application:
- first request: store one object in session
- second request: verify if the object is in session. If it's not -> ERROR

** How big are are the sessions that you're trying to replicate?*
- I'm using Spring MVC, and I have 3 additional objects in session. They
are not big (15 attributes each one)

** What's the load like on the box when you're running the tests that
you get errors on?*
- I've experiencing this issue on BIO even without load

** It is preferred to use the non blocking receiver to be able to grow your
cluster without running into thread starvation.*
- That's why I've tried NIO first, but I'd like to see if BIO solve my
issue and if using BIO my system doesn't get too slow.


Now, I'll try to run my tests using NIO, default VM configuration and FINER
logs.

Thanks a lot
João


2014-07-03 14:07 GMT-03:00 Mark Eggers <it...@yahoo.com.invalid>:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 7/3/2014 9:12 AM, João Sávio wrote:
> > cluster.log -> http://pastebin.com/c98WhnmG
> >
> >
> > 2014-07-03 13:04 GMT-03:00 João Sávio <jo...@gmail.com>:
> >
> >> Hello!
> >>
> >> Using NIO (with channelSendOptions="4", i.e., synchronous), with
> >> lightly load, my tests pass 100%. But, on heavy load, not all
> >> sessions are replicated on time, and I have about 20% of errors.
> >> If I increase maxThreads to 400, I have about 15% of errors.
> >>
> >> More information: * I am not performing parallel requests with
> >> same session * my cluster has 4 nodes (all in one machine - for
> >> test purpose only) * Java 7 64 bits, Tomcat 7.0.52, windows 7 64
> >> bits * using default NIO configuration, but with maxThreads=400 *
> >> VM options: -Xms512M   - on real environment this value is 1024
> >> -Xmx512M  - on real environment this value is 1024
> >> -XX:NewSize=450M -XX:MaxNewSize=450M -XX:PermSize=128M
> >> -XX:MaxPermSize=245M -XX:SurvivorRatio=8
> >> -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=15
> >> -XX:+UseBiasedLocking -XX:CMSInitiatingOccupancyFraction=60
> >> -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled
> >> -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+UseParNewGC
> >> -XX:+DisableExplicitGC -XX:+PrintGCDateStamps
> >> -XX:+PrintGCDetails -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
> >>
> >> Moreover, I'm trying to attach the logs again.
> >>
> >> Thanks João
>
> João,
>
> I took a look at the log. This is the BIO attempt and you do run out
> of threads. See the following:
>
> Jul 03, 2014 11:41:21 AM
> org.apache.catalina.tribes.transport.bio.BioReceiver listen
> WARNING: All BIO server replication threads are busy, unable to handle
> more requests until a thread is freed up.
>
> What's the load like on the box when you're running the tests that you
> get errors on?
>
> As Dan asks in his message:
>
> What is "on time"?
> What are the errors? Test result errors?
>
> How big are are the sessions that you're trying to replicate?
>
> My guess is that something else is going on, since the following log
> entry doesn't show much in the way of cluster traffic.
>
> INFO: ThroughputInterceptor Report[
>         Tx Msg:1 messages
>         Sent:0.00 MB (total)
>         Sent:0.00 MB (application)
>         Time:0.01 seconds
>         Tx Speed:0.04 MB/sec (total)
>         TxSpeed:0.04 MB/sec (application)
>         Error Msg:0
>         Rx Msg:13 messages
>         Rx Speed:0.00 MB/sec (since 1st msg)
>         Received:0.00 MB]
>
> It would also be interesting to see the logs when you use the NIO
> connector. According to the documentation:
>
> It is preferred to use the non blocking receiver to be able to grow
> your cluster without running into thread starvation.
>
> Also from the documentation:
>
> Usually the rule is to use 1 thread per node in the cluster for small
> clusters, and then depending on your message frequency and your
> hardware, you'll find an optimal number of threads peak out at a
> certain number.
>
> We might need a little more background on your application and your
> test environment to figure out why clustering is not behaving for you.
>
> . . . just my two cents
> /mde/
>
> PS - you have some errors in your server.xml (see the log). While they
> won't impact this problem, it might be a good idea to address them.
>
> /mde/
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.13 (MingW32)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJTtY28AAoJEEFGbsYNeTwtbk4H/1ehs00fmOLGfpcDxKkbfJJc
> B2T3FEYmW2scV/W3Z0+z4uhBgVwRqPHgEZHotdRFhkadymCKz0d5RjjEgnTMv5vH
> eP1u35NjmtteeLg+EcZU9XP1HOR+oxcx9fFic9NULtUb1lQOd9pIV9SWO82vFSI5
> 0ERzCxMr/ysiOZHPXPwl6SCe9TWGwYAWJh1QrH+3tqaD+EV7mYdZk7P/MOSWnSxn
> JzLRkO+nKPXLYv6NQiSzjCoyURIxv8+fIw3vIblx03vfhyKFlb/KR9r8ZfhlSiJ0
> i9fKzpRXmCVIHchWCDWKV89l6KzOyIYPVv3LlprGyLtCTbaBqvBQu5iFOhbHHiw=
> =jgsE
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>


-- 
http://joaosavio.wordpress.com

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Mark Eggers <it...@yahoo.com.INVALID>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 7/3/2014 9:12 AM, João Sávio wrote:
> cluster.log -> http://pastebin.com/c98WhnmG
> 
> 
> 2014-07-03 13:04 GMT-03:00 João Sávio <jo...@gmail.com>:
> 
>> Hello!
>> 
>> Using NIO (with channelSendOptions="4", i.e., synchronous), with
>> lightly load, my tests pass 100%. But, on heavy load, not all
>> sessions are replicated on time, and I have about 20% of errors.
>> If I increase maxThreads to 400, I have about 15% of errors.
>> 
>> More information: * I am not performing parallel requests with
>> same session * my cluster has 4 nodes (all in one machine - for
>> test purpose only) * Java 7 64 bits, Tomcat 7.0.52, windows 7 64
>> bits * using default NIO configuration, but with maxThreads=400 *
>> VM options: -Xms512M   - on real environment this value is 1024 
>> -Xmx512M  - on real environment this value is 1024 
>> -XX:NewSize=450M -XX:MaxNewSize=450M -XX:PermSize=128M 
>> -XX:MaxPermSize=245M -XX:SurvivorRatio=8 
>> -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=15 
>> -XX:+UseBiasedLocking -XX:CMSInitiatingOccupancyFraction=60 
>> -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled 
>> -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+UseParNewGC 
>> -XX:+DisableExplicitGC -XX:+PrintGCDateStamps 
>> -XX:+PrintGCDetails -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
>> 
>> Moreover, I'm trying to attach the logs again.
>> 
>> Thanks João

João,

I took a look at the log. This is the BIO attempt and you do run out
of threads. See the following:

Jul 03, 2014 11:41:21 AM
org.apache.catalina.tribes.transport.bio.BioReceiver listen
WARNING: All BIO server replication threads are busy, unable to handle
more requests until a thread is freed up.

What's the load like on the box when you're running the tests that you
get errors on?

As Dan asks in his message:

What is "on time"?
What are the errors? Test result errors?

How big are are the sessions that you're trying to replicate?

My guess is that something else is going on, since the following log
entry doesn't show much in the way of cluster traffic.

INFO: ThroughputInterceptor Report[
        Tx Msg:1 messages
        Sent:0.00 MB (total)
        Sent:0.00 MB (application)
        Time:0.01 seconds
        Tx Speed:0.04 MB/sec (total)
        TxSpeed:0.04 MB/sec (application)
        Error Msg:0
        Rx Msg:13 messages
        Rx Speed:0.00 MB/sec (since 1st msg)
        Received:0.00 MB]

It would also be interesting to see the logs when you use the NIO
connector. According to the documentation:

It is preferred to use the non blocking receiver to be able to grow
your cluster without running into thread starvation.

Also from the documentation:

Usually the rule is to use 1 thread per node in the cluster for small
clusters, and then depending on your message frequency and your
hardware, you'll find an optimal number of threads peak out at a
certain number.

We might need a little more background on your application and your
test environment to figure out why clustering is not behaving for you.

. . . just my two cents
/mde/

PS - you have some errors in your server.xml (see the log). While they
won't impact this problem, it might be a good idea to address them.

/mde/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTtY28AAoJEEFGbsYNeTwtbk4H/1ehs00fmOLGfpcDxKkbfJJc
B2T3FEYmW2scV/W3Z0+z4uhBgVwRqPHgEZHotdRFhkadymCKz0d5RjjEgnTMv5vH
eP1u35NjmtteeLg+EcZU9XP1HOR+oxcx9fFic9NULtUb1lQOd9pIV9SWO82vFSI5
0ERzCxMr/ysiOZHPXPwl6SCe9TWGwYAWJh1QrH+3tqaD+EV7mYdZk7P/MOSWnSxn
JzLRkO+nKPXLYv6NQiSzjCoyURIxv8+fIw3vIblx03vfhyKFlb/KR9r8ZfhlSiJ0
i9fKzpRXmCVIHchWCDWKV89l6KzOyIYPVv3LlprGyLtCTbaBqvBQu5iFOhbHHiw=
=jgsE
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

cluster.log -> http://pastebin.com/c98WhnmG


2014-07-03 13:04 GMT-03:00 João Sávio <jo...@gmail.com>:

> Hello!
>
> Using NIO (with channelSendOptions="4", i.e., synchronous), with lightly
> load, my tests pass 100%. But, on heavy load, not all sessions are
> replicated on time, and I have about 20% of errors. If I increase
> maxThreads to 400, I have about 15% of errors.
>
> More information:
> * I am not performing parallel requests with same session
> * my cluster has 4 nodes (all in one machine - for test purpose only)
> * Java 7 64 bits, Tomcat 7.0.52, windows 7 64 bits
> * using default NIO configuration, but with maxThreads=400
> * VM options:
> -Xms512M   - on real environment this value is 1024
> -Xmx512M  - on real environment this value is 1024
> -XX:NewSize=450M
> -XX:MaxNewSize=450M
> -XX:PermSize=128M
> -XX:MaxPermSize=245M
> -XX:SurvivorRatio=8
> -XX:TargetSurvivorRatio=90
> -XX:MaxTenuringThreshold=15
> -XX:+UseBiasedLocking
> -XX:CMSInitiatingOccupancyFraction=60
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+CMSClassUnloadingEnabled
> -XX:+UseConcMarkSweepGC
> -XX:+CMSIncrementalMode
> -XX:+UseParNewGC
> -XX:+DisableExplicitGC
> -XX:+PrintGCDateStamps
> -XX:+PrintGCDetails
> -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
>
> Moreover, I'm trying to attach the logs again.
>
> Thanks
> João
>



-- 
http://joaosavio.wordpress.com

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

Hello!

Using NIO (with channelSendOptions="4", i.e., synchronous), with lightly
load, my tests pass 100%. But, on heavy load, not all sessions are
replicated on time, and I have about 20% of errors. If I increase
maxThreads to 400, I have about 15% of errors.

More information:
* I am not performing parallel requests with same session
* my cluster has 4 nodes (all in one machine - for test purpose only)
* Java 7 64 bits, Tomcat 7.0.52, windows 7 64 bits
* using default NIO configuration, but with maxThreads=400
* VM options:
-Xms512M   - on real environment this value is 1024
-Xmx512M  - on real environment this value is 1024
-XX:NewSize=450M
-XX:MaxNewSize=450M
-XX:PermSize=128M
-XX:MaxPermSize=245M
-XX:SurvivorRatio=8
-XX:TargetSurvivorRatio=90
-XX:MaxTenuringThreshold=15
-XX:+UseBiasedLocking
-XX:CMSInitiatingOccupancyFraction=60
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+CMSClassUnloadingEnabled
-XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode
-XX:+UseParNewGC
-XX:+DisableExplicitGC
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log

Moreover, I'm trying to attach the logs again.

Thanks
João

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Filip Hanik <fi...@hanik.com>.

I'd be more inclined to continue down the path of the NIO connector, it has
been tested and used more. What are the errors you get when running with
NIO?


On Thu, Jul 3, 2014 at 8:51 AM, Konstantin Kolinko <kn...@gmail.com>
wrote:

> 2014-07-03 18:46 GMT+04:00 João Sávio <jo...@gmail.com>:
> > Unfortunately it's not working yet
> >
> > I increased the log level as you suggested. Log attached
> >
> > Thanks
> >
>
> Please read numbers 6. and 7. here:
> http://tomcat.apache.org/lists.html#tomcat-users
>
> The attachment was thrown away by the mail server.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Konstantin Kolinko <kn...@gmail.com>.

2014-07-03 18:46 GMT+04:00 João Sávio <jo...@gmail.com>:
> Unfortunately it's not working yet
>
> I increased the log level as you suggested. Log attached
>
> Thanks
>

Please read numbers 6. and 7. here:
http://tomcat.apache.org/lists.html#tomcat-users

The attachment was thrown away by the mail server.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

Unfortunately it's not working yet

I increased the log level as you suggested. Log attached

Thanks



2014-07-03 11:10 GMT-03:00 João Sávio <jo...@gmail.com>:

> Hello Mark
>
> Thanks for your answer. I put the new information and cluster works (maybe
> because I reboot my machine)
>
> Indeed, I'm trying BIO because NIO is causing session replication issues
> on load tests. Using NIO, I've already tried to increase maxThreads, and
> this minimizes the problem, but don't solve it.
>
> Thanks
>
>
> 2014-07-02 13:28 GMT-03:00 Mark Eggers <it...@yahoo.com.invalid>:
>
> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 7/2/2014 6:37 AM, João Sávio wrote:
>> > ?
>> >
>> >
>> > 2014-06-30 18:38 GMT-03:00 João Sávio <jo...@gmail.com>:
>> >
>> >> Hello people
>> >>
>> >> This is my first message on this group! I'm trying to set up a
>> >> Tomcat clustering using BIO receiver but I've been receiving the
>> >> following error when I started two nodes and tried to enter on my
>> >> application:
>> >>
>> >> Jun 30, 2014 9:25:12 PM
>> >> org.apache.catalina.tribes.transport.bio.BioReplicationT ask run
>> >> SEVERE: Unable to service bio socket
>> >> java.net.SocketTimeoutException: Read timed out at
>> >> java.net.SocketInputStream.socketRead0(Native Method) at
>> >> java.net.SocketInputStream.read(SocketInputStream.java:152) at
>> >> java.net.SocketInputStream.read(SocketInputStream.java:122) at
>> >> java.net.SocketInputStream.read(SocketInputStream.java:108) at
>> >> org.apache.catalina.tribes.transport.bio.BioReplicationTask.drainSock
>> >>
>> >>
>> et(BioReplicationTask.java:146)
>> >> at
>> >> org.apache.catalina.tribes.transport.bio.BioReplicationTask.run(BioRe
>> >>
>> >>
>> plicationTask.java:64)
>> >> at
>> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
>> >>
>> >>
>> java:1145)
>> >> at
>> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
>> >>
>> >>
>> .java:615)
>> >> at java.lang.Thread.run(Thread.java:744)
>> >>
>> >> Jun 30, 2014 9:25:15 PM
>> >> org.apache.catalina.ha.session.DeltaManager startInternal INFO:
>> >> Register manager localhost#/myApp to cluster element Engine with
>> >> name Catalina Jun 30, 2014 9:25:15 PM
>> >> org.apache.catalina.ha.session.DeltaManager startInternal INFO:
>> >> Starting clustering manager at localhost#/myApp Jun 30, 2014
>> >> 9:25:15 PM org.apache.catalina.ha.session.DeltaManager
>> >> getAllClusterSessions INFO: Manager [localhost#/myApp],
>> >> requesting session state from
>> >> org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 212,
>> >> 191, 209}:4000,{10, 212, 191, 2 09},4000, alive=32096,
>> >> securePort=-1, UDP Port=-1, id={-58 95 -116 89 86 -12 714 -75 -2
>> >> 74 -45 -54 -82 40 -114 }, payload={}, command={}, domain={}, ].
>> >> This operation will timeout if no session state has been received
>> >> within 60 seconds.
>> >>
>> >>
>> >> Attached are my two server.xml configuration. I've taken the
>> >> default configuration (worked on my computer) and replaced the
>> >> following things:
>> >>
>> >>
>> >> <Receiver
>> >> className="org.apache.catalina.tribes.transport.nio.NioReceiver"
>> >> by <Receiver
>> >> className="org.apache.catalina.tribes.transport.bio.BioReceiver"
>> >>
>> >> and
>> >>
>> >> <Transport
>> >>
>> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
>> >>
>> >>
>> by
>> >> <Transport
>> >>
>> className="org.apache.catalina.tribes.transport.bio.PooledMultiSender"/>
>> >>
>> >>
>> >>
>> >>
>> I don't know if I'm missing something. I'll be very thankful if someone
>> >> can take a look on my server.xml files or send me an example
>> >> configuration using BIO receiver (I couldn't find anyone).
>> >>
>> >> Thanks João -- http://joaosavio.wordpress.com
>>
>> João,
>>
>> Welcome to the list.
>>
>> I've only used NIO, and it's been a while since I've used clustering.
>>
>> A few questions to ask:
>>
>> 1. Is there a firewall blocking port 4000 (if the Tomcats are on
>>    separate machines)?
>> 2. Multicast enabled and routed?
>> 3. Environment (Tomcat version, java version, platforms)?
>> 4. Why BIO?
>>
>> What are you trying to accomplish by using the BIO connector rather
>> than the NIO connector?
>>
>> You can turn on more logging by adding something like the following to
>> logging.properties:
>>
>> # at the end of the handlers line:
>> ,5cluster.org.apache.juli.FileHandler
>>
>> # in the handler-specific properties section:
>> 5cluster.org.apache.juli.FileHandler.level = FINER
>> 5cluster.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
>> 5cluster.org.apache.juli.FileHandler.prefix = cluster.
>>
>> # in the facility specific properties section
>> # be aware of line wrapping
>> org.apache.catalina.tribes.MESSAGES.level = FINE
>> org.apache.catalina.tribes.MESSAGES.handlers =
>> 5cluster.org.apache.juli.FileHandler
>>
>> org.apache.catalina.tribes.level = FINE
>> org.apache.catalina.tribes.handlers = 5cluster.org.apache.juli.FileHandler
>>
>> org.apache.catalina.ha.level = FINE
>> org.apache.catalina.ha.handlers = 5cluster.org.apache.juli.FileHander
>>
>> org.apache.catalina.ha.deploy.level = INFO
>> org.apache.catalina.ha.deploy.handlers =
>> 5cluster.org.apache.juli.FileHandler
>>
>> Last note - we're normally a pretty helpful bunch of folks here, but
>> we are all volunteers. Things like work impact participation :-p. Also
>> as noted above, I've only used NIO so I don't know why this would not
>> work with BIO.
>>
>> . . . just my two cents
>> /mde/
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.13 (MingW32)
>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>>
>> iQEcBAEBAgAGBQJTtDM0AAoJEEFGbsYNeTwtkR8H/3JM2LPJBolSWp6kA2K+HJYK
>> mOn5LahCs6cQcZEKs6JKZMS5aVz9A8CPc4/e/LDfGJ86SH1kB1KAwPOwNtFjijl/
>> hpQvQYe8WTIc8Mh6QtbA1jfJD67FBqDxl356+QSrmZJnuJTEz8zAy/6L1/Fvo08c
>> HLYOjfkPj463TnDXJMUNIk1DSuHIdEnaxdXpM55ADTrkrJC06pM/wuU9znfpP/58
>> xKbkX05ODgdBLvP7sfs4vI4DoVWO81+w/4W3o7DC4y9UmqQhtTtFvSacGh8IimW2
>> G/+PVCbYUigKaeegqIVo9eCXeHLO7OBX6njWJKdzYtzPQHZ0AFS2miNC6AVH4co=
>> =bWHh
>> -----END PGP SIGNATURE-----
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>
>
> --
> http://joaosavio.wordpress.com
>
>
>


-- 
http://joaosavio.wordpress.com

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by João Sávio <jo...@gmail.com>.

Hello Mark

Thanks for your answer. I put the new information and cluster works (maybe
because I reboot my machine)

Indeed, I'm trying BIO because NIO is causing session replication issues on
load tests. Using NIO, I've already tried to increase maxThreads, and this
minimizes the problem, but don't solve it.

Thanks


2014-07-02 13:28 GMT-03:00 Mark Eggers <it...@yahoo.com.invalid>:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 7/2/2014 6:37 AM, João Sávio wrote:
> > ?
> >
> >
> > 2014-06-30 18:38 GMT-03:00 João Sávio <jo...@gmail.com>:
> >
> >> Hello people
> >>
> >> This is my first message on this group! I'm trying to set up a
> >> Tomcat clustering using BIO receiver but I've been receiving the
> >> following error when I started two nodes and tried to enter on my
> >> application:
> >>
> >> Jun 30, 2014 9:25:12 PM
> >> org.apache.catalina.tribes.transport.bio.BioReplicationT ask run
> >> SEVERE: Unable to service bio socket
> >> java.net.SocketTimeoutException: Read timed out at
> >> java.net.SocketInputStream.socketRead0(Native Method) at
> >> java.net.SocketInputStream.read(SocketInputStream.java:152) at
> >> java.net.SocketInputStream.read(SocketInputStream.java:122) at
> >> java.net.SocketInputStream.read(SocketInputStream.java:108) at
> >> org.apache.catalina.tribes.transport.bio.BioReplicationTask.drainSock
> >>
> >>
> et(BioReplicationTask.java:146)
> >> at
> >> org.apache.catalina.tribes.transport.bio.BioReplicationTask.run(BioRe
> >>
> >>
> plicationTask.java:64)
> >> at
> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
> >>
> >>
> java:1145)
> >> at
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
> >>
> >>
> .java:615)
> >> at java.lang.Thread.run(Thread.java:744)
> >>
> >> Jun 30, 2014 9:25:15 PM
> >> org.apache.catalina.ha.session.DeltaManager startInternal INFO:
> >> Register manager localhost#/myApp to cluster element Engine with
> >> name Catalina Jun 30, 2014 9:25:15 PM
> >> org.apache.catalina.ha.session.DeltaManager startInternal INFO:
> >> Starting clustering manager at localhost#/myApp Jun 30, 2014
> >> 9:25:15 PM org.apache.catalina.ha.session.DeltaManager
> >> getAllClusterSessions INFO: Manager [localhost#/myApp],
> >> requesting session state from
> >> org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 212,
> >> 191, 209}:4000,{10, 212, 191, 2 09},4000, alive=32096,
> >> securePort=-1, UDP Port=-1, id={-58 95 -116 89 86 -12 714 -75 -2
> >> 74 -45 -54 -82 40 -114 }, payload={}, command={}, domain={}, ].
> >> This operation will timeout if no session state has been received
> >> within 60 seconds.
> >>
> >>
> >> Attached are my two server.xml configuration. I've taken the
> >> default configuration (worked on my computer) and replaced the
> >> following things:
> >>
> >>
> >> <Receiver
> >> className="org.apache.catalina.tribes.transport.nio.NioReceiver"
> >> by <Receiver
> >> className="org.apache.catalina.tribes.transport.bio.BioReceiver"
> >>
> >> and
> >>
> >> <Transport
> >>
> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
> >>
> >>
> by
> >> <Transport
> >> className="org.apache.catalina.tribes.transport.bio.PooledMultiSender"/>
> >>
> >>
> >>
> >>
> I don't know if I'm missing something. I'll be very thankful if someone
> >> can take a look on my server.xml files or send me an example
> >> configuration using BIO receiver (I couldn't find anyone).
> >>
> >> Thanks João -- http://joaosavio.wordpress.com
>
> João,
>
> Welcome to the list.
>
> I've only used NIO, and it's been a while since I've used clustering.
>
> A few questions to ask:
>
> 1. Is there a firewall blocking port 4000 (if the Tomcats are on
>    separate machines)?
> 2. Multicast enabled and routed?
> 3. Environment (Tomcat version, java version, platforms)?
> 4. Why BIO?
>
> What are you trying to accomplish by using the BIO connector rather
> than the NIO connector?
>
> You can turn on more logging by adding something like the following to
> logging.properties:
>
> # at the end of the handlers line:
> ,5cluster.org.apache.juli.FileHandler
>
> # in the handler-specific properties section:
> 5cluster.org.apache.juli.FileHandler.level = FINER
> 5cluster.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
> 5cluster.org.apache.juli.FileHandler.prefix = cluster.
>
> # in the facility specific properties section
> # be aware of line wrapping
> org.apache.catalina.tribes.MESSAGES.level = FINE
> org.apache.catalina.tribes.MESSAGES.handlers =
> 5cluster.org.apache.juli.FileHandler
>
> org.apache.catalina.tribes.level = FINE
> org.apache.catalina.tribes.handlers = 5cluster.org.apache.juli.FileHandler
>
> org.apache.catalina.ha.level = FINE
> org.apache.catalina.ha.handlers = 5cluster.org.apache.juli.FileHander
>
> org.apache.catalina.ha.deploy.level = INFO
> org.apache.catalina.ha.deploy.handlers =
> 5cluster.org.apache.juli.FileHandler
>
> Last note - we're normally a pretty helpful bunch of folks here, but
> we are all volunteers. Things like work impact participation :-p. Also
> as noted above, I've only used NIO so I don't know why this would not
> work with BIO.
>
> . . . just my two cents
> /mde/
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.13 (MingW32)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJTtDM0AAoJEEFGbsYNeTwtkR8H/3JM2LPJBolSWp6kA2K+HJYK
> mOn5LahCs6cQcZEKs6JKZMS5aVz9A8CPc4/e/LDfGJ86SH1kB1KAwPOwNtFjijl/
> hpQvQYe8WTIc8Mh6QtbA1jfJD67FBqDxl356+QSrmZJnuJTEz8zAy/6L1/Fvo08c
> HLYOjfkPj463TnDXJMUNIk1DSuHIdEnaxdXpM55ADTrkrJC06pM/wuU9znfpP/58
> xKbkX05ODgdBLvP7sfs4vI4DoVWO81+w/4W3o7DC4y9UmqQhtTtFvSacGh8IimW2
> G/+PVCbYUigKaeegqIVo9eCXeHLO7OBX6njWJKdzYtzPQHZ0AFS2miNC6AVH4co=
> =bWHh
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>


-- 
http://joaosavio.wordpress.com

Re: Help with Tomcat 7 clustering using BIO receiver

Posted by Mark Eggers <it...@yahoo.com.INVALID>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 7/2/2014 6:37 AM, João Sávio wrote:
> ?
> 
> 
> 2014-06-30 18:38 GMT-03:00 João Sávio <jo...@gmail.com>:
> 
>> Hello people
>> 
>> This is my first message on this group! I'm trying to set up a
>> Tomcat clustering using BIO receiver but I've been receiving the
>> following error when I started two nodes and tried to enter on my
>> application:
>> 
>> Jun 30, 2014 9:25:12 PM 
>> org.apache.catalina.tribes.transport.bio.BioReplicationT ask run 
>> SEVERE: Unable to service bio socket 
>> java.net.SocketTimeoutException: Read timed out at
>> java.net.SocketInputStream.socketRead0(Native Method) at
>> java.net.SocketInputStream.read(SocketInputStream.java:152) at
>> java.net.SocketInputStream.read(SocketInputStream.java:122) at
>> java.net.SocketInputStream.read(SocketInputStream.java:108) at 
>> org.apache.catalina.tribes.transport.bio.BioReplicationTask.drainSock
>>
>> 
et(BioReplicationTask.java:146)
>> at 
>> org.apache.catalina.tribes.transport.bio.BioReplicationTask.run(BioRe
>>
>> 
plicationTask.java:64)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
>>
>> 
java:1145)
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
>>
>> 
.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> 
>> Jun 30, 2014 9:25:15 PM
>> org.apache.catalina.ha.session.DeltaManager startInternal INFO:
>> Register manager localhost#/myApp to cluster element Engine with 
>> name Catalina Jun 30, 2014 9:25:15 PM
>> org.apache.catalina.ha.session.DeltaManager startInternal INFO:
>> Starting clustering manager at localhost#/myApp Jun 30, 2014
>> 9:25:15 PM org.apache.catalina.ha.session.DeltaManager 
>> getAllClusterSessions INFO: Manager [localhost#/myApp],
>> requesting session state from 
>> org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 212,
>> 191, 209}:4000,{10, 212, 191, 2 09},4000, alive=32096,
>> securePort=-1, UDP Port=-1, id={-58 95 -116 89 86 -12 714 -75 -2
>> 74 -45 -54 -82 40 -114 }, payload={}, command={}, domain={}, ].
>> This operation will timeout if no session state has been received
>> within 60 seconds.
>> 
>> 
>> Attached are my two server.xml configuration. I've taken the
>> default configuration (worked on my computer) and replaced the
>> following things:
>> 
>> 
>> <Receiver
>> className="org.apache.catalina.tribes.transport.nio.NioReceiver" 
>> by <Receiver
>> className="org.apache.catalina.tribes.transport.bio.BioReceiver"
>> 
>> and
>> 
>> <Transport 
>> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
>>
>> 
by
>> <Transport 
>> className="org.apache.catalina.tribes.transport.bio.PooledMultiSender"/>
>>
>>
>>
>> 
I don't know if I'm missing something. I'll be very thankful if someone
>> can take a look on my server.xml files or send me an example
>> configuration using BIO receiver (I couldn't find anyone).
>> 
>> Thanks João -- http://joaosavio.wordpress.com

João,

Welcome to the list.

I've only used NIO, and it's been a while since I've used clustering.

A few questions to ask:

1. Is there a firewall blocking port 4000 (if the Tomcats are on
   separate machines)?
2. Multicast enabled and routed?
3. Environment (Tomcat version, java version, platforms)?
4. Why BIO?

What are you trying to accomplish by using the BIO connector rather
than the NIO connector?

You can turn on more logging by adding something like the following to
logging.properties:

# at the end of the handlers line:
,5cluster.org.apache.juli.FileHandler

# in the handler-specific properties section:
5cluster.org.apache.juli.FileHandler.level = FINER
5cluster.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
5cluster.org.apache.juli.FileHandler.prefix = cluster.

# in the facility specific properties section
# be aware of line wrapping
org.apache.catalina.tribes.MESSAGES.level = FINE
org.apache.catalina.tribes.MESSAGES.handlers =
5cluster.org.apache.juli.FileHandler

org.apache.catalina.tribes.level = FINE
org.apache.catalina.tribes.handlers = 5cluster.org.apache.juli.FileHandler

org.apache.catalina.ha.level = FINE
org.apache.catalina.ha.handlers = 5cluster.org.apache.juli.FileHander

org.apache.catalina.ha.deploy.level = INFO
org.apache.catalina.ha.deploy.handlers =
5cluster.org.apache.juli.FileHandler

Last note - we're normally a pretty helpful bunch of folks here, but
we are all volunteers. Things like work impact participation :-p. Also
as noted above, I've only used NIO so I don't know why this would not
work with BIO.

. . . just my two cents
/mde/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTtDM0AAoJEEFGbsYNeTwtkR8H/3JM2LPJBolSWp6kA2K+HJYK
mOn5LahCs6cQcZEKs6JKZMS5aVz9A8CPc4/e/LDfGJ86SH1kB1KAwPOwNtFjijl/
hpQvQYe8WTIc8Mh6QtbA1jfJD67FBqDxl356+QSrmZJnuJTEz8zAy/6L1/Fvo08c
HLYOjfkPj463TnDXJMUNIk1DSuHIdEnaxdXpM55ADTrkrJC06pM/wuU9znfpP/58
xKbkX05ODgdBLvP7sfs4vI4DoVWO81+w/4W3o7DC4y9UmqQhtTtFvSacGh8IimW2
G/+PVCbYUigKaeegqIVo9eCXeHLO7OBX6njWJKdzYtzPQHZ0AFS2miNC6AVH4co=
=bWHh
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org