You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomee.apache.org by Jlemoine <jl...@bcidaho.com> on 2016/09/29 19:25:45 UTC

HTTP Errors on Load-Balanced TomEE Servers

Hello,

The issue: TomEE is generating HTTP/1.1 500 errors after a period of time,
which is approximately once 100+ concurrent users are processing data.

All servers are Windows 2012 R2 Standard
Front-end=Apache 2.4.12\mod_jk 1.2.40 for load-balancing
Back-end=TomEE 1.7.0 instances for workflow processing using JDK 8.0_91

This setup is for internal end-users only, no access from the outside. 
Also, this setup is not for displaying web pages. The end-users access data
in a Sybase database via an Interactive client-side application.  The TomEE
instance servers process that data workflow that is then saved back to the
database.

There are two back-end servers utilizing two TomEE instances per server.
This was changed to three TomEE instances each, but this only prolonged
getting the HTTP errors slightly. Both TomEE servers have the same
resources; all TomEE instances are configured the same. The problem occurs
only when the Apache requests reach more than 100 concurrent users. The
Apache winnt mpm was configured for 500 ThreadsPerChild.  Load balancing
factor=1. The load balancing
method was at the default, which is Request. The TomEE instance ajp 1.3
connector attributes only defined the port #, protocol and redirect port #.
Other attributes would have been at default.
Setting log options to debug causes the end-users’ data processing to hang
before the 100 concurrent user count is reached.
From the testing that I am able to do with JMeter, it appears that the TomEE
ajp connector default thread limit has been reached, not allowing new
sockets to be opened.  No errors are posted in the catalina logs when this
happens.  The Apache and TomEE access logs only show the errors as HTTP/1.1
500 errors.  The Apache errors are posted 30 seconds after the TomEE errors. 
This behavior can be seen using the JK Status Manager.
This is a 10 minute test run using 225 simulated users requesting the broker
wsdl for each instance.  The first error under the Err column was generated
at a little over the 1 minute mark so it would have been an HTTP 500 error
for an end-user. This is the same, test after test after test although the
problem can show up in any instance, not just the “A” one shown here.
<http://tomee-openejb.979440.n4.nabble.com/file/n4680255/Image1.png> 






Adding these settings to TomEE’s ajp connectors,
acceptCount="150"
maxThreads="300"
minSpareThreads="50"
And upping the JMeter users to 250 over 10 minutes produced these results. 
Still getting strange, uneven behavior on the connectors but no errors in
the Err column.
<http://tomee-openejb.979440.n4.nabble.com/file/n4680255/Image2.png> 







Next, changing the load-balancer method from the default Request to Busyness
produced these results. Still no errors and far better connector loads. 
Also, note that the load-balancer Value has decreased dramatically. 
<http://tomee-openejb.979440.n4.nabble.com/file/n4680255/Image3.png> 







The positive results from these changes have been consistent over many, many
tests.
Still, since I can’t reproduce the same load in testing that the end-users
do, I cannot put this TomEE setup back into production without a high
percentage of assurance that it will not fail again due to the same problem.
Can anyone think of or suggest some further configuration changes that I
should research or even some other tuning that I could test?

Thanks,
Jim




--
View this message in context: http://tomee-openejb.979440.n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-tp4680255.html
Sent from the TomEE Users mailing list archive at Nabble.com.

Re: HTTP Errors on Load-Balanced TomEE Servers

Posted by Fabien R <th...@free.fr>.
On 30/09/2016 01:16, RAMM, David wrote:
> How do I unsubscribe from this email group? Ever since the email group moved, attempt to unsubscribe have never worked.
> 
This may help:

--- Administrative commands for the users list ---

I can handle administrative requests automatically. Please
do not send them to the list address! Instead, send
your message to the correct command address:

To subscribe to the list, send a message to:
   <us...@tomee.apache.org>

To remove your address from the list, send a message to:
   <us...@tomee.apache.org>

Send mail to the following for info and FAQ for this list:
   <us...@tomee.apache.org>
   <us...@tomee.apache.org>

Similar addresses exist for the digest list:
   <us...@tomee.apache.org>
   <us...@tomee.apache.org>

To get messages 123 through 145 (a maximum of 100 per request), mail:
   <us...@tomee.apache.org>

To get an index with subject and author for messages 123-456 , mail:
   <us...@tomee.apache.org>

They are always returned as sets of 100, max 2000 per request,
so you'll actually get 100-499.

To receive all messages with the same subject as message 12345,
send a short message to:
   <us...@tomee.apache.org>

The messages should contain one line or word of text to avoid being
treated as sp@m, but I will ignore their content.
Only the ADDRESS you send to is important.

You can start a subscription for an alternate address,
for example "john@host.domain", just add a hyphen and your
address (with '=' instead of '@') after the command word:
<us...@tomee.apache.org>

To stop subscription for this address, mail:
<us...@tomee.apache.org>

In both cases, I'll send a confirmation message to that address. When
you receive it, simply reply to it to complete your subscription.

If despite following these instructions, you do not get the
desired results, please contact my owner at
users-owner@tomee.apache.org.


RE: HTTP Errors on Load-Balanced TomEE Servers

Posted by "RAMM, David" <da...@baesystems.com>.
How do I unsubscribe from this email group? Ever since the email group moved, attempt to unsubscribe have never worked.

Cheers,

David
This email has been sent on behalf of one of the following companies within the BAE Systems Australia group of companies:

    BAE Systems Australia Limited - Australian Company Number 008 423 005
    BAE Systems Australia Defence Pty Limited - Australian Company Number 006 870 846
    BAE Systems Australia Logistics Pty Limited - Australian Company Number 086 228 864

Our registered office is Evans Building, Taranaki Road, Edinburgh Parks,
Edinburgh, South Australia, 5111. If the identity of the sending company is
not clear from the content of this email please contact the sender.

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy or
disclose its content, but please reply to this email immediately and highlight
the error to the sender and then immediately delete the message.

Re: HTTP Errors on Load-Balanced TomEE Servers

Posted by Jlemoine <jl...@bcidaho.com>.
Romain Manni-Bucau wrote
> did you check your datasource pool too? default is pretty low and under
> load you can just wait for the DB depending the app
> 
> Romain Manni-Bucau
> @rmannibucau &lt;https://twitter.com/rmannibucau&gt; |  Blog
> &lt;https://blog-rmannibucau.rhcloud.com&gt; | Old Wordpress Blog
> &lt;http://rmannibucau.wordpress.com&gt; | Github
> &lt;https://github.com/rmannibucau&gt; |
> LinkedIn &lt;https://www.linkedin.com/in/rmannibucau&gt; | Tomitriber
> &lt;http://www.tomitribe.com&gt; | JavaEE Factory
> &lt;https://javaeefactory-rmannibucau.rhcloud.com&gt;
> 
> 2016-10-03 17:11 GMT+02:00 Jlemoine &lt;

> jlemoine@

> &gt;:

No, I did not check the database connection pool settings.  I will look into
that, thank you Romain.



--
View this message in context: http://tomee-openejb.979440.n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-tp4680255p4680277.html
Sent from the TomEE Users mailing list archive at Nabble.com.

Re: HTTP Errors on Load-Balanced TomEE Servers

Posted by Romain Manni-Bucau <rm...@gmail.com>.
did you check your datasource pool too? default is pretty low and under
load you can just wait for the DB depending the app


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://blog-rmannibucau.rhcloud.com> | Old Wordpress Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Tomitriber
<http://www.tomitribe.com> | JavaEE Factory
<https://javaeefactory-rmannibucau.rhcloud.com>

2016-10-03 17:11 GMT+02:00 Jlemoine <jl...@bcidaho.com>:

> Romain Manni-Bucau wrote
> > 2016-10-03 15:43 GMT+02:00 Jlemoine &lt;
>
> > jlemoine@
>
> > &gt;:
> >
> >> 1.  I'm not sure what a "very small timeout" would be.  I currently have
> >> the
> >> ajp connector and LB timeouts at 20 seconds.
> >>
> >
> > 1ms?
> >
> >> 2.  Yes, ajp does seem to keep threads open once created, which is where
> >> I
> >> think the problem was.  No new sockets available to open.  However, so
> >> far
> >> this has not happened on the TomEE side after I changed the LB method to
> >> Busyness and made the other changes to the ajp connectors that I
> >> mentioned
> >> in the first post.
> >>
> >>
> > this is weird, ajp perf boost comes normally from the binary protocol
> > usage
> > and connections staying open. Depends the setup for sure but if not used
> > wonder if you have any interest to use ajp.
> >
> >>
> >> --
> >> View this message in context: http://tomee-openejb.979440.
> >> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-
> >> tp4680255p4680272.html
> >> Sent from the TomEE Users mailing list archive at Nabble.com.
> >>
>
> Yes Romain, from what I've researched what you're saying holds true.  But I
> had the ajp attributes at default and this bizarre behavior from the LB
> Request method was totally overloading an instance.
> <http://tomee-openejb.979440.n4.nabble.com/file/n4680275/Image4.png>
>
>
>
>
>
>
>
>
> This isn't really load balancing for us and any end-users attached to that
> "a" instance were locked up and unable to process data back to the
> database.
> I'm pretty sure I have this ironed out now but I won't know unless the
> Business wants to try out my config changes in Production.  Also, I'm still
> researching if there are other tuning options that I may need to try.
>
>
>
> --
> View this message in context: http://tomee-openejb.979440.
> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-
> tp4680255p4680275.html
> Sent from the TomEE Users mailing list archive at Nabble.com.
>

Re: HTTP Errors on Load-Balanced TomEE Servers

Posted by Jlemoine <jl...@bcidaho.com>.
Romain Manni-Bucau wrote
> 2016-10-03 15:43 GMT+02:00 Jlemoine &lt;

> jlemoine@

> &gt;:
> 
>> 1.  I'm not sure what a "very small timeout" would be.  I currently have
>> the
>> ajp connector and LB timeouts at 20 seconds.
>>
> 
> 1ms?
> 
>> 2.  Yes, ajp does seem to keep threads open once created, which is where
>> I
>> think the problem was.  No new sockets available to open.  However, so
>> far
>> this has not happened on the TomEE side after I changed the LB method to
>> Busyness and made the other changes to the ajp connectors that I
>> mentioned
>> in the first post.
>>
>>
> this is weird, ajp perf boost comes normally from the binary protocol
> usage
> and connections staying open. Depends the setup for sure but if not used
> wonder if you have any interest to use ajp.
> 
>>
>> --
>> View this message in context: http://tomee-openejb.979440.
>> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-
>> tp4680255p4680272.html
>> Sent from the TomEE Users mailing list archive at Nabble.com.
>>

Yes Romain, from what I've researched what you're saying holds true.  But I
had the ajp attributes at default and this bizarre behavior from the LB
Request method was totally overloading an instance.
<http://tomee-openejb.979440.n4.nabble.com/file/n4680275/Image4.png> 








This isn't really load balancing for us and any end-users attached to that
"a" instance were locked up and unable to process data back to the database.
I'm pretty sure I have this ironed out now but I won't know unless the
Business wants to try out my config changes in Production.  Also, I'm still
researching if there are other tuning options that I may need to try.



--
View this message in context: http://tomee-openejb.979440.n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-tp4680255p4680275.html
Sent from the TomEE Users mailing list archive at Nabble.com.

Re: HTTP Errors on Load-Balanced TomEE Servers

Posted by Romain Manni-Bucau <rm...@gmail.com>.
2016-10-03 15:43 GMT+02:00 Jlemoine <jl...@bcidaho.com>:

> Romain Manni-Bucau wrote
> > 2016-10-03 14:44 GMT+02:00 Jlemoine &lt;
>
> > jlemoine@
>
> > &gt;:
> >
> >>
> > Don't get it wrong:
> >
> > 1. having a very small timeout should reject very fast clients, no perf
> > enhancements excepted you don't have too much waiting clients
> > 2. what do you mean by the socket issue? ajp is connected so keeps
> sockets
> > by design
> >
> >>
> >> --
> >> View this message in context: http://tomee-openejb.979440.
> >> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-
> >> tp4680255p4680270.html
> >> Sent from the TomEE Users mailing list archive at Nabble.com.
> >>
>
> 1.  I'm not sure what a "very small timeout" would be.  I currently have
> the
> ajp connector and LB timeouts at 20 seconds.
>

1ms?


> 2.  Yes, ajp does seem to keep threads open once created, which is where I
> think the problem was.  No new sockets available to open.  However, so far
> this has not happened on the TomEE side after I changed the LB method to
> Busyness and made the other changes to the ajp connectors that I mentioned
> in the first post.
>
>
this is weird, ajp perf boost comes normally from the binary protocol usage
and connections staying open. Depends the setup for sure but if not used
wonder if you have any interest to use ajp.


>
>
> --
> View this message in context: http://tomee-openejb.979440.
> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-
> tp4680255p4680272.html
> Sent from the TomEE Users mailing list archive at Nabble.com.
>

Re: HTTP Errors on Load-Balanced TomEE Servers

Posted by Jlemoine <jl...@bcidaho.com>.
Romain Manni-Bucau wrote
> 2016-10-03 14:44 GMT+02:00 Jlemoine &lt;

> jlemoine@

> &gt;:
> 
>>
> Don't get it wrong:
> 
> 1. having a very small timeout should reject very fast clients, no perf
> enhancements excepted you don't have too much waiting clients
> 2. what do you mean by the socket issue? ajp is connected so keeps sockets
> by design
> 
>>
>> --
>> View this message in context: http://tomee-openejb.979440.
>> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-
>> tp4680255p4680270.html
>> Sent from the TomEE Users mailing list archive at Nabble.com.
>>

1.  I'm not sure what a "very small timeout" would be.  I currently have the
ajp connector and LB timeouts at 20 seconds.
2.  Yes, ajp does seem to keep threads open once created, which is where I
think the problem was.  No new sockets available to open.  However, so far
this has not happened on the TomEE side after I changed the LB method to
Busyness and made the other changes to the ajp connectors that I mentioned
in the first post.



--
View this message in context: http://tomee-openejb.979440.n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-tp4680255p4680272.html
Sent from the TomEE Users mailing list archive at Nabble.com.

Re: HTTP Errors on Load-Balanced TomEE Servers

Posted by Romain Manni-Bucau <rm...@gmail.com>.
2016-10-03 14:44 GMT+02:00 Jlemoine <jl...@bcidaho.com>:

> Romain Manni-Bucau wrote
> > Hi
> >
> > it would be interesting to tune the timeouts on the connector(s) -
> > potentially on the LB too.
> >
> > If it happens again don't forget to take a thread dump of processes, it
> > often helps to see what the server is doing and if it is just busy.
> >
> >
> >
> > Romain Manni-Bucau
> > @rmannibucau &lt;https://twitter.com/rmannibucau&gt; |  Blog
> > &lt;https://blog-rmannibucau.rhcloud.com&gt; | Old Wordpress Blog
> > &lt;http://rmannibucau.wordpress.com&gt; | Github
> > &lt;https://github.com/rmannibucau&gt; |
> > LinkedIn &lt;https://www.linkedin.com/in/rmannibucau&gt; | Tomitriber
> > &lt;http://www.tomitribe.com&gt; | JavaEE Factory
> > &lt;https://javaeefactory-rmannibucau.rhcloud.com&gt;
> >
> > 2016-09-29 21:25 GMT+02:00 Jlemoine &lt;
>
> > jlemoine@
>
> > &gt;:
> >
> >>
> >>
> >> --
> >> View this message in context: http://tomee-openejb.979440.
> >> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-tp4680255.html
> >> Sent from the TomEE Users mailing list archive at Nabble.com.
> >>
>
> Thanks Romain.  I have been working on testing the connector timeouts,
> although I haven't seen any performance improvements yet in my test
> environment.
> I can't mirror the same load in test as in prod but by throwing thousands
> of
> sim users at the web servers I can create the same symptoms.  Doing a
> thread
> dump with VisualVM does seem to comfirm what I suspected... a socket issue
> with the ajp connectors.
>
>
Don't get it wrong:

1. having a very small timeout should reject very fast clients, no perf
enhancements excepted you don't have too much waiting clients
2. what do you mean by the socket issue? ajp is connected so keeps sockets
by design


>
>
> --
> View this message in context: http://tomee-openejb.979440.
> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-
> tp4680255p4680270.html
> Sent from the TomEE Users mailing list archive at Nabble.com.
>

Re: HTTP Errors on Load-Balanced TomEE Servers

Posted by Jlemoine <jl...@bcidaho.com>.
Romain Manni-Bucau wrote
> Hi
> 
> it would be interesting to tune the timeouts on the connector(s) -
> potentially on the LB too.
> 
> If it happens again don't forget to take a thread dump of processes, it
> often helps to see what the server is doing and if it is just busy.
> 
> 
> 
> Romain Manni-Bucau
> @rmannibucau &lt;https://twitter.com/rmannibucau&gt; |  Blog
> &lt;https://blog-rmannibucau.rhcloud.com&gt; | Old Wordpress Blog
> &lt;http://rmannibucau.wordpress.com&gt; | Github
> &lt;https://github.com/rmannibucau&gt; |
> LinkedIn &lt;https://www.linkedin.com/in/rmannibucau&gt; | Tomitriber
> &lt;http://www.tomitribe.com&gt; | JavaEE Factory
> &lt;https://javaeefactory-rmannibucau.rhcloud.com&gt;
> 
> 2016-09-29 21:25 GMT+02:00 Jlemoine &lt;

> jlemoine@

> &gt;:
> 
>>
>>
>> --
>> View this message in context: http://tomee-openejb.979440.
>> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-tp4680255.html
>> Sent from the TomEE Users mailing list archive at Nabble.com.
>>

Thanks Romain.  I have been working on testing the connector timeouts,
although I haven't seen any performance improvements yet in my test
environment.
I can't mirror the same load in test as in prod but by throwing thousands of
sim users at the web servers I can create the same symptoms.  Doing a thread
dump with VisualVM does seem to comfirm what I suspected... a socket issue
with the ajp connectors.



--
View this message in context: http://tomee-openejb.979440.n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-tp4680255p4680270.html
Sent from the TomEE Users mailing list archive at Nabble.com.

Re: HTTP Errors on Load-Balanced TomEE Servers

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Hi

it would be interesting to tune the timeouts on the connector(s) -
potentially on the LB too.

If it happens again don't forget to take a thread dump of processes, it
often helps to see what the server is doing and if it is just busy.



Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://blog-rmannibucau.rhcloud.com> | Old Wordpress Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Tomitriber
<http://www.tomitribe.com> | JavaEE Factory
<https://javaeefactory-rmannibucau.rhcloud.com>

2016-09-29 21:25 GMT+02:00 Jlemoine <jl...@bcidaho.com>:

> Hello,
>
> The issue: TomEE is generating HTTP/1.1 500 errors after a period of time,
> which is approximately once 100+ concurrent users are processing data.
>
> All servers are Windows 2012 R2 Standard
> Front-end=Apache 2.4.12\mod_jk 1.2.40 for load-balancing
> Back-end=TomEE 1.7.0 instances for workflow processing using JDK 8.0_91
>
> This setup is for internal end-users only, no access from the outside.
> Also, this setup is not for displaying web pages. The end-users access data
> in a Sybase database via an Interactive client-side application.  The TomEE
> instance servers process that data workflow that is then saved back to the
> database.
>
> There are two back-end servers utilizing two TomEE instances per server.
> This was changed to three TomEE instances each, but this only prolonged
> getting the HTTP errors slightly. Both TomEE servers have the same
> resources; all TomEE instances are configured the same. The problem occurs
> only when the Apache requests reach more than 100 concurrent users. The
> Apache winnt mpm was configured for 500 ThreadsPerChild.  Load balancing
> factor=1. The load balancing
> method was at the default, which is Request. The TomEE instance ajp 1.3
> connector attributes only defined the port #, protocol and redirect port #.
> Other attributes would have been at default.
> Setting log options to debug causes the end-users’ data processing to hang
> before the 100 concurrent user count is reached.
> From the testing that I am able to do with JMeter, it appears that the
> TomEE
> ajp connector default thread limit has been reached, not allowing new
> sockets to be opened.  No errors are posted in the catalina logs when this
> happens.  The Apache and TomEE access logs only show the errors as HTTP/1.1
> 500 errors.  The Apache errors are posted 30 seconds after the TomEE
> errors.
> This behavior can be seen using the JK Status Manager.
> This is a 10 minute test run using 225 simulated users requesting the
> broker
> wsdl for each instance.  The first error under the Err column was generated
> at a little over the 1 minute mark so it would have been an HTTP 500 error
> for an end-user. This is the same, test after test after test although the
> problem can show up in any instance, not just the “A” one shown here.
> <http://tomee-openejb.979440.n4.nabble.com/file/n4680255/Image1.png>
>
>
>
>
>
>
> Adding these settings to TomEE’s ajp connectors,
> acceptCount="150"
> maxThreads="300"
> minSpareThreads="50"
> And upping the JMeter users to 250 over 10 minutes produced these results.
> Still getting strange, uneven behavior on the connectors but no errors in
> the Err column.
> <http://tomee-openejb.979440.n4.nabble.com/file/n4680255/Image2.png>
>
>
>
>
>
>
>
> Next, changing the load-balancer method from the default Request to
> Busyness
> produced these results. Still no errors and far better connector loads.
> Also, note that the load-balancer Value has decreased dramatically.
> <http://tomee-openejb.979440.n4.nabble.com/file/n4680255/Image3.png>
>
>
>
>
>
>
>
> The positive results from these changes have been consistent over many,
> many
> tests.
> Still, since I can’t reproduce the same load in testing that the end-users
> do, I cannot put this TomEE setup back into production without a high
> percentage of assurance that it will not fail again due to the same
> problem.
> Can anyone think of or suggest some further configuration changes that I
> should research or even some other tuning that I could test?
>
> Thanks,
> Jim
>
>
>
>
> --
> View this message in context: http://tomee-openejb.979440.
> n4.nabble.com/HTTP-Errors-on-Load-Balanced-TomEE-Servers-tp4680255.html
> Sent from the TomEE Users mailing list archive at Nabble.com.
>