You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Andrew Seales <an...@ed.ac.uk> on 2015/03/26 10:54:34 UTC
Tomcat 8 on Solaris 10/11
Hi,
We are having a problem on our production servers where downloads of
certain files are getting randomly truncated. This includes static
Javascript files, file downloads via servlets, etc, where the file is
more than about 100K. Most of the time the file downloads successfully,
but some randomly get truncated. The truncation doesn't happen in
exactly the same place every time.
I've been able to recreate the issue on our development servers using
Tomcat 8.0.20 with Java 1.8.20 and 1.8.40. I've tried Solaris 10, 11,
SPARC and x64 CPUs and the same issue occurs. I've tested on a fresh
install of Tomcat 8 and dropped in one of our larger Javascript files
into the webapps/ROOT directory and made no other changes. I'm using a
Perl script to continuously download the file and test an md5 hash
against a known good value to test if the download breaks. It also seems
to only occur when the network speed isn't very good. I use the
following command to limit the speed of my network interface:
sudo tc qdisc add dev eth0 root handle 1:0 netem rate 128kbit
I've also tested the same Tomcat on a Redhat 6 server but that appears
to work fine.
If I revert to Tomcat 7.0.59, then Solaris works fine. The problem
appears to only occur with Tomcat 8 on Solaris. I've tried v8.0.14 and
8.0.20 and they both have the problem.
The Perl script is available from
http://dlib-bauer.ucs.ed.ac.uk/testdata.pl
The Javascript file is available from
http://dlib-bauer.ucs.ed.ac.uk/ext-datadownload-20150323_1157.js
Is anyone else running Tomcat 8 on Solaris 10 or 11 with Java 8, or know
of any problems on the platform?
Regards,
--
Andrew Seales
EDINA tel: +44 (0) 131 650 3022
Edinburgh University fax: +44 (0) 131 650 3308
Causewayside House url: http://edina.ac.uk
160 Causewayside email: andrew.seales@ed.ac.uk
Edinburgh EH9 1PR
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
Re: Tomcat 8 on Solaris 10/11
Posted by Andrew Seales <an...@ed.ac.uk>.
On 26/03/15 21:10, Aurélien Terrestris wrote:
> As suggested by Rainer, I would try with the blocking connector and compare.
>
> Otherwise, it could be that your file is using very long lines (only 5
> lines for more than 800k of data). Maybe a tomcat-dev could have a
> look on this.
>
> $ wc ext-datadownload-20150323_1157.js
> 5 7634 838044 ext-datadownload-20150323_1157.js
>
I can try the non-minified version to see if it makes a difference, but
we're getting the same problem with binary files too.
--
Andrew Seales
EDINA tel: +44 (0) 131 650 3022
Edinburgh University fax: +44 (0) 131 650 3308
Causewayside House url: http://edina.ac.uk
160 Causewayside email: andrew.seales@ed.ac.uk
Edinburgh EH9 1PR
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
Re: Tomcat 8 on Solaris 10/11
Posted by Aurélien Terrestris <at...@gmail.com>.
As suggested by Rainer, I would try with the blocking connector and compare.
Otherwise, it could be that your file is using very long lines (only 5
lines for more than 800k of data). Maybe a tomcat-dev could have a
look on this.
$ wc ext-datadownload-20150323_1157.js
5 7634 838044 ext-datadownload-20150323_1157.js
2015-03-26 12:12 GMT+01:00 Rainer Jung <ra...@kippdata.de>:
> Am 26.03.2015 um 10:54 schrieb Andrew Seales:
>
>> Hi,
>>
>> We are having a problem on our production servers where downloads of
>> certain files are getting randomly truncated. This includes static
>> Javascript files, file downloads via servlets, etc, where the file is
>> more than about 100K. Most of the time the file downloads successfully,
>> but some randomly get truncated. The truncation doesn't happen in
>> exactly the same place every time.
>>
>> I've been able to recreate the issue on our development servers using
>> Tomcat 8.0.20 with Java 1.8.20 and 1.8.40. I've tried Solaris 10, 11,
>> SPARC and x64 CPUs and the same issue occurs. I've tested on a fresh
>> install of Tomcat 8 and dropped in one of our larger Javascript files
>> into the webapps/ROOT directory and made no other changes. I'm using a
>> Perl script to continuously download the file and test an md5 hash
>> against a known good value to test if the download breaks. It also seems
>> to only occur when the network speed isn't very good. I use the
>> following command to limit the speed of my network interface:
>>
>> sudo tc qdisc add dev eth0 root handle 1:0 netem rate 128kbit
>>
>> I've also tested the same Tomcat on a Redhat 6 server but that appears
>> to work fine.
>>
>> If I revert to Tomcat 7.0.59, then Solaris works fine. The problem
>> appears to only occur with Tomcat 8 on Solaris. I've tried v8.0.14 and
>> 8.0.20 and they both have the problem.
>>
>> The Perl script is available from
>> http://dlib-bauer.ucs.ed.ac.uk/testdata.pl
>> The Javascript file is available from
>> http://dlib-bauer.ucs.ed.ac.uk/ext-datadownload-20150323_1157.js
>>
>> Is anyone else running Tomcat 8 on Solaris 10 or 11 with Java 8, or know
>> of any problems on the platform?
>
>
> Yes, we do, on Solaris 10. I don't know of any such problems, but I can't
> introduce the slow network condition here to test.
>
> Is the file really truncated, i.e. too short, or is it corrupt? Can the
> truncation also be seen in the Tomcat access log? If so, could you replace
> the curl/md5sum based test with another HTTP client like LWP::Simple in perl
> or "ab" coming with Apache httpd. Just to rule out the client side of the
> picture.
>
> Is truncation always happening at the same byte? Any pattern?
>
> Which connector are you using? NIO? APR?
>
> I personally would try the following to provide additional analysis data:
> Find a setup, where you can log the client port. Use this setup and snoop
> network traffic during the test on the client and server side. Once the
> problem happens, use the local port number and timestamp to extract the
> communication pattern on the server and client side. That way you can see,
> which side closed/aborted the connection - or whether it is something in
> between client and server.
>
> Unfortunately logging the client port often is not trivial to achieve. On
> the Tomcat side (access log), currently there is only the server port
> available, not the remote port, although this would be very simple to add.
> On the short hand it would maybe work to switch to perl plus LWP and try to
> get the local port from LWP.
>
> Regards,
>
> Rainer
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
Re: Tomcat 8 on Solaris 10/11
Posted by Rainer Jung <ra...@kippdata.de>.
Am 27.03.2015 um 16:15 schrieb Rainer Jung:
> I'm thinking about adding client port as a loggable item to
> the Tomcat AccessLog but that won't help you right now.
Done and will be available starting with TC 8.0.22 and 7.0.62. Log
pattern format is %{remote}p like for Apache httpd.
Regards,
Rainer
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
Re: Tomcat 8 on Solaris 10/11
Posted by Andrew Seales <an...@ed.ac.uk>.
> Not necessarily a problem from the TCP layer point f view. But once
> you can find a request that's truncated in the TCP log, you can look out
>
> - whether it was a normal connection shutdown or a reset
> - whether there were unusual pauses between packets triggering timeouts
>
> etc. That's why you would benefit from your test client being able to
> log the client port when a failure arises so you can filter easily in
> Wireshark. I'm thinking about adding client port as a loggable item to
> the Tomcat AccessLog but that won't help you right now.
>
> I vaguely remember problems with TCP checksum offloading, but they
> should be fixed long ago. See e.g.
>
> http://compgroups.net/comp.unix.solaris/disable-e1000-tcp-checksum-offloading-t5220/472801
>
I can't see any RST packets so I can only conclude its a normal(ish)
shutdown.
I have found that if I run my test client on a Linux server that's close
to the Solaris servers, I don't have an issue with truncation. Perhaps
there's a switch issue between our Solaris servers at the outside world.
>
> I also once at a customer saw a problem not of truncation, but the
> last packet of a response as delayed quite noticable. That was fixed
> by applying the current Solaris patch cluster at that time.
I'm not sure there's a delay but I'm sure the patch level is way out of
date.
Thanks for your help though, my workaround at the moment is to just run
Tomcat 7. We're planning to migrate of Solaris onto Linux anyway, we'll
just have to wait until then before upgrading to Tomcat 8.
--
Andrew Seales
EDINA tel: +44 (0) 131 650 3022
Edinburgh University fax: +44 (0) 131 650 3308
Causewayside House url: http://edina.ac.uk
160 Causewayside email: andrew.seales@ed.ac.uk
Edinburgh EH9 1PR
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
Re: Tomcat 8 on Solaris 10/11
Posted by Rainer Jung <ra...@kippdata.de>.
Am 27.03.2015 um 15:48 schrieb Andrew Seales:
>
>> Yes, we do, on Solaris 10. I don't know of any such problems, but I
>> can't introduce the slow network condition here to test.
> Good to know. In case it wasn't clear, the network limiting is done on
> the client side, not on the server.
>>
>> Is the file really truncated, i.e. too short, or is it corrupt? Can
>> the truncation also be seen in the Tomcat access log? If so, could you
>> replace the curl/md5sum based test with another HTTP client like
>> LWP::Simple in perl or "ab" coming with Apache httpd. Just to rule out
>> the client side of the picture.
> Yes the file is definitely truncated rather than corrupted. Users of our
> services with normal browsers are noticing the problem, it's what
> prompted me to use the Perl+Curl test script.
>>
>> Is truncation always happening at the same byte? Any pattern?
>>
>> Which connector are you using? NIO? APR?
> I've tried both AJP13 and the standard HTTP1/1 connectors, both have the
> same problem. When using AJP the Apache log shows the file size as being
> truncated, I'll check the Tomcat log when using HTTP1/1 to see if it
> agrees.
>>
>> I personally would try the following to provide additional analysis
>> data: Find a setup, where you can log the client port. Use this setup
>> and snoop network traffic during the test on the client and server
>> side. Once the problem happens, use the local port number and
>> timestamp to extract the communication pattern on the server and
>> client side. That way you can see, which side closed/aborted the
>> connection - or whether it is something in between client and server.
> Thanks, I'll give Wireshark or something like that a go to see if I can
> see any TCP problems.
Not necessarily a problem from the TCP layer point f view. But once you
can find a request that's truncated in the TCP log, you can look out
- whether it was a normal connection shutdown or a reset
- whether there were unusual pauses between packets triggering timeouts
etc. That's why you would benefit from your test client being able to
log the client port when a failure arises so you can filter easily in
Wireshark. I'm thinking about adding client port as a loggable item to
the Tomcat AccessLog but that won't help you right now.
I vaguely remember problems with TCP checksum offloading, but they
should be fixed long ago. See e.g.
http://compgroups.net/comp.unix.solaris/disable-e1000-tcp-checksum-offloading-t5220/472801
I also once at a customer saw a problem not of truncation, but the last
packet of a response as delayed quite noticable. That was fixed by
applying the current Solaris patch cluster at that time.
Regards,
Rainer
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
Re: Tomcat 8 on Solaris 10/11
Posted by Andrew Seales <an...@ed.ac.uk>.
> Yes, we do, on Solaris 10. I don't know of any such problems, but I
> can't introduce the slow network condition here to test.
Good to know. In case it wasn't clear, the network limiting is done on
the client side, not on the server.
>
> Is the file really truncated, i.e. too short, or is it corrupt? Can
> the truncation also be seen in the Tomcat access log? If so, could you
> replace the curl/md5sum based test with another HTTP client like
> LWP::Simple in perl or "ab" coming with Apache httpd. Just to rule out
> the client side of the picture.
Yes the file is definitely truncated rather than corrupted. Users of our
services with normal browsers are noticing the problem, it's what
prompted me to use the Perl+Curl test script.
>
> Is truncation always happening at the same byte? Any pattern?
>
> Which connector are you using? NIO? APR?
I've tried both AJP13 and the standard HTTP1/1 connectors, both have the
same problem. When using AJP the Apache log shows the file size as being
truncated, I'll check the Tomcat log when using HTTP1/1 to see if it agrees.
>
> I personally would try the following to provide additional analysis
> data: Find a setup, where you can log the client port. Use this setup
> and snoop network traffic during the test on the client and server
> side. Once the problem happens, use the local port number and
> timestamp to extract the communication pattern on the server and
> client side. That way you can see, which side closed/aborted the
> connection - or whether it is something in between client and server.
Thanks, I'll give Wireshark or something like that a go to see if I can
see any TCP problems.
--
Andrew Seales
EDINA tel: +44 (0) 131 650 3022
Edinburgh University fax: +44 (0) 131 650 3308
Causewayside House url: http://edina.ac.uk
160 Causewayside email: andrew.seales@ed.ac.uk
Edinburgh EH9 1PR
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
Re: Tomcat 8 on Solaris 10/11
Posted by Rainer Jung <ra...@kippdata.de>.
Am 26.03.2015 um 10:54 schrieb Andrew Seales:
> Hi,
>
> We are having a problem on our production servers where downloads of
> certain files are getting randomly truncated. This includes static
> Javascript files, file downloads via servlets, etc, where the file is
> more than about 100K. Most of the time the file downloads successfully,
> but some randomly get truncated. The truncation doesn't happen in
> exactly the same place every time.
>
> I've been able to recreate the issue on our development servers using
> Tomcat 8.0.20 with Java 1.8.20 and 1.8.40. I've tried Solaris 10, 11,
> SPARC and x64 CPUs and the same issue occurs. I've tested on a fresh
> install of Tomcat 8 and dropped in one of our larger Javascript files
> into the webapps/ROOT directory and made no other changes. I'm using a
> Perl script to continuously download the file and test an md5 hash
> against a known good value to test if the download breaks. It also seems
> to only occur when the network speed isn't very good. I use the
> following command to limit the speed of my network interface:
>
> sudo tc qdisc add dev eth0 root handle 1:0 netem rate 128kbit
>
> I've also tested the same Tomcat on a Redhat 6 server but that appears
> to work fine.
>
> If I revert to Tomcat 7.0.59, then Solaris works fine. The problem
> appears to only occur with Tomcat 8 on Solaris. I've tried v8.0.14 and
> 8.0.20 and they both have the problem.
>
> The Perl script is available from
> http://dlib-bauer.ucs.ed.ac.uk/testdata.pl
> The Javascript file is available from
> http://dlib-bauer.ucs.ed.ac.uk/ext-datadownload-20150323_1157.js
>
> Is anyone else running Tomcat 8 on Solaris 10 or 11 with Java 8, or know
> of any problems on the platform?
Yes, we do, on Solaris 10. I don't know of any such problems, but I
can't introduce the slow network condition here to test.
Is the file really truncated, i.e. too short, or is it corrupt? Can the
truncation also be seen in the Tomcat access log? If so, could you
replace the curl/md5sum based test with another HTTP client like
LWP::Simple in perl or "ab" coming with Apache httpd. Just to rule out
the client side of the picture.
Is truncation always happening at the same byte? Any pattern?
Which connector are you using? NIO? APR?
I personally would try the following to provide additional analysis
data: Find a setup, where you can log the client port. Use this setup
and snoop network traffic during the test on the client and server side.
Once the problem happens, use the local port number and timestamp to
extract the communication pattern on the server and client side. That
way you can see, which side closed/aborted the connection - or whether
it is something in between client and server.
Unfortunately logging the client port often is not trivial to achieve.
On the Tomcat side (access log), currently there is only the server port
available, not the remote port, although this would be very simple to
add. On the short hand it would maybe work to switch to perl plus LWP
and try to get the local port from LWP.
Regards,
Rainer
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org