You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Derek Chen-Becker <db...@cpicorp.com> on 2010/04/17 00:29:55 UTC

[users@httpd] Diagnosing TCP reset on multipart POST

I have a really strange problem on my hands and I've been working on it
all week without any meaningful progress. I'm running HTTP 2.2.14 (built
from source, not packaged) on Solaris 10 and generally everything is
working well. From time to time, though, I get a situation where
particular clients are getting TCP resets part-way through a POST of
multipart form data (confirmed with wireshark).

At first I thought it was something in PHP (webmail app), so I created a
small HTML-only page with a file upload form in it, and it exhibits the
same behavior. I also though that it might be the file itself, but if I
take a file that won't upload from one machine and put it on another
machine it uploads fine. I am not experiencing the issue at all with
GETs or non-multipart POSTs.

I've compared packet captures from clients that work OK and clients that
get the resets and I don't see any significant differences in what's
getting sent to httpd. I've also tried checking the error_log (nothing
notable) and even trussing the processes, as described on the "debugging
httpd" web page, but I don't know exactly what to look for. The
processes appear to be going away, but they don't appear to be exiting
on a SIGSEGV, since the truss doesn't pause when they exit (and the
error log doesn't talk about any SIG errors). I also tried setting up
core dumps (with all of the Solaris 10-specific settings) but I'm not
getting any cores.

The only thing I've noticed is that (due to latency and/or congestion),
there's a 350ms gap between the last ACK from httpd and then the RST
packet (no client response in that time). My understanding, though, is
that the TCP timers on my server should be re-sending the ACK if there's
a timeout, not closing the socket. In any case, prior packets in the
stream exceed 350ms delay between ACK and next packet and there aren't
any issues there.

>From what I've read, the only way these clients would be getting a TCP
reset is if the httpd process that serves them closes the socket, but so
far I can find no evidence of any errors in httpd operation. To give you
an idea of what this looks like, here's a snippet of a packet capture
showing the behavior:

Delta    Src           Dst           Summary
0.000005 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=149645
0.000006 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=149948
0.383781 10.2.2.2      192.168.25.10 [Data]
0.000008 10.2.2.2      192.168.25.10 [Data]
0.000008 10.2.2.2      192.168.25.10 [Data]
0.000012 10.2.2.2      192.168.25.10 [Data]
0.000007 10.2.2.2      192.168.25.10 [Data]
0.000007 10.2.2.2      192.168.25.10 [Data]
0.000006 10.2.2.2      192.168.25.10 [Data]
0.000008 10.2.2.2      192.168.25.10 [Data]
0.000007 10.2.2.2      192.168.25.10 [Data]
0.000008 10.2.2.2      192.168.25.10 [Data]
0.000006 10.2.2.2      192.168.25.10 [Data]
0.000006 10.2.2.2      192.168.25.10 [Data]
0.000417 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=151408
0.000006 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=152868
0.000005 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=154328
0.000005 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=155788
0.000006 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=157248
0.000005 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=158708
0.000005 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=160168
0.000005 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=161628
0.000005 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=163088
0.000006 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=164548
0.000005 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=166008
0.000005 192.168.25.10 10.2.2.2      [ACK] Seq=402 Ack=166311
0.346417 192.168.25.10 10.2.2.2      [RST] Seq=402 Win=8192 Len=0


It's like it's cleanly shutting down the process without any log message...

I've googled around but I can't find any other messages that appear to
have similar behavior. My next step is to try to put some debug log
messages into httpd and rebuild it, although I'm not even certain where
to start on that front.

Does anyone have any ideas for diagnosing where the problem really lies?
Any advice would be greatly appreciated at this point.

Thanks,

Derek



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org