You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@perl.apache.org by Niko Tyni <nt...@debian.org> on 2014/08/09 12:41:31 UTC

Re: Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure

tag 697682 = patch
thanks

On Thu, Aug 07, 2014 at 04:32:08PM +0300, Niko Tyni wrote:
> On Sat, Mar 30, 2013 at 03:26:55PM +0000, Dominic Hargreaves wrote:

> > This doesn't seem to be limited to s390; I've seen it in on i386 during
> > perl 5.16 rebuilds.
> 
> As a data point, t/apache/read3.t fails for me consistently on the
> kfreebsd-{i386,amd64} porter boxes (fischer and falla) but interestingly
> not on the kfreebsd buildds.
> 
> The test is getting an internal server error, with this in the
> error log:
> 
>   [Thu Aug 07 13:30:28.768239 2014] [perl:error] [pid 81995:tid 34593212160] [client 127.0.0.1:12832] Apache2::RequestIO::read: (70007) The timeout specified has expired at /home/ntyni/libapache2-mod-perl2-2.0.8+httpd24-r1449661/t/response/TestApache/read3.pm line 30

As this was fully reproducible on the Debian/kFreeBSD boxes, I spent
some time debugging it.

The test is about a client sending a 12000 byte POST request to the
server, which reads it in a loop 100 bytes at a time and checks that it
all came through.

Starting the server with
 t/TEST -httpd_conf $(pwd)/debian/apache2.conf -one-process -start-httpd

attaching ktrace to it
 ktrace -p $(pgrep apache2) -f ktrace.apache2.2000

and then running the client part under ktrace as well:
 ktrace -f ktrace.client.2000 -- perl -Iblib/lib -Iblib/arch t/apache/read3.t

gives nice traces to inspect with kdump.

The failure only happens when the POSTed string is large enough that
it doesn't fit in the same write() call with the HTTP headers. So
"foobar"x1300 doesn't exhibit the problem but "foobar"x2000 does.

There seems to be a deadlock here:

 client writes the HTTP headers of the POST request but not the data yet
   server reads HTTP headers
   server writes HTTP OK + part of the body
 client calls select(), gets 2 so there are two descriptors ready (read+write)
 client reads HTTP OK + part of the body
   server tries to read the POST data but gets EAGAIN (no data yet)
   server calls poll() and blocks until there's something to read
 client calls select() again and blocks

The client is using LWP::Protocol::http under the hood, and I see it
indeed stops writing to the server when it gets a full HTTP header (up
to two newlines.) This is around line 386 of lib/LWP/Protocol/http.pm
in libwww-perl_6.08-1. Also, I note that it sends the whole POST request
in one write() unless it's over eight kilobytes long (see line 276.)

I'm unsure if there's a fault with the client (it gets an FD that's
ready for writing from the first select() call but ignores it, calls
select() again and blocks because the server has already called poll()
on the same descriptor), but the deadlock can be fixed/worked around by
making the server not send a response before the full POST request has
been read. See the attached patch, which fixes this completely for me.

I can also reproduce the issue on amd64 by running the test in a loop and
putting some load on the host. The patch makes it go away there too.

Cc'ing the modperl dev list. Please consider applying the patch.
-- 
Niko Tyni   ntyni@debian.org

Re: Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure

Posted by Steve Hay <st...@googlemail.com>.
On 5 September 2014 20:06, Niko Tyni <nt...@debian.org> wrote:
> On Wed, Aug 13, 2014 at 11:47:58AM +0300, Niko Tyni wrote:
>> On Wed, Aug 13, 2014 at 09:12:11AM +0100, Steve Hay wrote:
>>
>> > Thanks for the patch. I haven't seen this failure myself on Windows,
>> > but the patch certainly doesn't seem to break anything. I'm a little
>> > uneasy about it, though. It seems odd to structure read3.pm
>> > differently to read2.pm and read4.pm, which still have a plan emitted
>> > before the first read().
>> >
>> > Why does read3.pm deadlock but read2.pm and read4.pm don't? If it's
>> > just luck then perhaps read2.pm and read4.pm should be changed
>> > similarly?
>>
>> It only happens with read3 because the POSTed string in that test is
>> very much larger than the other ones. LWP::Protocol::http sends the
>> whole POST request in one write() unless it's over eight kilobytes long
>> (see lib/LWP/Protocol/http.pm in libwww-perl_6.08 line 276.)
>
> I've updated the attached patch to change read2 too. read4 is quite
> different though, as it reads just a few bytes at a time, so I think
> it's OK to leave it as is.
>
>> > Maybe a comment in the code would be good too, since the structure of
>> > most test files is to get the plan out early. Otherwise there is scope
>> > for re-introducing the bug in the future since it isn't obvious that
>> > the placement of the plan is important.
>
> Done as well.
>

Thanks, applied in revision 1624218.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure

Posted by Niko Tyni <nt...@debian.org>.
On Wed, Aug 13, 2014 at 11:47:58AM +0300, Niko Tyni wrote:
> On Wed, Aug 13, 2014 at 09:12:11AM +0100, Steve Hay wrote:
> 
> > Thanks for the patch. I haven't seen this failure myself on Windows,
> > but the patch certainly doesn't seem to break anything. I'm a little
> > uneasy about it, though. It seems odd to structure read3.pm
> > differently to read2.pm and read4.pm, which still have a plan emitted
> > before the first read().
> > 
> > Why does read3.pm deadlock but read2.pm and read4.pm don't? If it's
> > just luck then perhaps read2.pm and read4.pm should be changed
> > similarly?
> 
> It only happens with read3 because the POSTed string in that test is
> very much larger than the other ones. LWP::Protocol::http sends the
> whole POST request in one write() unless it's over eight kilobytes long
> (see lib/LWP/Protocol/http.pm in libwww-perl_6.08 line 276.)

I've updated the attached patch to change read2 too. read4 is quite
different though, as it reads just a few bytes at a time, so I think
it's OK to leave it as is.

> > Maybe a comment in the code would be good too, since the structure of
> > most test files is to get the plan out early. Otherwise there is scope
> > for re-introducing the bug in the future since it isn't obvious that
> > the placement of the plan is important.

Done as well.

Thanks again,
-- 
Niko Tyni   ntyni@debian.org

Re: Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure

Posted by Niko Tyni <nt...@debian.org>.
Thanks for looking at the patches!

On Wed, Aug 13, 2014 at 09:12:11AM +0100, Steve Hay wrote:

> Thanks for the patch. I haven't seen this failure myself on Windows,
> but the patch certainly doesn't seem to break anything. I'm a little
> uneasy about it, though. It seems odd to structure read3.pm
> differently to read2.pm and read4.pm, which still have a plan emitted
> before the first read().
> 
> Why does read3.pm deadlock but read2.pm and read4.pm don't? If it's
> just luck then perhaps read2.pm and read4.pm should be changed
> similarly?

It only happens with read3 because the POSTed string in that test is
very much larger than the other ones. LWP::Protocol::http sends the
whole POST request in one write() unless it's over eight kilobytes long
(see lib/LWP/Protocol/http.pm in libwww-perl_6.08 line 276.)

>From a correctness POV, I suppose the others should indeed do the same
in the (IMO unlikely) case LWP::Protocol::http ever changes to do two
writes even for such short requests.

> Maybe a comment in the code would be good too, since the structure of
> most test files is to get the plan out early. Otherwise there is scope
> for re-introducing the bug in the future since it isn't obvious that
> the placement of the plan is important.

Yeah, I agree that would be good. 

I can send a revised patch later later.
-- 
Niko Tyni   ntyni@debian.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure

Posted by Steve Hay <st...@googlemail.com>.
On 9 August 2014 11:41, Niko Tyni <nt...@debian.org> wrote:
> tag 697682 = patch
> thanks
>
> On Thu, Aug 07, 2014 at 04:32:08PM +0300, Niko Tyni wrote:
>> On Sat, Mar 30, 2013 at 03:26:55PM +0000, Dominic Hargreaves wrote:
>
>> > This doesn't seem to be limited to s390; I've seen it in on i386 during
>> > perl 5.16 rebuilds.
>>
>> As a data point, t/apache/read3.t fails for me consistently on the
>> kfreebsd-{i386,amd64} porter boxes (fischer and falla) but interestingly
>> not on the kfreebsd buildds.
>>
>> The test is getting an internal server error, with this in the
>> error log:
>>
>>   [Thu Aug 07 13:30:28.768239 2014] [perl:error] [pid 81995:tid 34593212160] [client 127.0.0.1:12832] Apache2::RequestIO::read: (70007) The timeout specified has expired at /home/ntyni/libapache2-mod-perl2-2.0.8+httpd24-r1449661/t/response/TestApache/read3.pm line 30
>
> As this was fully reproducible on the Debian/kFreeBSD boxes, I spent
> some time debugging it.
>
[...]
>
> I'm unsure if there's a fault with the client (it gets an FD that's
> ready for writing from the first select() call but ignores it, calls
> select() again and blocks because the server has already called poll()
> on the same descriptor), but the deadlock can be fixed/worked around by
> making the server not send a response before the full POST request has
> been read. See the attached patch, which fixes this completely for me.
>
> I can also reproduce the issue on amd64 by running the test in a loop and
> putting some load on the host. The patch makes it go away there too.
>
> Cc'ing the modperl dev list. Please consider applying the patch.

Thanks for the patch. I haven't seen this failure myself on Windows,
but the patch certainly doesn't seem to break anything. I'm a little
uneasy about it, though. It seems odd to structure read3.pm
differently to read2.pm and read4.pm, which still have a plan emitted
before the first read().

Why does read3.pm deadlock but read2.pm and read4.pm don't? If it's
just luck then perhaps read2.pm and read4.pm should be changed
similarly?

Maybe a comment in the code would be good too, since the structure of
most test files is to get the plan out early. Otherwise there is scope
for re-introducing the bug in the future since it isn't obvious that
the placement of the plan is important.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org