You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2017/10/14 10:09:36 UTC

[Bug 61616] New: mod_proxy_connect: stall and connection loss on bi-directional traffic

https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

            Bug ID: 61616
           Summary: mod_proxy_connect: stall and connection loss on
                    bi-directional traffic
           Product: Apache httpd-2
           Version: 2.4.25
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: mod_proxy_connect
          Assignee: bugs@httpd.apache.org
          Reporter: carsten@wolffcarsten.de
  Target Milestone: ---

Created attachment 35421
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35421&action=edit
default vhost with proxy_connect conf

When proxy_connect is used for taffic that is less of a request-response
nature, it can happen, that the send buffer on the backend socket gets full. In
that case, in libapr, when the writev() comes back with EAGAIN, it will block
in poll() on the backend socket with a 300-second timeout. No read from any of
the two sockets can happen anymore in that situation. Thus, depending on the
application protocol that is tunneled through CONNECT, the backend application
will then fillup the receive buffer at proxy_connect's backend socket and then
stall trying to send indefinitely. Now proxy_connect and the backend
application are in a deadlock.

A concrete setup to reproduce it (using debian stretch):

rsync -------------------------> apache ---------------> rsync --daemon
        CONNECT localhost:873

I will attach configuration and a script to this report and can also provide a
qemu image, if desired.

In this setup the stall looks like this:

----------------------------- rsync ----------------------------------------

# RSYNC_PROXY=localhost:80 rsync -rcP rsync://localhost:873/testsrc/ /testdst/
[..]
5269
              0   0%    0.00kB/s    0:00:00  
[ .. 5 minute delay .. ]
rsync: connection unexpectedly closed (403374822 bytes received so far)
[receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(235)
[receiver=3.1.2]
rsync: connection unexpectedly closed (242925 bytes received so far)
[generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(235)
[generator=3.1.2]

----------------------------- netstat --------------------------------------

# netstat -pnt | grep -v :22
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
PID/Program name    

tcp6       0         0 ::1:43642               ::1:80                 
ESTABLISHED 21638/rsync
tcp6  169814         0 ::1:80                  ::1:43642              
ESTABLISHED 20628/apache2
tcp6  5974823  2608576 ::1:43798               ::1:873                
ESTABLISHED 20628/apache2
tcp6  809935   2760413 ::1:873                 ::1:43798              
ESTABLISHED 21639/rsync

----------------------------- strace --------------------------------------

[..]
read(10,
"\24\372\342\1\16\255\223\375G\213k\251\215\360\357}\311\275\202\5jQ>\341>\370
jr\357,\4"..., 8000) = 8000    <---- fd 10 is client socket
writev(11, [{iov_base= [..] ., iov_len=7424}], 16) = -1 EAGAIN (Resource
temporarily unavailable)                      <---- fd 11 is backend socket
poll([{fd=11, events=POLLOUT}], 1, 300000)
 <detached ...>

----------------------------- gdb -----------------------------------------

(gdb) bt
#0  0x00007f37a88c4690 in __poll_nocancel () at
../sysdeps/unix/syscall-template.S:84
#1  0x00007f37a8dc5088 in apr_poll () from
target:/usr/lib/x86_64-linux-gnu/libapr-1.so.0
#2  0x00005572f5f781cd in send_brigade_blocking (c=0x7f37a95ad9b0,
bytes_written=0x7f37a95ade68, bb=0x7f37a95a1068, s=0x7f37a95a6c28) at
core_filters.c:747
#3  ap_core_output_filter (f=0x7f37a95add88, new_bb=0x7f37a95a1068) at
core_filters.c:542
#4  0x00007f37a58fac77 in ap_proxy_transfer_between_connections
(r=r@entry=0x7f37a95a50a0, c_i=0x7f37a95ad290, c_o=0x7f37a95ad9b0,
bb_i=<optimized out>, bb_o=0x7f37a95a1068, name=0x7f37a56e39dd "client",
sent=0x0, 
    bsize=8192, after=1) at proxy_util.c:3836
#5  0x00007f37a56e31c3 in proxy_connect_handler (r=0x7f37a95a50a0,
worker=<optimized out>, conf=<optimized out>, url=<optimized out>,
proxyname=<optimized out>, proxyport=<optimized out>) at
mod_proxy_connect.c:421

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #49 from Yann Ylavic <yl...@gmail.com> ---
Thanks Mohamed for the nice analysis, yes the issue is that there may still be
data in the input filter chain (which we didn't read yet) when the output
filter chain becomes full and we abort the transfer (APR_INCOMPLETE).

I figured out this already and was working on a new patch (got interrupted by
$day job...). It does not work for now but I will link it here when it passes
my testing.

The idea is to "drain" the input chain by not clearing POLLOUT (on the other
side) until done.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #59 from Mohamed El Housseine <el...@gmail.com> ---
(In reply to Mohamed El Housseine from comment #58)

sorry
> For ensuring this, extending the if-condition for handling
> data_in_output_filters with "&& !APR_STATUS_IS_EAGAIN(rv)" must be enought.

I mean with "&& !APR_STATUS_IS_INCOMPLETE(rv)"

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #12 from Carsten Wolff <ca...@wolffcarsten.de> ---
Created attachment 35429
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35429&action=edit
trace6 error log of a single test

(In reply to Yann Ylavic from comment #11)
> Could you run the test with LogLevel trace6 and provide the error_log?

here you are. It should be a clean log of just one test case

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #11 from Yann Ylavic <yl...@gmail.com> ---
Could you run the test with LogLevel trace6 and provide the error_log?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #9 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35427
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35427&action=edit
File read buffer and flush threshold tuning

Thanks for testing Carsten.

I think I see what happens now, the issue is that whenever
send_brigade_blocking() is entered it won't return before all the data are
flushed, regardless of the remaining threshold.

This new patch is mostly unrelated (was mainly written some times ago to allow
for better control of the outgoing traffic), but the changes in the core output
filtering could address this case (see should_block() and how it's (re)used to
bail out without blocking while we remain under the flush max threshold).

I didn't committed this patch to trunk yet because it was made for 2.4.x, and
some parts quite differ in trunk which makes the frontport non-trivial.

Could you please test it and see if it helps for this case too?
It should be applied in addition to attachment 35426 (v3) because proxy_connect
needs to do the right thing regarding POLLOUT still.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35424|0                           |1
        is obsolete|                            |

--- Comment #5 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35425
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35425&action=edit
Handle POLLOUT in proxy_connect (v2)

Small update to the patch to keep POLLOUT until all data are out.
Please test this one instead.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #60 from Yann Ylavic <yl...@gmail.com> ---
Hi Mohamed,

(In reply to Mohamed El Housseine from comment #58)
> 
> In its second call (for handling remaining data in input filters) it can
> return:
> 
> APR_SUCCESS
> APR_STATUS_IS_INCOMPLETE
> ERROR
> 
> APR_STATUS_IS_EOF can't be returned, because we are handling remaining
> (existing) data.
The issue is that ap_proxy_transfer_between_connections() is a loop, and once
it has consumed pending data it also consumes the socket (with all of its
possible error codes) until the other side is full.
So I think we need the same logic than in the POLLIN case...

Also, the httpd code in trunk now handles pending data very differently (no
c->data_in_{out,in}put_filters anymore, the whole filter chain is involved),
while our patches until now are 2.4.x (and even 2.4.29) specific.
For anything to be accepted/upstreamed in the httpd project, it must first go
into trunk, that's the next step.

When writing these patches I had that in mind because I'd like to share as much
code as possible between mod_proxy's__connect, _wstunnel and even _http (which
I'd like to see handling websocket traffic too). So the proxy_connect loop is
(expectedly) meant to be a shared helper (along with common things in struct
proxy_connect_conn).

So the goal (for me) is a wider refactorization which I wanted to make work in
2.4.x first (outside of new trunk "concepts") such that I have both something
to compare and possibly a backport plan (without the full monty ;)

Thanks anyway for your analyses and proposals that help a lot!

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Carsten Wolff <ca...@wolffcarsten.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |elh.mohamed@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #71 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Yann Ylavic from comment #70)
> I tries to keep the best of our respective proposals (no buffering nor code
> duplication...), if it works we should find an agreement right? :)

Hi Yann,

I apologize for the long silence. We have this patch running together with
https://bz.apache.org/bugzilla/attachment.cgi?id=35620 for a while now and it
seems to work well. Occasional "broken pipe" messages from ap_pass_brigade are
probably caused by real connection losses. Thank you for your persistence in
this matter!

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #38 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Carsten Wolff from comment #37)
> (In reply to Yann Ylavic from comment #36)
> > If this is the case, I'd suggest using attachment 35513 [details] only for
> > now, that's less performances (than expected?) but the same as the original
> > proxy_connect ones.
> 
> I did a quick test without attachment 35447 [details] and it seems you're
> right, the regression seems to be in that patch. I will do more thorough
> testing tomorrow.

Still, with just attachment 35513, the rsync transfer through proxy_connect
often stalls at the very end of the transfer. This disappeared when I also used
attachment 35447 and set FlushMinThreshold 1. That's the main reason why I used
both patches.

The end of the log of such a stall:

[Wed Dec 20 00:23:54.200810 2017] [ssl:trace6] [pid 24662:tid 140235413370624]
ssl_engine_io.c(219): [client 192.168.122.12:33064] bio_filter_out_write: 1317
bytes
[Wed Dec 20 00:23:54.200815 2017] [ssl:trace4] [pid 24662:tid 140235413370624]
ssl_engine_io.c(2220): [client 192.168.122.12:33064] OpenSSL: write 1317/1317
bytes to BIO#7f8b080021b0 [mem: 7f8b080025a3] (BIO dump follows)
[Wed Dec 20 00:23:54.200820 2017] [proxy:trace2] [pid 24662:tid
140235413370624] proxy_util.c(3845): [client 192.168.122.12:33064]
ap_proxy_transfer_between_connections complete
[Wed Dec 20 00:23:54.200835 2017] [core:trace6] [pid 24662:tid 140235413370624]
core_filters.c(857): [client 192.168.122.12:33064] writev_nonblocking:
1317/1317 bytes
[Wed Dec 20 00:28:54.301391 2017] [proxy_connect:error] [pid 24662:tid
140235413370624] (70007)The timeout specified has expired: [client
192.168.122.12:33064] AH01023: polling
[Wed Dec 20 00:28:54.301651 2017] [proxy_connect:trace2] [pid 24662:tid
140235413370624] mod_proxy_connect.c(562): [client 192.168.122.12:33064]
finished with poll() - cleaning up

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Carsten Wolff <ca...@wolffcarsten.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35442|0                           |1
        is obsolete|                            |

--- Comment #18 from Carsten Wolff <ca...@wolffcarsten.de> ---
Created attachment 35443
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35443&action=edit
suppress poll with long timeout in filter code (v2)

here is a cleanup of the patch, I hope this is better to read

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #10 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Yann Ylavic from comment #9)
> Could you please test it and see if it helps for this case too?
> It should be applied in addition to attachment 35426 [details] (v3) because
> proxy_connect needs to do the right thing regarding POLLOUT still.

I did now test these two patches together. The result is unchanged. strace and
backtrace look the same, send_brigade_blocking() calls apr_poll() with a
timeout of 300 seconds. After timeout it comes back with APR_TIMEUP, which
results in a socket close.
So it seems, that should_flush() sets flush_upto to a true value and thus has
no influence on the situation.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35512|0                           |1
        is obsolete|                            |

--- Comment #32 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35513
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35513&action=edit
Handle POLLOUT in proxy_connect (v8)

Well, same for the backend connection I guess, hence this v8.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #62 from Yann Ylavic <yl...@gmail.com> ---
(In reply to Carsten Wolff from comment #61)
> 
> unfortunately not, it creates busy loops again, when I:
> - start 100 transfers "simultaniously"
> - quickly interrupt the rsync clients (SIGINT)

Hmm, I can't reproduce, can you please provide the error log when this happens?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #53 from Carsten Wolff <ca...@wolffcarsten.de> ---
BTW,

(In reply to Mohamed El Housseine from comment #48)
> #b: socket_bucket_read allocates a new bucket and _overwrites_ the requested
> len with APR_BUCKET_BUFF_SIZE, before passing it to apr_socket_recv.

shouldn't this be considered a bug in APR? I would understand if the function
would just clamp to min(len, APR_BUCKET_BUFF_SIZE) and might read less than
requested. But reading more than the caller requested while returning only as
much as was requested and not telling the caller about that fact is very
unintuitive, especially if one is used to the behavior of POSIX read().

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Carsten Wolff <ca...@dsa.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carsten.wolff@dsa.de

--- Comment #34 from Carsten Wolff <ca...@dsa.de> ---
Hi Yann,

I have hit another problem with these patches, they do break mod_jk (ajp13).
The ajp code in mod_jk uses the ap_rwrite() function from server/protocol.c,
which now sometimes breaks it's semantics and doesn't finish the write of the
whole buffer passed to it. The only other user of ap_rwrite() I could find is
mod_lua, which I haven't tested. Any idea?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #63 from Yann Ylavic <yl...@gmail.com> ---
(In reply to Yann Ylavic from comment #62)
> (In reply to Carsten Wolff from comment #61)
> > 
> > unfortunately not, it creates busy loops again, when I:
> > - start 100 transfers "simultaniously"
> > - quickly interrupt the rsync clients (SIGINT)
> 
> Hmm, I can't reproduce, can you please provide the error log when this
> happens?
LogLevel trace8 ;)

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #25 from Yann Ylavic <yl...@gmail.com> ---
(In reply to Yann Ylavic from comment #24)
> Created attachment 35447 [details]
> Read buffer size and flush threshold tuning

Also not note that send_brigade_blocking() is now a bit misnamed, it's rather
send_brigade_maybe_blocking() since it first attemps to write non-blocking in
any case and that might prevent the poll() if !should_block() afterward...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #47 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Yann Ylavic from comment #46)
> The pcap shows rsynd sending data at the (timeout/premature) end of
> connection which is an error message (on the socket???).
> Something's fishy there...

Nothing fishy with that, AFAIK. rsync tries very hard to get error messages
through to the user (behind the client). From a TCP perspective there's nothing
wrong with sending data after receiving a FIN. The client may have "SHUT_WR"
the socket and would still be able to receive that message and present it to
the user. Of course the proxy has closed the socket at that point and thus
sends an RST. I think this part is actually fine.

(In reply to Yann Ylavic from comment #45)
> By running the rsync loop indefinitely, it happens once in a while that
> rsync and rsyncd stop communicating, but the proxy doesn't look at fault
> here, it poll()s appropriately for reads on both sides but none seems to
> want to send its remaining data.

I can reproduce this and I see something in my capture, which is hard to see in
yours, because you use all zeroes for the files contents (I use /dev/urandom).

First of all, if I try with a much longer timeout than 10 seconds, then, after
minutes, the transfer succeeds without error!

During the "stall" I see :
- in netstat: all involved socket buffers are empty
- in trace8 log: proxy_connect is polling for POLLIN on both sockets
Conclusion: All data sent by rsync server and client has been read by the proxy
at that point.

What I can see from my pcap:
- The last data bytes the rsync server sends _before_ the stall are in packet
6717.
- A tail chunk of these bytes is _not_ being sent to the client before the
stall.
- After minutes, the rsync client sends an empty message (\x00\x00\x00\x07 ==
len 0 MSG_DATA), which the proxy sends to the server, which causes the server
to respond with the same empty message (maybe some sort of keepalive in rsync),
which causes POLLIN on the proxy backend. Only after that, the proxy sends a
data segment (packet 8252) to the client containing the remainder of packet
6717 plus the new message.

Conclusion: Because all socket buffers are emtpy during the stall and
proxy_connect polls for POLLIN, the delayed data must sit in the apache process
during that time. Maybe under some circumstance c_o->data_in_output_filters
being false is a lie (or simply not enough) and the data has been set aside
somewhere without proxy_connect noticing?

You can find the PCAP here:
https://drive.google.com/file/d/1Rf4fI1MBIVZ8AF0XI49xDiPrryHgKaZz/view?usp=sharing

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35427|0                           |1
        is obsolete|                            |

--- Comment #24 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35447
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35447&action=edit
Read buffer size and flush threshold tuning

Allows to configure ReadBufferSize, FlushMinThreshold, FlushMaxThreshold and
FlushMaxPipelined (instead of hard coded values).

For your proxy_connect case where large files may be rsync-ed, something like
the below looks not absurd:

# Read more at once than the default 8K
ReadBufferSize 65536
# Don't block below this (i.e. more than ReadBufferSize)
FlushMaxThreshold 131072
# Always flush (non-blocking) when asked to write (i.e. don't retain data)
FlushMinThreshold 1

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #2 from carsten@wolffcarsten.de ---
Created attachment 35423
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35423&action=edit
script for reproducing the bug

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35738|0                           |1
        is obsolete|                            |
  Attachment #35739|0                           |1
        is obsolete|                            |

--- Comment #56 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35783
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35783&action=edit
Handle POLLOUT in proxy_connect (v12)

This v12 implements the "drain" flag I mentioned earlier.

It's less intrusive change than attachment 35776 (or attachment 35772) with
regard tp core filters (self contained in mod_proxy code).

It also keeps the ap_proxy_transfer_between_connections() logic in one place,
to avoid duplicating code (and error handling, which seems to be missing in
attachment 35776).

Probably more importantly, it passes my testing :)
Does it work for you?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35431|0                           |1
        is obsolete|                            |
  Attachment #35443|0                           |1
        is obsolete|                            |
  Attachment #35447|0                           |1
        is obsolete|                            |

--- Comment #39 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35620
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35620&action=edit
Read buffer size and flush threshold tuning (v2)

Thanks for the details.

I was able to reproduce and the issue came from the socket timeout set to zero
in ap_core_output_filter(), and later used for apr_poll() in
send_brigade_blocking() (i.e. poll() was asked for immediate POLLOUT
availability and timeouted otherwise, e.g. not on localhost...).

Fixed in this v2 by passing the (original) timeout to send_brigade_blocking().

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #50 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Yann Ylavic from comment #49)
> The idea is to "drain" the input chain by not clearing POLLOUT (on the other
> side) until done.

How would that work? After POLLOUT we only call ap_core_output_filter, which
does not access f->ctx->in_ctx, where the data hides. And we can't extract the
data by accessing ctx from mod_proxy_connect, because the ctx are private
structures. I don't see an elegant solution without modifying the core_filter.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35623|0                           |1
        is obsolete|                            |

--- Comment #45 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35738
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35738&action=edit
Handle POLLOUT in proxy_connect (v11)

An upadte for the patch to better/fully handle half-closed connections and
improve logging (trace8).

Leaving on shutdown failure is not necessary anymore, I can't reproduce stalls
or busy loops (anymore) with this patch and your rsync scenario.

By running the rsync loop indefinitely, it happens once in a while that rsync
and rsyncd stop communicating, but the proxy doesn't look at fault here, it
poll()s appropriately for reads on both sides but none seems to want to send
its remaining data.

I took a pcap and httpd log of the issue (will attach in next message), and the
proxy can't do better here.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #28 from Carsten Wolff <ca...@wolffcarsten.de> ---
Thanks Yann for your patience and all your work! I have some last observations
to share:

- When combining https://bz.apache.org/bugzilla/attachment.cgi?id=35427 and
https://bz.apache.org/bugzilla/attachment.cgi?id=35446, I still get frequent
stalls at the end of the transfer. Also, I observed a higher CPU usage by
apache compared to e.g. apache-2.4.27 release. During transfer, the apache
process used ~50% of one core in my test setup. For reference, I also ran the
same test with squid3 as a connect proxy and it used ~9% of one core.

- When combining https://bz.apache.org/bugzilla/attachment.cgi?id=35447 and
https://bz.apache.org/bugzilla/attachment.cgi?id=35446, I can no longer produce
any stall during or at the end of the transfer. Also, the CPU usage by apache
is reduced to ~14%

- When combining https://bz.apache.org/bugzilla/attachment.cgi?id=35447 and
https://bz.apache.org/bugzilla/attachment.cgi?id=35446 and applying the
settings you suggested in comment#24, everything still works and the CPU usage
drops to ~8%.

I will now have to run tests on my target platform, which is a bit more
complicated, so it will take a few days, but I suppose we can assume this bug
is fixed now. Thanks again!

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35432|0                           |1
        is obsolete|                            |

--- Comment #20 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35446
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35446&action=edit
Handle POLLOUT in proxy_connect (v6)

This new patch gives more control (flags) to the caller of
ap_proxy_transfer_between_connections(), which can now stop its loop when the
output filter chain starts buffering data.

It also refactors proxy_connect_handler() to have a symetric loop because the
logic had to be duplicated on both sides (the more we add logic, the less it
becomes readable).

I made more testing with it and it seems to work as expected here..

I also have an update for attachment 35427 but this patch is actually barely
related (the above patch should work on its own for the proxy connect case).

I can provide it here too if you wish because it allows to transfer more than
8K bytes at a time (a current limitation in mod_proxy_connect).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #23 from Yann Ylavic <yl...@gmail.com> ---
(In reply to Carsten Wolff from comment #21)
> 
> It gives me a hard time to reproduce the original issue. :-) But, it now
> blocks at the end of the transfer in mod_proxy_connect's poll() for 300
> seconds before finishing successfully. See backtrace and log below.

It looks like the socket is not really closed (before timeout) on the one or
other end (proxy_connect won't close on its own).

What's the output if you force "#define DEBUGGING" somewhere before
proxy_connect_handler() in mod_proxy_connect.c?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #14 from Carsten Wolff <ca...@wolffcarsten.de> ---
Created attachment 35431
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35431&action=edit
shortening the poll timeout for send_brigade_blocking

I tried this patch on top of your patches, but unfortunately it just leads to
the connection loss happening more quickly:

[Tue Oct 17 09:08:16.947301 2017] [core:trace6] [pid 13571:tid 140023972112128]
core_filters.c(878): (11)Resource temporarily unavailable: [client ::1:44128]
writev_nonblocking: 62656/65536 bytes
[Tue Oct 17 09:08:16.947332 2017] [core:trace6] [pid 13571:tid 140023972112128]
core_filters.c(407): [client ::1:44128] core_output_filter: flushing because of
max threshold
[Tue Oct 17 09:08:16.947339 2017] [core:trace6] [pid 13571:tid 140023972112128]
core_filters.c(878): (11)Resource temporarily unavailable: [client ::1:44128]
writev_nonblocking: 0/68416 bytes
[Tue Oct 17 09:08:16.947344 2017] [core:trace6] [pid 13571:tid 140023972112128]
core_filters.c(407): [client ::1:44128] core_output_filter: flushing because of
max threshold
[Tue Oct 17 09:08:16.947350 2017] [core:trace1] [pid 13571:tid 140023972112128]
core_filters.c(560): (70007)The timeout specified has expired: [client
::1:44128] core_output_filter: writing data to the network
[Tue Oct 17 09:08:16.947356 2017] [proxy:error] [pid 13571:tid 140023972112128]
(70007)The timeout specified has expired: [client ::1:44128] AH03307:
ap_proxy_transfer_between_connections: error on backend - ap_pass_brigade
[Tue Oct 17 09:08:16.947362 2017] [proxy:trace2] [pid 13571:tid
140023972112128] proxy_util.c(3843): (70007)The timeout specified has expired:
[client ::1:44128] ap_proxy_transfer_between_connections complete
[Tue Oct 17 09:08:16.947368 2017] [proxy_connect:trace2] [pid 13571:tid
140023972112128] mod_proxy_connect.c(532): [client ::1:44128] finished with
poll() - cleaning up
[Tue Oct 17 09:08:16.947390 2017] [core:trace6] [pid 13571:tid 140023972112128]
core_filters.c(418): [client ::1:873] core_output_filter: flushing because of
FLUSH bucket

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #6 from Carsten Wolff <ca...@wolffcarsten.de> ---
First of all, thanks for the very quick turnaround!

Unfortunately, the patch doesn't fix the problem. Actually, I have played with
similar approaches for changing proxy_connect_handler() before, like doing a
poll() for POLLOUT on the backend upon every POLLIN event from the client
before actually calling ap_proxy_transfer_between_connections(), or like
prefering the direction "backend to client", when both sockets issue POLLIN
events.
But, no change to the poll() logic in mod_proxy_connect.c worked, because it's
not there, where the poll() blocks. The whole situation can build up during one
call of ap_proxy_transfer_between_connections(), when rsync daemon keeps
sending long enough without reading, while rsync client is also sending. In
that case apache will always block in apr/network_io/unix/sendrecv.c without a
chance to return to mod_proxy again for another poll on the other socket.
Is there a chance to parallelize the two data directions through threading or
something?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #3 from carsten@wolffcarsten.de ---
One more remark: I selected version 2.4.25 in the bug header, because I used
that (debian) version for the reproducer. Effected versions are actually all
versions, including 2.2 branch, 2.4 branch as well as trunk.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #61 from Carsten Wolff <ca...@wolffcarsten.de> ---
Hi Yann,

(In reply to Yann Ylavic from comment #56)
> Created attachment 35783 [details]
> Handle POLLOUT in proxy_connect (v12)
> 
> Does it work for you?

unfortunately not, it creates busy loops again, when I:
- start 100 transfers "simultaniously"
- quickly interrupt the rsync clients (SIGINT)

strace repeats:

epoll_wait(15, [{EPOLLOUT|EPOLLHUP, {u32=4292461128, u64=140737485849160}}], 2,
300000) = 1
epoll_ctl(15, EPOLL_CTL_ADD, 13, {EPOLLOUT, {u32=4292461032,
u64=140737485849064}}) = 0
epoll_ctl(15, EPOLL_CTL_DEL, 14, 0x7fffffd9c0b0) = 0
shutdown(14, SHUT_WR)                   = -1 ENOTCONN (Transport endpoint is
not connected)

Just like in comment#44, (patch v10). Maybe, apr_socket_shutdown != APR_SUCCESS
should be handled afterall.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #52 from Carsten Wolff <ca...@wolffcarsten.de> ---
Created attachment 35772
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35772&action=edit
try draining input buffers

this patch goes on top of
https://bz.apache.org/bugzilla/attachment.cgi?id=35738 and
https://bz.apache.org/bugzilla/attachment.cgi?id=35620&action=edit

It's late and there needs to be more testing, but I wanted to let you know that
I seem to have good success with this patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #46 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35739
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35739&action=edit
rsync timeout: pcap and httpd log (with v11)

I used three different (linux) boxes for rsync, the proxy and rsyncd, traces
were taken on the proxy.

We can see in the log file that the proxy is poll()ing appropriately on both
sides before Timeout (set to 10).

The pcap shows rsynd sending data at the (timeout/premature) end of connection
which is an error message (on the socket???).
Something's fishy there...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #33 from Carsten Wolff <ca...@wolffcarsten.de> ---
Tests with v8 look good with SSL. Thanks! :-)

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #37 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Yann Ylavic from comment #36)
> Sorry for the delay Carsten, just to clarify, you are running httpd with
> attachment 35513 [details] [Handle POLLOUT in proxy_connect (v8)] and
> attachment 35447 [details] [ Read buffer size and flush threshold tuning],
> no other patch right?

Almost. I added these two patches to packages from debian sid
(https://packages.debian.org/sid/apache2) and to the SuSE package from obs
(https://build.opensuse.org/package/show/Apache/apache2), so there are some
distribution patches in there, but nothing essential at this time.

> While attachment 35513 [details] alone should work for the proxy_connect
> case, perfomances are better with tunings from attachment 35447 [details],
> still right?

Yes, the single-transfer test runs for about the same time, but cpu load is
less with attachment 35447, which promises better performance with concurrency.

> If this is the case, I'd suggest using attachment 35513 [details] only for
> now, that's less performances (than expected?) but the same as the original
> proxy_connect ones.

I did a quick test without attachment 35447 and it seems you're right, the
regression seems to be in that patch. I will do more thorough testing tomorrow.

> I'm looking at what could cause regressions in attachment 35447 [details],
> but this patch is work in progress anyway and not really related to this PR
> I think.

Running the client (wget) on localhost, I cannot reproduce the error. Running
it from another machine, it's there 100% of the time. The logs of both cases
look exactly the same, except for the last two lines:

GOOD:
[Mon Dec 18 13:36:59.042282 2017] [core:trace6] [pid 3507:tid 140042357954304]
core_filters.c(449): [client 192.168.122.11:43236] core_output_filter: flushing
because of max threshold
[Mon Dec 18 13:36:59.042875 2017] [core:trace6] [pid 3507:tid 140042357954304]
core_filters.c(857): [client 192.168.122.11:43236] writev_nonblocking:
512276/512276 bytes
[Mon Dec 18 13:36:59.042929 2017] [core:trace6] [pid 3507:tid 140042357954304]
core_filters.c(458): [client 192.168.122.11:43236] core_output_filter: flushing
because of FLUSH bucket

BAD:
[Mon Dec 18 13:37:21.164185 2017] [core:trace6] [pid 3537:tid 140042248849152]
core_filters.c(449): [client 192.168.122.12:51428] core_output_filter: flushing
because of max threshold
[Mon Dec 18 13:37:21.164499 2017] [core:trace6] [pid 3537:tid 140042248849152]
core_filters.c(857): (11)Resource temporarily unavailable: [client
192.168.122.12:51428] writev_nonblocking: 86880/512276 bytes
[Mon Dec 18 13:37:21.164548 2017] [core:trace6] [pid 3537:tid 140042248849152]
core_filters.c(449): [client 192.168.122.12:51428] core_output_filter: flushing
because of max threshold
[Mon Dec 18 13:37:21.164566 2017] [core:trace1] [pid 3537:tid 140042248849152]
core_filters.c(571): (70007)The timeout specified has expired: [client
192.168.122.12:51428] core_output_filter: writing data to the network

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #66 from Mohamed El Housseine <el...@gmail.com> ---
(In reply to Yann Ylavic from comment #60)
Hi Yann again,
thank you very much for for your detailed reply.

> The issue is that ap_proxy_transfer_between_connections() is a loop, and
> once it has consumed pending data it also consumes the socket (with all of
> its possible error codes) until the other side is full.
> So I think we need the same logic than in the POLLIN case...

I see, an improved will be uploaded, it handles the remaining case:
APR_STATUS_IS_EOF.

> Also, the httpd code in trunk now handles pending data very differently (no
> c->data_in_{out,in}put_filters anymore, the whole filter chain is involved),
> while our patches until now are 2.4.x (and even 2.4.29) specific.
> For anything to be accepted/upstreamed in the httpd project, it must first
> go into trunk, that's the next step.

As long this patch goes on top of proxy_connect (v11) and v11 is still not
merged into trunk, unfortunately, there is no way for my patch to into trunk
:-(

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #30 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Carsten Wolff from comment #29)
> Hi, unfortunately, using patch#35446 and patch#35447, there is another
> problem, when I change the scenario by making the client connection through
> SSL. In that case, the transfer stalls very quickly within the proxy_connect
> poll loop

When I comment the coalescing filter in ssl_io_filter_init(), it works again.
So it seems we need a way for mod_proxy_connect to cause mod_ssl to not use
that coalescing filter. Is there an accepted way for passing flags from module
to module or would this need to go through the core code?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35513|0                           |1
        is obsolete|                            |

--- Comment #40 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35621
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35621&action=edit
Handle POLLOUT in proxy_connect (v9)

While at it, I also updated this patch for a better
ap_proxy_transfer_between_connections() interface regarding the flags
parameter.

Would you mind testing this one with your update rather than v8?
Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #68 from Yann Ylavic <yl...@gmail.com> ---
You don't seem to be using attachment 35620 here (hence FlushMinThreshold).

If this is the case, you at least need to:
  #define THRESHOLD_MIN_WRITE 1
in "server/core_filters.c".

Otherwise, the core output filter is buffering up to 4K which is not suitable
for mod_proxy_connect.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #26 from Carsten Wolff <ca...@wolffcarsten.de> ---
I added this just at the end of the poll() loop:
        ap_log_rerror(APLOG_MARK, APLOG_TRACE2, 0, r,
            "@END: "
            "[%s: POLLIN: %d POLLOUT: %d POLLHUP: %d data_in_filters: %d] "
            "[%s: POLLIN: %d POLLOUT: %d POLLHUP: %d data_in_filters: %d]",
            conns[0].name,
            conns[0].pfd.reqevents & APR_POLLIN && 1,
            conns[0].pfd.reqevents & APR_POLLOUT && 1,
            conns[0].pfd.reqevents & APR_POLLHUP && 1,
            conns[0].c->data_in_output_filters,
            conns[1].name,
            conns[1].pfd.reqevents & APR_POLLIN && 1,
            conns[1].pfd.reqevents & APR_POLLOUT && 1,
            conns[1].pfd.reqevents & APR_POLLHUP && 1,
            conns[1].c->data_in_output_filters
        );


The whole long log looks like this sample:

[Sun Oct 22 21:31:00.885835 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(452): [client ::1:33400] AH01024: woke
from poll(), i=1
[Sun Oct 22 21:31:00.885891 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(483): [client ::1:33400] AH01025: client
was readable
[Sun Oct 22 21:31:00.885946 2017] [proxy:trace2] [pid 26472:tid
140206004721408] proxy_util.c(3845): [client ::1:33400]
ap_proxy_transfer_between_connections complete
[Sun Oct 22 21:31:00.885960 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(517): [client ::1:33400] backend wait
writable
[Sun Oct 22 21:31:00.885978 2017] [proxy_connect:trace2] [pid 26472:tid
140206004721408] mod_proxy_connect.c(569): [client ::1:33400] @END: [client:
POLLIN: 0 POLLOUT: 0 POLLHUP: 0 data_in_filters: 0] [backend: POLLIN: 1
POLLOUT: 1 POLLHUP: 0 data_in_filters: 1]
[Sun Oct 22 21:31:00.885992 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(452): [client ::1:33400] AH01024: woke
from poll(), i=1
[Sun Oct 22 21:31:00.886002 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(528): [client ::1:33400] backend was
writable
[Sun Oct 22 21:31:00.886068 2017] [core:trace6] [pid 26472:tid 140206004721408]
core_filters.c(878): [remote ::1:873] writev_nonblocking: 8/8 bytes
[Sun Oct 22 21:31:00.886108 2017] [proxy_connect:trace2] [pid 26472:tid
140206004721408] mod_proxy_connect.c(569): [client ::1:33400] @END: [client:
POLLIN: 1 POLLOUT: 0 POLLHUP: 0 data_in_filters: 0] [backend: POLLIN: 1
POLLOUT: 0 POLLHUP: 0 data_in_filters: 0]
[Sun Oct 22 21:31:00.888170 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(452): [client ::1:33400] AH01024: woke
from poll(), i=1
[Sun Oct 22 21:31:00.888229 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(483): [client ::1:33400] AH01025: backend
was readable
[Sun Oct 22 21:31:00.888271 2017] [proxy:trace2] [pid 26472:tid
140206004721408] proxy_util.c(3845): [client ::1:33400]
ap_proxy_transfer_between_connections complete
[Sun Oct 22 21:31:00.888307 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(517): [client ::1:33400] client wait
writable
[Sun Oct 22 21:31:00.888329 2017] [proxy_connect:trace2] [pid 26472:tid
140206004721408] mod_proxy_connect.c(569): [client ::1:33400] @END: [client:
POLLIN: 1 POLLOUT: 1 POLLHUP: 0 data_in_filters: 1] [backend: POLLIN: 0
POLLOUT: 0 POLLHUP: 0 data_in_filters: 0]
[Sun Oct 22 21:31:00.888344 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(452): [client ::1:33400] AH01024: woke
from poll(), i=1
[Sun Oct 22 21:31:00.888354 2017] [proxy_connect:debug] [pid 26472:tid
140206004721408] mod_proxy_connect.c(528): [client ::1:33400] client was
writable
[Sun Oct 22 21:31:00.888450 2017] [core:trace6] [pid 26472:tid 140206004721408]
core_filters.c(878): [client ::1:33400] writev_nonblocking: 12/12 bytes
[Sun Oct 22 21:31:00.888492 2017] [proxy_connect:trace2] [pid 26472:tid
140206004721408] mod_proxy_connect.c(569): [client ::1:33400] @END: [client:
POLLIN: 1 POLLOUT: 0 POLLHUP: 0 data_in_filters: 0] [backend: POLLIN: 1
POLLOUT: 0 POLLHUP: 0 data_in_filters: 0]


Two things are peculiar:
- we no longer request APR_POLLHUP events. Is that intentional?
- It seems like when ap_proxy_transfer_between_connections returns,
"data_in_output_filters" is always true on the outgoing connection and we're
always entering the "write wait" path.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #69 from Yann Ylavic <yl...@gmail.com> ---
(In reply to Yann Ylavic from comment #68)
> You don't seem to be using attachment 35620 [details] here (hence
> FlushMinThreshold).
This relates to comment 65, not Mohamed's patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #57 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35784
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35784&action=edit
Diff from v11 to v12

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #27 from Yann Ylavic <yl...@gmail.com> ---
The logs from comment 21 and comment 26 look normal, no timeout triggered,
proxy_connect_handler() seems to be waiting for sockets to be ready to do its
job.

(In reply to Carsten Wolff from comment #26)
> I added this just at the end of the poll() loop:
>         ap_log_rerror(APLOG_MARK, APLOG_TRACE2, 0, r,
>             "@END: "
>             "[%s: POLLIN: %d POLLOUT: %d POLLHUP: %d data_in_filters: %d] "
>             "[%s: POLLIN: %d POLLOUT: %d POLLHUP: %d data_in_filters: %d]",
>             conns[0].name,
>             conns[0].pfd.reqevents & APR_POLLIN && 1,
>             conns[0].pfd.reqevents & APR_POLLOUT && 1,
>             conns[0].pfd.reqevents & APR_POLLHUP && 1,
>             conns[0].c->data_in_output_filters,
>             conns[1].name,
>             conns[1].pfd.reqevents & APR_POLLIN && 1,
>             conns[1].pfd.reqevents & APR_POLLOUT && 1,
>             conns[1].pfd.reqevents & APR_POLLHUP && 1,
>             conns[1].c->data_in_output_filters
>         );
> 
> 
> The whole long log looks like this sample:

Events are fine here.

> 
> Two things are peculiar:
> - we no longer request APR_POLLHUP events. Is that intentional?

POLLHUP is a returned event only (ignored when requested usually), though APR
might maybe require it when pollset is implemented with select() (will loook at
this), it shouldn't be the case on Linux hence for your testing...


> - It seems like when ap_proxy_transfer_between_connections returns,
> "data_in_output_filters" is always true on the outgoing connection and we're
> always entering the "write wait" path.

Yes, ap_proxy_transfer_between_connections() used to forward everything it
reads until EAGAIN (non-blocking on read only), that's why the core output
filter ended up blocking when the outgoing socket didn't follow the incoming
rate.
I changed this in attachment 35446 with a flag (AP_PROXY_TRANSFER_UNBUFFERED)
to tell ap_proxy_transfer_between_connections() to stop when buffering happens
in ouput too (i.e. data_in_output_filters), such that in this case
mod_proxy_connect can use POLLOUT before trying again.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #58 from Mohamed El Housseine <el...@gmail.com> ---
Hi Yann,
thank you for your feedback!

> It also keeps the ap_proxy_transfer_between_connections() logic in one
> place, to avoid duplicating code (and error handling, which seems to be
> missing in attachment 35776 [details]).

The error handling is not missing, I analyzed all possible returns of
ap_proxy_transfer_between_connections, it can return:
APR_SUCCESS
APR_STATUS_IS_INCOMPLETE
APR_STATUS_IS_EOF
ERROR

In its second call (for handling remaining data in input filters) it can
return:

APR_SUCCESS
APR_STATUS_IS_INCOMPLETE
ERROR

APR_STATUS_IS_EOF can't be returned, because we are handling remaining
(existing) data.

Also if the second call returns something different than APR_SUCCESS or
APR_STATUS_IS_INCOMPLETE it must be handled as error.

For ensuring this, extending the if-condition for handling
data_in_output_filters with "&& !APR_STATUS_IS_EAGAIN(rv)" must be enought.


maybe I missed another case?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #43 from Carsten Wolff <ca...@dsa.de> ---
(In reply to Carsten Wolff from comment #42)
> I have a feeling I should repeat Tests 1 and 2 without SSL to see if SSL is
> necessary for all the stalls to be reproduced. I'll let you know.

So I did:

Test5 (only v10 applied, no SSL):
    - rsync through proxy_connect: OK

Test6 (only v10 applied, no SSL, netem):
    - rsync through proxy_connect: OK

Test7 (v10 and v2 applied, no SSL):
    - rsync through proxy_connect: OK

Test8 (v10 and v2 applied, no SSL, netem):
    - rsync through proxy_connect: OK

So it definitely seems to be necessary that the frontend connection uses SSL
for the stalls to be triggered.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35430|0                           |1
        is obsolete|                            |

--- Comment #15 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35432
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35432&action=edit
Handle POLLOUT in proxy_connect (v5)

Please use v5 instead, v4 can't work before it does not play well with EAGAIN.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35426|0                           |1
        is obsolete|                            |

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35621|0                           |1
        is obsolete|                            |

--- Comment #41 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35623
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35623&action=edit
Handle POLLOUT in proxy_connect (v10)

Argh, new v10 since v9 changed semantics of
ap_proxy_transfer_between_connections() regarding EAGAIN (which it can't since
it's public API).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #8 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Yann Ylavic from comment #7)
> We have to stop reading on the other side when writing blocks on one side.
> How does this new version work?

Not better, I'm afraid:

Thread 2 (Thread 0x7f534bfff700 (LWP 31402)):
#0  0x00007f5354d316ad in poll () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f5355232088 in apr_poll () from
target:/usr/lib/x86_64-linux-gnu/libapr-1.so.0
#2  0x000055df6ab6c17d in send_brigade_blocking (c=0x7f53559b6a98,
bytes_written=0x7f53559b6fe0, bb=0x7f53559b3ee0, s=0x7f53559b3c48) at
core_filters.c:747
#3  ap_core_output_filter (f=0x7f53559b6e70, new_bb=0x7f53559b3ee0) at
core_filters.c:542
#4  0x00007f5351d60da7 in ap_proxy_transfer_between_connections
(r=r@entry=0x7f53559b20a0, c_i=0x7f53559b6338, c_o=0x7f53559b6a98,
bb_i=<optimized out>, bb_o=0x7f53559b3ee0, name=name@entry=0x7f5351b49cee
"client", 
    sent=0x0, bsize=8192, after=1) at proxy_util.c:3839
#5  0x00007f5351b495d9 in proxy_connect_handler (r=0x7f53559b20a0,
worker=<optimized out>, conf=<optimized out>, url=<optimized out>,
proxyname=<optimized out>, proxyport=<optimized out>) at
mod_proxy_connect.c:468

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #54 from Mohamed El Housseine <el...@gmail.com> ---
Created attachment 35774
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35774&action=edit
patch for fixing the problem of remaining data in input ctx

Hi yan,
Hi Carsten,

i could fix that problem with the attached patch. There was no need for big
changes. after succesfull sending of all data in output_filters I check if
there are remaining data in input_filters and resend them.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #17 from Carsten Wolff <ca...@wolffcarsten.de> ---
Created attachment 35442
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35442&action=edit
suppress poll with long timeout in filter code

Hi again,

this patch, ontop of https://bz.apache.org/bugzilla/attachment.cgi?id=35427 and
https://bz.apache.org/bugzilla/attachment.cgi?id=35432 fixes it for me (tm).

I'm not sure, if this is an acceptable way of doing it for you and I am sure
the patch needs cleanup, but I wanted to let you know immediately, since I have
no time for the rest of the day.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #13 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35430
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35430&action=edit
Handle POLLOUT in proxy_connect (v4)

Thanks for the traces, I missed that proxy_connect was FLUSHing data.

How about this v4 (to replace attachment 35426 (v3), in addition to attachment
35427 still)?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #29 from Carsten Wolff <ca...@wolffcarsten.de> ---
Hi, unfortunately, using patch#35446 and patch#35447, there is another problem,
when I change the scenario by making the client connection through SSL. In that
case, the transfer stalls very quickly within the proxy_connect poll loop:


[Thu Nov 09 20:34:28.029653 2017] [proxy_connect:debug] [pid 19603:tid
139754421921536] mod_proxy_connect.c(246): [client 127.0.0.1:56996] AH01019:
connecting localhost:873 to localhost:873
[Thu Nov 09 20:34:28.029745 2017] [proxy_connect:trace1] [pid 19603:tid
139754421921536] mod_proxy_connect.c(272): [client 127.0.0.1:56996] connecting
to remote proxy localhost on port 873
[Thu Nov 09 20:34:28.029776 2017] [proxy:trace2] [pid 19603:tid
139754421921536] proxy_util.c(2080): [client 127.0.0.1:56996] CONNECT: fam 10
socket created to connect to localhost
[Thu Nov 09 20:34:28.029877 2017] [proxy_connect:trace3] [pid 19603:tid
139754421921536] mod_proxy_connect.c(355): [client 127.0.0.1:56996] connection
complete to [::1]:873 (localhost)
[Thu Nov 09 20:34:28.029905 2017] [proxy_connect:trace1] [pid 19603:tid
139754421921536] mod_proxy_connect.c(377): [client 127.0.0.1:56996] Returning
200 OK
[Thu Nov 09 20:34:28.029921 2017] [ssl:trace6] [pid 19603:tid 139754421921536]
ssl_engine_io.c(859): [client 127.0.0.1:56996] ssl_filter_write: 76 bytes
[Thu Nov 09 20:34:28.029946 2017] [ssl:trace6] [pid 19603:tid 139754421921536]
ssl_engine_io.c(219): [client 127.0.0.1:56996] bio_filter_out_write: 105 bytes
[Thu Nov 09 20:34:28.029967 2017] [ssl:trace4] [pid 19603:tid 139754421921536]
ssl_engine_io.c(2220): [client 127.0.0.1:56996] OpenSSL: write 105/105 bytes to
BIO#7f1b040021b0 [mem: 7f1b040025a3] (BIO dump follows)
[Thu Nov 09 20:34:28.029982 2017] [core:trace6] [pid 19603:tid 139754421921536]
core_filters.c(458): [client 127.0.0.1:56996] core_output_filter: flushing
because of FLUSH bucket
[Thu Nov 09 20:34:28.030012 2017] [core:trace6] [pid 19603:tid 139754421921536]
core_filters.c(857): [client 127.0.0.1:56996] writev_nonblocking: 105/105 bytes
[Thu Nov 09 20:34:28.030027 2017] [proxy_connect:trace2] [pid 19603:tid
139754421921536] mod_proxy_connect.c(399): [client 127.0.0.1:56996] setting up
poll()
[Thu Nov 09 20:34:28.032931 2017] [proxy_connect:debug] [pid 19603:tid
139754421921536] mod_proxy_connect.c(454): [client 127.0.0.1:56996] AH01024:
woke from poll(), i=1
[Thu Nov 09 20:34:28.032971 2017] [proxy_connect:debug] [pid 19603:tid
139754421921536] mod_proxy_connect.c(485): [client 127.0.0.1:56996] AH01025:
client was readable
[Thu Nov 09 20:34:28.033024 2017] [ssl:trace4] [pid 19603:tid 139754421921536]
ssl_engine_io.c(2220): [client 127.0.0.1:56996] OpenSSL: read 5/5 bytes from
BIO#7f1b04002270 [mem: 7f1b040025a3] (BIO dump follows)
[Thu Nov 09 20:34:28.033041 2017] [ssl:trace4] [pid 19603:tid 139754421921536]
ssl_engine_io.c(2220): [client 127.0.0.1:56996] OpenSSL: read 38/38 bytes from
BIO#7f1b04002270 [mem: 7f1b040025a8] (BIO dump follows)
[Thu Nov 09 20:34:28.033060 2017] [proxy:debug] [pid 19603:tid 139754421921536]
proxy_util.c(3803): [client 127.0.0.1:56996] AH03306:
ap_proxy_transfer_between_connections: read 14 bytes from client
[Thu Nov 09 20:34:28.033089 2017] [proxy:trace2] [pid 19603:tid
139754421921536] proxy_util.c(3847): [client 127.0.0.1:56996]
ap_proxy_transfer_between_connections complete
[Thu Nov 09 20:34:28.033098 2017] [proxy_connect:debug] [pid 19603:tid
139754421921536] mod_proxy_connect.c(519): [client 127.0.0.1:56996] backend
wait writable
[Thu Nov 09 20:34:28.033112 2017] [proxy_connect:debug] [pid 19603:tid
139754421921536] mod_proxy_connect.c(454): [client 127.0.0.1:56996] AH01024:
woke from poll(), i=1
[Thu Nov 09 20:34:28.033120 2017] [proxy_connect:debug] [pid 19603:tid
139754421921536] mod_proxy_connect.c(530): [client 127.0.0.1:56996] backend was
writable
[Thu Nov 09 20:34:28.033159 2017] [core:trace6] [pid 19603:tid 139754421921536]
core_filters.c(857): [remote ::1:873] writev_nonblocking: 14/14 bytes
[Thu Nov 09 20:34:28.033755 2017] [proxy_connect:debug] [pid 19603:tid
139754421921536] mod_proxy_connect.c(454): [client 127.0.0.1:56996] AH01024:
woke from poll(), i=1
[Thu Nov 09 20:34:28.033782 2017] [proxy_connect:debug] [pid 19603:tid
139754421921536] mod_proxy_connect.c(485): [client 127.0.0.1:56996] AH01025:
backend was readable
[Thu Nov 09 20:34:28.033817 2017] [proxy:debug] [pid 19603:tid 139754421921536]
proxy_util.c(3803): [client 127.0.0.1:56996] AH03306:
ap_proxy_transfer_between_connections: read 14 bytes from backend
[Thu Nov 09 20:34:28.033835 2017] [ssl:trace4] [pid 19603:tid 139754421921536]
ssl_engine_io.c(1674): [client 127.0.0.1:56996] coalesce: have 0 bytes, adding
14 more
[Thu Nov 09 20:34:28.033848 2017] [proxy:trace2] [pid 19603:tid
139754421921536] proxy_util.c(3847): [client 127.0.0.1:56996]
ap_proxy_transfer_between_connections complete
[Thu Nov 09 20:34:28.582512 2017] [proxy:debug] [pid 19638:tid 139754523858112]
proxy_util.c(1781): AH00925: initializing worker proxy:forward shared
[Thu Nov 09 20:34:28.582666 2017] [proxy:debug] [pid 19638:tid 139754523858112]
proxy_util.c(1823): AH00927: initializing worker proxy:forward local
[Thu Nov 09 20:34:28.582706 2017] [proxy:debug] [pid 19638:tid 139754523858112]
proxy_util.c(1858): AH00930: initialized pool in child 19638 for (*) min=0
max=25 smax=25
[Thu Nov 09 20:34:28.582732 2017] [proxy:debug] [pid 19638:tid 139754523858112]
proxy_util.c(1781): AH00925: initializing worker proxy:forward shared
[Thu Nov 09 20:34:28.582746 2017] [proxy:debug] [pid 19638:tid 139754523858112]
proxy_util.c(1823): AH00927: initializing worker proxy:forward local
[Thu Nov 09 20:34:28.582772 2017] [proxy:debug] [pid 19638:tid 139754523858112]
proxy_util.c(1858): AH00930: initialized pool in child 19638 for (*) min=0
max=25 smax=25
[Thu Nov 09 20:39:28.134117 2017] [proxy_connect:error] [pid 19603:tid
139754421921536] (70007)The timeout specified has expired: [client
127.0.0.1:56996] AH01023: polling
[Thu Nov 09 20:39:28.134322 2017] [proxy_connect:trace2] [pid 19603:tid
139754421921536] mod_proxy_connect.c(560): [client 127.0.0.1:56996] finished
with poll() - cleaning up
[Thu Nov 09 20:39:28.134870 2017] [ssl:trace4] [pid 19603:tid 139754421921536]
ssl_engine_io.c(1734): [client 127.0.0.1:56996] coalesce: passing on 14 bytes

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35784|Diff from v11 to v12        |Incremental patch from v11
        description|                            |to v12

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Mohamed El Housseine <el...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35774|0                           |1
        is obsolete|                            |

--- Comment #55 from Mohamed El Housseine <el...@gmail.com> ---
Created attachment 35776
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35776&action=edit
patch for fixing the problem of remaining data in input ctx v2

Hi Yann again,

I made some improvement to keep the changes as minimal as possible.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #64 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Yann Ylavic from comment #62)
> Hmm, I can't reproduce, can you please provide the error log when this
> happens?

Will do. In the meantime, another observation: Because your patch v12 is
constrained to the module code and thus cannot determine, if there still is
data in the input filter, it just assumes whenever it cannot write, that there
may still be data in the input filter of other. This means, that you call
transfer_between_connections every time this happens, while our patch only does
that, when there actually is data in the input filters. Since
transfer_between_connections does not only drain data from it's input filter
but also reads some more data again, we thought you may potentially buffer a
lot more data in application memory and allow the sending endpoint to send a
lot more, even though you cannot be sure that you will be able to write it
again. A simple test with "watch free -m" seems to confirm that, during my test
(100 transfers à 10MiB), the application mem with your patch goes up to about
560MiB on my machine, while ours just reaches about 350MiB.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #72 from Tom Drake <dr...@gmail.com> ---
I believe that I am encountering this same issue. 

I had reported my issue on server fault here:
https://serverfault.com/questions/949259/apache-buffers-tomcat-responses-and-closes-connection-to-tomcat-when-request-si

The way I have prevented the connection failure from happening is to eliminate
the bi-directional communication; my proxied web-service consumes all of the
input and processes all of it before writing content back out. While this
prevents the error from occurring when there is too much data, it adds an
undesirable latency to the web-service.

I'm interested in finding out if and when the current patch will be released?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #74 from Tom Drake <dr...@gmail.com> ---
Carsten, thanks for your reply.
Do you have any suggestions for me as to how to get around the issue I'm
experiencing?

Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #21 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Yann Ylavic from comment #19)
> Thanks for the patch Carsten, but there can't be an exception for
> send_brigade_blocking() to block when the conditions are met, the core
> output filter must control the memory buckets in flight (set aside).

Sure, using the timeout value as a criterion in send_brigade_blocking is ugly.
I see there's a APR_NONBLOCK_READ flag one can pass to ap_get_brigade(). Maybe
there should be a APR_NONBLOCK_WRITE flag one could pass to ap_pass_brigade()
(and further down to send_brigade_blocking) to absolutely make sure, the poll()
there is never hit?

(In reply to Yann Ylavic from comment #20)
> Created attachment 35446 [details]
> Handle POLLOUT in proxy_connect (v6)
> 
> This new patch gives more control (flags) to the caller of
> ap_proxy_transfer_between_connections(), which can now stop its loop when
> the output filter chain starts buffering data.
>
> It also refactors proxy_connect_handler() to have a symetric loop because
> the logic had to be duplicated on both sides (the more we add logic, the
> less it becomes readable).

Much nicer indeed, you also choose the right socket on POLLOUT now, which was
wrong in v5.

> I made more testing with it and it seems to work as expected here..

It gives me a hard time to reproduce the original issue. :-) But, it now blocks
at the end of the transfer in mod_proxy_connect's poll() for 300 seconds before
finishing successfully. See backtrace and log below.

> I also have an update for attachment 35427 [details] but this patch is
> actually barely related (the above patch should work on its own for the
> proxy connect case).
> 
> I can provide it here too if you wish because it allows to transfer more
> than 8K bytes at a time (a current limitation in mod_proxy_connect).

This might be interesting for performance of connections on loopback, where the
segment size is 64k.

Current backtrace:
Thread 2 (Thread 0x7ffa5f0b9700 (LWP 5797)):
#0  0x00007ffa63d180f3 in epoll_wait () at
../sysdeps/unix/syscall-template.S:84
#1  0x00007ffa6420e201 in impl_pollset_poll (pollset=0x7ffa64990f78,
timeout=<optimized out>, num=0x7ffa5f0b6a94, descriptors=0x7ffa5f0b6aa0) at
./poll/unix/epoll.c:266
#2  0x00007ffa60b263b7 in proxy_connect_handler (r=0x7ffa6498f0a0,
worker=<optimized out>, conf=<optimized out>, url=<optimized out>,
proxyname=<optimized out>, 
    proxyport=<optimized out>) at mod_proxy_connect.c:442


Log tail of the end of the transfer:
[Sat Oct 21 16:28:29.475904 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:28:29.476140 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [client ::1:33340] writev_nonblocking: 8192/8192 bytes
[Sat Oct 21 16:28:29.476241 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:28:29.476510 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [client ::1:33340] writev_nonblocking: 8192/8192 bytes
[Sat Oct 21 16:28:29.476685 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:28:29.476985 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [client ::1:33340] writev_nonblocking: 8192/8192 bytes
[Sat Oct 21 16:33:46.176810 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.178932 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.179110 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [remote ::1:873] writev_nonblocking: 4/4 bytes
[Sat Oct 21 16:33:46.186064 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [client ::1:33340] writev_nonblocking: 4272/4272 bytes
[Sat Oct 21 16:33:46.199852 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.200032 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [remote ::1:873] writev_nonblocking: 5/5 bytes
[Sat Oct 21 16:33:46.200317 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.200365 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [client ::1:33340] writev_nonblocking: 5/5 bytes
[Sat Oct 21 16:33:46.200793 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.200898 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [remote ::1:873] writev_nonblocking: 7/7 bytes
[Sat Oct 21 16:33:46.200920 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.200952 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [client ::1:33340] writev_nonblocking: 6/6 bytes
[Sat Oct 21 16:33:46.242005 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.242141 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [client ::1:33340] writev_nonblocking: 25/25 bytes
[Sat Oct 21 16:33:46.242524 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.244044 2017] [core:trace6] [pid 5795:tid 140713313146624]
core_filters.c(878): [remote ::1:873] writev_nonblocking: 5/5 bytes
[Sat Oct 21 16:33:46.256132 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.256195 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): (70014)End of file found: [client ::1:33340]
ap_proxy_transfer_between_connections complete
[Sat Oct 21 16:33:46.256236 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): [client ::1:33340] ap_proxy_transfer_between_connections
complete
[Sat Oct 21 16:33:46.256245 2017] [proxy:trace2] [pid 5795:tid 140713313146624]
proxy_util.c(3845): (70014)End of file found: [client ::1:33340]
ap_proxy_transfer_between_connections complete
[Sat Oct 21 16:33:46.256368 2017] [proxy_connect:trace2] [pid 5795:tid
140713313146624] mod_proxy_connect.c(558): [client ::1:33340] finished with
poll() - cleaning up

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #42 from Carsten Wolff <ca...@wolffcarsten.de> ---
Created attachment 35639
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35639&action=edit
pcaps of a stall in rsync handshake (Test4)

Hello Yann, I wish you a happy new year!

I did some more testing on different combinations of these patches. Here are
the results (I will call patch 35620 "v2" and patch 35623 "v10"):

Test1 (only v10 applied):
    - wget for static contents: OK
    - rsync through SSL+proxy_connect: sometimes still stalls at the very end
of the transfer (just like in comment#38)

Test2 (v10 and v2 applied):
    - wget for static contents: OK
    - rsync through SSL+proxy_connect: often stalls at the beginning of the
transfer (during the rsync protocol handshake)

Test3 (v10 and v2 applied, configuration from comment#24 added):
    - wget for static contents: OK
    - rsync through SSL+proxy_connect: OK (can't reproduce in the scenario I
used so far ... but ...)

Because I was under the impression that timing plays a role in that code, I
decided to repeat Test3 under different network conditions and used netem(1) to
change the characteristics of the virtual network:

Test4 (v10 and v2 applied, configuration from comment#24 added, netem):
    - wget for static contents: OK
    - rsync through SSL+proxy_connect: often stalls(2)(3) at the beginning of
the transfer (during the rsync protocol handshake)

So I think it's safe to say that the bug in attachment 35447 has been fixed by
"v2". But, unfortunately, stalls in SSL+proxy_connect can still be provoked
under certain conditions. Next, I tried without SSL:

Test5 (v10 and v2 applied, configuration from comment#24 added, netem, no SSL): 
    - rsync through proxy_connect: OK (cannot seem to reproduce the stalls)

I have a feeling I should repeat Tests 1 and 2 without SSL to see if SSL is
necessary for all the stalls to be reproduced. I'll let you know.

(1)
tc qdisc add dev vnet0 root handle 1: netem delay 100ms 50ms 25%
tc qdisc add dev vnet0 parent 1:1 pfifo limit 1000

(2) log of stall in Test4
[Thu Dec 28 09:06:01.752182 2017] [proxy_connect:debug] [pid 9674:tid
139811025307392] mod_proxy_connect.c(503): [client 192.168.122.12:33950] client
wait writable
[Thu Dec 28 09:06:01.752210 2017] [proxy_connect:debug] [pid 9674:tid
139811025307392] mod_proxy_connect.c(456): [client 192.168.122.12:33950]
AH01024: woke from poll(), i=1
[Thu Dec 28 09:06:01.752227 2017] [proxy_connect:debug] [pid 9674:tid
139811025307392] mod_proxy_connect.c(531): [client 192.168.122.12:33950] client
was writable
[Thu Dec 28 09:06:01.752316 2017] [core:trace6] [pid 9674:tid 139811025307392]
core_filters.c(740): [client 192.168.122.12:33950] writev_nonblocking: 69/69
bytes
[Thu Dec 28 09:06:01.894674 2017] [proxy_connect:debug] [pid 9674:tid
139811025307392] mod_proxy_connect.c(456): [client 192.168.122.12:33950]
AH01024: woke from poll(), i=1
[Thu Dec 28 09:06:01.894840 2017] [proxy_connect:debug] [pid 9674:tid
139811025307392] mod_proxy_connect.c(487): [client 192.168.122.12:33950]
AH01025: client was readable
[Thu Dec 28 09:06:01.895319 2017] [ssl:trace4] [pid 9674:tid 139811025307392]
ssl_engine_io.c(2220): [client 192.168.122.12:33950] OpenSSL: read 5/5 bytes
from BIO#7f282c002270 [mem: 7f282c002bc3] (BIO dump follows)
[Thu Dec 28 09:06:01.895393 2017] [ssl:trace4] [pid 9674:tid 139811025307392]
ssl_engine_io.c(2220): [client 192.168.122.12:33950] OpenSSL: read 64/64 bytes
from BIO#7f282c002270 [mem: 7f282c002bc8] (BIO dump follows)
[Thu Dec 28 09:06:01.895495 2017] [proxy:trace2] [pid 9674:tid 139811025307392]
proxy_util.c(3845): [client 192.168.122.12:33950]
ap_proxy_transfer_between_connections: output filters full
[Thu Dec 28 09:06:01.895531 2017] [proxy:trace2] [pid 9674:tid 139811025307392]
proxy_util.c(3866): (70008)Partial results are valid but processing is
incomplete: [client 192.168.122.12:33950] ap_proxy_transfer_between_connections
finished
[Thu Dec 28 09:06:01.895557 2017] [proxy_connect:debug] [pid 9674:tid
139811025307392] mod_proxy_connect.c(503): [client 192.168.122.12:33950]
backend wait writable
[Thu Dec 28 09:06:01.895737 2017] [proxy_connect:debug] [pid 9674:tid
139811025307392] mod_proxy_connect.c(456): [client 192.168.122.12:33950]
AH01024: woke from poll(), i=1
[Thu Dec 28 09:06:01.895785 2017] [proxy_connect:debug] [pid 9674:tid
139811025307392] mod_proxy_connect.c(531): [client 192.168.122.12:33950]
backend was writable
[Thu Dec 28 09:06:01.895915 2017] [core:trace6] [pid 9674:tid 139811025307392]
core_filters.c(740): [remote ::1:873] writev_nonblocking: 9/9 bytes

(3) bt of stall in Test4
Thread 3 (Thread 0x7f284bfff700 (LWP 9827)):
#0  0x00007f2855f3f0f3 in epoll_wait () at
../sysdeps/unix/syscall-template.S:84
#1  0x00007f2856435201 in impl_pollset_poll (pollset=0x7f2856b1e028,
timeout=<optimized out>, num=0x7f284bffca34, descriptors=0x7f284bffca40) at
./poll/unix/epoll.c:266
#2  0x00007f2852d4c4b5 in proxy_connect_handler (r=0x7f2856b1c0a0,
worker=<optimized out>, conf=<optimized out>, url=<optimized out>,
proxyname=<optimized out>, proxyport=<optimized out>) at
mod_proxy_connect.c:446
#3  0x00007f2852f5b3fc in proxy_run_scheme_handler (r=r@entry=0x7f2856b1c0a0,
worker=0x7f2856c638d8, conf=conf@entry=0x7f2856c29320, url=0x7f2856b1d706
"localhost:873", proxyhost=proxyhost@entry=0x0, 
    proxyport=proxyport@entry=0) at mod_proxy.c:2888
#4  0x00007f2852f5c371 in proxy_handler (r=0x7f2856b1c0a0) at mod_proxy.c:1230

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #35 from Carsten Wolff <ca...@wolffcarsten.de> ---
Hum. It seems to be even worse. Now that proxy_connect works for my case with
these patches, everything else seems to be broken, even the normal case of GET
for static files produces incomplete files.
To test this, I just put a handfull of files of 500k into the document root and
linked them from index.html. When I now do "wget -r", it recognizes partial
content at connection close by the server:

--2017-12-14 23:37:48--  http://192.168.122.11/1.html
Connecting to 192.168.122.11:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 512000 (500K) [text/html]
Saving to: ‘192.168.122.11/1.html’

192.168.122.11/1.html                                  
16%[===================>                                                       
                                               ]  84,57K  --.-KB/s    in 0s     

2017-12-14 23:37:48 (316 MB/s) - Connection closed at byte 86604. Retrying.

--2017-12-14 23:37:49--  (try: 2)  http://192.168.122.11/1.html
Connecting to 192.168.122.11:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 512000 (500K), 425396 (415K) remaining [text/html]
Saving to: ‘192.168.122.11/1.html’

192.168.122.11/1.html                                  
33%[++++++++++++++++++++====================>                                  
                                               ] 169,09K  --.-KB/s    in 0,001s 

2017-12-14 23:37:49 (99,2 MB/s) - Connection closed at byte 173153. Retrying.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35446|0                           |1
        is obsolete|                            |

--- Comment #31 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35512
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35512&action=edit
Handle POLLOUT in proxy_connect (v7)

Good catch, it makes no sense to try to coalesce TCP packets!
This v7 simply removes the mod_ssl coalescing filter from the chain before
entering the mod_proxy_connect's loop.

The change between v6 and v7 lokks like this:

+     /* we are now acting as a tunnel - the input/output filter stacks should
+-     * not contain any non-connection filters.
++     * not contain any non-connection or coalescing filters.
+      */
++    ap_remove_output_filter_byhandle(c->output_filters,
++                                     "SSL/TLS Coalescing Filter");
+     r->output_filters = c->output_filters;

We'll probably want to use a helper from mod_ssl for this in the final patch,
and avoid the magic name, but for now it should work..

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #19 from Yann Ylavic <yl...@gmail.com> ---
Thanks for the patch Carsten, but there can't be an exception for
send_brigade_blocking() to block when the conditions are met, the core output
filter must control the memory buckets in flight (set aside).

Please see new proposal below.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #1 from carsten@wolffcarsten.de ---
Created attachment 35422
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35422&action=edit
rsync daemon config

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #16 from Carsten Wolff <ca...@wolffcarsten.de> ---
(In reply to Yann Ylavic from comment #15)
> Created attachment 35432 [details]
> Handle POLLOUT in proxy_connect (v5)
> 
> Please use v5 instead, v4 can't work before it does not play well with
> EAGAIN.

With v5 + "File read buffer and flush threshold tuning", proxy_connect gets
stuck very early on:



[Tue Oct 17 12:55:53.597014 2017] [core:trace5] [pid 9813:tid 140438873556736]
protocol.c(645): [client ::1:44136] Request received from client: CONNECT
localhost:873 HTTP/1.0
[Tue Oct 17 12:55:53.597118 2017] [http:trace4] [pid 9813:tid 140438873556736]
http_request.c(433): [client ::1:44136] Headers received from client:
[Tue Oct 17 12:55:53.597268 2017] [authz_core:debug] [pid 9813:tid
140438873556736] mod_authz_core.c(809): [client ::1:44136] AH01626:
authorization result of Require host localhost: granted
[Tue Oct 17 12:55:53.597278 2017] [authz_core:debug] [pid 9813:tid
140438873556736] mod_authz_core.c(809): [client ::1:44136] AH01626:
authorization result of <RequireAny>: granted
[Tue Oct 17 12:55:53.597282 2017] [core:trace3] [pid 9813:tid 140438873556736]
request.c(304): [client ::1:44136] request authorized without authentication by
access_checker_ex hook: localhost:873
[Tue Oct 17 12:55:53.597291 2017] [proxy_connect:trace1] [pid 9813:tid
140438873556736] mod_proxy_connect.c(141): [client ::1:44136] canonicalising
URL localhost:873
[Tue Oct 17 12:55:53.597312 2017] [proxy:trace2] [pid 9813:tid 140438873556736]
proxy_util.c(1970): [client ::1:44136] *: found forward proxy worker for
localhost:873
[Tue Oct 17 12:55:53.597318 2017] [proxy:debug] [pid 9813:tid 140438873556736]
mod_proxy.c(1227): [client ::1:44136] AH01143: Running scheme localhost handler
(attempt 0)
[Tue Oct 17 12:55:53.597322 2017] [proxy_connect:trace1] [pid 9813:tid
140438873556736] mod_proxy_connect.c(190): [client ::1:44136] serving URL
localhost:873
[Tue Oct 17 12:55:53.597329 2017] [proxy_connect:debug] [pid 9813:tid
140438873556736] mod_proxy_connect.c(207): [client ::1:44136] AH01019:
connecting localhost:873 to localhost:873
[Tue Oct 17 12:55:53.597372 2017] [proxy_connect:trace1] [pid 9813:tid
140438873556736] mod_proxy_connect.c(233): [client ::1:44136] connecting to
remote proxy localhost on port 873
[Tue Oct 17 12:55:53.597384 2017] [proxy:trace2] [pid 9813:tid 140438873556736]
proxy_util.c(2078): [client ::1:44136] CONNECT: fam 10 socket created to
connect to localhost
[Tue Oct 17 12:55:53.597466 2017] [proxy_connect:trace3] [pid 9813:tid
140438873556736] mod_proxy_connect.c(311): [client ::1:44136] connection
complete to [::1]:873 (localhost)
[Tue Oct 17 12:55:53.597483 2017] [proxy_connect:trace1] [pid 9813:tid
140438873556736] mod_proxy_connect.c(333): [client ::1:44136] Returning 200 OK
[Tue Oct 17 12:55:53.597493 2017] [core:trace6] [pid 9813:tid 140438873556736]
core_filters.c(418): [client ::1:44136] core_output_filter: flushing because of
FLUSH bucket
[Tue Oct 17 12:55:53.597522 2017] [core:trace6] [pid 9813:tid 140438873556736]
core_filters.c(878): [client ::1:44136] writev_nonblocking: 76/76 bytes
[Tue Oct 17 12:55:53.597522 2017] [proxy_connect:trace2] [pid 9813:tid
140438873556736] mod_proxy_connect.c(356): [client ::1:44136] setting up poll()
[Tue Oct 17 12:55:53.597651 2017] [proxy:trace2] [pid 9813:tid 140438873556736]
proxy_util.c(3844): [client ::1:44136] ap_proxy_transfer_between_connections
complete
[Tue Oct 17 12:55:53.980879 2017] [proxy:debug] [pid 13857:tid 140438968014016]
proxy_util.c(1779): AH00925: initializing worker proxy:forward shared
[Tue Oct 17 12:55:53.980922 2017] [proxy:debug] [pid 13857:tid 140438968014016]
proxy_util.c(1821): AH00927: initializing worker proxy:forward local
[Tue Oct 17 12:55:53.980933 2017] [proxy:debug] [pid 13857:tid 140438968014016]
proxy_util.c(1856): AH00930: initialized pool in child 13857 for (*) min=0
max=25 smax=25



Thread 2 (Thread 0x7fba792bc700 (LWP 9815)):
#0  0x00007fba7df1b0f3 in epoll_wait () at
../sysdeps/unix/syscall-template.S:84
#1  0x00007fba7e411201 in impl_pollset_poll (pollset=0x7fba7eb93f78,
timeout=<optimized out>, num=0x7fba792b9aa4, descriptors=0x7fba792b9ab0) at
./poll/unix/epoll.c:266
#2  0x00007fba7ad29198 in proxy_connect_handler (r=0x7fba7eb920a0,
worker=<optimized out>, conf=<optimized out>, url=<optimized out>,
proxyname=<optimized out>, proxyport=<optimized out>) at
mod_proxy_connect.c:401



# netstat -pnt | grep -v :22
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
PID/Program name    
tcp6      14      0 ::1:35806               ::1:873                 ESTABLISHED
9813/apache2        
tcp6       0      0 ::1:44136               ::1:80                  ESTABLISHED
13855/rsync         
tcp6       0      0 ::1:873                 ::1:35806               ESTABLISHED
13856/rsync         
tcp6       0      0 ::1:80                  ::1:44136               ESTABLISHED
9813/apache2

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #51 from Yann Ylavic <yl...@gmail.com> ---
Yes sorry, I made a shortcut. POLLOUT + a "drain" flag to indicate that the
POLLIN code should be entered in this case too...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #65 from Carsten Wolff <ca...@wolffcarsten.de> ---
Created attachment 35786
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35786&action=edit
log of a single process entering a busy loop

(In reply to Yann Ylavic from comment #63)
> (In reply to Yann Ylavic from comment #62)
> > (In reply to Carsten Wolff from comment #61)
> > > 
> > > unfortunately not, it creates busy loops again, when I:
> > > - start 100 transfers "simultaniously"
> > > - quickly interrupt the rsync clients (SIGINT)
> > 
> > Hmm, I can't reproduce, can you please provide the error log when this
> > happens?
> LogLevel trace8 ;)

Here's the log.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Mohamed El Housseine <el...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35776|0                           |1
        is obsolete|                            |

--- Comment #67 from Mohamed El Housseine <el...@gmail.com> ---
Created attachment 35787
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35787&action=edit
patch for fixing the problem of remaining data in input ctx v3

as already mentioned patch for remaining data in input filters v3

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35425|0                           |1
        is obsolete|                            |

--- Comment #7 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35426
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35426&action=edit
Handle POLLOUT in proxy_connect (v3)

We have to stop reading on the other side when writing blocks on one side.
How does this new version work?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #4 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35424
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35424&action=edit
Handle POLLOUT in proxy_connect

Could you please try this patch on latest 2.4.x?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #73 from Carsten Wolff <ca...@wolffcarsten.de> ---
Hi Tom,

(In reply to Tom Drake from comment #72)
> I believe that I am encountering this same issue. 

that is unlikely. You're using mod_proxy_ajp, which does not share much with
mod_proxy_connect. Also, looking at mod_proxy_ajp.c, the behavior you're seeing
seems to be intentional (it only flushes data to the client explicitely, if the
backend has no more data to read, i.e. if a poll times out after 10ms [see also
bug#37100]).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #44 from Carsten Wolff <ca...@wolffcarsten.de> ---
Created attachment 35737
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35737&action=edit
fixes a busy loop trying to shutdown a disconnected socket

(In reply to Carsten Wolff from comment #43)
> So it definitely seems to be necessary that the frontend connection uses SSL
> for the stalls to be triggered.

I need to retract that statement. With "v10 and v2 applied + configuration from
comment#24", another problem arises, when many concurrent connections are made
through proxy_connect: Now and then a process (prefork mpm) will stay in a busy
loop, doing this over and over again:

shutdown(30, 1 /* send */)              = -1 ENOTCONN (Transport endpoint is
not connected)
epoll_ctl(31, EPOLL_CTL_DEL, 30, {0, {u32=0, u64=0}}) = 0
epoll_wait(31, {{EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=359204392,
u64=140724962657832}}}, 2, 60000) = 1
epoll_ctl(31, EPOLL_CTL_DEL, 30, {0, {u32=0, u64=0}}) = -1 ENOENT (No such file
or directory)
epoll_ctl(31, EPOLL_CTL_ADD, 30, {EPOLLOUT, {u32=359204480,
u64=140724962657920}}) = 0
shutdown(29, 1 /* send */)              = -1 ENOTCONN (Transport endpoint is
not connected)
epoll_ctl(31, EPOLL_CTL_DEL, 29, {0, {u32=0, u64=0}}) = 0
epoll_wait(31, {{EPOLLOUT|EPOLLHUP, {u32=359204480, u64=140724962657920}}}, 2,
60000) = 1
epoll_ctl(31, EPOLL_CTL_DEL, 29, {0, {u32=0, u64=0}}) = -1 ENOENT (No such file
or directory)
epoll_ctl(31, EPOLL_CTL_ADD, 29, {EPOLLOUT, {u32=359204392,
u64=140724962657832}}) = 0
shutdown(30, 1 /* send */)

The attached patch avoids the busy loop by setting "done", when
apr_socket_shutdown returns with an error.

Further more, when testing this patch by doing many concurrent rsync transfers,
I noticed that some rsync clients still stall at some point (w/ and w/o SSL!).
When these clients die, that's when the busy loop occurs.

The test is very simple:
# cat reproduce-endlesss-poll-loop.sh 
#!/bin/bash
mkdir /testdst
cd /testdst
for i in {1..100}; do
        dd if=/dev/urandom of=$i bs=10M count=1 >/dev/null &
done
wait

for i in {1..100}; do
        # w/ stunnel
        #RSYNC_PROXY=localhost:874 rsync -rcP rsync://localhost:873/testsrc/$i
/testdst/$i &
        # no-ssl
        RSYNC_PROXY=192.168.122.11:80 rsync -rcP
rsync://localhost:873/testsrc/$i /testdst/$i &
done
wait

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #22 from Yann Ylavic <yl...@gmail.com> ---
(In reply to Carsten Wolff from comment #21)
> 
> Maybe there should be a APR_NONBLOCK_WRITE flag one could
> pass to ap_pass_brigade() (and further down to send_brigade_blocking) to
> absolutely make sure, the poll() there is never hit?

When the poll() is hit in send_brigade_blocking() it means the core output
filter ought to block, because we don't want unbounded buffering.

Suppose a handler keeps sending data out, when the socket becomes full we must
stop buffering soon or later (i.e. above THRESHOLD_MAX_BUFFER or something
configurable which is the point of attachment 35427, among other parameters).

Otherwise it's unbounded memory usage.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Graham Leggett <mi...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #75 from Graham Leggett <mi...@apache.org> ---
Backported to v2.4.47.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #36 from Yann Ylavic <yl...@gmail.com> ---
Sorry for the delay Carsten, just to clarify, you are running httpd with
attachment 35513 [Handle POLLOUT in proxy_connect (v8)] and attachment 35447 [
Read buffer size and flush threshold tuning], no other patch right?

While attachment 35513 alone should work for the proxy_connect case,
perfomances are better with tunings from attachment 35447, still right?

If this is the case, I'd suggest using attachment 35513 only for now, that's
less performances (than expected?) but the same as the original proxy_connect
ones.

I'm looking at what could cause regressions in attachment 35447, but this patch
is work in progress anyway and not really related to this PR I think.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

--- Comment #48 from Mohamed El Housseine <el...@gmail.com> ---

After driving many tests with the data provided by Carsten, I was able to
figure out, where the unsent-but-read data sits during the stall.
For simplification I will reduce the proxy flow to the following steps:

1- proxy_connect_handler calls ap_proxy_transfer_between_connections after
every succesfull POLLIN.

The flow of ap_proxy_transfer_between_connections consists of:

2- call ap_get_brigade to read the incoming data from the backend connection
into bb_i
3- call ap_proxy_buckets_lifetime_transform to copy bb_i into bb_o
4- call ap_pass_brigade to write the data of bb_o to the client connection
5- repeat the steps 2:4 as long as step 2 and step 4 return APR_SUCCESS

During this flow following situtations may occur:

#a: the ap_core_input_filter is called from ap_get_brigade in step2. It takes
as last parameter the readbytes (the max amount of data to read) and it tries
to read the requested amount of data from the socket by calling the apr
function socket_bucket_read through apr_bucket_read in a loop
(http://svn.apache.org/viewvc/httpd/httpd/branches/2.4.x/server/core_filters.c?annotate=1734391#l235
and
http://svn.apache.org/viewvc/httpd/httpd/branches/2.4.x/server/core_filters.c?annotate=1734391#l284).

#b: socket_bucket_read allocates a new bucket and _overwrites_ the requested
len with APR_BUCKET_BUFF_SIZE, before passing it to apr_socket_recv.

#c: Because of this, it is possible that socket_bucket_read reads more data
from the socket into the bucket than ap_core_input_filter was instructed to
get.

#d: In this case the read data will be split into several buckets by
ap_core_input_filter. Tthe first one will be returned through bb_i and the rest
will be kept in in_ctx->b.
(http://svn.apache.org/viewvc/httpd/httpd/branches/2.4.x/server/core_filters.c?annotate=1734391#l305
ff.)

#e: the ap_pass_brigade in step 4 then tries to write the data from bb_i to the
target connection. If that write comes back incomplete, the flag
data_in_output_filters will be set to true, in which case the loop in step 5
ends.

#f: if this all happens at the very end of the data transfer, and the socket
doesn't receive any more data to cause a POLLIN, step 1 will never happen again
and the data in in_ctx->b will not get sent before rsync decides to send an
(application-level) keepalive (after minutes).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 61616] mod_proxy_connect: stall and connection loss on bi-directional traffic

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61616

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35783|0                           |1
        is obsolete|                            |
  Attachment #35784|0                           |1
        is obsolete|                            |

--- Comment #70 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35789
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35789&action=edit
Handle POLLOUT in proxy_connect (v13)

Previous patch (v12) was bogus, indeed.

This new one should work regardless of THRESHOLD_MIN_WRITE (or
FlushMinThreshold from attachment 35620), i.e. it's really self contained.

The does use or play with "data_in_input_filters" because the core filter can't
be responsible for clearing it (other filters might retain data too, suppose
it's used by third-party modules...).

So I kept my "drain" flag, set according to
ap_proxy_transfer_between_connections() since a successful return from there
implies that there is no pending data. It allows to reduce pressure on
buffering as you noted, but mainly to avoid blocking on the write side of
ap_proxy_transfer_between_connections() by progressively accumulating data
there, should it not "follow" the read side (which defeats the purpose of
checking data_in_output_filters anyway).

I tries to keep the best of our respective proposals (no buffering nor code
duplication...), if it works we should find an agreement right? :)

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org