You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by James Peach <ja...@me.com> on 2011/12/15 07:33:21 UTC

delivering EOS event up to the FetchSM state machine

Hi all,

I've been looking at an issue where TSFetchUrl does not deliver the EOS event to the caller until long after the origin server closes the read socket. I made a small change to read_from_net() so that it notices when read(2) returns 0 even if it returned > 0 a previous time round the loop. This causes it to deliver the EOS event up to the HTTP state machine.

In the end, however, the EOS event is not delivered to the FetchSM. Here's what happens …

We receive the response from the origin server. It's 

[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] producer_handler [http server VC_EVENT_READ_READY]
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] producer_handler_dechunked [http server VC_EVENT_READ_READY]
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_chunk) creating a chunk of size 3296 bytes
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_redirect) [HttpTunnel::producer_handler] enable_redirection: [1 0 0] event: 100
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: reenable Write
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) attempting to read 32768 bytes
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) read 0 bytes
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) triggering EOS
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] producer_handler [http server VC_EVENT_EOS]
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] producer_handler_dechunked [http server VC_EVENT_EOS]
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_redirect) [HttpTunnel::producer_handler] enable_redirection: [1 0 0] event: 104
[Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http) [0] [&HttpSM::tunnel_handler_server, VC_EVENT_EOS]

At this point, we know that the read socket closed and we are trying to deliver the EOS event up the stack. Then this happens:

[Dec 14 20:36:31.338] Server {0x7fff7b5f9960} DEBUG: (http_ss) [0] session closed
[Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: reenable Write
[Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Passive: Received event 1
[Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_write_side
[Dec 14 20:36:31.363] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_write_side; act_on 3499
[Dec 14 20:36:31.386] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_write_side; added 3499
[Dec 14 20:36:31.386] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] consumer_handler [user agent VC_EVENT_WRITE_COMPLETE]
[Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0] [&HttpSM::tunnel_handler_ua, VC_EVENT_WRITE_COMPLETE]
[Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_cs) [0] session released by sm [0]
[Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_cs) [0] initiating io for next header
[Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: do_io_read for -1 bytes
[Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0] [HttpSM::main_handler, HTTP_TUNNEL_EVENT_DONE]
[Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0] [&HttpSM::tunnel_handler, HTTP_TUNNEL_EVENT_DONE]
[Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_redirect) [HttpTunnel::deallocate_postdata_copy_buffers]
[Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (http_seq) [HttpStateMachineGet::update_stats] Logging transaction
[Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (http) [0] dellocating sm

So now the HttpSM knows that it's all over because it is deallocating itself in HttpSM::kill_this().

[Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side
[Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side; act_on 3499
[Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side; added 3499
[Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM) [fetch_handler] calling fetch_plugin
[Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM) [process_fetch_read] I am here read
[Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM) [process_fetch_read] number of bytes in read ready 3499
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: reenable Read
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: reenable Write
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Passive: Received event 1
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_read_side
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_read_side; act_on 0
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: closed? 0 shutdown? 0
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_write_side
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Active: Received event 1
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side; act_on 0
[Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: closed? 0 shutdown? 0

proxy/PluginVC.cc:
 624   if (act_on <= 0) {
 625     Debug("pvc", "[%u] %s: closed? %u shutdown? %u", PVC_ID, PVC_TYPE,
 626       other_side->closed, other_side->write_state.shutdown);
 627     if (other_side->closed || other_side->write_state.shutdown) {
 628       read_state.vio._cont->handleEvent(VC_EVENT_EOS, &read_state.vio);
 629     }
 630     return;
 631   }

So the Plugin VC does not think its other_side is closed (because no-one has called do_io_close()?). But eventually there is a HttpClientSession timeout, which closes the right PluginVC and wakes it. Now the FetchSM is kicked and the TSFetchUrl event triggers:

[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc_event) [0] Passive: Received event 2
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] [&HttpClientSession::state_keep_alive, VC_EVENT_ACTIVE_TIMEOUT]
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Passive: do_io_close
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] session closed
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] session destroy
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Passive: process_close
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc_event) [0] Active: Received event 1
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: process_read_side
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: process_read_side; act_on 0
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: closed? 1 shutdown? 0
[Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (FetchSM) [fetch_handler] calling fetch_plugin

After beating my head against this for a while, I don't see what the right way to deliver the EOS event up from the HTTP layer to the PluginVC is. Can anyone hold my hand a little and give me some clues about how this is supposed to work?

thanks,
James


Re: delivering EOS event up to the FetchSM state machine

Posted by James Peach <ja...@me.com>.
On 14/12/2011, at 10:33 PM, James Peach wrote:

> Hi all,
> 
> I've been looking at an issue where TSFetchUrl does not deliver the EOS event to the caller until long after the origin server closes the read socket. I made a small change to read_from_net() so that it notices when read(2) returns 0 even if it returned > 0 a previous time round the loop. This causes it to deliver the EOS event up to the HTTP state machine.
> 
> In the end, however, the EOS event is not delivered to the FetchSM. Here's what happens …
> 
> We receive the response from the origin server. It's 
> 
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] producer_handler [http server VC_EVENT_READ_READY]
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] producer_handler_dechunked [http server VC_EVENT_READ_READY]
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_chunk) creating a chunk of size 3296 bytes
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_redirect) [HttpTunnel::producer_handler] enable_redirection: [1 0 0] event: 100
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: reenable Write
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) attempting to read 32768 bytes
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) read 0 bytes
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) triggering EOS
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] producer_handler [http server VC_EVENT_EOS]
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] producer_handler_dechunked [http server VC_EVENT_EOS]
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_redirect) [HttpTunnel::producer_handler] enable_redirection: [1 0 0] event: 104
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http) [0] [&HttpSM::tunnel_handler_server, VC_EVENT_EOS]
> 
> At this point, we know that the read socket closed and we are trying to deliver the EOS event up the stack. Then this happens:
> 
> [Dec 14 20:36:31.338] Server {0x7fff7b5f9960} DEBUG: (http_ss) [0] session closed
> [Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: reenable Write
> [Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Passive: Received event 1
> [Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_write_side
> [Dec 14 20:36:31.363] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_write_side; act_on 3499
> [Dec 14 20:36:31.386] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_write_side; added 3499
> [Dec 14 20:36:31.386] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0] consumer_handler [user agent VC_EVENT_WRITE_COMPLETE]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0] [&HttpSM::tunnel_handler_ua, VC_EVENT_WRITE_COMPLETE]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_cs) [0] session released by sm [0]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_cs) [0] initiating io for next header
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: do_io_read for -1 bytes
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0] [HttpSM::main_handler, HTTP_TUNNEL_EVENT_DONE]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0] [&HttpSM::tunnel_handler, HTTP_TUNNEL_EVENT_DONE]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_redirect) [HttpTunnel::deallocate_postdata_copy_buffers]
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (http_seq) [HttpStateMachineGet::update_stats] Logging transaction
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (http) [0] dellocating sm
> 
> So now the HttpSM knows that it's all over because it is deallocating itself in HttpSM::kill_this().
> 
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side; act_on 3499
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side; added 3499
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM) [fetch_handler] calling fetch_plugin
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM) [process_fetch_read] I am here read
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM) [process_fetch_read] number of bytes in read ready 3499
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: reenable Read
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: reenable Write
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Passive: Received event 1
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_read_side
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_read_side; act_on 0
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: closed? 0 shutdown? 0
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive: process_write_side
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Active: Received event 1
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: process_read_side; act_on 0
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active: closed? 0 shutdown? 0
> 
> proxy/PluginVC.cc:
> 624   if (act_on <= 0) {
> 625     Debug("pvc", "[%u] %s: closed? %u shutdown? %u", PVC_ID, PVC_TYPE,
> 626       other_side->closed, other_side->write_state.shutdown);
> 627     if (other_side->closed || other_side->write_state.shutdown) {
> 628       read_state.vio._cont->handleEvent(VC_EVENT_EOS, &read_state.vio);
> 629     }
> 630     return;
> 631   }
> 
> So the Plugin VC does not think its other_side is closed (because no-one has called do_io_close()?). But eventually there is a HttpClientSession timeout, which closes the right PluginVC and wakes it. Now the FetchSM is kicked and the TSFetchUrl event triggers:
> 
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc_event) [0] Passive: Received event 2
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] [&HttpClientSession::state_keep_alive, VC_EVENT_ACTIVE_TIMEOUT]
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Passive: do_io_close
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] session closed
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] session destroy
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Passive: process_close
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc_event) [0] Active: Received event 1
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: process_read_side
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: process_read_side; act_on 0
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: closed? 1 shutdown? 0
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (FetchSM) [fetch_handler] calling fetch_plugin
> 
> After beating my head against this for a while, I don't see what the right way to deliver the EOS event up from the HTTP layer to the PluginVC is. Can anyone hold my hand a little and give me some clues about how this is supposed to work?

This appears to do the trick:

diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc
index 624637b..b450755 100644
--- a/proxy/http/HttpSM.cc
+++ b/proxy/http/HttpSM.cc
@@ -2842,6 +2842,7 @@ HttpSM::tunnel_handler_server(int event, HttpTunnelProducer * 
   if (close_connection) {
     p->vc->do_io_close();
     p->read_vio = NULL;
+    t_state.client_info.keep_alive = HTTP_NO_KEEPALIVE;
   } else {
     server_session->attach_hostname(t_state.current.server->name);
     server_session->server_trans_stat--;

I don't understand the HTTP state machine well enough to know whether this is a generally correct fix, so I'd love it if anyone could give this a thumbs up or down

J