You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Susan Hinrichs (JIRA)" <ji...@apache.org> on 2014/11/03 22:45:34 UTC

[jira] [Comment Edited] (TS-3105) Combination of fixes for TS-3084 and TS-3073 causing asserts and segfaults on 5.1 and beyond

    [ https://issues.apache.org/jira/browse/TS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189535#comment-14189535 ] 

Susan Hinrichs edited comment on TS-3105 at 11/3/14 9:44 PM:
-------------------------------------------------------------

ts-3105-master-6  Trying to prevent tunnel_handler_ua from being called twice.

Update:  This patch has been run in a production environment for 3 hours so far.  

The key change between this patch and the previous patch was a change in HttpTunnel::consumer_handler.  In the case of a "final" event (e.g. write complete, eos, error, timeout), the original code was setting the final callback to execute, but this caused assertion failures in HttpSM::tunnel_handler_post_or_put() if p->handler_state was 0.  Earlier versions of the patch would avoid setting the callback flag if p->handler_state was 0.  This avoided the assert, but apparently caused the state machine to leak.

This patch updated the logic to set the p->handler_state flag if it was not already set.  It chooses a value based on the event and the c->vc_type.


was (Author: shinrich):
ts-3105-master-6  Trying to prevent tunnel_handler_ua from being called twice.

Update:  This patch has been run in a production environment for 3 hours so far.  

The key change between this patch and the previous patch was a changed in HttpTunnel::consumer_handler.  In the case of a "final" event (e.g. write complete, eos, error, timeout), the original code was setting the final callback to execute, but this caused assertion failures in HttpSM::tunnel_handler_post_or_put() if p->handler_state was 0.  Earlier versions of the patch would avoid setting the callback flag if p->handler_state was 0.  This avoided the assert, but apparently caused the state machine to leak.

This patch updated the logic to set the p->handler_state flag if it was not already set.  It chooses a value based on the event and the c->vc_type.

> Combination of fixes for TS-3084 and TS-3073 causing asserts and segfaults on 5.1 and beyond
> --------------------------------------------------------------------------------------------
>
>                 Key: TS-3105
>                 URL: https://issues.apache.org/jira/browse/TS-3105
>             Project: Traffic Server
>          Issue Type: Bug
>            Reporter: Susan Hinrichs
>            Assignee: Susan Hinrichs
>             Fix For: 5.2.0
>
>         Attachments: ts-3073-and-3084-and-3105-against-510.patch, ts-3105-master-6.patch
>
>
> These two patches were run in a production environment on top of 5.0.1 without problem for several weeks.  Now running with these patches on top of 5.1 causes either an assert or a segfault.  Another person has reported the same segfault when running master in a production environment.
> In the assert, the handler_state of the producers is 0 (UNKNOWN) rather than a terminal state which is expected.  I'm assuming either we are being directed into the terminal state from a connection that terminates too quickly.  Or an event has hung around for too long and is being executed against the state machine after it has been recycled.
> The event is HTTP_TUNNEL_EVENT_DONE
> The assert stack trace is
> FATAL: HttpSM.cc:2632: failed assert `0`
> /z/bin/traffic_server - STACK TRACE:
> /z/lib/libtsutil.so.5(+0x25197)[0x2b8bd08dc197]
> /z/lib/libtsutil.so.5(+0x23def)[0x2b8bd08dadef]
> /z/bin/traffic_server(HttpSM::tunnel_handler_post_or_put(HttpTunnelProducer*)+0xcd)[0x5982ad]
> /z/bin/traffic_server(HttpSM::tunnel_handler_post(int, void*)+0x86)[0x5a32d6]
> /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x5a1e18]
> /z/bin/traffic_server(HttpTunnel::main_handler(int, void*)+0xee)[0x5dd6ae]
> /z/bin/traffic_server(write_to_net_io(NetHandler*, UnixNetVConnection*, EThread*)+0x136e)[0x721d1e]
> /z/bin/traffic_server(NetHandler::mainNetEvent(int, Event*)+0x28c)[0x7162fc]
> /z/bin/traffic_server(EThread::process_event(Event*, int)+0x91)[0x744df1]
> /z/bin/traffic_server(EThread::execute()+0x4fc)[0x7458ac]
> /z/bin/traffic_server[0x7440ca]
> /lib64/libpthread.so.0(+0x7034)[0x2b8bd1ee4034]
> /lib64/libc.so.6(clone+0x6d)[0x2b8bd2c2875d]
> The segfault stack trace is 
> /z/bin/traffic_server - STACK TRACE: 
> /lib64/libpthread.so.0(+0xf280)[0x2abccd0d8280]
> /z/bin/traffic_server(HttpSM::tunnel_handler_ua(int, HttpTunnelConsumer*)+0x122)[0x591462]
> /z/bin/traffic_server(HttpTunnel::consumer_handler(int, HttpTunnelConsumer*)+0x9e)[0x5dd15e]
> /z/bin/traffic_server(HttpTunnel::main_handler(int, void*)+0x117)[0x5dd6d7]
> /z/bin/traffic_server(UnixNetVConnection::mainEvent(int, Event*)+0x3f0)[0x725190]
> /z/bin/traffic_server(InactivityCop::check_inactivity(int, Event*)+0x275)[0x716b75]
> /z/bin/traffic_server(EThread::process_event(Event*, int)+0x91)[0x744df1]
> /z/bin/traffic_server(EThread::execute()+0x2fb)[0x7456ab]
> /z/bin/traffic_server[0x7440ca]
> /lib64/libpthread.so.0(+0x7034)[0x2abccd0d0034]
> /lib64/libc.so.6(clone+0x6d)[0x2abccde1475d]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)