You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Thach Tran (JIRA)" <ji...@apache.org> on 2013/11/25 14:11:35 UTC

[jira] [Commented] (TS-2380) ATS crash in HttpSM::handle_api_return for SEND_RESPONSE_HDR event

    [ https://issues.apache.org/jira/browse/TS-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831438#comment-13831438 ] 

Thach Tran commented on TS-2380:
--------------------------------

I've looked into this a bit further and I'm suspecting this is similar to TS-1335.
In my setup, I got a null transform plugin that hook into some requests/responses for the purpose of logging the message body.
Could someone confirm how TS-1335 was resolved as I couldn't find any code commit that was made in association with that bug.

Did it get resolved by essentially following this comment

{quote}
Since the null-transform not change the content length, so we should use 
data->output_vio = TSVConnWrite(output_conn, contp, data->output_reader, TSVIONBytesGet(input_vio));
instead of 
data->output_vio = TSVConnWrite(output_conn, contp, data->output_reader, INT64_MAX);
{quote}

> ATS crash in HttpSM::handle_api_return for SEND_RESPONSE_HDR event
> ------------------------------------------------------------------
>
>                 Key: TS-2380
>                 URL: https://issues.apache.org/jira/browse/TS-2380
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP
>            Reporter: Thach Tran
>
> We're shipping Linux servers with ATS 4.0.1 installed and configured to proxy HTTP and I occasionally see this ATS crash appears on our central crash reporting system. I haven't been able to reproduce this but it appears often enough to let me think there could be an issue with ATS in some edge cases. The stack trace looks like the following.
> {noformat}
> The process was terminated with 'signal 11, Segmentation fault'
> Threads
> =======
> Id   Target Id         Frame 
>   24   Thread 0x2b3be3eaf700 (LWP 13058) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
>   23   Thread 0x2b3beb1dd700 (LWP 13066) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   22   Thread 0x2b3be8aeb700 (LWP 13062) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
>   21   Thread 0x2b3bebbe7700 (LWP 13071) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   20   Thread 0x2b3beb5e1700 (LWP 13068) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   19   Thread 0x2b3bfc55c700 (LWP 13078) 0x00002b3be295bf2d in accept () at ../sysdeps/unix/syscall-template.S:81
>   18   Thread 0x2b3be8bec700 (LWP 13063) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
>   17   Thread 0x2b3be3dae700 (LWP 13057) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
>   16   Thread 0x2b3be8ced700 (LWP 13064) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
>   15   Thread 0x2b3bfc35a700 (LWP 13077) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   14   Thread 0x2b3beb3df700 (LWP 13067) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   13   Thread 0x2b3bfc259700 (LWP 13076) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   12   Thread 0x2b3beb7e3700 (LWP 13069) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   11   Thread 0x2b3bf0cd2700 (LWP 13075) 0x00002b3be295bf2d in accept () at ../sysdeps/unix/syscall-template.S:81
>   10   Thread 0x2b3bebde9700 (LWP 13072) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   9    Thread 0x2b3beb9e5700 (LWP 13070) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   8    Thread 0x2b3bebfeb700 (LWP 13073) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>   7    Thread 0x2b3be960f700 (LWP 13065) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
>   6    Thread 0x2b3bf0ad0700 (LWP 13074) 0x00002b3be295bf2d in accept () at ../sysdeps/unix/syscall-template.S:81
>   5    Thread 0x2b3be8504700 (LWP 13061) 0x00002b3be333c23d in nanosleep () from /lib/x86_64/libc.so.6
>   4    Thread 0x2b3be8302700 (LWP 13060) 0x00002b3be333c23d in nanosleep () from /lib/x86_64/libc.so.6
>   3    Thread 0x2b3be8100700 (LWP 13059) 0x00002b3be333c23d in nanosleep () from /lib/x86_64/libc.so.6
>   2    Thread 0x2b3be38a8700 (LWP 13056) 0x00002b3be333c23d in nanosleep () from /lib/x86_64/libc.so.6
> * 1    Thread 0x2b3be3635140 (LWP 13055) 0x0000000000562fb9 in HttpSM::handle_api_return (this=0x2b3bf0f7de70) at HttpSM.cc:1562
> Backtrace from signalled thread
> ===============================
> #0  0x0000000000562fb9 in HttpSM::handle_api_return (this=0x2b3bf0f7de70) at HttpSM.cc:1562
> No locals.
> #1  0x0000000000562dc0 in HttpSM::state_api_callout (this=0x2b3bf0f7de70, event=60000, data=0x0) at HttpSM.cc:1486
>         api_next = HttpSM::API_RETURN_CONTINUE
>         __FUNCTION__ =           "state_api_callout"
> #2  0x000000000056255b in HttpSM::state_api_callback (this=0x2b3bf0f7de70, event=60000, data=0x0) at HttpSM.cc:1305
>         __FUNCTION__ =           "state_api_callback"
> #3  0x00000000004f796b in TSHttpTxnReenable (txnp=0x2b3bf0f7de70, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5558
>         trylock = {
>           m = {
>             m_ptr = 0x2b3bf4164e10
>           }, 
>           lock_acquired = true
>         }
>         sm = 0x2b3bf0f7de70
>         eth = 0x2b3be39ab010
> #4  0x00002b3beadd8b15 in cont_header_filter (event=TS_EVENT_HTTP_SEND_RESPONSE_HDR, edata=0x2b3bf0f7de70) at header_filter.cc:90
>         txnp = 0x2b3bf0f7de70
>         hook = TS_HTTP_SEND_RESPONSE_HDR_HOOK
>         reqp = 0x2b3bf0f7e5a8
>         hdr_loc = 0x2b3bf0026898
> #5  0x00000000004ec383 in INKContInternal::handle_event (this=0x2a357b0, event=60007, edata=0x2b3bf0f7de70) at InkAPI.cc:993
> No locals.
> #6  0x00000000004e3abe in Continuation::handleEvent (this=0x2a357b0, event=60007, data=0x2b3bf0f7de70) at ../iocore/eventsystem/I_Continuation.h:146
> No locals.
> #7  0x00000000004ecb9d in APIHook::invoke (this=0x2a368c0, event=60007, edata=0x2b3bf0f7de70) at InkAPI.cc:1212
> No locals.
> #8  0x0000000000562b4e in HttpSM::state_api_callout (this=0x2b3bf0f7de70, event=0, data=0x0) at HttpSM.cc:1412
>         plugin_lock = false
>         plugin_mutex = {
>           m_ptr = 0x0
>         }
>         hook = 0x2a368c0
>         api_next = HttpSM::API_RETURN_UNKNOWN
>         __FUNCTION__ =           "state_api_callout"
> #9  0x000000000056dab1 in HttpSM::do_api_callout_internal (this=0x2b3bf0f7de70) at HttpSM.cc:4736
> No locals.
> #10 0x000000000057927a in HttpSM::do_api_callout (this=0x2b3bf0f7de70) at HttpSM.cc:504
> No locals.
> #11 0x0000000000574e13 in HttpSM::set_next_state (this=0x2b3bf0f7de70) at HttpSM.cc:7034
>         __FUNCTION__ =           "set_next_state"
> #12 0x0000000000573d6e in HttpSM::call_transact_and_set_next_state (this=0x2b3bf0f7de70, f=0x5830b6 <HttpTransact::HandleApiErrorJump(HttpTransact::State*)>) at HttpSM.cc:6747
>         __FUNCTION__ =           "call_transact_and_set_next_state"
> #13 0x00000000005623c4 in HttpSM::state_common_wait_for_transform_read (this=0x2b3bf0f7de70, t_info=0x2b3bf0f80000, tunnel_handler=(int (HttpSM::*)(HttpSM * const, int, void *)) 0x5660ea <HttpSM::tunnel_handler_post(int, void*)>, event=3, data=0x2b3c1c000990) at HttpSM.cc:1278
>         __FUNCTION__ =           "state_common_wait_for_transform_read"
>         c = 0x2b3bf0f7fa68
> #14 0x0000000000561f2d in HttpSM::state_request_wait_for_transform_read (this=0x2b3bf0f7de70, event=3, data=0x2b3c1c000990) at HttpSM.cc:1174
>         __FUNCTION__ =           "state_request_wait_for_transform_read"
>         size = 9223372036854775807
> #15 0x0000000000565f74 in HttpSM::main_handler (this=0x2b3bf0f7de70, event=2000, data=0x2b3c1c000990) at HttpSM.cc:2503
>         jump_point = NULL
>         __FUNCTION__ =           "main_handler"
>         vc_entry = 0x0
> #16 0x00000000004e3abe in Continuation::handleEvent (this=0x2b3bf0f7de70, event=2000, data=0x2b3c1c000990) at ../iocore/eventsystem/I_Continuation.h:146
> No locals.
> #17 0x0000000000528e1f in TransformTerminus::handle_event (this=0x2b3c1c000908, event=1) at Transform.cc:173
>         val = 1
>         __FUNCTION__ =           "handle_event"
> #18 0x00000000004e3abe in Continuation::handleEvent (this=0x2b3c1c000908, event=1, data=0x2b3bf416c520) at ../iocore/eventsystem/I_Continuation.h:146
> No locals.
> #19 0x00000000006c4def in EThread::process_event (this=0x2b3be39ab010, e=0x2b3bf416c520, calling_code=1) at UnixEThread.cc:141
>         c_temp = 0x2b3c1c000908
>         lock = {
>           m = {
>             m_ptr = 0x2b3bf4164e10
>           }, 
>           lock_acquired = true
>         }
> #20 0x00000000006c4fc3 in EThread::execute (this=0x2b3be39ab010) at UnixEThread.cc:192
>         done_one = false
>         e = 0x2b3bf416c520
>         NegativeQueue = {
>           <DLL<Event, Event::Link_link>> = {
>             head = 0x2915ea0
>           }, 
>           members of Queue<Event, Event::Link_link>: 
>           tail = 0x2915960
>         }
>         next_time = 1384545575927012429
> #21 0x000000000050c7fb in main (argv=0x7fff6c38fae8) at Main.cc:1690
>         admin_user_p = true
>         __FUNCTION__ =           "main"
>         stacksize = 1048576
>         user = 0x28d22a0 "\320\344\326\342;+"
>         main_thread = 0x2b3be38a9010
>         mlock_flags = 0
>         machine_addr = {
>           sa = {
>             sa_family = 0, 
>             sa_data =               '\000' <repeats 13 times>
>           }, 
>           sin = {
>             sin_family = 0, 
>             sin_port = 0, 
>             sin_addr = {
>               s_addr = 0
>             }, 
>             sin_zero =               "\000\000\000\000\000\000\000"
>           }, 
>           sin6 = {
>             sin6_family = 0, 
>             sin6_port = 0, 
>             sin6_flowinfo = 0, 
>             sin6_addr = {
>               __in6_u = {
>                 __u6_addr8 =                   '\000' <repeats 15 times>, 
>                 __u6_addr16 =                   {[0] = 0,
>                   [1] = 0,
>                   [2] = 0,
>                   [3] = 0,
>                   [4] = 0,
>                   [5] = 0,
>                   [6] = 0,
>                   [7] = 0}, 
>                 __u6_addr32 =                   {[0] = 0,
>                   [1] = 0,
>                   [2] = 0,
>                   [3] = 0}
>               }
>             }, 
>             sin6_scope_id = 0
>           }
>         }
>         cluster_type = 3
>         num_remap_threads = 0
> {noformat}
> I looked at {{HttpSM.cc:1562}} and it appears to be this line.
> {noformat}
> if (ua_session) {
>   ua_session->get_netvc()->set_inactivity_timeout(HRTIME_SECONDS(t_state.txn_conf->transaction_no_activity_timeout_in));
> }
> {noformat}
> Could this be the case of client has given up and disconnected which led to {{ua_session->get_netvc()}} returned invalid pointer? Anybody saw this before?



--
This message was sent by Atlassian JIRA
(v6.1#6144)