You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Thach Tran (JIRA)" <ji...@apache.org> on 2013/11/21 14:16:36 UTC

[jira] [Created] (TS-2380) ATS crash in HttpSM::handle_api_return for SEND_RESPONSE_HDR event

Thach Tran created TS-2380:
------------------------------

             Summary: ATS crash in HttpSM::handle_api_return for SEND_RESPONSE_HDR event
                 Key: TS-2380
                 URL: https://issues.apache.org/jira/browse/TS-2380
             Project: Traffic Server
          Issue Type: Bug
          Components: Core, HTTP
            Reporter: Thach Tran


We're shipping Linux servers with ATS 4.0.1 installed and configured to proxy HTTP and I occasionally see this ATS crash appears on our central crash reporting system. I haven't been able to reproduce this but it appears often enough to let me think there could be an issue with ATS in some edge cases. The stack trace looks like the following.

{noformat}
The process was terminated with 'signal 11, Segmentation fault'

Threads
=======

Id   Target Id         Frame 
  24   Thread 0x2b3be3eaf700 (LWP 13058) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
  23   Thread 0x2b3beb1dd700 (LWP 13066) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  22   Thread 0x2b3be8aeb700 (LWP 13062) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
  21   Thread 0x2b3bebbe7700 (LWP 13071) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  20   Thread 0x2b3beb5e1700 (LWP 13068) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  19   Thread 0x2b3bfc55c700 (LWP 13078) 0x00002b3be295bf2d in accept () at ../sysdeps/unix/syscall-template.S:81
  18   Thread 0x2b3be8bec700 (LWP 13063) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
  17   Thread 0x2b3be3dae700 (LWP 13057) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
  16   Thread 0x2b3be8ced700 (LWP 13064) 0x00002b3be336d683 in epoll_wait () from /lib/x86_64/libc.so.6
  15   Thread 0x2b3bfc35a700 (LWP 13077) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  14   Thread 0x2b3beb3df700 (LWP 13067) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  13   Thread 0x2b3bfc259700 (LWP 13076) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  12   Thread 0x2b3beb7e3700 (LWP 13069) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  11   Thread 0x2b3bf0cd2700 (LWP 13075) 0x00002b3be295bf2d in accept () at ../sysdeps/unix/syscall-template.S:81
  10   Thread 0x2b3bebde9700 (LWP 13072) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  9    Thread 0x2b3beb9e5700 (LWP 13070) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  8    Thread 0x2b3bebfeb700 (LWP 13073) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  7    Thread 0x2b3be960f700 (LWP 13065) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  6    Thread 0x2b3bf0ad0700 (LWP 13074) 0x00002b3be295bf2d in accept () at ../sysdeps/unix/syscall-template.S:81
  5    Thread 0x2b3be8504700 (LWP 13061) 0x00002b3be333c23d in nanosleep () from /lib/x86_64/libc.so.6
  4    Thread 0x2b3be8302700 (LWP 13060) 0x00002b3be333c23d in nanosleep () from /lib/x86_64/libc.so.6
  3    Thread 0x2b3be8100700 (LWP 13059) 0x00002b3be333c23d in nanosleep () from /lib/x86_64/libc.so.6
  2    Thread 0x2b3be38a8700 (LWP 13056) 0x00002b3be333c23d in nanosleep () from /lib/x86_64/libc.so.6
* 1    Thread 0x2b3be3635140 (LWP 13055) 0x0000000000562fb9 in HttpSM::handle_api_return (this=0x2b3bf0f7de70) at HttpSM.cc:1562

Backtrace from signalled thread
===============================

#0  0x0000000000562fb9 in HttpSM::handle_api_return (this=0x2b3bf0f7de70) at HttpSM.cc:1562
No locals.
#1  0x0000000000562dc0 in HttpSM::state_api_callout (this=0x2b3bf0f7de70, event=60000, data=0x0) at HttpSM.cc:1486
        api_next = HttpSM::API_RETURN_CONTINUE
        __FUNCTION__ =           "state_api_callout"
#2  0x000000000056255b in HttpSM::state_api_callback (this=0x2b3bf0f7de70, event=60000, data=0x0) at HttpSM.cc:1305
        __FUNCTION__ =           "state_api_callback"
#3  0x00000000004f796b in TSHttpTxnReenable (txnp=0x2b3bf0f7de70, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5558
        trylock = {
          m = {
            m_ptr = 0x2b3bf4164e10
          }, 
          lock_acquired = true
        }
        sm = 0x2b3bf0f7de70
        eth = 0x2b3be39ab010
#4  0x00002b3beadd8b15 in cont_header_filter (event=TS_EVENT_HTTP_SEND_RESPONSE_HDR, edata=0x2b3bf0f7de70) at header_filter.cc:90
        txnp = 0x2b3bf0f7de70
        hook = TS_HTTP_SEND_RESPONSE_HDR_HOOK
        reqp = 0x2b3bf0f7e5a8
        hdr_loc = 0x2b3bf0026898
#5  0x00000000004ec383 in INKContInternal::handle_event (this=0x2a357b0, event=60007, edata=0x2b3bf0f7de70) at InkAPI.cc:993
No locals.
#6  0x00000000004e3abe in Continuation::handleEvent (this=0x2a357b0, event=60007, data=0x2b3bf0f7de70) at ../iocore/eventsystem/I_Continuation.h:146
No locals.
#7  0x00000000004ecb9d in APIHook::invoke (this=0x2a368c0, event=60007, edata=0x2b3bf0f7de70) at InkAPI.cc:1212
No locals.
#8  0x0000000000562b4e in HttpSM::state_api_callout (this=0x2b3bf0f7de70, event=0, data=0x0) at HttpSM.cc:1412
        plugin_lock = false
        plugin_mutex = {
          m_ptr = 0x0
        }
        hook = 0x2a368c0
        api_next = HttpSM::API_RETURN_UNKNOWN
        __FUNCTION__ =           "state_api_callout"
#9  0x000000000056dab1 in HttpSM::do_api_callout_internal (this=0x2b3bf0f7de70) at HttpSM.cc:4736
No locals.
#10 0x000000000057927a in HttpSM::do_api_callout (this=0x2b3bf0f7de70) at HttpSM.cc:504
No locals.
#11 0x0000000000574e13 in HttpSM::set_next_state (this=0x2b3bf0f7de70) at HttpSM.cc:7034
        __FUNCTION__ =           "set_next_state"
#12 0x0000000000573d6e in HttpSM::call_transact_and_set_next_state (this=0x2b3bf0f7de70, f=0x5830b6 <HttpTransact::HandleApiErrorJump(HttpTransact::State*)>) at HttpSM.cc:6747
        __FUNCTION__ =           "call_transact_and_set_next_state"
#13 0x00000000005623c4 in HttpSM::state_common_wait_for_transform_read (this=0x2b3bf0f7de70, t_info=0x2b3bf0f80000, tunnel_handler=(int (HttpSM::*)(HttpSM * const, int, void *)) 0x5660ea <HttpSM::tunnel_handler_post(int, void*)>, event=3, data=0x2b3c1c000990) at HttpSM.cc:1278
        __FUNCTION__ =           "state_common_wait_for_transform_read"
        c = 0x2b3bf0f7fa68
#14 0x0000000000561f2d in HttpSM::state_request_wait_for_transform_read (this=0x2b3bf0f7de70, event=3, data=0x2b3c1c000990) at HttpSM.cc:1174
        __FUNCTION__ =           "state_request_wait_for_transform_read"
        size = 9223372036854775807
#15 0x0000000000565f74 in HttpSM::main_handler (this=0x2b3bf0f7de70, event=2000, data=0x2b3c1c000990) at HttpSM.cc:2503
        jump_point = NULL
        __FUNCTION__ =           "main_handler"
        vc_entry = 0x0
#16 0x00000000004e3abe in Continuation::handleEvent (this=0x2b3bf0f7de70, event=2000, data=0x2b3c1c000990) at ../iocore/eventsystem/I_Continuation.h:146
No locals.
#17 0x0000000000528e1f in TransformTerminus::handle_event (this=0x2b3c1c000908, event=1) at Transform.cc:173
        val = 1
        __FUNCTION__ =           "handle_event"
#18 0x00000000004e3abe in Continuation::handleEvent (this=0x2b3c1c000908, event=1, data=0x2b3bf416c520) at ../iocore/eventsystem/I_Continuation.h:146
No locals.
#19 0x00000000006c4def in EThread::process_event (this=0x2b3be39ab010, e=0x2b3bf416c520, calling_code=1) at UnixEThread.cc:141
        c_temp = 0x2b3c1c000908
        lock = {
          m = {
            m_ptr = 0x2b3bf4164e10
          }, 
          lock_acquired = true
        }
#20 0x00000000006c4fc3 in EThread::execute (this=0x2b3be39ab010) at UnixEThread.cc:192
        done_one = false
        e = 0x2b3bf416c520
        NegativeQueue = {
          <DLL<Event, Event::Link_link>> = {
            head = 0x2915ea0
          }, 
          members of Queue<Event, Event::Link_link>: 
          tail = 0x2915960
        }
        next_time = 1384545575927012429
#21 0x000000000050c7fb in main (argv=0x7fff6c38fae8) at Main.cc:1690
        admin_user_p = true
        __FUNCTION__ =           "main"
        stacksize = 1048576
        user = 0x28d22a0 "\320\344\326\342;+"
        main_thread = 0x2b3be38a9010
        mlock_flags = 0
        machine_addr = {
          sa = {
            sa_family = 0, 
            sa_data =               '\000' <repeats 13 times>
          }, 
          sin = {
            sin_family = 0, 
            sin_port = 0, 
            sin_addr = {
              s_addr = 0
            }, 
            sin_zero =               "\000\000\000\000\000\000\000"
          }, 
          sin6 = {
            sin6_family = 0, 
            sin6_port = 0, 
            sin6_flowinfo = 0, 
            sin6_addr = {
              __in6_u = {
                __u6_addr8 =                   '\000' <repeats 15 times>, 
                __u6_addr16 =                   {[0] = 0,
                  [1] = 0,
                  [2] = 0,
                  [3] = 0,
                  [4] = 0,
                  [5] = 0,
                  [6] = 0,
                  [7] = 0}, 
                __u6_addr32 =                   {[0] = 0,
                  [1] = 0,
                  [2] = 0,
                  [3] = 0}
              }
            }, 
            sin6_scope_id = 0
          }
        }
        cluster_type = 3
        num_remap_threads = 0
{noformat}

I looked at {{HttpSM.cc:1562}} and it appears to be this line.
{noformat}
if (ua_session) {
  ua_session->get_netvc()->set_inactivity_timeout(HRTIME_SECONDS(t_state.txn_conf->transaction_no_activity_timeout_in));
{noformat}

Could this be the case of client has given up and disconnected which led to {{ua_session->get_netvc()}} returned invalid pointer? Anybody saw this before?



--
This message was sent by Atlassian JIRA
(v6.1#6144)