You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Sudheer Vinukonda (JIRA)" <ji...@apache.org> on 2014/12/28 17:21:13 UTC
[jira] [Comment Edited] (TS-2780) Core dump in SpdyRequest::clear()
in production testing of SPDY
[ https://issues.apache.org/jira/browse/TS-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14055129#comment-14055129 ]
Sudheer Vinukonda edited comment on TS-2780 at 12/28/14 4:20 PM:
-----------------------------------------------------------------
I am finally able to reproduce this error message and understand the issue. Basically, this happens, due to a race condition, when the client sends a {{DATA}} frame while ATS sends an error in {{SYN_REPLY}} (e.g. a POST transaction with client trying to send a large data file, but, ATS denies POST transactions from the client using ip_allow ACL).
{code}
E.g. % spdycat -3 -v https://abc.com -d /tmp/1 > /tmp/2 (where /tmp/1 is a large data file with about 5K lines).
{code}
The issue happens due to the code in {{spdy_on_data_chunk_recv_callback()}}, {{spdy_on_data_recv_callback()}} etc. Basically, when the error happens, the request is deleted and erased from {{SpdyClientSession->req_map}}. However, the data frame may still be received after the req object is deleted. The below code correctly ignores the data frames, since, the request has been deleted already - but, the statement {{SpdyRequest *req = sm->req_map[stream_id];}} results in adding an element with key {{stream_id}} to the {{req_map}} with value {{NULL}}. A correct way to check if the req has been erased from the map is to check if the iterator pointing to that element is valid. Will test and upload a patch shortly ..
{code}
void
spdy_on_data_chunk_recv_callback(spdylay_session * /*session*/, uint8_t /*flags*/,
int32_t stream_id, const uint8_t *data,
size_t len, void *user_data)
{
SpdyClientSession *sm = (SpdyClientSession *)user_data;
SpdyRequest *req = sm->req_map[stream_id];
//
// SpdyRequest has been deleted on error, drop this data;
//
if (!req)
return;
{code}
was (Author: sudheerv):
I am finally able to reproduce this error message and understand the issue. Basically, this happens, due to a race condition, when the client sends a DATA frame while ATS sends an error in SYN_REPLY (e.g. a POST transaction with client trying to send a large data file, but, ATS denies POST transactions from the client using ip_allow ACL).
E.g. % spdycat -3 -v https://abc.com -d /tmp/1 > /tmp/2 (where /tmp/1 is a large data file with about 5K lines).
The issue happens due to the code in spdy_on_data_chunk_recv_callback(), spdy_on_data_recv_callback() etc. Basically, when the error happens, the request is deleted and erased from client_session->req_map. However, the data frame may still be received after the req object is deleted. The below code correctly ignores the data frames, since, the request has been deleted already - but, the statement "SpdyRequest *req = sm->req_map[stream_id];" results in adding an element with key stream_id to the req_map with value NULL. A correct way to check if the req has been erased from the map is to check if the iterator pointing to that element is valid. Will test and upload a patch shortly ..
{code}
void
spdy_on_data_chunk_recv_callback(spdylay_session * /*session*/, uint8_t /*flags*/,
int32_t stream_id, const uint8_t *data,
size_t len, void *user_data)
{
SpdyClientSession *sm = (SpdyClientSession *)user_data;
SpdyRequest *req = sm->req_map[stream_id];
//
// SpdyRequest has been deleted on error, drop this data;
//
if (!req)
return;
{code}
> Core dump in SpdyRequest::clear() in production testing of SPDY
> ---------------------------------------------------------------
>
> Key: TS-2780
> URL: https://issues.apache.org/jira/browse/TS-2780
> Project: Traffic Server
> Issue Type: Bug
> Components: SPDY
> Reporter: Sudheer Vinukonda
> Assignee: Bryan Call
> Labels: spdy, yahoo
> Fix For: 5.1.0
>
> Attachments: TS-2780.diff, ts2780.diff
>
>
> Noticed the below core dump in SpdyRequest::clear() during production testing on our proxy platform.
> {code}
> NOTE: Traffic Server received Sig 11: Segmentation fault
> /home/y/bin/traffic_server - STACK TRACE:
> /lib64/libpthread.so.0(+0x3420a0f500)[0x2b47cff7d500]
> /home/y/bin/traffic_server(_ZN11SpdyRequest5clearEv+0x14)[0x5d06d4]
> /home/y/bin/traffic_server(_ZN6SpdySM5clearEv+0x34)[0x5d0b84]
> /home/y/bin/traffic_server[0x5d13d1]
> /home/y/bin/traffic_server(_ZN18UnixNetVConnection14readSignalDoneEiP10NetHandler+0x3d)[0x6efbcd]
> /home/y/bin/traffic_server(_ZN17SSLNetVConnection11net_read_ioEP10NetHandlerP7EThread+0xc76)[0x6df3e6]
> /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x6e6b12]
> /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x714aaf]
> /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x493)[0x715453]
> /home/y/bin/traffic_server[0x713e5a]
> /lib64/libpthread.so.0(+0x3420a07851)[0x2b47cff75851]
> /lib64/libc.so.6(clone+0x6d)[0x34202e894d]
> {code}
> gdb output shows that req may be a null pointer in SpdySM::clear()
> {code}
> (gdb) bt
> #0 SpdyRequest::clear (this=0x0) at SpdySM.cc:44
> #1 0x00000000005d0b84 in SpdySM::clear (this=0x2aefdc4b46a0) at SpdySM.cc:116
> #2 0x00000000005d13d1 in spdy_default_handler (contp=<value optimized out>, event=TS_EVENT_VCONN_INACTIVITY_TIMEOUT, edata=<value optimized out>) at SpdySM.cc:258
> #3 0x00000000006eeabb in handleEvent (event=<value optimized out>, vc=0x2aefcc141ca0) at ../../iocore/eventsystem/I_Continuation.h:146
> #4 read_signal_and_update (event=<value optimized out>, vc=0x2aefcc141ca0) at UnixNetVConnection.cc:216
> #5 0x00000000006f3cd4 in UnixNetVConnection::mainEvent (this=0x2aefcc141ca0, event=<value optimized out>, e=<value optimized out>) at UnixNetVConnection.cc:1152
> #6 0x00000000006e7d0e in handleEvent (this=0x1634af0, event=<value optimized out>, e=0x136e850) at ../../iocore/eventsystem/I_Continuation.h:146
> #7 InactivityCop::check_inactivity (this=0x1634af0, event=<value optimized out>, e=0x136e850) at UnixNet.cc:67
> #8 0x0000000000714aaf in handleEvent (this=0x2aef48f91010, e=0x136e850, calling_code=2) at I_Continuation.h:146
> #9 EThread::process_event (this=0x2aef48f91010, e=0x136e850, calling_code=2) at UnixEThread.cc:145
> #10 0x0000000000715623 in EThread::execute (this=0x2aef48f91010) at UnixEThread.cc:224
> #11 0x0000000000713e5a in spawn_thread_internal (a=0x163d000) at Thread.cc:88
> #12 0x00002aed0a8a1851 in start_thread () from /lib64/libpthread.so.0
> #13 0x00000034202e894d in clone () from /lib64/libc.so.6
> (gdb) print this
> $1 = (SpdyRequest * const) 0x0
> (gdb) up
> #1 0x00000000005d0b84 in SpdySM::clear (this=0x2aefdc4b46a0) at SpdySM.cc:116
> 116 in SpdySM.cc
> (gdb) print req
> $2 = (SpdyRequest *) 0x0
> (gdb) print iter
> $3 = {first = 25, second = }
> (gdb) print req_map
> $4 = std::map with 3 elements = {
> [1] = 0x2aefd81e5ae0,
> [25] = 0x0,
> [27] = 0x0
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)