You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Alan M. Carroll (JIRA)" <ji...@apache.org> on 2011/09/03 01:50:09 UTC

[jira] [Commented] (TS-934) Proxy Mutex null pointer crash

    [ https://issues.apache.org/jira/browse/TS-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096479#comment-13096479 ] 

Alan M. Carroll commented on TS-934:
------------------------------------

bcall reports seeing something that looks very much like this problem (crash in do_io_close at the ProxyMutexPtr dereference with a value of 0 for this). He reports that he doesn't see it at 65K TPS but does at 140K TPS on the 3.0.1 codebase. This codebase does not include the TS-911 fix.

> Proxy Mutex null pointer crash
> ------------------------------
>
>                 Key: TS-934
>                 URL: https://issues.apache.org/jira/browse/TS-934
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 3.1.0
>         Environment: Debian 6.0.2 quadcore, forward transparent proxy.
>            Reporter: Alan M. Carroll
>            Assignee: Alan M. Carroll
>             Fix For: 3.1.1
>
>
> [Client report]
> We had the cache crash gracefully twice last night on a segfault.  Both 
> times the callstack produced by trafficserver's signal handler was:
> /usr/bin/traffic_server[0x529596]
> /lib/libpthread.so.0(+0xef60)[0x2ab09a897f60]
> [0x2ab09e7c0a10]
> usr/bin/traffic_server(HttpServerSession::do_io_close(int)+0xa8)[0x567a3c]
> /usr/bin/traffic_server(HttpVCTable::cleanup_entry(HttpVCTableEntry*)+0x4c)[0x56aff6]
> /usr/bin/traffic_server(HttpVCTable::cleanup_all()+0x64)[0x56b07a]
> /usr/bin/traffic_server(HttpSM::kill_this()+0x120)[0x57c226]
> /usr/bin/traffic_server(HttpSM::main_handler(int, void*)+0x208)[0x571b28]
> /usr/bin/traffic_server(Continuation::handleEvent(int, 
> void*)+0x69)[0x4e4623]
> I went through the disassembly and the instruction that it is on in 
> ::do_io_close is loading the value of diags (not dereferencing it) so it 
> is unlikely that that through a segfault (unless this is some how in 
> thread local storage and that is corrupt).
> The kernel message claimed that the instruction pointer was 0x4e438e 
> which in this build is in ProxyMutexPtr::operator ->() on the 
> instruction that dereferences the object pointer to get the stored mutex 
> pointer (bingo!), so it would seem that at some point we are 
> dereferencing a null "safe" pointer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira