You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Thomas Jackson (JIRA)" <ji...@apache.org> on 2016/11/12 00:53:58 UTC

[jira] [Created] (TS-5052) Segfault in HostDB sync if something fails while not holding the parent continuation mutex

Thomas Jackson created TS-5052:
----------------------------------

             Summary: Segfault in HostDB sync if something fails while not holding the parent continuation mutex
                 Key: TS-5052
                 URL: https://issues.apache.org/jira/browse/TS-5052
             Project: Traffic Server
          Issue Type: Bug
            Reporter: Thomas Jackson


What we noticed was the following in traffic.out:

{code}
Server {0x2af761e0d700} WARNING: <P_RefCountCache.h:510 (initialize_storage)> Unable to create temporary file /var/trafficserver/
host.db.syncing, unable to persist hostdb: -13 error:Permission denied
traffic_server: Segmentation fault (Address not mapped to object [0x28])traffic_server - STACK TRACE: 
{code}

Which lead me to dig into it-- and it turns out the issue is related to changes after the HostDB rewrite to move syncing outside of the main NET threads. Before all calls to this syncer where done in a single net thread wherever it was initially scheduled. Now we bounce between ET_TASK threads and ET_NET threads (to avoid switching, lock contention, etc.)-- but the error handlers weren't updated to handle this situation.

So to fix this, I've created a "set_error" and "return_error" method to the RefCountCacheSerializer which will take this into consideration-- specifically that it will immediately return the error if scheduled in the calling thread-- otherwise it'll reschedule onto that thread *then* return the error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)