You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Leif Hedstrom (JIRA)" <ji...@apache.org> on 2016/03/16 17:09:33 UTC

[jira] [Commented] (TS-4278) HostDB sync causes active transactions to block for 100's of ms

    [ https://issues.apache.org/jira/browse/TS-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197581#comment-15197581 ] 

Leif Hedstrom commented on TS-4278:
-----------------------------------

Nice catch. I'm +1 on setting this to 0 as a default for v7.0.0 (file an appropriate Jira for that maybe if you agree). Unless of course we manage to replace HostDB entirely for 7.0.0 :). That much said, would it help if we scheduled this on a task thread ? That's also imply changing the MultiCacheBase and how it schedules as well.

Fwiw, we run traffic_server with the "-k" option in production, to force a flush on every startup. But this is better IMO.

Also, I noticed that setting this config to 0 on a running system does not stop it from syncing. Checking the code, we do reload the config, but it doesn't seem to allow for the case of disabling the continuation when we set it to 0. [~shinrich] Can you confirm this? If so, should we fix that for 6.2?

> HostDB sync causes active transactions to block for 100's of ms
> ---------------------------------------------------------------
>
>                 Key: TS-4278
>                 URL: https://issues.apache.org/jira/browse/TS-4278
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: HostDB
>            Reporter: Susan Hinrichs
>             Fix For: 6.2.0
>
>
> When HostDB syncs to disk (by default every two minutes), active transactions will block when they reach HttpSM::do_hostdb_lookup.  This is because do_hostdb_lookup calls hostDBProcessor.getbyname_imm which attempts to get the bucket locks.   The delays generally last for 500-1200ms.  This blocks the event loop so no other actions will be performed by the net handler until the lock is dropped.
> I'm assuming that the bucket locks are grabbed by the sync logic.  When I increased proxy.config.cache.hostdb.sync_frequency to 1200, the every two minute slow down went away.  Fortunately proxy.config.cache.hostdb.sync_frequency set to 0 seems to completely eliminate the sync, which will be my suggested solution internally.
> I tried reducing the size of the hostdb table, but that didn't seem to affect the delay time.
> The delay only reliably exhibited on loaded system.  Running my httperf test case on a machine with no other activity did not show the delays.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)