You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/05/20 20:40:12 UTC

[jira] [Commented] (TS-4331) Hostdb consistency problems due to MultiCache

    [ https://issues.apache.org/jira/browse/TS-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15294146#comment-15294146 ] 

ASF GitHub Bot commented on TS-4331:
------------------------------------

GitHub user jacksontj opened a pull request:

    https://github.com/apache/trafficserver/pull/653

    TS-4331Major re-write of hostdb

    This is still a WIP, but I figure it's about time to get some feedback on the code here.
    
    The primary goal of this is to fix the crashing issues described in TS-4331, where basically multicache wasn't keeping track of who referenced various items within the cache.
    
    At this point my TODO list still is:
    - Configurable size/item limits
    - configurable max RR/SRV records (instead of a constant)
    - Additional tests
    -- tests for syncing/loading cache from disk
    - cleanup alloc() method in RefCountCache (now its using new, need to either move to ClassAllocator or to just pad the iobuf allocation and put them in there)
    - bounds checking in alloc() (specifically that we could even allocate an item of the size requested)
    - additional metrics
    -- number of items
    -- total size
    -- hit rate
    
    I'm sure more will come up, I'll do my best to keep this list ^^ up-to-date

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jacksontj/trafficserver hostdb_cleanup

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/trafficserver/pull/653.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #653
    
----

----


> Hostdb consistency problems due to MultiCache
> ---------------------------------------------
>
>                 Key: TS-4331
>                 URL: https://issues.apache.org/jira/browse/TS-4331
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: HostDB
>            Reporter: Thomas Jackson
>            Assignee: Thomas Jackson
>             Fix For: 7.0.0
>
>
> This ticket is for the correct long term fix to TS-4207
> pulled from a comment, which wraps up the issue
> {quote}
> Leif Hedstrom I have spent a decent amount of time on this while I was OOO on vacation the last couple of weeks. It seems that the root cause of this issue has always existed, and that the addition of always doing hostname storing (https://github.com/apache/trafficserver/commit/0e703e1e) we are just causing the issue to happen all the time.
> To understand the issue I'll give a little background in how hostdb is currently working. Basically hostdb is just a wrapper around this templated struct called MultiCache. MultiCache is "multi" not because it is templated, but because it has two types of storage (static-- blocks and dynamic-- alloc). The static side of the cache can hold N HostDBInfo structs (the results of DNS queries). The dynamic side is used to store the round robin records and various strings associated with the record. The size of this dynamic space is defined as (N x [estimated_heap_bytes_per_entry. The basic problem we are running into is that we are putting too much preassure on the dynamic heap-- such that the heap is getting re-used while people still have references to items in that space.
> So, I've actually been working on re-writing MultiCache to allocate the entire required block at once (so we don't have this problem where the parent exists but not the children), but I'm not certain if we want such a change to go into the 6.x branch (I'm willing to discuss if we want). If we aren't comfortable with such a large change I suggest just accounting for the hostname size in the estimated_heap_bytes_per_entry as a stopgap solution. The maximum allowable size is 253 (so 254 with null terminator), but we could pick a smaller number (~120 or so seems to be more reasonable). Alternatively you can increase the number of records in hostdb (and the size accordingly) to increase the dynamic heap size.
> TLDR; almost done with the long term solution, but I'm not sure if we want to merge that into 6.x-- alternatively we can do a simple workaround in 6.x (https://github.com/apache/trafficserver/pull/553)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)