You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by "Manjesh Nilange (JIRA)" <ji...@apache.org> on 2010/01/13 23:45:54 UTC
[jira] Commented: (TS-87) Performance improvement: Avoid linear
search in remap code
[ https://issues.apache.org/jira/browse/TS-87?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799980#action_12799980 ]
Manjesh Nilange commented on TS-87:
-----------------------------------
Proposed change:
Design:
We create a trie for all the mappings for a given host, specific to a <url_type, port> combination. The <url_type, port> combination is required so that mappings for different purposes exist separately. Each node of this trie will have a pointer to the mapping and it's rank (if one exists) and a array of child node pointers.
Insert: We first hash the host field of the incoming request. We use the trie collection for that host, if it exists or else create a new one. We then find/create a trie for the <url_type, port> combination of the incoming request and then store the mapping in the trie.
Search: We first find hash the host field and find the trie collection specific to that host. We then use the <url_type, port> of the incoming request to find a trie within that collection. We then traverse that trie using the incoming request's path. As we traverse, we retain the best-ranked mapping we encounter. This ensures both these properties:
1) incoming path /abcd should match configured path /a - this works because we encounter the mapping for /a in the earlier the traversal itself
2) if mappings are configured for both /a and /abcd and /abcd has a better rank (using line number for now), the mapping for /abcd will be returned as it'll win over the first mapping of /a owing to rank.
In the hash table, we store a collection of tries for every host string.
New classes:
Trie - A template class that implements a trie; It's a template because it can be instantiated to store anything in the nodes. Right now it's only instantiation is to store url_mapping pointers.
UrlMappingPathIndex - Serves as an abstraction for the trie collection. This class hides away the <url_type, port> differentiating and the actual Trie. UrlRewrite.cc just stores UrlMappingPathIndex objects in the hash table and is not aware of any other details.
> Performance improvement: Avoid linear search in remap code
> ----------------------------------------------------------
>
> Key: TS-87
> URL: https://issues.apache.org/jira/browse/TS-87
> Project: Traffic Server
> Issue Type: Improvement
> Components: Core
> Reporter: Manjesh Nilange
> Assignee: Manjesh Nilange
> Priority: Minor
> Attachments: trie.patch
>
>
> Currently, the remap code stores all the remap rules in a hash table keyed by the host field of the "from URL" of the rule. Entries with the same host field are chained in a linked list. When looking for a mapping, the code has to do a linear traversal of this list, based on the path of the incoming request to find a matching rule. This performance should be improved as overhead can be substantial in cases where there are hundreds of the entries for the same host.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.