You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficcontrol.apache.org by "Jeff Elsloo (JIRA)" <ji...@apache.org> on 2017/06/27 13:45:00 UTC

[jira] [Resolved] (TC-401) Traffic Router Serves OFFLINE Caches

     [ https://issues.apache.org/jira/browse/TC-401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Elsloo resolved TC-401.
----------------------------
    Resolution: Fixed

> Traffic Router Serves OFFLINE Caches
> ------------------------------------
>
>                 Key: TC-401
>                 URL: https://issues.apache.org/jira/browse/TC-401
>             Project: Traffic Control
>          Issue Type: Bug
>          Components: Traffic Router
>    Affects Versions: 2.0.0
>            Reporter: Jeff Elsloo
>             Fix For: 2.1.0
>
>
> We identified an issue that causes Traffic Router to serve up an {{OFFLINE}} cache indefinitely after a snapshot of the CRConfig. This bug will also do the inverse, where a cache that was previously set to {{OFFLINE}} will never have traffic routed to it when set back to {{ONLINE}} or {{REPORTED}} (referenced only as {{ONLINE}} henceforth).
> The bug is caused by {{ConfigHandler.processConfig()}} clearing the cache locations from the {{NetworkNode}} prior to swapping out the instance of {{CacheRegister}}. When the cache locations have been cleared, but the prior {{CacheRegister}} is still in place, a race condition can occur where the {{CacheLocation}} for a given cache group from the prior config will be set on the recently cleared {{NetworkNode}}. When this happens, the {{List<Cache>}} contains the prior config's list for that cache group, which means that any host state change from/to {{ONLINE}} or {{OFFLINE}} will not be reflected. This is because when transitioning to {{OFFLINE}} the {{Cache}} drops from the CRConfig and it will reappear when set to {{ONLINE}}. Contrast this with {{ONLINE}} to {{ADMIN_DOWN}}, the {{Cache}} remains in the CRConfig, so we are simply using the status to determine whether the cache is available and the software works as designed.
> This is due to the way we use lazy loading to associate network ranges within the CZF with {{CacheLocations}} within a given {{NetworkNode}} representing that section of the CZF. In {{TrafficRouter}}, during cache selection, if we have a hit in the coverage zone file but the {{CacheLocation}} is uninitialized, we obtain the {{CacheLocation}} from {{CacheRegister}} and set it for that specific {{NetworkNode}}. If our {{NetworkNode}} is cleared but our {{CacheRegister}} has yet to be swapped, we will set the {{NetworkNode}} to the old {{CacheLocation}} and as mentioned, which will have a reference to the prior {{List<Cache>}}, denying anyone the opportunity to populate that {{NetworkNode}} with the new {{CacheLocation}} and new {{List<Cache>}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)