You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficcontrol.apache.org by GitBox <gi...@apache.org> on 2022/05/05 22:16:51 UTC

[GitHub] [trafficcontrol] rawlinp opened a new pull request, #6810: Blueprint: Cache Config Service

rawlinp opened a new pull request, #6810:
URL: https://github.com/apache/trafficcontrol/pull/6810

   <!--
   Thank you for contributing! Please be sure to read our contribution guidelines: https://github.com/apache/trafficcontrol/blob/master/CONTRIBUTING.md
   If this closes or relates to an existing issue, please reference it using one of the following:
   
   Closes: #ISSUE
   Related: #ISSUE
   
   If this PR fixes a security vulnerability, DO NOT submit! Instead, contact
   the Apache Traffic Control Security Team at security@trafficcontrol.apache.org and follow the
   guidelines at https://apache.org/security regarding vulnerability disclosure.
   -->
   
   Introduce a blueprint for the Cache Config Service.
   
   <!-- **^ Add meaningful description above** --><hr/>
   
   <!--
   Licensed to the Apache Software Foundation (ASF) under one
   or more contributor license agreements.  See the NOTICE file
   distributed with this work for additional information
   regarding copyright ownership.  The ASF licenses this file
   to you under the Apache License, Version 2.0 (the
   "License"); you may not use this file except in compliance
   with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
   Unless required by applicable law or agreed to in writing,
   software distributed under the License is distributed on an
   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
   KIND, either express or implied.  See the License for the
   specific language governing permissions and limitations
   under the License.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficcontrol] ocket8888 commented on a diff in pull request #6810: Blueprint: Cache Config Service

Posted by GitBox <gi...@apache.org>.
ocket8888 commented on code in PR #6810:
URL: https://github.com/apache/trafficcontrol/pull/6810#discussion_r887197646


##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"

Review Comment:
   Nit but these timestamps aren't actually in RFC3339 format; they need a `Z` suffix.
   
   <a href="https://go.dev/play/p/KqXYUbD0ABr"><kbd>Playground</kbd></a>



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],
+    "profiles": [],
+    "deliveryServices": [],
+    "deliveryServiceServers": [],
+    "jobs": [],
+    "cdn": {},
+    "deliveryServiceRegexes": [],
+    "serverCapabilities":{}, 
+    "dsRequiredCapabilities": {},
+    "uriSigningKeys": {},
+    "urlSigKeys": {},
+    "sslKeys": [],
+    "topologies": [],
+  }
+}
+```
+
+
+#### Client Impact
+A new TO Go client method will be added for the `GET /cdn_update_times` API,
+but because `GET /cache_config_snapshot` will be served by the Cache Config
+Service, it may not have a corresponding method in the TO Go client (although
+the TO Go client "raw" method could still be used to request this API).
+
+#### Data Model / Database Impact
+The Traffic Ops data model and database schema will remain unchanged. The
+proposed Cache Config Services may or may not use a traditional database --
+they may just store the latest generated snapshots in memory.
+
+### t3c Impact
+`t3c` will be updated to optionally request data from the new Cache Config
+Services (via the Traffic Ops reverse proxy functionality). This will likely be
+a new CLI flag so that the new data request path can be enabled at-will.
+Eventually, the option will be required so that `t3c` doesn't have to maintain

Review Comment:
   > Eventually, the option will be required...
   
   If the behavior eventually becomes mandatory, wouldn't that obviate the need for the command-line option to exist at all? 



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at

Review Comment:
   Our [API Guidelines](https://traffic-control-cdn.readthedocs.io/en/latest/development/api_guidelines.html#date-time-format) state:
   
   > Wherever date/times are accepted as input, [Traffic Ops API](https://traffic-control-cdn.readthedocs.io/en/latest/api/index.html#to-api) endpoints MUST accept either format [Unix epoch timestamps or RFC3339 timestamp strings]...
   
   So to conform with that this would also need to support RFC3339-format timestamps.



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],
+    "profiles": [],
+    "deliveryServices": [],
+    "deliveryServiceServers": [],
+    "jobs": [],
+    "cdn": {},
+    "deliveryServiceRegexes": [],
+    "serverCapabilities":{}, 
+    "dsRequiredCapabilities": {},
+    "uriSigningKeys": {},
+    "urlSigKeys": {},
+    "sslKeys": [],
+    "topologies": [],
+  }
+}
+```
+
+
+#### Client Impact
+A new TO Go client method will be added for the `GET /cdn_update_times` API,
+but because `GET /cache_config_snapshot` will be served by the Cache Config
+Service, it may not have a corresponding method in the TO Go client (although
+the TO Go client "raw" method could still be used to request this API).
+
+#### Data Model / Database Impact
+The Traffic Ops data model and database schema will remain unchanged. The
+proposed Cache Config Services may or may not use a traditional database --
+they may just store the latest generated snapshots in memory.
+
+### t3c Impact
+`t3c` will be updated to optionally request data from the new Cache Config
+Services (via the Traffic Ops reverse proxy functionality). This will likely be
+a new CLI flag so that the new data request path can be enabled at-will.
+Eventually, the option will be required so that `t3c` doesn't have to maintain
+two different data request paths.
+
+### Traffic Monitor Impact
+n/a
+
+### Traffic Router Impact
+n/a
+
+### Traffic Stats Impact
+n/a
+
+### Traffic Vault Impact
+n/a
+
+### Documentation Impact
+The new `cdn_update_times` Traffic Ops API will be documented in the usual
+manner. Cache Config Service sections may be added to the existing
+documentation, including the documentation of its `cache_config_snapshots` API.

Review Comment:
   It'd be neat if you could just make a `README.rst` in the service's dedicated directory and the `setup.py` for our docs just included it automagically. That would take some tinkering, and it's not really related to this blueprint, but it would be nice to document the service one time, closest to its source code.



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`

Review Comment:
   Why the new endpoint instead of adding two properties to a CDN as represented in `/cdns`?



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],
+    "profiles": [],
+    "deliveryServices": [],
+    "deliveryServiceServers": [],
+    "jobs": [],
+    "cdn": {},
+    "deliveryServiceRegexes": [],
+    "serverCapabilities":{}, 
+    "dsRequiredCapabilities": {},
+    "uriSigningKeys": {},
+    "urlSigKeys": {},
+    "sslKeys": [],
+    "topologies": [],
+  }
+}
+```
+
+
+#### Client Impact
+A new TO Go client method will be added for the `GET /cdn_update_times` API,
+but because `GET /cache_config_snapshot` will be served by the Cache Config
+Service, it may not have a corresponding method in the TO Go client (although
+the TO Go client "raw" method could still be used to request this API).
+
+#### Data Model / Database Impact
+The Traffic Ops data model and database schema will remain unchanged. The
+proposed Cache Config Services may or may not use a traditional database --
+they may just store the latest generated snapshots in memory.
+
+### t3c Impact
+`t3c` will be updated to optionally request data from the new Cache Config
+Services (via the Traffic Ops reverse proxy functionality). This will likely be
+a new CLI flag so that the new data request path can be enabled at-will.
+Eventually, the option will be required so that `t3c` doesn't have to maintain
+two different data request paths.
+
+### Traffic Monitor Impact
+n/a
+
+### Traffic Router Impact
+n/a
+
+### Traffic Stats Impact
+n/a
+
+### Traffic Vault Impact
+n/a
+
+### Documentation Impact
+The new `cdn_update_times` Traffic Ops API will be documented in the usual
+manner. Cache Config Service sections may be added to the existing
+documentation, including the documentation of its `cache_config_snapshots` API.
+The new `t3c` CLI flag will be added to its documentation.
+
+### Testing Impact
+The new `cdn_update_times` Traffic Ops API will be tested via the Traffic Ops API tests.
+
+The new Cache Config Service will have its own unit and integration tests.
+
+The `t3c` integration tests may need to be updated to use the Cache Config
+Service for data retrieval in addition to Traffic Ops.
+
+### Performance Impact
+One of the primary goals of this blueprint is to shift `t3c` request load off
+of the Traffic Ops database bottleneck onto horizontally-scalable Cache Config
+Services. This will allow CDNs to propagate changes much more quickly than they
+can today. Because the Cache Config Services will be reverse-proxied through
+Traffic Ops, network load will increase on Traffic Ops servers if `t3c` is run
+more frequently, but the additional CPU load should be fairly minimal because
+the reverse-proxying should not be CPU-intensive. That said, Traffic Ops can
+also be scaled out horizontally if necessary to spread out network and CPU
+load for effectively.
+
+In order to minimize any Traffic Ops database load from requests to Traffic Ops
+from the Cache Config Services, they will poll/query Traffic Ops in a
+consistent-hash-like manner so that they are not all making requests to Traffic

Review Comment:
   What is meant by "a consistent-hash-like manner"? Does this have to do with deciding which of a set of TO instances to query, or with the timing of making such requests?



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],
+    "profiles": [],
+    "deliveryServices": [],
+    "deliveryServiceServers": [],
+    "jobs": [],
+    "cdn": {},
+    "deliveryServiceRegexes": [],
+    "serverCapabilities":{}, 
+    "dsRequiredCapabilities": {},

Review Comment:
   Assuming these are maps of some identifier for a server/DS to their respective Capabilities (remember server short hostnames aren't unique, so it shouldn't use that), could these just be properties of their respective object? e.g.
   ```jsonc
   {
     //...
     "servers": [
       //...
       {
         //...
         "capabilities": []
       }
     ],
     "deliveryServices": [
       //...
       {
         //...
         "requiredCapabilities": []
       }
     ]
   }
   ```
   
   Now that I'm looking through, seems like the same would apply to `uriSigningKeys`, `urlSigKeys`, possibly `sslKeys`, `deliveryServiceRegexes`, probably `jobs`, `deliveryServiceServers` (although probably in reverse such that servers have a list of assigned Deliver Services), and maybe `cacheKeyConfigParams`, `parentConfigParams`, and `remapConfigParams`.



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],

Review Comment:
   Why are these all separated from their presumably assigned `profiles`? If they're global and therefore not strictly tied to any Profile, why aren't they in `globalParams`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficcontrol] rawlinp commented on a diff in pull request #6810: Blueprint: Cache Config Service

Posted by GitBox <gi...@apache.org>.
rawlinp commented on code in PR #6810:
URL: https://github.com/apache/trafficcontrol/pull/6810#discussion_r906336931


##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],
+    "profiles": [],
+    "deliveryServices": [],
+    "deliveryServiceServers": [],
+    "jobs": [],
+    "cdn": {},
+    "deliveryServiceRegexes": [],
+    "serverCapabilities":{}, 
+    "dsRequiredCapabilities": {},

Review Comment:
   That's a good suggestion. I meant for this to basically map to the `t3cutil.ConfigData` struct, and I can't say for sure why that struct was made that way. I imagine it's because those are all separate TO APIs and since that's how `t3c` requests the data, that's how `t3c` passes it around. In order to reduce unnecessary churn in `t3c`, I think it makes sense to keep it that way; otherwise, we'd have to change a lot of `t3c` to handle the new data structure for not much value.



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],
+    "profiles": [],
+    "deliveryServices": [],
+    "deliveryServiceServers": [],
+    "jobs": [],
+    "cdn": {},
+    "deliveryServiceRegexes": [],
+    "serverCapabilities":{}, 
+    "dsRequiredCapabilities": {},
+    "uriSigningKeys": {},
+    "urlSigKeys": {},
+    "sslKeys": [],
+    "topologies": [],
+  }
+}
+```
+
+
+#### Client Impact
+A new TO Go client method will be added for the `GET /cdn_update_times` API,
+but because `GET /cache_config_snapshot` will be served by the Cache Config
+Service, it may not have a corresponding method in the TO Go client (although
+the TO Go client "raw" method could still be used to request this API).
+
+#### Data Model / Database Impact
+The Traffic Ops data model and database schema will remain unchanged. The
+proposed Cache Config Services may or may not use a traditional database --
+they may just store the latest generated snapshots in memory.
+
+### t3c Impact
+`t3c` will be updated to optionally request data from the new Cache Config
+Services (via the Traffic Ops reverse proxy functionality). This will likely be
+a new CLI flag so that the new data request path can be enabled at-will.
+Eventually, the option will be required so that `t3c` doesn't have to maintain

Review Comment:
   Sure, once it's made mandatory, I'm sure `t3c` would just remove the CLI option altogether. I can call that out.



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],
+    "profiles": [],
+    "deliveryServices": [],
+    "deliveryServiceServers": [],
+    "jobs": [],
+    "cdn": {},
+    "deliveryServiceRegexes": [],
+    "serverCapabilities":{}, 
+    "dsRequiredCapabilities": {},
+    "uriSigningKeys": {},
+    "urlSigKeys": {},
+    "sslKeys": [],
+    "topologies": [],
+  }
+}
+```
+
+
+#### Client Impact
+A new TO Go client method will be added for the `GET /cdn_update_times` API,
+but because `GET /cache_config_snapshot` will be served by the Cache Config
+Service, it may not have a corresponding method in the TO Go client (although
+the TO Go client "raw" method could still be used to request this API).
+
+#### Data Model / Database Impact
+The Traffic Ops data model and database schema will remain unchanged. The
+proposed Cache Config Services may or may not use a traditional database --
+they may just store the latest generated snapshots in memory.
+
+### t3c Impact
+`t3c` will be updated to optionally request data from the new Cache Config
+Services (via the Traffic Ops reverse proxy functionality). This will likely be
+a new CLI flag so that the new data request path can be enabled at-will.
+Eventually, the option will be required so that `t3c` doesn't have to maintain
+two different data request paths.
+
+### Traffic Monitor Impact
+n/a
+
+### Traffic Router Impact
+n/a
+
+### Traffic Stats Impact
+n/a
+
+### Traffic Vault Impact
+n/a
+
+### Documentation Impact
+The new `cdn_update_times` Traffic Ops API will be documented in the usual
+manner. Cache Config Service sections may be added to the existing
+documentation, including the documentation of its `cache_config_snapshots` API.

Review Comment:
   Cool suggestion, I've added that to this section



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],

Review Comment:
   This data structure basically just maps to the `t3cutil.ConfigData` struct, except with some changes to make it more per-CDN as opposed to per-server. That said, there's been some pushback offline from @rob05c about this snapshot being per-CDN (among other things), so a future revision will likely make this per-server. Therefore, `profiles` would become `profile` again (among other things to make it per-server).



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at

Review Comment:
   Done



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"

Review Comment:
   Done



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`

Review Comment:
   I considered that, but leant more towards this since the users of this API would have no need for the other `/cdns` fields. That said, it's not a big deal if you'd prefer just adding them to `/cdns`. Let me know either way.



##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],
+    "profiles": [],
+    "deliveryServices": [],
+    "deliveryServiceServers": [],
+    "jobs": [],
+    "cdn": {},
+    "deliveryServiceRegexes": [],
+    "serverCapabilities":{}, 
+    "dsRequiredCapabilities": {},
+    "uriSigningKeys": {},
+    "urlSigKeys": {},
+    "sslKeys": [],
+    "topologies": [],
+  }
+}
+```
+
+
+#### Client Impact
+A new TO Go client method will be added for the `GET /cdn_update_times` API,
+but because `GET /cache_config_snapshot` will be served by the Cache Config
+Service, it may not have a corresponding method in the TO Go client (although
+the TO Go client "raw" method could still be used to request this API).
+
+#### Data Model / Database Impact
+The Traffic Ops data model and database schema will remain unchanged. The
+proposed Cache Config Services may or may not use a traditional database --
+they may just store the latest generated snapshots in memory.
+
+### t3c Impact
+`t3c` will be updated to optionally request data from the new Cache Config
+Services (via the Traffic Ops reverse proxy functionality). This will likely be
+a new CLI flag so that the new data request path can be enabled at-will.
+Eventually, the option will be required so that `t3c` doesn't have to maintain
+two different data request paths.
+
+### Traffic Monitor Impact
+n/a
+
+### Traffic Router Impact
+n/a
+
+### Traffic Stats Impact
+n/a
+
+### Traffic Vault Impact
+n/a
+
+### Documentation Impact
+The new `cdn_update_times` Traffic Ops API will be documented in the usual
+manner. Cache Config Service sections may be added to the existing
+documentation, including the documentation of its `cache_config_snapshots` API.
+The new `t3c` CLI flag will be added to its documentation.
+
+### Testing Impact
+The new `cdn_update_times` Traffic Ops API will be tested via the Traffic Ops API tests.
+
+The new Cache Config Service will have its own unit and integration tests.
+
+The `t3c` integration tests may need to be updated to use the Cache Config
+Service for data retrieval in addition to Traffic Ops.
+
+### Performance Impact
+One of the primary goals of this blueprint is to shift `t3c` request load off
+of the Traffic Ops database bottleneck onto horizontally-scalable Cache Config
+Services. This will allow CDNs to propagate changes much more quickly than they
+can today. Because the Cache Config Services will be reverse-proxied through
+Traffic Ops, network load will increase on Traffic Ops servers if `t3c` is run
+more frequently, but the additional CPU load should be fairly minimal because
+the reverse-proxying should not be CPU-intensive. That said, Traffic Ops can
+also be scaled out horizontally if necessary to spread out network and CPU
+load for effectively.
+
+In order to minimize any Traffic Ops database load from requests to Traffic Ops
+from the Cache Config Services, they will poll/query Traffic Ops in a
+consistent-hash-like manner so that they are not all making requests to Traffic

Review Comment:
   Hmm... "consistent-hash-like manner" is probably not very clear. Maybe "consistent, evenly-dispersed manner" would be better. It's about the timing of making requests to TO. I can make this more clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficcontrol] ocket8888 commented on a diff in pull request #6810: Blueprint: Cache Config Service

Posted by GitBox <gi...@apache.org>.
ocket8888 commented on code in PR #6810:
URL: https://github.com/apache/trafficcontrol/pull/6810#discussion_r887191321


##########
blueprints/cache-config-service.md:
##########
@@ -0,0 +1,253 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Cache Config Service
+
+## Problem Description
+In order to remove the Traffic Ops database as a bottleneck in the distribution
+of cache configuration data to CDN caching servers, we need a way to replicate
+the data to a set of Cache Config Services (CCS) from which the caching servers
+will request it from. Instead of having thousands of caching servers
+simultaneously making requests to Traffic Ops (or to an intermediary
+caching proxy in front of Traffic Ops), the Cache Config Services will request
+the necessary data directly from Traffic Ops on behalf of the caching servers.
+The Cache Config Services will then bundle the data into CDN-specific
+"snapshots" which they will then serve to requesting caching servers (via
+`t3c`). Because the Cache Config Services are horizontally scalable, the
+bottleneck on the Traffic Ops database will be remediated, and caching servers
+will be update to request and process updates much more frequently than they
+can today.
+
+Additionally, by reducing the number of Traffic Ops APIs used by `t3c`, it
+reduces the amount of coupling between them, making each easier to change
+without breaking the other. As long as the data snapshot is kept stable and
+backwards-compatible, `t3c` won't be affected by most necessary breaking
+changes made to the Traffic Ops API. If the Traffic Ops API has a breaking
+change, backwards-compatibility changes would be made within the Cache Config
+Service so that it does not impact `t3c`.
+
+## Proposed Change
+`t3c` will be updated to optionally request data from the Cache Config Snapshot
+API which will be reverse-proxied by Traffic Ops to the Cache Config Services.
+The Traffic Ops reverse proxy functionality will be implemented separately, and
+this blueprint will depend on that functionality. The Cache Config Services
+will periodically poll Traffic Ops to check for queued updates. If updates are
+queued for a cache in a given CDN, the Cache Config Services will request all
+the necessary data from Traffic Ops for generating cache configuration for that
+given CDN then bundle it into a CDN-specific "snapshot" (with the timestamp of
+when the cache was queued). Caches will then request the snapshot with that
+particular timestamp from the Cache Config Service and use it to generate their
+configuration files.
+
+![](img/cache-config-service.png "Architectural diagram of Cache Config Service")
+
+### Traffic Portal Impact
+n/a
+
+### Traffic Ops Impact
+The new Cache Config Service(s) will be reverse-proxied through Traffic Ops.
+Since they will increase the efficiency of propagating CDN configuration data,
+load may be increased on the Traffic Ops servers themselves (mainly network,
+some CPU), but load will be greatly removed from the Traffic Ops database.
+
+To increase the efficiency of Cache Config Services polling Traffic Ops for
+queued updates/revalidations, a new API will be added to Traffic Ops which
+returns the latest `config_update_time` and `revalidate_update_time` of the
+servers in a given CDN. Whenever this value increases for a given CDN, the
+Cache Config Service(s) will request all the necessary data from Traffic Ops
+then merge it into a single JSON snapshot to serve to caching servers.
+
+#### REST API Impact
+For Traffic Ops, something like this:
+
+`GET /cdn_update_times`
+
+Query params: none
+
+Response:
+```
+{
+  "response": [
+    {
+      "cdn": "foo-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    },
+    {
+      "cdn": "bar-cdn",
+      "latestConfigUpdateTime": "1970-01-01T00:00:01.234",
+      "latestRevalUpdateTime": "1970-01-01T00:00:05.678"
+    }
+  ]
+}
+```
+
+For the Cache Config Service:
+
+`GET /cache_config_snapshots?cdn=foo-cdn&t=1` (including but not limited to the following
+top-level fields, and each object will only contain fields that `t3c` requires):
+
+Query params:
+`cdn`: the name of the CDN
+`t`: the timestamp (in Unix epoch) of the snapshot (which will correspond to at
+least one server's `config_update_time` or `revalidate_update_time`
+
+Response:
+```json
+{
+  "response": {
+    "servers": [],
+    "cachegroups": [],
+    "globalParams": [],
+    "cacheKeyConfigParams": [],
+    "remapConfigParams": [],
+    "parentConfigParams": [],
+    "profiles": [],
+    "deliveryServices": [],
+    "deliveryServiceServers": [],
+    "jobs": [],
+    "cdn": {},
+    "deliveryServiceRegexes": [],
+    "serverCapabilities":{}, 
+    "dsRequiredCapabilities": {},

Review Comment:
   Assuming these are maps of some identifier for a server/DS to their respective Capabilities (remember server short hostnames aren't unique, so it shouldn't use that), could these just be properties of their respective object? e.g.
   ```jsonc
   {
     //...
     "servers": [
       //...
       {
         //...
         "capabilities": []
       }
     ],
     "deliveryServices": [
       //...
       {
         //...
         "requiredCapabilities": []
       }
     ]
   }
   ```
   
   Now that I'm looking through, seems like the same would apply to `uriSigningKeys`, `urlSigKeys`, possibly `sslKeys`, `deliveryServiceRegexes`, probably `jobs`, `deliveryServiceServers` (although probably in reverse such that servers have a list of assigned Deliver Services), and maybe `cacheKeyConfigParams`, `parentConfigParams`, and `remapConfigParams`.
   
   Just seems like it would spare clients from some back-tracking to build connections between things.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org