You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficcontrol.apache.org by "Durfey, Ryan" <Ry...@comcast.com> on 2017/09/01 18:53:27 UTC

Configuration Management - Webex Meeting Notes

Thanks to all of those that attended and contributed.  Below are my raw notes.  I plan to summarize these further in the wiki and kick off new email threads for key topics that remain open.  I would love to get some presentations organized for the Atlanta meeting on these topics.

I am also working on how to host future virtual international meetings with 25-50 heads.  I would love any feedback or ideas people have on what works best.

Ryan

 Discussion Notes

  1.  Configuration of Multiple Caching Engines – how do we setup for this long term?
     *   In agreement, moving towards multiple caching proxies
     *   ATS & NGINX are both definitely on the table, any other specific caches?
                                                              i.      None specifically mentioned but there may be others like Varnish

  1.  Configuration Format & Storage – database tables vs. config files/json
     *   Moving away from database as storage for configs?  No.
     *   LUA Engine – discussed json blob, but talked out of this.  Use a structured database for this.
  2.  State Machine
     *   How do we push/remove configs in correct order without breaking the CDN?
                                                              i.      It’s possible if we cut time to seconds to roll out changes that temporary conflicts are not that great of a concern, but this is definitely based on the fact that the config completes without issue and does so very rapidly.

     *   Push out configs without snapshot or queue, with specific order, in a way that does not affect others
     *   Discussing from traffic ops standpoint.  Also affects Traffic Router and live traffic on the CDN.  Must consider the other components and order of operation.  Must make updates in parallel.  Need to architect in parallel.
     *   Breaking crconfig into separate components is another element.
     *   State machine will be separate from ORT, lean towards implementation inTraffic Ops
                                                              i.      Must be used by Traffic Portal
                                                            ii.      Should be integrated with TO? Decision for down the road.
                                                          iii.      Make it more generic on TO, to generate configs.  All the other components that use crconfig.  Won’t use LUA for those.  We would need a translation layer.  Use a templated approach on TO side, then would be modified as needed.
                                                           iv.      Next Gen ORT client is taking shape.
                                                             v.      Need to separate DS, Traffic Router, component configs.  DS configs must be autonomous. Will need a translation layer.
                                                           vi.      Translation layer for each component, rather than have TO put out different formats, it puts out one format and each component must be able to interpret.  This makes configs generic, puts work on each component.
                                                         vii.      Will this limit what TO outputs?  Pushes changes into the components.

  1.  LUA Rules Engine – creating greater options for customized service
     *   Rules engine would allow users to input generic rules which would be interpreted behind the scenes and implemented via LUA plugin
                                                              i.      Example of a generic rule would be something like “on this trigger, perform this action”.  Something like on requests to origin, add this special header for access.

     *   Many existing plugins could be implemented in LUA, it is much safer and much less likely to crash server since LUA coroutines are non-blocking
                                                              i.      Not all plugins can be replaced, by LUA, but many can.

     *   Supporting LUA helps with cross caching engine support.
     *   LUA is not the same across ATS / NGINX, you would need to translate, but it was written by same person.
     *   Extension of CDN without Code, benefits of not having to update traffic ops
     *   How does this affect non-technical users?  How can they still build services?
                                                              i.      The goal is to make simple rules available to end users.  They don’t have to configure a script they build a simple rule.
                                                            ii.      Restrict raw scripting to CDN Dev/Ops, but simplify rules access for everyone
                                                          iii.      How do we prevent end users from having too much access and potentially manipulating system?

  1.  Delivery Service Configuration Versioning (DSCV)
     *   Allows rapid deployment, testing, and roll back of configs
     *   Decouple changing config from deployment
     *   Allows users to change config, deployment decisions can be made separately and can be regulated
     *   Changes can be compared
     *   Two types
                                                              i.      Simple – Single delivery service deployed version 5 to version 7 (replace in all caches)
                                                            ii.      Multiple Versions – Allow multiple versions to be deployed simultaneously

     *   Start with Simple for now, move to multiple versions in future
     *   Tell caches delivery service and version
     *   Concerns
                                                              i.      If we store versions in memory, that could cause issues in Traffic Monitor/Router
                                                            ii.      Notion of an instant roll back is difficult.

           *   DNSSEC zone signing is an issue
           *   State machine issues for ordering / coordinating of changes
                                                          iii.      Example change a property on a service, sometimes traffic monitor may need to know first.  Must take into account the coordination between components

     *   DSCV provides an ID to every configuration
  1.  Time Stamping Config Versions – How to track versioning
     *   By having a deployed version table with history, you don’t remove lines, you just mark it as removed.  You can then see the version history by date in the table.
     *   You can say the ID is the creation date, but may not simplify things. This is a trade off requiring further discussion.
     *   Sounds like we would like to have an ID field to keep things simple (ie deploy config 7 vs. config 5 instead of deploy config 08/31/2017 18:05:36 vs. 08/31/2017 18:15:36)
     *   We can have both an ID and a timestamp and either could be used to stipulate a particular version
  2.  ORT – What is the future?
     *   Rewrite in new golang traffic API?
     *   Derek working on this, in Go using TO client
     *   ORT doesn’t care what it talks to
     *   Does ORT work as we want it to?  Not yet. Change from script to a service with active communication
     *   Need to start a thread on this topic
     *   Not the highest priority yet
  3.  Self Service
     *   Tenancy – restrictions by tenant, by Nir, about to go into production
     *   Roles & Capabilities – need to have ops role to create  a service.  Naama worked on this. Need to enforce.
     *   These are the first two steps for self-service ^^^
  4.  Generic Configuration Model
     *   We can make the configuration parameters for a CDN service all generic (ie not specific to a caching engine)
     *   Each caching engine implemented would need a layer written to read the generic config and translate to appropriate settings for the cache updates.
     *   This forces complexity where it belongs on the cache owner to interpret and translate instead of on the TC developers who would need to know how to configure all types of caches.
     *   This also fits with generic instructions in Rules Engine where each cache would need an interpreter for generic rules
     *   Vendor Specific Configuration
                                                              i.      Nginx caches know how to deal with certain configs.  May need new parameters.
                                                            ii.      Delivery service profile needs to be taken into consideration.
                                                          iii.      Profiles could go back to being a server profile only
                                                           iv.      Delivery service profiles lack structure currently?


 Wiki: https://cwiki.apache.org/confluence/display/TC/Configuration+Management



Ryan Durfey
Sr. Product Manager - CDN | Comcast Technology Solutions
1899 Wynkoop Ste. 550 | Denver, CO 80202
M | 303-524-5099
ryan_durfey@comcast.com<ma...@comcast.com>
CDN Support (24x7): 866-405-2993 or cdn_support@comcast.com<ma...@comcast.com>