You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@trafficcontrol.apache.org by ocket 8888 <oc...@gmail.com> on 2020/08/25 16:48:45 UTC

Splitting up servers

Hello everyone, I'd like to discuss something that the Traffic Ops Working
Group
has been working on: splitting servers apart.

Servers have a lot of properties, and most are specifically important to
Cache
Servers - made all the more clear by the recent addition of multiple network
interfaces. We propose they be split up into different objects based on
type -
which will also help reduce (if not totally eliminate) the use of custom
Types
for servers. This will also eliminate the need for hacky ways of searching
for
certain kinds of servers - e.g. checking for a profile name that matches
"ATS_.*" to determine if something is a cache server and searching for a
Type
that matches ".*EDGE.*" to determine if something is an edge-tier or
mid-tier
Cache Server (both of which are real checks in place today).

The new objects would be:

- Cache Servers - exactly what it sounds like
- Infrastructure Servers - catch-all for anything that doesn't fit in a
different category, e.g. Grafana
- Origins - This should ideally eat the concept of "ORG"-type servers so
that we ONLY have Origins to express the concept of an Origin server.
- Traffic Monitors - exactly what it sounds like
- Traffic Ops Servers - exactly what it sounds like
- Traffic Portals - exactly what it sounds like
- Traffic Routers - exactly what it sounds like
- Traffic Stats Servers - exactly what it sounds like - but InfluxDB
servers would be Infrastructure Servers; this is just whatever machine is
running the actual Traffic Stats program.
- Traffic Vaults - exactly what it sounds like

I have a Draft PR (https://github.com/apache/trafficcontrol/pull/4986)
ready for
a blueprint to split out Traffic Portals already, to give you a sort of
idea of
what that would look like. I don't want to get too bogged down in what
properties each one will have exactly, since that's best decided on a
case-by-case basis and each should have its own blueprint, but I'm more
looking
for feedback on the concept of splitting apart servers in general.

If you do have questions about what properties each is semi-planned to have,
though, I can answer it or point you at the current draft of the API design
document which contains all those answers.

Re: [EXTERNAL] Re: Splitting up servers

Posted by ocket 8888 <oc...@gmail.com>.

I definitely consider getting rid of the "ALL" CDN a benefit of the
process, but it's neither the only goal nor the main one.

You wouldn't need to have multiple pages/tables, though the existing one
would need to be modified - you'd just have a new form and a drop down
selection on hitting the "create" button, kind of like how Delivery
Services are now.

I don't think you are dealing with a single server concept today. You're
dealing with a single endpoint, but interpreting each object differently
based on parts of that object. This is an attempt to formalize the
distinctions and simplify the distinct types to only what they need to be.
I guess I do see the advantage of getting all of them with one `/servers`
endpoint, though.

> For example, you must not downgrade TO after TM is upgraded to a version
dependent on one of these application configurations.

Not at all. Traffic Monitor supports two concurrent API versions: the
latest and the penultimate. You must not downgrade TO by two full API
versions to work with a TM that's at a certain TO API version - but that's
true anyway because our API versions can change at most as frequently as
ATC release versions, and we only support 2 concurrent ATC versions.

Regarding letting components configure themselves... I agree that would be
ideal. But because the configuration is already taking place in TO I never
considered that there would be a way to convince people to change that
locus of configuration. It seems like it's in there because that's what
people want.

On Tue, Aug 25, 2020 at 10:31 PM Gray, Jonathan <Jo...@comcast.com>
wrote:

> Ok, I didn't follow that the ALL CDN wasn't the issue here, but really
> profiles/parameters.  I'm +1 on slowly and carefully dismantling that.
>
> As an Operator, the objects themselves don't really matter to me so long
> as there are dedicated fields in TP for that information. I am still -1 on
> making multiple pages/grids though.
>
> As an API Consumer, I still prefer a singular concept of a server to work
> with.  Today it's easy to simply apply JQ/JMESPATH filters (with the not
> simple ID integer lookups) on that response to get what I need.
>
> Sounds almost like you want a per type Propset to maybe bind to each
> server.  I'm not sure I'm a huge fan of putting application specific data
> on a server object like the ones you mention because it creates new
> dependencies between applications and the API.  For example, you must not
> downgrade TO after TM is upgraded to a version dependent on one of these
> application configurations.
>
> Shifting gears a bit, Is there a reason these TM fields aren't just in the
> TM config file on disk managed through whatever system people happen to be
> using to manage those?  Per delivery service things I understand need
> special logic which is why atstccfg is a thing, but application config
> should be with the application I believe.  That moves all the
> responsibility and documentation back to the application instead of the TO
> API, lets each application and application version enforce what they expect
> independent of one another, and generally makes things easier to triage.
> It's just as easy today to have TMs with mismatched profiles/parameters as
> it is for the config on disk to be out of sync.
>
> Jonathan G
>
>
> On 8/25/20, 9:54 PM, "ocket 8888" <oc...@gmail.com> wrote:
>
>     Well, the main reason to separate them instead of just making
> constraints
>     based on type is so that configuration things can be taken out of
> Profiles
>     and Parameters and put into the object definitions themselves. Traffic
>     Portal is a bad example of that, because it has no TO-side
> configuration. A
>     better example would be the current spec for Traffic Monitor objects,
> which
>     would move
>
>     - eventCount
>     - healthPollingInterval
>     - heartbeatPollingInterval
>     - threadCount
>     - timePad
>
>     out of Parameters for better validation, simpler documentation, and
> greater
>     transparency when you look at the object as returned by the API.
>
>     So what you'd wind up with, then, isn't so much a subset of server
>     properties, but an intersection with some common set of properties but
> each
>     "sub-type" of server having its own, unique properties.
>
>     On Tue, Aug 25, 2020 at 9:35 PM Gray, Jonathan <
> Jonathan_Gray@comcast.com>
>     wrote:
>
>     > Ah, ok.  I didn't realize you were only talking about the API.
>     >
>     > As an Operations user, I like having them all together in one place
>     > because it gives me one place to go search a grid for.  If I'm
> trying to
>     > locate an IP for example, I don't have to hunt through 3 different TP
>     > pages/grids to maybe find it.  I just want TO/TP to not let me do
> the wrong
>     > thing with data entry in the places we havn't caught already.
>     >
>     > As an API consumer, I like having them all together because it's a
> single
>     > call to get all the data I need for multiple purposes with a common
> schema
>     > to leverage when addressing fields.  I agree with Dave's comment
> earlier
>     > still that if server fields don't make sense except on server types,
> we
>     > should add those to a type or type union object and peel them away
> from
>     > server that way.  From a performance perspective, the extra non-cache
>     > servers are negligible by comparison.  It also makes more routes and
>     > increases the API complexity to use.  If the goal is to try and deal
> with
>     > the ALL CDN, it's not all that different than dealing with NULL/nil
> as a
>     > consumer of the API.  All the safeguards that would have to exist
> are the
>     > same in either case.
>     >
>     > What advantage does making new API objects have over adding a
> database
>     > constraint and/or TO API Check and/or TP UI omission?  The ALL CDN is
>     > already not an option in TP, so I'm not sure how issue #4324 came
> about in
>     > practice.
>     >
>     > Jonathan G
>     >
>     >
>     > On 8/25/20, 8:03 PM, "ocket 8888" <oc...@gmail.com> wrote:
>     >
>     >     The database setup is totally irrelevant. That was never
> discussed in
>     > the
>     >     working group, it's just an implementation detail of the API. I
>     > included it
>     >     in the blueprint because there's a section on database impact,
> but
>     > however
>     >     it gets done is fine. This is purely about improving the API.
>     >
>     >     On Tue, Aug 25, 2020 at 4:26 PM Gray, Jonathan <
>     > Jonathan_Gray@comcast.com>
>     >     wrote:
>     >
>     >     > A server is a server.  We were bad and put non-server, or
> special
>     > server
>     >     > type data on everything.  Server should remain server, but you
> can
>     > take all
>     >     > the ancillary columns off into one or more separate tables
> joined on
>     > the PK
>     >     > of a server row.  For now, that would be everything.  As TO
> adjusts
>     > to no
>     >     > longer need the extra concepts, those joined nonsense rows for
> other
>     > types
>     >     > can be dropped.  That said, significant changes to TODB are
>     > complicated and
>     >     > risky so we would want to move with care and foresight into
> other
>     > things of
>     >     > significant effort so there's not a conflict by accident or
>     > instability
>     >     > delivered until it's complete.
>     >     >
>     >     > Jonathan G
>     >     >
>     >     > On 8/25/20, 3:57 PM, "Jeremy Mitchell" <mi...@gmail.com>
>     > wrote:
>     >     >
>     >     >     Oh, this would help us get rid of the "ALL" cdn as well
> which
>     > has kind
>     >     > of
>     >     >     been a pain. Lots of "if cdn != ALL, then do something..."
> in the
>     >     >     codebase..which eventually leads to bugs like this:
>     >     >
>     >     >
>     >     >
>     >
> https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/issues/4324__;!!CQl3mcHX2A!S8EeyRyjEHQ4cqwQ1ZGzR3tX4StRhNk2mtP19puNOM6EeXzXN8hgwCpKGRG2HqIvzrYc$
>     >     >
>     >     >     On Tue, Aug 25, 2020 at 3:42 PM Zach Hoffman <
> zach@zrhoffman.net
>     > >
>     >     > wrote:
>     >     >
>     >     >     > +1 for splitting cache servers and infra servers.
> Currently,
>     > each
>     >     > server
>     >     >     > must be associated with a CDN and  cache group.
>     >     >     >
>     >     >     > While that part may seem logical by itself, when updates
> are
>     > queued
>     >     > on a
>     >     >     > CDN, it sets the upd_pending column on *all* of that
> CDN's
>     > servers,
>     >     >     > including servers of RASCAL type, servers of CCR type,
> etc.
>     > Although
>     >     > this
>     >     >     > doesn't hurt anything, as Jeremy has said, side effects
> like
>     > these
>     >     > make
>     >     >     > database-level validation difficult, so a table split of
> some
>     > kind
>     >     > seems
>     >     >     > like a step in the right direction.
>     >     >     >
>     >     >     > -Zach
>     >     >     >
>     >     >     > On Tue, Aug 25, 2020 at 2:22 PM Jeremy Mitchell <
>     >     > mitchell852@gmail.com>
>     >     >     > wrote:
>     >     >     >
>     >     >     > > If you look at the columns of the servers table,
> you'll see
>     > that
>     >     > most are
>     >     >     > > specific to "cache servers", so I definitely think that
>     > should be
>     >     >     > > addressed. Overloaded tables make it hard
> (impossible?) to
>     > do any
>     >     >     > > database-level validation and I thought we wanted to
> move in
>     > that
>     >     >     > direction
>     >     >     > > where possible.
>     >     >     > >
>     >     >     > > At the very least I think we should have these tables
> to
>     > capture
>     >     > all our
>     >     >     > > "server objects":
>     >     >     > >
>     >     >     > > - cache_servers (formerly known as servers)
>     >     >     > > - infra_servers
>     >     >     > > - origins
>     >     >     > >
>     >     >     > > Now whether the API mirrors the tables is another
>     > discussion. I
>     >     > don't
>     >     >     > think
>     >     >     > > we strive for that but sometimes GET
> /api/cache_servers just
>     > seems
>     >     > to
>     >     >     > make
>     >     >     > > sense.
>     >     >     > >
>     >     >     > > Jeremy
>     >     >     > >
>     >     >     > >
>     >     >     > >
>     >     >     > >
>     >     >     > >
>     >     >     > > On Tue, Aug 25, 2020 at 12:19 PM Gray, Jonathan <
>     >     >     > Jonathan_Gray@comcast.com
>     >     >     > > >
>     >     >     > > wrote:
>     >     >     > >
>     >     >     > > > I agree with Dave here.  Instead of trying to make
> our
>     > database
>     >     > and API
>     >     >     > > > identical, we should focus on doing better
> relational data
>     >     > modeling
>     >     >     > > inside
>     >     >     > > > the database and letting that roll upward into TO
> with more
>     >     > specific
>     >     >     > > > queries and stronger data integrity inside the
> database.
>     >     >     > > >
>     >     >     > > > Jonathan G
>     >     >     > > >
>     >     >     > > > On 8/25/20, 11:20 AM, "Dave Neuman" <
> neuman@apache.org>
>     > wrote:
>     >     >     > > >
>     >     >     > > >     This feels extremely heavy handed to me.  I don't
>     > think we
>     >     > should
>     >     >     > try
>     >     >     > > > to
>     >     >     > > >     build out a new table for different server types
> which
>     > will
>     >     > mostly
>     >     >     > > > have all
>     >     >     > > >     the same columns.  I could maybe see a total of 3
>     > tables for
>     >     >     > caches,
>     >     >     > > >     origins (which already exists), and other
> things, but
>     > even
>     >     > then I
>     >     >     > > > would be
>     >     >     > > >     hesitant to think it was a great idea.  Even if
> we
>     > have a
>     >     > caches
>     >     >     > > > table, we
>     >     >     > > >     still have to put some sort of typing in place to
>     >     > distinguish edges
>     >     >     > > and
>     >     >     > > >     mids and with the addition of flexible
> topologies,
>     > even that
>     >     > is
>     >     >     > > muddy;
>     >     >     > > > it
>     >     >     > > >     might be better to call them forward and reverse
>     > proxies
>     >     > instead,
>     >     >     > but
>     >     >     > > > that
>     >     >     > > >     is a different conversation.  I think while it
> may
>     > seem like
>     >     > this
>     >     >     > > > solves a
>     >     >     > > >     lot of problems on the surface, I still think
> some of
>     > the
>     >     > things
>     >     >     > you
>     >     >     > > > are
>     >     >     > > >     trying to address will remain and we will have
> new
>     > problems
>     >     > on top
>     >     >     > of
>     >     >     > > >     that.
>     >     >     > > >
>     >     >     > > >     I think we should think about addressing this
> problem
>     > with a
>     >     > better
>     >     >     > > > way of
>     >     >     > > >     identifying server types that can be accounted
> for in
>     > code
>     >     > instead
>     >     >     > of
>     >     >     > > >     searching for strings, adding some validation to
> our
>     > API
>     >     > based on
>     >     >     > the
>     >     >     > > >     server types (e.g. only require some fields for
>     > caches), and
>     >     > also
>     >     >     > by
>     >     >     > > >     thinking about the way we do our API and maybe
> trying
>     > to get
>     >     > away
>     >     >     > > from
>     >     >     > > >     "based on database tables" to be "based on use
> cases".
>     >     >     > > >
>     >     >     > > >     --Dave
>     >     >     > > >
>     >     >     > > >     On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <
>     >     > ocket8888@gmail.com>
>     >     >     > > > wrote:
>     >     >     > > >
>     >     >     > > >     > Hello everyone, I'd like to discuss something
> that
>     > the
>     >     > Traffic
>     >     >     > Ops
>     >     >     > > > Working
>     >     >     > > >     > Group
>     >     >     > > >     > has been working on: splitting servers apart.
>     >     >     > > >     >
>     >     >     > > >     > Servers have a lot of properties, and most are
>     > specifically
>     >     >     > > > important to
>     >     >     > > >     > Cache
>     >     >     > > >     > Servers - made all the more clear by the recent
>     > addition of
>     >     >     > > multiple
>     >     >     > > >     > network
>     >     >     > > >     > interfaces. We propose they be split up into
>     > different
>     >     > objects
>     >     >     > > based
>     >     >     > > > on
>     >     >     > > >     > type -
>     >     >     > > >     > which will also help reduce (if not totally
>     > eliminate) the
>     >     > use of
>     >     >     > > > custom
>     >     >     > > >     > Types
>     >     >     > > >     > for servers. This will also eliminate the need
> for
>     > hacky
>     >     > ways of
>     >     >     > > > searching
>     >     >     > > >     > for
>     >     >     > > >     > certain kinds of servers - e.g. checking for a
>     > profile
>     >     > name that
>     >     >     > > > matches
>     >     >     > > >     > "ATS_.*" to determine if something is a cache
> server
>     > and
>     >     >     > searching
>     >     >     > > > for a
>     >     >     > > >     > Type
>     >     >     > > >     > that matches ".*EDGE.*" to determine if
> something is
>     > an
>     >     > edge-tier
>     >     >     > > or
>     >     >     > > >     > mid-tier
>     >     >     > > >     > Cache Server (both of which are real checks in
> place
>     >     > today).
>     >     >     > > >     >
>     >     >     > > >     > The new objects would be:
>     >     >     > > >     >
>     >     >     > > >     > - Cache Servers - exactly what it sounds like
>     >     >     > > >     > - Infrastructure Servers - catch-all for
> anything
>     > that
>     >     > doesn't
>     >     >     > fit
>     >     >     > > > in a
>     >     >     > > >     > different category, e.g. Grafana
>     >     >     > > >     > - Origins - This should ideally eat the
> concept of
>     >     > "ORG"-type
>     >     >     > > > servers so
>     >     >     > > >     > that we ONLY have Origins to express the
> concept of
>     > an
>     >     > Origin
>     >     >     > > server.
>     >     >     > > >     > - Traffic Monitors - exactly what it sounds
> like
>     >     >     > > >     > - Traffic Ops Servers - exactly what it sounds
> like
>     >     >     > > >     > - Traffic Portals - exactly what it sounds like
>     >     >     > > >     > - Traffic Routers - exactly what it sounds like
>     >     >     > > >     > - Traffic Stats Servers - exactly what it
> sounds
>     > like - but
>     >     >     > > InfluxDB
>     >     >     > > >     > servers would be Infrastructure Servers; this
> is just
>     >     > whatever
>     >     >     > > > machine is
>     >     >     > > >     > running the actual Traffic Stats program.
>     >     >     > > >     > - Traffic Vaults - exactly what it sounds like
>     >     >     > > >     >
>     >     >     > > >     > I have a Draft PR (
>     >     >     > > >
>     >     >     > >
>     >     >     >
>     >     >
>     >
> https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$
>     >     >     > > > )
>     >     >     > > >     > ready for
>     >     >     > > >     > a blueprint to split out Traffic Portals
> already, to
>     > give
>     >     > you a
>     >     >     > > sort
>     >     >     > > > of
>     >     >     > > >     > idea of
>     >     >     > > >     > what that would look like. I don't want to get
> too
>     > bogged
>     >     > down in
>     >     >     > > > what
>     >     >     > > >     > properties each one will have exactly, since
> that's
>     > best
>     >     > decided
>     >     >     > > on a
>     >     >     > > >     > case-by-case basis and each should have its own
>     > blueprint,
>     >     > but
>     >     >     > I'm
>     >     >     > > > more
>     >     >     > > >     > looking
>     >     >     > > >     > for feedback on the concept of splitting apart
>     > servers in
>     >     >     > general.
>     >     >     > > >     >
>     >     >     > > >     > If you do have questions about what properties
> each
>     > is
>     >     >     > semi-planned
>     >     >     > > > to
>     >     >     > > >     > have,
>     >     >     > > >     > though, I can answer it or point you at the
> current
>     > draft
>     >     > of the
>     >     >     > > API
>     >     >     > > > design
>     >     >     > > >     > document which contains all those answers.
>     >     >     > > >     >
>     >     >     > > >
>     >     >     > > >
>     >     >     > >
>     >     >     >
>     >     >
>     >     >
>     >
>     >
>
>

Re: [EXTERNAL] Re: Splitting up servers

Posted by "Gray, Jonathan" <Jo...@comcast.com>.

Ok, I didn't follow that the ALL CDN wasn't the issue here, but really profiles/parameters.  I'm +1 on slowly and carefully dismantling that.

As an Operator, the objects themselves don't really matter to me so long as there are dedicated fields in TP for that information. I am still -1 on making multiple pages/grids though.

As an API Consumer, I still prefer a singular concept of a server to work with.  Today it's easy to simply apply JQ/JMESPATH filters (with the not simple ID integer lookups) on that response to get what I need.

Sounds almost like you want a per type Propset to maybe bind to each server.  I'm not sure I'm a huge fan of putting application specific data on a server object like the ones you mention because it creates new dependencies between applications and the API.  For example, you must not downgrade TO after TM is upgraded to a version dependent on one of these application configurations.

Shifting gears a bit, Is there a reason these TM fields aren't just in the TM config file on disk managed through whatever system people happen to be using to manage those?  Per delivery service things I understand need special logic which is why atstccfg is a thing, but application config should be with the application I believe.  That moves all the responsibility and documentation back to the application instead of the TO API, lets each application and application version enforce what they expect independent of one another, and generally makes things easier to triage.  It's just as easy today to have TMs with mismatched profiles/parameters as it is for the config on disk to be out of sync.

Jonathan G


On 8/25/20, 9:54 PM, "ocket 8888" <oc...@gmail.com> wrote:

    Well, the main reason to separate them instead of just making constraints
    based on type is so that configuration things can be taken out of Profiles
    and Parameters and put into the object definitions themselves. Traffic
    Portal is a bad example of that, because it has no TO-side configuration. A
    better example would be the current spec for Traffic Monitor objects, which
    would move

    - eventCount
    - healthPollingInterval
    - heartbeatPollingInterval
    - threadCount
    - timePad

    out of Parameters for better validation, simpler documentation, and greater
    transparency when you look at the object as returned by the API.

    So what you'd wind up with, then, isn't so much a subset of server
    properties, but an intersection with some common set of properties but each
    "sub-type" of server having its own, unique properties.

    On Tue, Aug 25, 2020 at 9:35 PM Gray, Jonathan <Jo...@comcast.com>
    wrote:

    > Ah, ok.  I didn't realize you were only talking about the API.
    >
    > As an Operations user, I like having them all together in one place
    > because it gives me one place to go search a grid for.  If I'm trying to
    > locate an IP for example, I don't have to hunt through 3 different TP
    > pages/grids to maybe find it.  I just want TO/TP to not let me do the wrong
    > thing with data entry in the places we havn't caught already.
    >
    > As an API consumer, I like having them all together because it's a single
    > call to get all the data I need for multiple purposes with a common schema
    > to leverage when addressing fields.  I agree with Dave's comment earlier
    > still that if server fields don't make sense except on server types, we
    > should add those to a type or type union object and peel them away from
    > server that way.  From a performance perspective, the extra non-cache
    > servers are negligible by comparison.  It also makes more routes and
    > increases the API complexity to use.  If the goal is to try and deal with
    > the ALL CDN, it's not all that different than dealing with NULL/nil as a
    > consumer of the API.  All the safeguards that would have to exist are the
    > same in either case.
    >
    > What advantage does making new API objects have over adding a database
    > constraint and/or TO API Check and/or TP UI omission?  The ALL CDN is
    > already not an option in TP, so I'm not sure how issue #4324 came about in
    > practice.
    >
    > Jonathan G
    >
    >
    > On 8/25/20, 8:03 PM, "ocket 8888" <oc...@gmail.com> wrote:
    >
    >     The database setup is totally irrelevant. That was never discussed in
    > the
    >     working group, it's just an implementation detail of the API. I
    > included it
    >     in the blueprint because there's a section on database impact, but
    > however
    >     it gets done is fine. This is purely about improving the API.
    >
    >     On Tue, Aug 25, 2020 at 4:26 PM Gray, Jonathan <
    > Jonathan_Gray@comcast.com>
    >     wrote:
    >
    >     > A server is a server.  We were bad and put non-server, or special
    > server
    >     > type data on everything.  Server should remain server, but you can
    > take all
    >     > the ancillary columns off into one or more separate tables joined on
    > the PK
    >     > of a server row.  For now, that would be everything.  As TO adjusts
    > to no
    >     > longer need the extra concepts, those joined nonsense rows for other
    > types
    >     > can be dropped.  That said, significant changes to TODB are
    > complicated and
    >     > risky so we would want to move with care and foresight into other
    > things of
    >     > significant effort so there's not a conflict by accident or
    > instability
    >     > delivered until it's complete.
    >     >
    >     > Jonathan G
    >     >
    >     > On 8/25/20, 3:57 PM, "Jeremy Mitchell" <mi...@gmail.com>
    > wrote:
    >     >
    >     >     Oh, this would help us get rid of the "ALL" cdn as well which
    > has kind
    >     > of
    >     >     been a pain. Lots of "if cdn != ALL, then do something..." in the
    >     >     codebase..which eventually leads to bugs like this:
    >     >
    >     >
    >     >
    > https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/issues/4324__;!!CQl3mcHX2A!S8EeyRyjEHQ4cqwQ1ZGzR3tX4StRhNk2mtP19puNOM6EeXzXN8hgwCpKGRG2HqIvzrYc$
    >     >
    >     >     On Tue, Aug 25, 2020 at 3:42 PM Zach Hoffman <zach@zrhoffman.net
    > >
    >     > wrote:
    >     >
    >     >     > +1 for splitting cache servers and infra servers. Currently,
    > each
    >     > server
    >     >     > must be associated with a CDN and  cache group.
    >     >     >
    >     >     > While that part may seem logical by itself, when updates are
    > queued
    >     > on a
    >     >     > CDN, it sets the upd_pending column on *all* of that CDN's
    > servers,
    >     >     > including servers of RASCAL type, servers of CCR type, etc.
    > Although
    >     > this
    >     >     > doesn't hurt anything, as Jeremy has said, side effects like
    > these
    >     > make
    >     >     > database-level validation difficult, so a table split of some
    > kind
    >     > seems
    >     >     > like a step in the right direction.
    >     >     >
    >     >     > -Zach
    >     >     >
    >     >     > On Tue, Aug 25, 2020 at 2:22 PM Jeremy Mitchell <
    >     > mitchell852@gmail.com>
    >     >     > wrote:
    >     >     >
    >     >     > > If you look at the columns of the servers table, you'll see
    > that
    >     > most are
    >     >     > > specific to "cache servers", so I definitely think that
    > should be
    >     >     > > addressed. Overloaded tables make it hard (impossible?) to
    > do any
    >     >     > > database-level validation and I thought we wanted to move in
    > that
    >     >     > direction
    >     >     > > where possible.
    >     >     > >
    >     >     > > At the very least I think we should have these tables to
    > capture
    >     > all our
    >     >     > > "server objects":
    >     >     > >
    >     >     > > - cache_servers (formerly known as servers)
    >     >     > > - infra_servers
    >     >     > > - origins
    >     >     > >
    >     >     > > Now whether the API mirrors the tables is another
    > discussion. I
    >     > don't
    >     >     > think
    >     >     > > we strive for that but sometimes GET /api/cache_servers just
    > seems
    >     > to
    >     >     > make
    >     >     > > sense.
    >     >     > >
    >     >     > > Jeremy
    >     >     > >
    >     >     > >
    >     >     > >
    >     >     > >
    >     >     > >
    >     >     > > On Tue, Aug 25, 2020 at 12:19 PM Gray, Jonathan <
    >     >     > Jonathan_Gray@comcast.com
    >     >     > > >
    >     >     > > wrote:
    >     >     > >
    >     >     > > > I agree with Dave here.  Instead of trying to make our
    > database
    >     > and API
    >     >     > > > identical, we should focus on doing better relational data
    >     > modeling
    >     >     > > inside
    >     >     > > > the database and letting that roll upward into TO with more
    >     > specific
    >     >     > > > queries and stronger data integrity inside the database.
    >     >     > > >
    >     >     > > > Jonathan G
    >     >     > > >
    >     >     > > > On 8/25/20, 11:20 AM, "Dave Neuman" <ne...@apache.org>
    > wrote:
    >     >     > > >
    >     >     > > >     This feels extremely heavy handed to me.  I don't
    > think we
    >     > should
    >     >     > try
    >     >     > > > to
    >     >     > > >     build out a new table for different server types which
    > will
    >     > mostly
    >     >     > > > have all
    >     >     > > >     the same columns.  I could maybe see a total of 3
    > tables for
    >     >     > caches,
    >     >     > > >     origins (which already exists), and other things, but
    > even
    >     > then I
    >     >     > > > would be
    >     >     > > >     hesitant to think it was a great idea.  Even if we
    > have a
    >     > caches
    >     >     > > > table, we
    >     >     > > >     still have to put some sort of typing in place to
    >     > distinguish edges
    >     >     > > and
    >     >     > > >     mids and with the addition of flexible topologies,
    > even that
    >     > is
    >     >     > > muddy;
    >     >     > > > it
    >     >     > > >     might be better to call them forward and reverse
    > proxies
    >     > instead,
    >     >     > but
    >     >     > > > that
    >     >     > > >     is a different conversation.  I think while it may
    > seem like
    >     > this
    >     >     > > > solves a
    >     >     > > >     lot of problems on the surface, I still think some of
    > the
    >     > things
    >     >     > you
    >     >     > > > are
    >     >     > > >     trying to address will remain and we will have new
    > problems
    >     > on top
    >     >     > of
    >     >     > > >     that.
    >     >     > > >
    >     >     > > >     I think we should think about addressing this problem
    > with a
    >     > better
    >     >     > > > way of
    >     >     > > >     identifying server types that can be accounted for in
    > code
    >     > instead
    >     >     > of
    >     >     > > >     searching for strings, adding some validation to our
    > API
    >     > based on
    >     >     > the
    >     >     > > >     server types (e.g. only require some fields for
    > caches), and
    >     > also
    >     >     > by
    >     >     > > >     thinking about the way we do our API and maybe trying
    > to get
    >     > away
    >     >     > > from
    >     >     > > >     "based on database tables" to be "based on use cases".
    >     >     > > >
    >     >     > > >     --Dave
    >     >     > > >
    >     >     > > >     On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <
    >     > ocket8888@gmail.com>
    >     >     > > > wrote:
    >     >     > > >
    >     >     > > >     > Hello everyone, I'd like to discuss something that
    > the
    >     > Traffic
    >     >     > Ops
    >     >     > > > Working
    >     >     > > >     > Group
    >     >     > > >     > has been working on: splitting servers apart.
    >     >     > > >     >
    >     >     > > >     > Servers have a lot of properties, and most are
    > specifically
    >     >     > > > important to
    >     >     > > >     > Cache
    >     >     > > >     > Servers - made all the more clear by the recent
    > addition of
    >     >     > > multiple
    >     >     > > >     > network
    >     >     > > >     > interfaces. We propose they be split up into
    > different
    >     > objects
    >     >     > > based
    >     >     > > > on
    >     >     > > >     > type -
    >     >     > > >     > which will also help reduce (if not totally
    > eliminate) the
    >     > use of
    >     >     > > > custom
    >     >     > > >     > Types
    >     >     > > >     > for servers. This will also eliminate the need for
    > hacky
    >     > ways of
    >     >     > > > searching
    >     >     > > >     > for
    >     >     > > >     > certain kinds of servers - e.g. checking for a
    > profile
    >     > name that
    >     >     > > > matches
    >     >     > > >     > "ATS_.*" to determine if something is a cache server
    > and
    >     >     > searching
    >     >     > > > for a
    >     >     > > >     > Type
    >     >     > > >     > that matches ".*EDGE.*" to determine if something is
    > an
    >     > edge-tier
    >     >     > > or
    >     >     > > >     > mid-tier
    >     >     > > >     > Cache Server (both of which are real checks in place
    >     > today).
    >     >     > > >     >
    >     >     > > >     > The new objects would be:
    >     >     > > >     >
    >     >     > > >     > - Cache Servers - exactly what it sounds like
    >     >     > > >     > - Infrastructure Servers - catch-all for anything
    > that
    >     > doesn't
    >     >     > fit
    >     >     > > > in a
    >     >     > > >     > different category, e.g. Grafana
    >     >     > > >     > - Origins - This should ideally eat the concept of
    >     > "ORG"-type
    >     >     > > > servers so
    >     >     > > >     > that we ONLY have Origins to express the concept of
    > an
    >     > Origin
    >     >     > > server.
    >     >     > > >     > - Traffic Monitors - exactly what it sounds like
    >     >     > > >     > - Traffic Ops Servers - exactly what it sounds like
    >     >     > > >     > - Traffic Portals - exactly what it sounds like
    >     >     > > >     > - Traffic Routers - exactly what it sounds like
    >     >     > > >     > - Traffic Stats Servers - exactly what it sounds
    > like - but
    >     >     > > InfluxDB
    >     >     > > >     > servers would be Infrastructure Servers; this is just
    >     > whatever
    >     >     > > > machine is
    >     >     > > >     > running the actual Traffic Stats program.
    >     >     > > >     > - Traffic Vaults - exactly what it sounds like
    >     >     > > >     >
    >     >     > > >     > I have a Draft PR (
    >     >     > > >
    >     >     > >
    >     >     >
    >     >
    > https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$
    >     >     > > > )
    >     >     > > >     > ready for
    >     >     > > >     > a blueprint to split out Traffic Portals already, to
    > give
    >     > you a
    >     >     > > sort
    >     >     > > > of
    >     >     > > >     > idea of
    >     >     > > >     > what that would look like. I don't want to get too
    > bogged
    >     > down in
    >     >     > > > what
    >     >     > > >     > properties each one will have exactly, since that's
    > best
    >     > decided
    >     >     > > on a
    >     >     > > >     > case-by-case basis and each should have its own
    > blueprint,
    >     > but
    >     >     > I'm
    >     >     > > > more
    >     >     > > >     > looking
    >     >     > > >     > for feedback on the concept of splitting apart
    > servers in
    >     >     > general.
    >     >     > > >     >
    >     >     > > >     > If you do have questions about what properties each
    > is
    >     >     > semi-planned
    >     >     > > > to
    >     >     > > >     > have,
    >     >     > > >     > though, I can answer it or point you at the current
    > draft
    >     > of the
    >     >     > > API
    >     >     > > > design
    >     >     > > >     > document which contains all those answers.
    >     >     > > >     >
    >     >     > > >
    >     >     > > >
    >     >     > >
    >     >     >
    >     >
    >     >
    >
    >

Re: [EXTERNAL] Re: Splitting up servers

Posted by ocket 8888 <oc...@gmail.com>.

Well, the main reason to separate them instead of just making constraints
based on type is so that configuration things can be taken out of Profiles
and Parameters and put into the object definitions themselves. Traffic
Portal is a bad example of that, because it has no TO-side configuration. A
better example would be the current spec for Traffic Monitor objects, which
would move

- eventCount
- healthPollingInterval
- heartbeatPollingInterval
- threadCount
- timePad

out of Parameters for better validation, simpler documentation, and greater
transparency when you look at the object as returned by the API.

So what you'd wind up with, then, isn't so much a subset of server
properties, but an intersection with some common set of properties but each
"sub-type" of server having its own, unique properties.

On Tue, Aug 25, 2020 at 9:35 PM Gray, Jonathan <Jo...@comcast.com>
wrote:

> Ah, ok.  I didn't realize you were only talking about the API.
>
> As an Operations user, I like having them all together in one place
> because it gives me one place to go search a grid for.  If I'm trying to
> locate an IP for example, I don't have to hunt through 3 different TP
> pages/grids to maybe find it.  I just want TO/TP to not let me do the wrong
> thing with data entry in the places we havn't caught already.
>
> As an API consumer, I like having them all together because it's a single
> call to get all the data I need for multiple purposes with a common schema
> to leverage when addressing fields.  I agree with Dave's comment earlier
> still that if server fields don't make sense except on server types, we
> should add those to a type or type union object and peel them away from
> server that way.  From a performance perspective, the extra non-cache
> servers are negligible by comparison.  It also makes more routes and
> increases the API complexity to use.  If the goal is to try and deal with
> the ALL CDN, it's not all that different than dealing with NULL/nil as a
> consumer of the API.  All the safeguards that would have to exist are the
> same in either case.
>
> What advantage does making new API objects have over adding a database
> constraint and/or TO API Check and/or TP UI omission?  The ALL CDN is
> already not an option in TP, so I'm not sure how issue #4324 came about in
> practice.
>
> Jonathan G
>
>
> On 8/25/20, 8:03 PM, "ocket 8888" <oc...@gmail.com> wrote:
>
>     The database setup is totally irrelevant. That was never discussed in
> the
>     working group, it's just an implementation detail of the API. I
> included it
>     in the blueprint because there's a section on database impact, but
> however
>     it gets done is fine. This is purely about improving the API.
>
>     On Tue, Aug 25, 2020 at 4:26 PM Gray, Jonathan <
> Jonathan_Gray@comcast.com>
>     wrote:
>
>     > A server is a server.  We were bad and put non-server, or special
> server
>     > type data on everything.  Server should remain server, but you can
> take all
>     > the ancillary columns off into one or more separate tables joined on
> the PK
>     > of a server row.  For now, that would be everything.  As TO adjusts
> to no
>     > longer need the extra concepts, those joined nonsense rows for other
> types
>     > can be dropped.  That said, significant changes to TODB are
> complicated and
>     > risky so we would want to move with care and foresight into other
> things of
>     > significant effort so there's not a conflict by accident or
> instability
>     > delivered until it's complete.
>     >
>     > Jonathan G
>     >
>     > On 8/25/20, 3:57 PM, "Jeremy Mitchell" <mi...@gmail.com>
> wrote:
>     >
>     >     Oh, this would help us get rid of the "ALL" cdn as well which
> has kind
>     > of
>     >     been a pain. Lots of "if cdn != ALL, then do something..." in the
>     >     codebase..which eventually leads to bugs like this:
>     >
>     >
>     >
> https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/issues/4324__;!!CQl3mcHX2A!S8EeyRyjEHQ4cqwQ1ZGzR3tX4StRhNk2mtP19puNOM6EeXzXN8hgwCpKGRG2HqIvzrYc$
>     >
>     >     On Tue, Aug 25, 2020 at 3:42 PM Zach Hoffman <zach@zrhoffman.net
> >
>     > wrote:
>     >
>     >     > +1 for splitting cache servers and infra servers. Currently,
> each
>     > server
>     >     > must be associated with a CDN and  cache group.
>     >     >
>     >     > While that part may seem logical by itself, when updates are
> queued
>     > on a
>     >     > CDN, it sets the upd_pending column on *all* of that CDN's
> servers,
>     >     > including servers of RASCAL type, servers of CCR type, etc.
> Although
>     > this
>     >     > doesn't hurt anything, as Jeremy has said, side effects like
> these
>     > make
>     >     > database-level validation difficult, so a table split of some
> kind
>     > seems
>     >     > like a step in the right direction.
>     >     >
>     >     > -Zach
>     >     >
>     >     > On Tue, Aug 25, 2020 at 2:22 PM Jeremy Mitchell <
>     > mitchell852@gmail.com>
>     >     > wrote:
>     >     >
>     >     > > If you look at the columns of the servers table, you'll see
> that
>     > most are
>     >     > > specific to "cache servers", so I definitely think that
> should be
>     >     > > addressed. Overloaded tables make it hard (impossible?) to
> do any
>     >     > > database-level validation and I thought we wanted to move in
> that
>     >     > direction
>     >     > > where possible.
>     >     > >
>     >     > > At the very least I think we should have these tables to
> capture
>     > all our
>     >     > > "server objects":
>     >     > >
>     >     > > - cache_servers (formerly known as servers)
>     >     > > - infra_servers
>     >     > > - origins
>     >     > >
>     >     > > Now whether the API mirrors the tables is another
> discussion. I
>     > don't
>     >     > think
>     >     > > we strive for that but sometimes GET /api/cache_servers just
> seems
>     > to
>     >     > make
>     >     > > sense.
>     >     > >
>     >     > > Jeremy
>     >     > >
>     >     > >
>     >     > >
>     >     > >
>     >     > >
>     >     > > On Tue, Aug 25, 2020 at 12:19 PM Gray, Jonathan <
>     >     > Jonathan_Gray@comcast.com
>     >     > > >
>     >     > > wrote:
>     >     > >
>     >     > > > I agree with Dave here.  Instead of trying to make our
> database
>     > and API
>     >     > > > identical, we should focus on doing better relational data
>     > modeling
>     >     > > inside
>     >     > > > the database and letting that roll upward into TO with more
>     > specific
>     >     > > > queries and stronger data integrity inside the database.
>     >     > > >
>     >     > > > Jonathan G
>     >     > > >
>     >     > > > On 8/25/20, 11:20 AM, "Dave Neuman" <ne...@apache.org>
> wrote:
>     >     > > >
>     >     > > >     This feels extremely heavy handed to me.  I don't
> think we
>     > should
>     >     > try
>     >     > > > to
>     >     > > >     build out a new table for different server types which
> will
>     > mostly
>     >     > > > have all
>     >     > > >     the same columns.  I could maybe see a total of 3
> tables for
>     >     > caches,
>     >     > > >     origins (which already exists), and other things, but
> even
>     > then I
>     >     > > > would be
>     >     > > >     hesitant to think it was a great idea.  Even if we
> have a
>     > caches
>     >     > > > table, we
>     >     > > >     still have to put some sort of typing in place to
>     > distinguish edges
>     >     > > and
>     >     > > >     mids and with the addition of flexible topologies,
> even that
>     > is
>     >     > > muddy;
>     >     > > > it
>     >     > > >     might be better to call them forward and reverse
> proxies
>     > instead,
>     >     > but
>     >     > > > that
>     >     > > >     is a different conversation.  I think while it may
> seem like
>     > this
>     >     > > > solves a
>     >     > > >     lot of problems on the surface, I still think some of
> the
>     > things
>     >     > you
>     >     > > > are
>     >     > > >     trying to address will remain and we will have new
> problems
>     > on top
>     >     > of
>     >     > > >     that.
>     >     > > >
>     >     > > >     I think we should think about addressing this problem
> with a
>     > better
>     >     > > > way of
>     >     > > >     identifying server types that can be accounted for in
> code
>     > instead
>     >     > of
>     >     > > >     searching for strings, adding some validation to our
> API
>     > based on
>     >     > the
>     >     > > >     server types (e.g. only require some fields for
> caches), and
>     > also
>     >     > by
>     >     > > >     thinking about the way we do our API and maybe trying
> to get
>     > away
>     >     > > from
>     >     > > >     "based on database tables" to be "based on use cases".
>     >     > > >
>     >     > > >     --Dave
>     >     > > >
>     >     > > >     On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <
>     > ocket8888@gmail.com>
>     >     > > > wrote:
>     >     > > >
>     >     > > >     > Hello everyone, I'd like to discuss something that
> the
>     > Traffic
>     >     > Ops
>     >     > > > Working
>     >     > > >     > Group
>     >     > > >     > has been working on: splitting servers apart.
>     >     > > >     >
>     >     > > >     > Servers have a lot of properties, and most are
> specifically
>     >     > > > important to
>     >     > > >     > Cache
>     >     > > >     > Servers - made all the more clear by the recent
> addition of
>     >     > > multiple
>     >     > > >     > network
>     >     > > >     > interfaces. We propose they be split up into
> different
>     > objects
>     >     > > based
>     >     > > > on
>     >     > > >     > type -
>     >     > > >     > which will also help reduce (if not totally
> eliminate) the
>     > use of
>     >     > > > custom
>     >     > > >     > Types
>     >     > > >     > for servers. This will also eliminate the need for
> hacky
>     > ways of
>     >     > > > searching
>     >     > > >     > for
>     >     > > >     > certain kinds of servers - e.g. checking for a
> profile
>     > name that
>     >     > > > matches
>     >     > > >     > "ATS_.*" to determine if something is a cache server
> and
>     >     > searching
>     >     > > > for a
>     >     > > >     > Type
>     >     > > >     > that matches ".*EDGE.*" to determine if something is
> an
>     > edge-tier
>     >     > > or
>     >     > > >     > mid-tier
>     >     > > >     > Cache Server (both of which are real checks in place
>     > today).
>     >     > > >     >
>     >     > > >     > The new objects would be:
>     >     > > >     >
>     >     > > >     > - Cache Servers - exactly what it sounds like
>     >     > > >     > - Infrastructure Servers - catch-all for anything
> that
>     > doesn't
>     >     > fit
>     >     > > > in a
>     >     > > >     > different category, e.g. Grafana
>     >     > > >     > - Origins - This should ideally eat the concept of
>     > "ORG"-type
>     >     > > > servers so
>     >     > > >     > that we ONLY have Origins to express the concept of
> an
>     > Origin
>     >     > > server.
>     >     > > >     > - Traffic Monitors - exactly what it sounds like
>     >     > > >     > - Traffic Ops Servers - exactly what it sounds like
>     >     > > >     > - Traffic Portals - exactly what it sounds like
>     >     > > >     > - Traffic Routers - exactly what it sounds like
>     >     > > >     > - Traffic Stats Servers - exactly what it sounds
> like - but
>     >     > > InfluxDB
>     >     > > >     > servers would be Infrastructure Servers; this is just
>     > whatever
>     >     > > > machine is
>     >     > > >     > running the actual Traffic Stats program.
>     >     > > >     > - Traffic Vaults - exactly what it sounds like
>     >     > > >     >
>     >     > > >     > I have a Draft PR (
>     >     > > >
>     >     > >
>     >     >
>     >
> https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$
>     >     > > > )
>     >     > > >     > ready for
>     >     > > >     > a blueprint to split out Traffic Portals already, to
> give
>     > you a
>     >     > > sort
>     >     > > > of
>     >     > > >     > idea of
>     >     > > >     > what that would look like. I don't want to get too
> bogged
>     > down in
>     >     > > > what
>     >     > > >     > properties each one will have exactly, since that's
> best
>     > decided
>     >     > > on a
>     >     > > >     > case-by-case basis and each should have its own
> blueprint,
>     > but
>     >     > I'm
>     >     > > > more
>     >     > > >     > looking
>     >     > > >     > for feedback on the concept of splitting apart
> servers in
>     >     > general.
>     >     > > >     >
>     >     > > >     > If you do have questions about what properties each
> is
>     >     > semi-planned
>     >     > > > to
>     >     > > >     > have,
>     >     > > >     > though, I can answer it or point you at the current
> draft
>     > of the
>     >     > > API
>     >     > > > design
>     >     > > >     > document which contains all those answers.
>     >     > > >     >
>     >     > > >
>     >     > > >
>     >     > >
>     >     >
>     >
>     >
>
>

Re: [EXTERNAL] Re: Splitting up servers

Posted by "Gray, Jonathan" <Jo...@comcast.com>.

Ah, ok.  I didn't realize you were only talking about the API.

As an Operations user, I like having them all together in one place because it gives me one place to go search a grid for.  If I'm trying to locate an IP for example, I don't have to hunt through 3 different TP pages/grids to maybe find it.  I just want TO/TP to not let me do the wrong thing with data entry in the places we havn't caught already.

As an API consumer, I like having them all together because it's a single call to get all the data I need for multiple purposes with a common schema to leverage when addressing fields.  I agree with Dave's comment earlier still that if server fields don't make sense except on server types, we should add those to a type or type union object and peel them away from server that way.  From a performance perspective, the extra non-cache servers are negligible by comparison.  It also makes more routes and increases the API complexity to use.  If the goal is to try and deal with the ALL CDN, it's not all that different than dealing with NULL/nil as a consumer of the API.  All the safeguards that would have to exist are the same in either case.

What advantage does making new API objects have over adding a database constraint and/or TO API Check and/or TP UI omission?  The ALL CDN is already not an option in TP, so I'm not sure how issue #4324 came about in practice.

Jonathan G


On 8/25/20, 8:03 PM, "ocket 8888" <oc...@gmail.com> wrote:

    The database setup is totally irrelevant. That was never discussed in the
    working group, it's just an implementation detail of the API. I included it
    in the blueprint because there's a section on database impact, but however
    it gets done is fine. This is purely about improving the API.

    On Tue, Aug 25, 2020 at 4:26 PM Gray, Jonathan <Jo...@comcast.com>
    wrote:

    > A server is a server.  We were bad and put non-server, or special server
    > type data on everything.  Server should remain server, but you can take all
    > the ancillary columns off into one or more separate tables joined on the PK
    > of a server row.  For now, that would be everything.  As TO adjusts to no
    > longer need the extra concepts, those joined nonsense rows for other types
    > can be dropped.  That said, significant changes to TODB are complicated and
    > risky so we would want to move with care and foresight into other things of
    > significant effort so there's not a conflict by accident or instability
    > delivered until it's complete.
    >
    > Jonathan G
    >
    > On 8/25/20, 3:57 PM, "Jeremy Mitchell" <mi...@gmail.com> wrote:
    >
    >     Oh, this would help us get rid of the "ALL" cdn as well which has kind
    > of
    >     been a pain. Lots of "if cdn != ALL, then do something..." in the
    >     codebase..which eventually leads to bugs like this:
    >
    >
    > https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/issues/4324__;!!CQl3mcHX2A!S8EeyRyjEHQ4cqwQ1ZGzR3tX4StRhNk2mtP19puNOM6EeXzXN8hgwCpKGRG2HqIvzrYc$
    >
    >     On Tue, Aug 25, 2020 at 3:42 PM Zach Hoffman <za...@zrhoffman.net>
    > wrote:
    >
    >     > +1 for splitting cache servers and infra servers. Currently, each
    > server
    >     > must be associated with a CDN and  cache group.
    >     >
    >     > While that part may seem logical by itself, when updates are queued
    > on a
    >     > CDN, it sets the upd_pending column on *all* of that CDN's servers,
    >     > including servers of RASCAL type, servers of CCR type, etc. Although
    > this
    >     > doesn't hurt anything, as Jeremy has said, side effects like these
    > make
    >     > database-level validation difficult, so a table split of some kind
    > seems
    >     > like a step in the right direction.
    >     >
    >     > -Zach
    >     >
    >     > On Tue, Aug 25, 2020 at 2:22 PM Jeremy Mitchell <
    > mitchell852@gmail.com>
    >     > wrote:
    >     >
    >     > > If you look at the columns of the servers table, you'll see that
    > most are
    >     > > specific to "cache servers", so I definitely think that should be
    >     > > addressed. Overloaded tables make it hard (impossible?) to do any
    >     > > database-level validation and I thought we wanted to move in that
    >     > direction
    >     > > where possible.
    >     > >
    >     > > At the very least I think we should have these tables to capture
    > all our
    >     > > "server objects":
    >     > >
    >     > > - cache_servers (formerly known as servers)
    >     > > - infra_servers
    >     > > - origins
    >     > >
    >     > > Now whether the API mirrors the tables is another discussion. I
    > don't
    >     > think
    >     > > we strive for that but sometimes GET /api/cache_servers just seems
    > to
    >     > make
    >     > > sense.
    >     > >
    >     > > Jeremy
    >     > >
    >     > >
    >     > >
    >     > >
    >     > >
    >     > > On Tue, Aug 25, 2020 at 12:19 PM Gray, Jonathan <
    >     > Jonathan_Gray@comcast.com
    >     > > >
    >     > > wrote:
    >     > >
    >     > > > I agree with Dave here.  Instead of trying to make our database
    > and API
    >     > > > identical, we should focus on doing better relational data
    > modeling
    >     > > inside
    >     > > > the database and letting that roll upward into TO with more
    > specific
    >     > > > queries and stronger data integrity inside the database.
    >     > > >
    >     > > > Jonathan G
    >     > > >
    >     > > > On 8/25/20, 11:20 AM, "Dave Neuman" <ne...@apache.org> wrote:
    >     > > >
    >     > > >     This feels extremely heavy handed to me.  I don't think we
    > should
    >     > try
    >     > > > to
    >     > > >     build out a new table for different server types which will
    > mostly
    >     > > > have all
    >     > > >     the same columns.  I could maybe see a total of 3 tables for
    >     > caches,
    >     > > >     origins (which already exists), and other things, but even
    > then I
    >     > > > would be
    >     > > >     hesitant to think it was a great idea.  Even if we have a
    > caches
    >     > > > table, we
    >     > > >     still have to put some sort of typing in place to
    > distinguish edges
    >     > > and
    >     > > >     mids and with the addition of flexible topologies, even that
    > is
    >     > > muddy;
    >     > > > it
    >     > > >     might be better to call them forward and reverse proxies
    > instead,
    >     > but
    >     > > > that
    >     > > >     is a different conversation.  I think while it may seem like
    > this
    >     > > > solves a
    >     > > >     lot of problems on the surface, I still think some of the
    > things
    >     > you
    >     > > > are
    >     > > >     trying to address will remain and we will have new problems
    > on top
    >     > of
    >     > > >     that.
    >     > > >
    >     > > >     I think we should think about addressing this problem with a
    > better
    >     > > > way of
    >     > > >     identifying server types that can be accounted for in code
    > instead
    >     > of
    >     > > >     searching for strings, adding some validation to our API
    > based on
    >     > the
    >     > > >     server types (e.g. only require some fields for caches), and
    > also
    >     > by
    >     > > >     thinking about the way we do our API and maybe trying to get
    > away
    >     > > from
    >     > > >     "based on database tables" to be "based on use cases".
    >     > > >
    >     > > >     --Dave
    >     > > >
    >     > > >     On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <
    > ocket8888@gmail.com>
    >     > > > wrote:
    >     > > >
    >     > > >     > Hello everyone, I'd like to discuss something that the
    > Traffic
    >     > Ops
    >     > > > Working
    >     > > >     > Group
    >     > > >     > has been working on: splitting servers apart.
    >     > > >     >
    >     > > >     > Servers have a lot of properties, and most are specifically
    >     > > > important to
    >     > > >     > Cache
    >     > > >     > Servers - made all the more clear by the recent addition of
    >     > > multiple
    >     > > >     > network
    >     > > >     > interfaces. We propose they be split up into different
    > objects
    >     > > based
    >     > > > on
    >     > > >     > type -
    >     > > >     > which will also help reduce (if not totally eliminate) the
    > use of
    >     > > > custom
    >     > > >     > Types
    >     > > >     > for servers. This will also eliminate the need for hacky
    > ways of
    >     > > > searching
    >     > > >     > for
    >     > > >     > certain kinds of servers - e.g. checking for a profile
    > name that
    >     > > > matches
    >     > > >     > "ATS_.*" to determine if something is a cache server and
    >     > searching
    >     > > > for a
    >     > > >     > Type
    >     > > >     > that matches ".*EDGE.*" to determine if something is an
    > edge-tier
    >     > > or
    >     > > >     > mid-tier
    >     > > >     > Cache Server (both of which are real checks in place
    > today).
    >     > > >     >
    >     > > >     > The new objects would be:
    >     > > >     >
    >     > > >     > - Cache Servers - exactly what it sounds like
    >     > > >     > - Infrastructure Servers - catch-all for anything that
    > doesn't
    >     > fit
    >     > > > in a
    >     > > >     > different category, e.g. Grafana
    >     > > >     > - Origins - This should ideally eat the concept of
    > "ORG"-type
    >     > > > servers so
    >     > > >     > that we ONLY have Origins to express the concept of an
    > Origin
    >     > > server.
    >     > > >     > - Traffic Monitors - exactly what it sounds like
    >     > > >     > - Traffic Ops Servers - exactly what it sounds like
    >     > > >     > - Traffic Portals - exactly what it sounds like
    >     > > >     > - Traffic Routers - exactly what it sounds like
    >     > > >     > - Traffic Stats Servers - exactly what it sounds like - but
    >     > > InfluxDB
    >     > > >     > servers would be Infrastructure Servers; this is just
    > whatever
    >     > > > machine is
    >     > > >     > running the actual Traffic Stats program.
    >     > > >     > - Traffic Vaults - exactly what it sounds like
    >     > > >     >
    >     > > >     > I have a Draft PR (
    >     > > >
    >     > >
    >     >
    > https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$
    >     > > > )
    >     > > >     > ready for
    >     > > >     > a blueprint to split out Traffic Portals already, to give
    > you a
    >     > > sort
    >     > > > of
    >     > > >     > idea of
    >     > > >     > what that would look like. I don't want to get too bogged
    > down in
    >     > > > what
    >     > > >     > properties each one will have exactly, since that's best
    > decided
    >     > > on a
    >     > > >     > case-by-case basis and each should have its own blueprint,
    > but
    >     > I'm
    >     > > > more
    >     > > >     > looking
    >     > > >     > for feedback on the concept of splitting apart servers in
    >     > general.
    >     > > >     >
    >     > > >     > If you do have questions about what properties each is
    >     > semi-planned
    >     > > > to
    >     > > >     > have,
    >     > > >     > though, I can answer it or point you at the current draft
    > of the
    >     > > API
    >     > > > design
    >     > > >     > document which contains all those answers.
    >     > > >     >
    >     > > >
    >     > > >
    >     > >
    >     >
    >
    >

Re: [EXTERNAL] Re: Splitting up servers

Posted by ocket 8888 <oc...@gmail.com>.

The database setup is totally irrelevant. That was never discussed in the
working group, it's just an implementation detail of the API. I included it
in the blueprint because there's a section on database impact, but however
it gets done is fine. This is purely about improving the API.

On Tue, Aug 25, 2020 at 4:26 PM Gray, Jonathan <Jo...@comcast.com>
wrote:

> A server is a server.  We were bad and put non-server, or special server
> type data on everything.  Server should remain server, but you can take all
> the ancillary columns off into one or more separate tables joined on the PK
> of a server row.  For now, that would be everything.  As TO adjusts to no
> longer need the extra concepts, those joined nonsense rows for other types
> can be dropped.  That said, significant changes to TODB are complicated and
> risky so we would want to move with care and foresight into other things of
> significant effort so there's not a conflict by accident or instability
> delivered until it's complete.
>
> Jonathan G
>
> On 8/25/20, 3:57 PM, "Jeremy Mitchell" <mi...@gmail.com> wrote:
>
>     Oh, this would help us get rid of the "ALL" cdn as well which has kind
> of
>     been a pain. Lots of "if cdn != ALL, then do something..." in the
>     codebase..which eventually leads to bugs like this:
>
>
> https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/issues/4324__;!!CQl3mcHX2A!S8EeyRyjEHQ4cqwQ1ZGzR3tX4StRhNk2mtP19puNOM6EeXzXN8hgwCpKGRG2HqIvzrYc$
>
>     On Tue, Aug 25, 2020 at 3:42 PM Zach Hoffman <za...@zrhoffman.net>
> wrote:
>
>     > +1 for splitting cache servers and infra servers. Currently, each
> server
>     > must be associated with a CDN and  cache group.
>     >
>     > While that part may seem logical by itself, when updates are queued
> on a
>     > CDN, it sets the upd_pending column on *all* of that CDN's servers,
>     > including servers of RASCAL type, servers of CCR type, etc. Although
> this
>     > doesn't hurt anything, as Jeremy has said, side effects like these
> make
>     > database-level validation difficult, so a table split of some kind
> seems
>     > like a step in the right direction.
>     >
>     > -Zach
>     >
>     > On Tue, Aug 25, 2020 at 2:22 PM Jeremy Mitchell <
> mitchell852@gmail.com>
>     > wrote:
>     >
>     > > If you look at the columns of the servers table, you'll see that
> most are
>     > > specific to "cache servers", so I definitely think that should be
>     > > addressed. Overloaded tables make it hard (impossible?) to do any
>     > > database-level validation and I thought we wanted to move in that
>     > direction
>     > > where possible.
>     > >
>     > > At the very least I think we should have these tables to capture
> all our
>     > > "server objects":
>     > >
>     > > - cache_servers (formerly known as servers)
>     > > - infra_servers
>     > > - origins
>     > >
>     > > Now whether the API mirrors the tables is another discussion. I
> don't
>     > think
>     > > we strive for that but sometimes GET /api/cache_servers just seems
> to
>     > make
>     > > sense.
>     > >
>     > > Jeremy
>     > >
>     > >
>     > >
>     > >
>     > >
>     > > On Tue, Aug 25, 2020 at 12:19 PM Gray, Jonathan <
>     > Jonathan_Gray@comcast.com
>     > > >
>     > > wrote:
>     > >
>     > > > I agree with Dave here.  Instead of trying to make our database
> and API
>     > > > identical, we should focus on doing better relational data
> modeling
>     > > inside
>     > > > the database and letting that roll upward into TO with more
> specific
>     > > > queries and stronger data integrity inside the database.
>     > > >
>     > > > Jonathan G
>     > > >
>     > > > On 8/25/20, 11:20 AM, "Dave Neuman" <ne...@apache.org> wrote:
>     > > >
>     > > >     This feels extremely heavy handed to me.  I don't think we
> should
>     > try
>     > > > to
>     > > >     build out a new table for different server types which will
> mostly
>     > > > have all
>     > > >     the same columns.  I could maybe see a total of 3 tables for
>     > caches,
>     > > >     origins (which already exists), and other things, but even
> then I
>     > > > would be
>     > > >     hesitant to think it was a great idea.  Even if we have a
> caches
>     > > > table, we
>     > > >     still have to put some sort of typing in place to
> distinguish edges
>     > > and
>     > > >     mids and with the addition of flexible topologies, even that
> is
>     > > muddy;
>     > > > it
>     > > >     might be better to call them forward and reverse proxies
> instead,
>     > but
>     > > > that
>     > > >     is a different conversation.  I think while it may seem like
> this
>     > > > solves a
>     > > >     lot of problems on the surface, I still think some of the
> things
>     > you
>     > > > are
>     > > >     trying to address will remain and we will have new problems
> on top
>     > of
>     > > >     that.
>     > > >
>     > > >     I think we should think about addressing this problem with a
> better
>     > > > way of
>     > > >     identifying server types that can be accounted for in code
> instead
>     > of
>     > > >     searching for strings, adding some validation to our API
> based on
>     > the
>     > > >     server types (e.g. only require some fields for caches), and
> also
>     > by
>     > > >     thinking about the way we do our API and maybe trying to get
> away
>     > > from
>     > > >     "based on database tables" to be "based on use cases".
>     > > >
>     > > >     --Dave
>     > > >
>     > > >     On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <
> ocket8888@gmail.com>
>     > > > wrote:
>     > > >
>     > > >     > Hello everyone, I'd like to discuss something that the
> Traffic
>     > Ops
>     > > > Working
>     > > >     > Group
>     > > >     > has been working on: splitting servers apart.
>     > > >     >
>     > > >     > Servers have a lot of properties, and most are specifically
>     > > > important to
>     > > >     > Cache
>     > > >     > Servers - made all the more clear by the recent addition of
>     > > multiple
>     > > >     > network
>     > > >     > interfaces. We propose they be split up into different
> objects
>     > > based
>     > > > on
>     > > >     > type -
>     > > >     > which will also help reduce (if not totally eliminate) the
> use of
>     > > > custom
>     > > >     > Types
>     > > >     > for servers. This will also eliminate the need for hacky
> ways of
>     > > > searching
>     > > >     > for
>     > > >     > certain kinds of servers - e.g. checking for a profile
> name that
>     > > > matches
>     > > >     > "ATS_.*" to determine if something is a cache server and
>     > searching
>     > > > for a
>     > > >     > Type
>     > > >     > that matches ".*EDGE.*" to determine if something is an
> edge-tier
>     > > or
>     > > >     > mid-tier
>     > > >     > Cache Server (both of which are real checks in place
> today).
>     > > >     >
>     > > >     > The new objects would be:
>     > > >     >
>     > > >     > - Cache Servers - exactly what it sounds like
>     > > >     > - Infrastructure Servers - catch-all for anything that
> doesn't
>     > fit
>     > > > in a
>     > > >     > different category, e.g. Grafana
>     > > >     > - Origins - This should ideally eat the concept of
> "ORG"-type
>     > > > servers so
>     > > >     > that we ONLY have Origins to express the concept of an
> Origin
>     > > server.
>     > > >     > - Traffic Monitors - exactly what it sounds like
>     > > >     > - Traffic Ops Servers - exactly what it sounds like
>     > > >     > - Traffic Portals - exactly what it sounds like
>     > > >     > - Traffic Routers - exactly what it sounds like
>     > > >     > - Traffic Stats Servers - exactly what it sounds like - but
>     > > InfluxDB
>     > > >     > servers would be Infrastructure Servers; this is just
> whatever
>     > > > machine is
>     > > >     > running the actual Traffic Stats program.
>     > > >     > - Traffic Vaults - exactly what it sounds like
>     > > >     >
>     > > >     > I have a Draft PR (
>     > > >
>     > >
>     >
> https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$
>     > > > )
>     > > >     > ready for
>     > > >     > a blueprint to split out Traffic Portals already, to give
> you a
>     > > sort
>     > > > of
>     > > >     > idea of
>     > > >     > what that would look like. I don't want to get too bogged
> down in
>     > > > what
>     > > >     > properties each one will have exactly, since that's best
> decided
>     > > on a
>     > > >     > case-by-case basis and each should have its own blueprint,
> but
>     > I'm
>     > > > more
>     > > >     > looking
>     > > >     > for feedback on the concept of splitting apart servers in
>     > general.
>     > > >     >
>     > > >     > If you do have questions about what properties each is
>     > semi-planned
>     > > > to
>     > > >     > have,
>     > > >     > though, I can answer it or point you at the current draft
> of the
>     > > API
>     > > > design
>     > > >     > document which contains all those answers.
>     > > >     >
>     > > >
>     > > >
>     > >
>     >
>
>

Re: [EXTERNAL] Re: Splitting up servers

Posted by "Gray, Jonathan" <Jo...@comcast.com>.

A server is a server.  We were bad and put non-server, or special server type data on everything.  Server should remain server, but you can take all the ancillary columns off into one or more separate tables joined on the PK of a server row.  For now, that would be everything.  As TO adjusts to no longer need the extra concepts, those joined nonsense rows for other types can be dropped.  That said, significant changes to TODB are complicated and risky so we would want to move with care and foresight into other things of significant effort so there's not a conflict by accident or instability delivered until it's complete.

Jonathan G

On 8/25/20, 3:57 PM, "Jeremy Mitchell" <mi...@gmail.com> wrote:

    Oh, this would help us get rid of the "ALL" cdn as well which has kind of
    been a pain. Lots of "if cdn != ALL, then do something..." in the
    codebase..which eventually leads to bugs like this:

    https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/issues/4324__;!!CQl3mcHX2A!S8EeyRyjEHQ4cqwQ1ZGzR3tX4StRhNk2mtP19puNOM6EeXzXN8hgwCpKGRG2HqIvzrYc$

    On Tue, Aug 25, 2020 at 3:42 PM Zach Hoffman <za...@zrhoffman.net> wrote:

    > +1 for splitting cache servers and infra servers. Currently, each server
    > must be associated with a CDN and  cache group.
    >
    > While that part may seem logical by itself, when updates are queued on a
    > CDN, it sets the upd_pending column on *all* of that CDN's servers,
    > including servers of RASCAL type, servers of CCR type, etc. Although this
    > doesn't hurt anything, as Jeremy has said, side effects like these make
    > database-level validation difficult, so a table split of some kind seems
    > like a step in the right direction.
    >
    > -Zach
    >
    > On Tue, Aug 25, 2020 at 2:22 PM Jeremy Mitchell <mi...@gmail.com>
    > wrote:
    >
    > > If you look at the columns of the servers table, you'll see that most are
    > > specific to "cache servers", so I definitely think that should be
    > > addressed. Overloaded tables make it hard (impossible?) to do any
    > > database-level validation and I thought we wanted to move in that
    > direction
    > > where possible.
    > >
    > > At the very least I think we should have these tables to capture all our
    > > "server objects":
    > >
    > > - cache_servers (formerly known as servers)
    > > - infra_servers
    > > - origins
    > >
    > > Now whether the API mirrors the tables is another discussion. I don't
    > think
    > > we strive for that but sometimes GET /api/cache_servers just seems to
    > make
    > > sense.
    > >
    > > Jeremy
    > >
    > >
    > >
    > >
    > >
    > > On Tue, Aug 25, 2020 at 12:19 PM Gray, Jonathan <
    > Jonathan_Gray@comcast.com
    > > >
    > > wrote:
    > >
    > > > I agree with Dave here.  Instead of trying to make our database and API
    > > > identical, we should focus on doing better relational data modeling
    > > inside
    > > > the database and letting that roll upward into TO with more specific
    > > > queries and stronger data integrity inside the database.
    > > >
    > > > Jonathan G
    > > >
    > > > On 8/25/20, 11:20 AM, "Dave Neuman" <ne...@apache.org> wrote:
    > > >
    > > >     This feels extremely heavy handed to me.  I don't think we should
    > try
    > > > to
    > > >     build out a new table for different server types which will mostly
    > > > have all
    > > >     the same columns.  I could maybe see a total of 3 tables for
    > caches,
    > > >     origins (which already exists), and other things, but even then I
    > > > would be
    > > >     hesitant to think it was a great idea.  Even if we have a caches
    > > > table, we
    > > >     still have to put some sort of typing in place to distinguish edges
    > > and
    > > >     mids and with the addition of flexible topologies, even that is
    > > muddy;
    > > > it
    > > >     might be better to call them forward and reverse proxies instead,
    > but
    > > > that
    > > >     is a different conversation.  I think while it may seem like this
    > > > solves a
    > > >     lot of problems on the surface, I still think some of the things
    > you
    > > > are
    > > >     trying to address will remain and we will have new problems on top
    > of
    > > >     that.
    > > >
    > > >     I think we should think about addressing this problem with a better
    > > > way of
    > > >     identifying server types that can be accounted for in code instead
    > of
    > > >     searching for strings, adding some validation to our API based on
    > the
    > > >     server types (e.g. only require some fields for caches), and also
    > by
    > > >     thinking about the way we do our API and maybe trying to get away
    > > from
    > > >     "based on database tables" to be "based on use cases".
    > > >
    > > >     --Dave
    > > >
    > > >     On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <oc...@gmail.com>
    > > > wrote:
    > > >
    > > >     > Hello everyone, I'd like to discuss something that the Traffic
    > Ops
    > > > Working
    > > >     > Group
    > > >     > has been working on: splitting servers apart.
    > > >     >
    > > >     > Servers have a lot of properties, and most are specifically
    > > > important to
    > > >     > Cache
    > > >     > Servers - made all the more clear by the recent addition of
    > > multiple
    > > >     > network
    > > >     > interfaces. We propose they be split up into different objects
    > > based
    > > > on
    > > >     > type -
    > > >     > which will also help reduce (if not totally eliminate) the use of
    > > > custom
    > > >     > Types
    > > >     > for servers. This will also eliminate the need for hacky ways of
    > > > searching
    > > >     > for
    > > >     > certain kinds of servers - e.g. checking for a profile name that
    > > > matches
    > > >     > "ATS_.*" to determine if something is a cache server and
    > searching
    > > > for a
    > > >     > Type
    > > >     > that matches ".*EDGE.*" to determine if something is an edge-tier
    > > or
    > > >     > mid-tier
    > > >     > Cache Server (both of which are real checks in place today).
    > > >     >
    > > >     > The new objects would be:
    > > >     >
    > > >     > - Cache Servers - exactly what it sounds like
    > > >     > - Infrastructure Servers - catch-all for anything that doesn't
    > fit
    > > > in a
    > > >     > different category, e.g. Grafana
    > > >     > - Origins - This should ideally eat the concept of "ORG"-type
    > > > servers so
    > > >     > that we ONLY have Origins to express the concept of an Origin
    > > server.
    > > >     > - Traffic Monitors - exactly what it sounds like
    > > >     > - Traffic Ops Servers - exactly what it sounds like
    > > >     > - Traffic Portals - exactly what it sounds like
    > > >     > - Traffic Routers - exactly what it sounds like
    > > >     > - Traffic Stats Servers - exactly what it sounds like - but
    > > InfluxDB
    > > >     > servers would be Infrastructure Servers; this is just whatever
    > > > machine is
    > > >     > running the actual Traffic Stats program.
    > > >     > - Traffic Vaults - exactly what it sounds like
    > > >     >
    > > >     > I have a Draft PR (
    > > >
    > >
    > https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$
    > > > )
    > > >     > ready for
    > > >     > a blueprint to split out Traffic Portals already, to give you a
    > > sort
    > > > of
    > > >     > idea of
    > > >     > what that would look like. I don't want to get too bogged down in
    > > > what
    > > >     > properties each one will have exactly, since that's best decided
    > > on a
    > > >     > case-by-case basis and each should have its own blueprint, but
    > I'm
    > > > more
    > > >     > looking
    > > >     > for feedback on the concept of splitting apart servers in
    > general.
    > > >     >
    > > >     > If you do have questions about what properties each is
    > semi-planned
    > > > to
    > > >     > have,
    > > >     > though, I can answer it or point you at the current draft of the
    > > API
    > > > design
    > > >     > document which contains all those answers.
    > > >     >
    > > >
    > > >
    > >
    >

Re: [EXTERNAL] Re: Splitting up servers

Posted by Jeremy Mitchell <mi...@gmail.com>.

Oh, this would help us get rid of the "ALL" cdn as well which has kind of
been a pain. Lots of "if cdn != ALL, then do something..." in the
codebase..which eventually leads to bugs like this:

https://github.com/apache/trafficcontrol/issues/4324

On Tue, Aug 25, 2020 at 3:42 PM Zach Hoffman <za...@zrhoffman.net> wrote:

> +1 for splitting cache servers and infra servers. Currently, each server
> must be associated with a CDN and  cache group.
>
> While that part may seem logical by itself, when updates are queued on a
> CDN, it sets the upd_pending column on *all* of that CDN's servers,
> including servers of RASCAL type, servers of CCR type, etc. Although this
> doesn't hurt anything, as Jeremy has said, side effects like these make
> database-level validation difficult, so a table split of some kind seems
> like a step in the right direction.
>
> -Zach
>
> On Tue, Aug 25, 2020 at 2:22 PM Jeremy Mitchell <mi...@gmail.com>
> wrote:
>
> > If you look at the columns of the servers table, you'll see that most are
> > specific to "cache servers", so I definitely think that should be
> > addressed. Overloaded tables make it hard (impossible?) to do any
> > database-level validation and I thought we wanted to move in that
> direction
> > where possible.
> >
> > At the very least I think we should have these tables to capture all our
> > "server objects":
> >
> > - cache_servers (formerly known as servers)
> > - infra_servers
> > - origins
> >
> > Now whether the API mirrors the tables is another discussion. I don't
> think
> > we strive for that but sometimes GET /api/cache_servers just seems to
> make
> > sense.
> >
> > Jeremy
> >
> >
> >
> >
> >
> > On Tue, Aug 25, 2020 at 12:19 PM Gray, Jonathan <
> Jonathan_Gray@comcast.com
> > >
> > wrote:
> >
> > > I agree with Dave here.  Instead of trying to make our database and API
> > > identical, we should focus on doing better relational data modeling
> > inside
> > > the database and letting that roll upward into TO with more specific
> > > queries and stronger data integrity inside the database.
> > >
> > > Jonathan G
> > >
> > > On 8/25/20, 11:20 AM, "Dave Neuman" <ne...@apache.org> wrote:
> > >
> > >     This feels extremely heavy handed to me.  I don't think we should
> try
> > > to
> > >     build out a new table for different server types which will mostly
> > > have all
> > >     the same columns.  I could maybe see a total of 3 tables for
> caches,
> > >     origins (which already exists), and other things, but even then I
> > > would be
> > >     hesitant to think it was a great idea.  Even if we have a caches
> > > table, we
> > >     still have to put some sort of typing in place to distinguish edges
> > and
> > >     mids and with the addition of flexible topologies, even that is
> > muddy;
> > > it
> > >     might be better to call them forward and reverse proxies instead,
> but
> > > that
> > >     is a different conversation.  I think while it may seem like this
> > > solves a
> > >     lot of problems on the surface, I still think some of the things
> you
> > > are
> > >     trying to address will remain and we will have new problems on top
> of
> > >     that.
> > >
> > >     I think we should think about addressing this problem with a better
> > > way of
> > >     identifying server types that can be accounted for in code instead
> of
> > >     searching for strings, adding some validation to our API based on
> the
> > >     server types (e.g. only require some fields for caches), and also
> by
> > >     thinking about the way we do our API and maybe trying to get away
> > from
> > >     "based on database tables" to be "based on use cases".
> > >
> > >     --Dave
> > >
> > >     On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <oc...@gmail.com>
> > > wrote:
> > >
> > >     > Hello everyone, I'd like to discuss something that the Traffic
> Ops
> > > Working
> > >     > Group
> > >     > has been working on: splitting servers apart.
> > >     >
> > >     > Servers have a lot of properties, and most are specifically
> > > important to
> > >     > Cache
> > >     > Servers - made all the more clear by the recent addition of
> > multiple
> > >     > network
> > >     > interfaces. We propose they be split up into different objects
> > based
> > > on
> > >     > type -
> > >     > which will also help reduce (if not totally eliminate) the use of
> > > custom
> > >     > Types
> > >     > for servers. This will also eliminate the need for hacky ways of
> > > searching
> > >     > for
> > >     > certain kinds of servers - e.g. checking for a profile name that
> > > matches
> > >     > "ATS_.*" to determine if something is a cache server and
> searching
> > > for a
> > >     > Type
> > >     > that matches ".*EDGE.*" to determine if something is an edge-tier
> > or
> > >     > mid-tier
> > >     > Cache Server (both of which are real checks in place today).
> > >     >
> > >     > The new objects would be:
> > >     >
> > >     > - Cache Servers - exactly what it sounds like
> > >     > - Infrastructure Servers - catch-all for anything that doesn't
> fit
> > > in a
> > >     > different category, e.g. Grafana
> > >     > - Origins - This should ideally eat the concept of "ORG"-type
> > > servers so
> > >     > that we ONLY have Origins to express the concept of an Origin
> > server.
> > >     > - Traffic Monitors - exactly what it sounds like
> > >     > - Traffic Ops Servers - exactly what it sounds like
> > >     > - Traffic Portals - exactly what it sounds like
> > >     > - Traffic Routers - exactly what it sounds like
> > >     > - Traffic Stats Servers - exactly what it sounds like - but
> > InfluxDB
> > >     > servers would be Infrastructure Servers; this is just whatever
> > > machine is
> > >     > running the actual Traffic Stats program.
> > >     > - Traffic Vaults - exactly what it sounds like
> > >     >
> > >     > I have a Draft PR (
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$
> > > )
> > >     > ready for
> > >     > a blueprint to split out Traffic Portals already, to give you a
> > sort
> > > of
> > >     > idea of
> > >     > what that would look like. I don't want to get too bogged down in
> > > what
> > >     > properties each one will have exactly, since that's best decided
> > on a
> > >     > case-by-case basis and each should have its own blueprint, but
> I'm
> > > more
> > >     > looking
> > >     > for feedback on the concept of splitting apart servers in
> general.
> > >     >
> > >     > If you do have questions about what properties each is
> semi-planned
> > > to
> > >     > have,
> > >     > though, I can answer it or point you at the current draft of the
> > API
> > > design
> > >     > document which contains all those answers.
> > >     >
> > >
> > >
> >
>

Re: [EXTERNAL] Re: Splitting up servers

Posted by Zach Hoffman <za...@zrhoffman.net>.

+1 for splitting cache servers and infra servers. Currently, each server
must be associated with a CDN and  cache group.

While that part may seem logical by itself, when updates are queued on a
CDN, it sets the upd_pending column on *all* of that CDN's servers,
including servers of RASCAL type, servers of CCR type, etc. Although this
doesn't hurt anything, as Jeremy has said, side effects like these make
database-level validation difficult, so a table split of some kind seems
like a step in the right direction.

-Zach

On Tue, Aug 25, 2020 at 2:22 PM Jeremy Mitchell <mi...@gmail.com>
wrote:

> If you look at the columns of the servers table, you'll see that most are
> specific to "cache servers", so I definitely think that should be
> addressed. Overloaded tables make it hard (impossible?) to do any
> database-level validation and I thought we wanted to move in that direction
> where possible.
>
> At the very least I think we should have these tables to capture all our
> "server objects":
>
> - cache_servers (formerly known as servers)
> - infra_servers
> - origins
>
> Now whether the API mirrors the tables is another discussion. I don't think
> we strive for that but sometimes GET /api/cache_servers just seems to make
> sense.
>
> Jeremy
>
>
>
>
>
> On Tue, Aug 25, 2020 at 12:19 PM Gray, Jonathan <Jonathan_Gray@comcast.com
> >
> wrote:
>
> > I agree with Dave here.  Instead of trying to make our database and API
> > identical, we should focus on doing better relational data modeling
> inside
> > the database and letting that roll upward into TO with more specific
> > queries and stronger data integrity inside the database.
> >
> > Jonathan G
> >
> > On 8/25/20, 11:20 AM, "Dave Neuman" <ne...@apache.org> wrote:
> >
> >     This feels extremely heavy handed to me.  I don't think we should try
> > to
> >     build out a new table for different server types which will mostly
> > have all
> >     the same columns.  I could maybe see a total of 3 tables for caches,
> >     origins (which already exists), and other things, but even then I
> > would be
> >     hesitant to think it was a great idea.  Even if we have a caches
> > table, we
> >     still have to put some sort of typing in place to distinguish edges
> and
> >     mids and with the addition of flexible topologies, even that is
> muddy;
> > it
> >     might be better to call them forward and reverse proxies instead, but
> > that
> >     is a different conversation.  I think while it may seem like this
> > solves a
> >     lot of problems on the surface, I still think some of the things you
> > are
> >     trying to address will remain and we will have new problems on top of
> >     that.
> >
> >     I think we should think about addressing this problem with a better
> > way of
> >     identifying server types that can be accounted for in code instead of
> >     searching for strings, adding some validation to our API based on the
> >     server types (e.g. only require some fields for caches), and also by
> >     thinking about the way we do our API and maybe trying to get away
> from
> >     "based on database tables" to be "based on use cases".
> >
> >     --Dave
> >
> >     On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <oc...@gmail.com>
> > wrote:
> >
> >     > Hello everyone, I'd like to discuss something that the Traffic Ops
> > Working
> >     > Group
> >     > has been working on: splitting servers apart.
> >     >
> >     > Servers have a lot of properties, and most are specifically
> > important to
> >     > Cache
> >     > Servers - made all the more clear by the recent addition of
> multiple
> >     > network
> >     > interfaces. We propose they be split up into different objects
> based
> > on
> >     > type -
> >     > which will also help reduce (if not totally eliminate) the use of
> > custom
> >     > Types
> >     > for servers. This will also eliminate the need for hacky ways of
> > searching
> >     > for
> >     > certain kinds of servers - e.g. checking for a profile name that
> > matches
> >     > "ATS_.*" to determine if something is a cache server and searching
> > for a
> >     > Type
> >     > that matches ".*EDGE.*" to determine if something is an edge-tier
> or
> >     > mid-tier
> >     > Cache Server (both of which are real checks in place today).
> >     >
> >     > The new objects would be:
> >     >
> >     > - Cache Servers - exactly what it sounds like
> >     > - Infrastructure Servers - catch-all for anything that doesn't fit
> > in a
> >     > different category, e.g. Grafana
> >     > - Origins - This should ideally eat the concept of "ORG"-type
> > servers so
> >     > that we ONLY have Origins to express the concept of an Origin
> server.
> >     > - Traffic Monitors - exactly what it sounds like
> >     > - Traffic Ops Servers - exactly what it sounds like
> >     > - Traffic Portals - exactly what it sounds like
> >     > - Traffic Routers - exactly what it sounds like
> >     > - Traffic Stats Servers - exactly what it sounds like - but
> InfluxDB
> >     > servers would be Infrastructure Servers; this is just whatever
> > machine is
> >     > running the actual Traffic Stats program.
> >     > - Traffic Vaults - exactly what it sounds like
> >     >
> >     > I have a Draft PR (
> >
> https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$
> > )
> >     > ready for
> >     > a blueprint to split out Traffic Portals already, to give you a
> sort
> > of
> >     > idea of
> >     > what that would look like. I don't want to get too bogged down in
> > what
> >     > properties each one will have exactly, since that's best decided
> on a
> >     > case-by-case basis and each should have its own blueprint, but I'm
> > more
> >     > looking
> >     > for feedback on the concept of splitting apart servers in general.
> >     >
> >     > If you do have questions about what properties each is semi-planned
> > to
> >     > have,
> >     > though, I can answer it or point you at the current draft of the
> API
> > design
> >     > document which contains all those answers.
> >     >
> >
> >
>

Re: [EXTERNAL] Re: Splitting up servers

Posted by Jeremy Mitchell <mi...@gmail.com>.

If you look at the columns of the servers table, you'll see that most are
specific to "cache servers", so I definitely think that should be
addressed. Overloaded tables make it hard (impossible?) to do any
database-level validation and I thought we wanted to move in that direction
where possible.

At the very least I think we should have these tables to capture all our
"server objects":

- cache_servers (formerly known as servers)
- infra_servers
- origins

Now whether the API mirrors the tables is another discussion. I don't think
we strive for that but sometimes GET /api/cache_servers just seems to make
sense.

Jeremy





On Tue, Aug 25, 2020 at 12:19 PM Gray, Jonathan <Jo...@comcast.com>
wrote:

> I agree with Dave here.  Instead of trying to make our database and API
> identical, we should focus on doing better relational data modeling inside
> the database and letting that roll upward into TO with more specific
> queries and stronger data integrity inside the database.
>
> Jonathan G
>
> On 8/25/20, 11:20 AM, "Dave Neuman" <ne...@apache.org> wrote:
>
>     This feels extremely heavy handed to me.  I don't think we should try
> to
>     build out a new table for different server types which will mostly
> have all
>     the same columns.  I could maybe see a total of 3 tables for caches,
>     origins (which already exists), and other things, but even then I
> would be
>     hesitant to think it was a great idea.  Even if we have a caches
> table, we
>     still have to put some sort of typing in place to distinguish edges and
>     mids and with the addition of flexible topologies, even that is muddy;
> it
>     might be better to call them forward and reverse proxies instead, but
> that
>     is a different conversation.  I think while it may seem like this
> solves a
>     lot of problems on the surface, I still think some of the things you
> are
>     trying to address will remain and we will have new problems on top of
>     that.
>
>     I think we should think about addressing this problem with a better
> way of
>     identifying server types that can be accounted for in code instead of
>     searching for strings, adding some validation to our API based on the
>     server types (e.g. only require some fields for caches), and also by
>     thinking about the way we do our API and maybe trying to get away from
>     "based on database tables" to be "based on use cases".
>
>     --Dave
>
>     On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <oc...@gmail.com>
> wrote:
>
>     > Hello everyone, I'd like to discuss something that the Traffic Ops
> Working
>     > Group
>     > has been working on: splitting servers apart.
>     >
>     > Servers have a lot of properties, and most are specifically
> important to
>     > Cache
>     > Servers - made all the more clear by the recent addition of multiple
>     > network
>     > interfaces. We propose they be split up into different objects based
> on
>     > type -
>     > which will also help reduce (if not totally eliminate) the use of
> custom
>     > Types
>     > for servers. This will also eliminate the need for hacky ways of
> searching
>     > for
>     > certain kinds of servers - e.g. checking for a profile name that
> matches
>     > "ATS_.*" to determine if something is a cache server and searching
> for a
>     > Type
>     > that matches ".*EDGE.*" to determine if something is an edge-tier or
>     > mid-tier
>     > Cache Server (both of which are real checks in place today).
>     >
>     > The new objects would be:
>     >
>     > - Cache Servers - exactly what it sounds like
>     > - Infrastructure Servers - catch-all for anything that doesn't fit
> in a
>     > different category, e.g. Grafana
>     > - Origins - This should ideally eat the concept of "ORG"-type
> servers so
>     > that we ONLY have Origins to express the concept of an Origin server.
>     > - Traffic Monitors - exactly what it sounds like
>     > - Traffic Ops Servers - exactly what it sounds like
>     > - Traffic Portals - exactly what it sounds like
>     > - Traffic Routers - exactly what it sounds like
>     > - Traffic Stats Servers - exactly what it sounds like - but InfluxDB
>     > servers would be Infrastructure Servers; this is just whatever
> machine is
>     > running the actual Traffic Stats program.
>     > - Traffic Vaults - exactly what it sounds like
>     >
>     > I have a Draft PR (
> https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$
> )
>     > ready for
>     > a blueprint to split out Traffic Portals already, to give you a sort
> of
>     > idea of
>     > what that would look like. I don't want to get too bogged down in
> what
>     > properties each one will have exactly, since that's best decided on a
>     > case-by-case basis and each should have its own blueprint, but I'm
> more
>     > looking
>     > for feedback on the concept of splitting apart servers in general.
>     >
>     > If you do have questions about what properties each is semi-planned
> to
>     > have,
>     > though, I can answer it or point you at the current draft of the API
> design
>     > document which contains all those answers.
>     >
>
>

Re: [EXTERNAL] Re: Splitting up servers

Posted by "Gray, Jonathan" <Jo...@comcast.com>.

I agree with Dave here.  Instead of trying to make our database and API identical, we should focus on doing better relational data modeling inside the database and letting that roll upward into TO with more specific queries and stronger data integrity inside the database.

Jonathan G

On 8/25/20, 11:20 AM, "Dave Neuman" <ne...@apache.org> wrote:

    This feels extremely heavy handed to me.  I don't think we should try to
    build out a new table for different server types which will mostly have all
    the same columns.  I could maybe see a total of 3 tables for caches,
    origins (which already exists), and other things, but even then I would be
    hesitant to think it was a great idea.  Even if we have a caches table, we
    still have to put some sort of typing in place to distinguish edges and
    mids and with the addition of flexible topologies, even that is muddy; it
    might be better to call them forward and reverse proxies instead, but that
    is a different conversation.  I think while it may seem like this solves a
    lot of problems on the surface, I still think some of the things you are
    trying to address will remain and we will have new problems on top of
    that.

    I think we should think about addressing this problem with a better way of
    identifying server types that can be accounted for in code instead of
    searching for strings, adding some validation to our API based on the
    server types (e.g. only require some fields for caches), and also by
    thinking about the way we do our API and maybe trying to get away from
    "based on database tables" to be "based on use cases".

    --Dave

    On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <oc...@gmail.com> wrote:

    > Hello everyone, I'd like to discuss something that the Traffic Ops Working
    > Group
    > has been working on: splitting servers apart.
    >
    > Servers have a lot of properties, and most are specifically important to
    > Cache
    > Servers - made all the more clear by the recent addition of multiple
    > network
    > interfaces. We propose they be split up into different objects based on
    > type -
    > which will also help reduce (if not totally eliminate) the use of custom
    > Types
    > for servers. This will also eliminate the need for hacky ways of searching
    > for
    > certain kinds of servers - e.g. checking for a profile name that matches
    > "ATS_.*" to determine if something is a cache server and searching for a
    > Type
    > that matches ".*EDGE.*" to determine if something is an edge-tier or
    > mid-tier
    > Cache Server (both of which are real checks in place today).
    >
    > The new objects would be:
    >
    > - Cache Servers - exactly what it sounds like
    > - Infrastructure Servers - catch-all for anything that doesn't fit in a
    > different category, e.g. Grafana
    > - Origins - This should ideally eat the concept of "ORG"-type servers so
    > that we ONLY have Origins to express the concept of an Origin server.
    > - Traffic Monitors - exactly what it sounds like
    > - Traffic Ops Servers - exactly what it sounds like
    > - Traffic Portals - exactly what it sounds like
    > - Traffic Routers - exactly what it sounds like
    > - Traffic Stats Servers - exactly what it sounds like - but InfluxDB
    > servers would be Infrastructure Servers; this is just whatever machine is
    > running the actual Traffic Stats program.
    > - Traffic Vaults - exactly what it sounds like
    >
    > I have a Draft PR (https://urldefense.com/v3/__https://github.com/apache/trafficcontrol/pull/4986__;!!CQl3mcHX2A!SSbbYaDqtrWYwoO2tJM5Q1FjEiah5oVxE2I8kUagnUPqF3nKfj3k9miwq8px91-RjQAG$ )
    > ready for
    > a blueprint to split out Traffic Portals already, to give you a sort of
    > idea of
    > what that would look like. I don't want to get too bogged down in what
    > properties each one will have exactly, since that's best decided on a
    > case-by-case basis and each should have its own blueprint, but I'm more
    > looking
    > for feedback on the concept of splitting apart servers in general.
    >
    > If you do have questions about what properties each is semi-planned to
    > have,
    > though, I can answer it or point you at the current draft of the API design
    > document which contains all those answers.
    >

Re: Splitting up servers

Posted by Dave Neuman <ne...@apache.org>.

This feels extremely heavy handed to me.  I don't think we should try to
build out a new table for different server types which will mostly have all
the same columns.  I could maybe see a total of 3 tables for caches,
origins (which already exists), and other things, but even then I would be
hesitant to think it was a great idea.  Even if we have a caches table, we
still have to put some sort of typing in place to distinguish edges and
mids and with the addition of flexible topologies, even that is muddy; it
might be better to call them forward and reverse proxies instead, but that
is a different conversation.  I think while it may seem like this solves a
lot of problems on the surface, I still think some of the things you are
trying to address will remain and we will have new problems on top of
that.

I think we should think about addressing this problem with a better way of
identifying server types that can be accounted for in code instead of
searching for strings, adding some validation to our API based on the
server types (e.g. only require some fields for caches), and also by
thinking about the way we do our API and maybe trying to get away from
"based on database tables" to be "based on use cases".

--Dave

On Tue, Aug 25, 2020 at 10:49 AM ocket 8888 <oc...@gmail.com> wrote:

> Hello everyone, I'd like to discuss something that the Traffic Ops Working
> Group
> has been working on: splitting servers apart.
>
> Servers have a lot of properties, and most are specifically important to
> Cache
> Servers - made all the more clear by the recent addition of multiple
> network
> interfaces. We propose they be split up into different objects based on
> type -
> which will also help reduce (if not totally eliminate) the use of custom
> Types
> for servers. This will also eliminate the need for hacky ways of searching
> for
> certain kinds of servers - e.g. checking for a profile name that matches
> "ATS_.*" to determine if something is a cache server and searching for a
> Type
> that matches ".*EDGE.*" to determine if something is an edge-tier or
> mid-tier
> Cache Server (both of which are real checks in place today).
>
> The new objects would be:
>
> - Cache Servers - exactly what it sounds like
> - Infrastructure Servers - catch-all for anything that doesn't fit in a
> different category, e.g. Grafana
> - Origins - This should ideally eat the concept of "ORG"-type servers so
> that we ONLY have Origins to express the concept of an Origin server.
> - Traffic Monitors - exactly what it sounds like
> - Traffic Ops Servers - exactly what it sounds like
> - Traffic Portals - exactly what it sounds like
> - Traffic Routers - exactly what it sounds like
> - Traffic Stats Servers - exactly what it sounds like - but InfluxDB
> servers would be Infrastructure Servers; this is just whatever machine is
> running the actual Traffic Stats program.
> - Traffic Vaults - exactly what it sounds like
>
> I have a Draft PR (https://github.com/apache/trafficcontrol/pull/4986)
> ready for
> a blueprint to split out Traffic Portals already, to give you a sort of
> idea of
> what that would look like. I don't want to get too bogged down in what
> properties each one will have exactly, since that's best decided on a
> case-by-case basis and each should have its own blueprint, but I'm more
> looking
> for feedback on the concept of splitting apart servers in general.
>
> If you do have questions about what properties each is semi-planned to
> have,
> though, I can answer it or point you at the current draft of the API design
> document which contains all those answers.
>