You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficcontrol.apache.org by David Neuman <da...@gmail.com> on 2018/04/02 18:43:07 UTC

Re: Question about the poll model of the Traffic Monitor

Hi Zhilin,
Is it possible to get this design doc added to our wiki?  I create a design
docs page here (https://cwiki.apache.org/confluence/display/TC/Design+Docs).
I think it would be good to get the document there so it doesn't get lost
over time.

Thanks!
Dave

On Wed, Mar 28, 2018 at 10:41 PM, Zhilin Huang (zhilhuan) <
zhilhuan@cisco.com> wrote:

> Hi Guys,
>
> Thanks a lot for the discussion. I should put the design earlier for
> review, and sorry for the delay. Here is the link for the design doc:
> https://docs.google.com/document/d/1vgq-pGNoLLYf7Y3cu5hWu67TUKpN5hucrp
> -ZS9nSsd4/edit?usp=sharing
>
> Short summary for the feature design:
> ---
> There is feature request from market to add secondary IPs support on edge
> cache servers, and the functionality to assign a delivery service to a
> secondary IP of an edge cache.
>
> This feature requires Traffic Ops implementation to support secondary IP
> configuration for edge cache, and delivery service assignment to secondary
> IP.
>
> Traffic Monitor should also monitor connectivity of secondary IPs
> configured. And Traffic Router needs support to resolve streamer FQDN to
> secondary IP assigned in a delivery service.
>
> Traffic Server should record the IP serving client request. And should
> reject request to an unassigned IP for a delivery service.
>
> This design has taken compatibility into consideration: if no secondary IP
> configured, or some parts of the system has not been upgraded to the
> version supports this feature, the traffic will be served by primary IPs as
> before.
> ---
>
> Replies for Robert's comments is embedded in the email thread. Much
> appreciated and welcome to any further comments.
>
> Thanks,
> Zhilin
>
>
>
>
> On 29/03/2018, 10:19 AM, "Neil Hao (nbaoping)" <nb...@cisco.com>
> wrote:
>
>     Hi Robert/Nir,
>
>     Thanks very much for the quick and detail reply, and sorry for that I
> didn’t make the whole feature clearly. Actually, it’s our Secondary IP
> feature, which is a big feature that will bring change to all the
> components in the Traffic Control. I thought our teammate reviewed the
> design with you guys before, but it seems not. And after discussion, we
> will start the whole feature design review with you guys soon, I think it
> will be better to continue the discussion after that.
>
>     Thanks,
>     Neil
>
>     On 3/29/18, 1:16 AM, "Robert Butts" <ro...@gmail.com> wrote:
>
>         I agree with Nir, it's not as simple as changing a structure to
> `[]URL`,
>         it's a bigger architectural design question.
>
>         How do you plan to mark caches Unavailable if they're unhealthy on
> one
>         interface, but healthy on another?
>
>         Right now, Traffic Router needs a boolean for each cache, it
> doesn't know
>         anything about multiple network interfaces, IPv4 vs IPv6, etc. It
> only
>         knows the FQDN, which is all the clients it's giving DNS records
> to will
>         know when they request the cache.
>
>         Questions:
>         Is a cache marked Unavailable when any interface is unreachable?
> Or all of
>         them?
> ZH> Actually, we will care about an IP availability instead of interface
> availability. Please take a look at 3.1.2 of the design doc.
>
>         What if an interface is reachable, but one interface reports
> different
>         stats than another interface? For example, what if someone
> configures a
>         different caching proxy (ATS) on each interface?
> ZH> Will only use 1 ATS to serve traffic from all IPs configured.
>
>         How are stats aggregated? Should the monitor aggregate all stats
> from
>         different polls and interfaces together, and consider them the same
>         "server"? If not, how do we reconcile the different stats with
> what the
>         Monitor reports on `CrStates` and `CacheStats`? If so, again, what
> happens
>         if different interfaces have different ATS instances, so e.g. the
> byte
>         count on one is 100, and the other is 1000, then 101, then 1001.
> It simply
>         won't work. Do we handle that? Or just ignore it, and document "all
>         interfaces must report the same stats"? Do we try to detect that
> and give a
>         useful error or warning?
> ZH> The bandwidth for interfaces will be aggregated. We will only have 1
> ATS to serve traffic from all interfaces. The connectivity check is IP
> based. And the stats collection will be interface based. Please take a look
> at 3.1.2 of the design doc for details.
>
>         In Traffic Ops, servers have specific data used for polling.
> Traffic
>         Monitor gets the stats URI path from Parameters, and the URI IP
> from the
>         Servers table. It doesn't use the FQDN, Server Host or Server
> Domain. Where
>         would these other interfaces come from? Parameters? Or another
> table linked
>         to the servers table? (I'd really, really rather we didn't put
> more data in
>         unsafe Parameters, which can not exist, not be properly formatted,
> need
>         safety checks in all code that ever uses them, and are confusing
> and opaque
>         to new users) Would these other interfaces be in addition to using
> the IP
>         from the Server table? Or replace it?
>
>         Do we have config options for all of these? Only some of them? In
> the
>         config file, or Traffic Ops fields?
> ZH> Please take a look at 3.1.1 of the design doc. Basically, we will add
> new APIs, or new fields to existing APIs. So this feature implementation
> will not impact existing functionality.
>
>         I'd like to hear the use case too, and e.g. why it isn't better to
> simply
>         make each different interface a different server in Traffic Ops?
> How is the
> ZH> We discussed this solution too. But the main issue is running ort
> script for one server will overwrite the ATS configuration for anther
> server. The use case is our customer want different client to be served by
> different IP. For example a mobile client will be served by different IP of
> a PC client.
>         Traffic Router routing to them, anyway? Are you setting up the
> same DNS
>         record to point to the IPs of all interfaces? How is that
> configured in
> ZH> For each edge, each DS will be assigned to a single IP. If no
> secondary IP specified, it will work just as the behavior today. Please
> take a look at 3.1.3 of the design doc.
>         Traffic Ops then? I.e. which interfaces are configured as the
> Server IP and
>         IP6? Are we certain there aren't other issues in other Traffic
> Control
>         components, with a Server IP and IP6 not having a one-to-one
> relationship
>         with the FQDN A/AAAA record?
> ZH> Please check 3.1.1 of the design doc. There will be new pages for
> secondary IPs configuration, the current functionality should not be
> impacted.
>
>         Do we need to take the bigger step, of having a Traffic Ops Server
> have an
>         array of IPs? That's a lot more work (especially making sure it
> works
>         everywhere, e.g. Traffic Router), but it solves a lot of questions
> and
>         hackery, gives us a lot more flexibility, and matches the physical
> reality
>         better.
> ZH> When making this design, we are trying to avoid impact to current
> functionality and compatibility with earlier version. So we add extra
> tables or fields for secondary IPs.
>
>         I'm not opposed to the idea, but we need to think through the
> architecture,
>         we need to be sure the added complexity is worth it over existing
>         solutions, we need to make all the options (e.g. Unavailable if
> any vs all)
>         configurable, and we need to make sure the common simple case of a
> single
>         Server IP and IP6 still work without additional configuration
> complexity.
> ZH> Yes, agree with you. We are trying to not impact the existing
> solution. Please take a look at the design doc for more details.
>
>
>
>         On Wed, Mar 28, 2018 at 10:19 AM, Nir Sopher <ni...@qwilt.com>
> wrote:
>
>         > Hi Eric/Neil,
>         > Isn't the question of supporting multi interfaces per server a
> much wider
>         > question? Architectural wise.
>         > What would be the desired behavior if the monitoring shows that
> only one of
>         > the interfaces is down? Will the router send traffic to the
> healthy
>         > interfaces? How?
>         > Nir
>         >
>         > On Wed, Mar 28, 2018, 19:10 Eric Friedrich (efriedri) <
> efriedri@cisco.com>
>         > wrote:
>         >
>         > > The use case behind this question probably deserves a longer
> dev@ email.
>         > >
>         > > I will oversimplify: we are extending TC to support multiple
> IPv4 (or
>         > > multiple IPv6) addresses per edge cache (across 1 or more
> NICs).
>         > >
>         > > Assume all addresses are reachable from the TM.
>         > >
>         > > —Eric
>         > >
>         > >
>         > > > On Mar 28, 2018, at 11:37 AM, Robert Butts <
> robert.o.butts@gmail.com>
>         > > wrote:
>         > > >
>         > > > When you say different interfaces, do you mean IPv4 versus
> IPv6? Or
>         > > > something else?
>         > > >
>         > > > If you mean IPv4 vs IPv6, we have a PR for that from Dylan
> Volz
>         > > > https://github.com/apache/incubator-trafficcontrol/pull/1627
>         > > >
>         > > > I'm hoping to get to it early next week, just haven't found
> the time to
>         > > > review and test it yet.
>         > > >
>         > > > Or did you mean something else by "interface"? Linux network
>         > interfaces?
>         > > > Ports?
>         > > >
>         > > >
>         > > > On Wed, Mar 28, 2018 at 12:02 AM, Neil Hao (nbaoping) <
>         > > nbaoping@cisco.com>
>         > > > wrote:
>         > > >
>         > > >> Hi,
>         > > >>
>         > > >> Currently, we poll exact one URL request to each cache
> server for one
>         > > >> interface, but now we’d like to add multiple interfaces
> support,
>         > > therefore,
>         > > >> we need multiple requests to query each interface of the
> cache
>         > server, I
>         > > >> check the code of Traffic Monitor, it seems we don’t
> support this kind
>         > > of
>         > > >> polling, right?
>         > > >>
>         > > >> I figure out different ways to support this:
>         > > >> 1) The first way: change the ‘Urls’ field in the
> HttpPollerConfig from
>         > > >> ‘map[string]PollConfig’ to ‘map[string][]PollConfig’, so
> that we can
>         > > have
>         > > >> multiple polling config to query the multiple interfaces
> info.
>         > > >>
>         > > >> 2) The second way: Change the ‘URL’ field in the PollConfig
> from
>         > > ‘string’
>         > > >> to ‘[]string’.
>         > > >>
>         > > >> No matter which way, it seems it will bring a little big
> change to the
>         > > >> current polling model. I’m not sure if I’m on the right
> direction,
>         > would
>         > > >> you guys have suggestions for this?
>         > > >>
>         > > >> Thanks,
>         > > >> Neil
>         > > >>
>         > >
>         > >
>         >
>
>
>
>
>

Re: Question about the poll model of the Traffic Monitor

Posted by "Zhilin Huang (zhilhuan)" <zh...@cisco.com>.
Thanks, Eric.

It works. I added the design doc which links to the google doc.

Thanks,
Zhilin


On 03/04/2018, 8:19 PM, "Eric Friedrich (efriedri)" <ef...@cisco.com> wrote:

    Zhilin-
      I added you to the Wiki permissions. Please try again
    
    —Eric
    
    > On Apr 3, 2018, at 2:00 AM, Zhilin Huang (zhilhuan) <zh...@cisco.com> wrote:
    > 
    > Hi Dave,
    > 
    > I could not find the edit button on this page. Looks like I do not have the authority to add the doc.
    > 
    > Thanks,
    > Zhilin
    > 
    > 
    > On 03/04/2018, 2:43 AM, "David Neuman" <da...@gmail.com> wrote:
    > 
    >    Hi Zhilin,
    >    Is it possible to get this design doc added to our wiki?  I create a design
    >    docs page here (https://cwiki.apache.org/confluence/display/TC/Design+Docs).
    >    I think it would be good to get the document there so it doesn't get lost
    >    over time.
    > 
    >    Thanks!
    >    Dave
    > 
    >    On Wed, Mar 28, 2018 at 10:41 PM, Zhilin Huang (zhilhuan) <
    >    zhilhuan@cisco.com> wrote:
    > 
    >> Hi Guys,
    >> 
    >> Thanks a lot for the discussion. I should put the design earlier for
    >> review, and sorry for the delay. Here is the link for the design doc:
    >> https://docs.google.com/document/d/1vgq-pGNoLLYf7Y3cu5hWu67TUKpN5hucrp
    >> -ZS9nSsd4/edit?usp=sharing
    >> 
    >> Short summary for the feature design:
    >> ---
    >> There is feature request from market to add secondary IPs support on edge
    >> cache servers, and the functionality to assign a delivery service to a
    >> secondary IP of an edge cache.
    >> 
    >> This feature requires Traffic Ops implementation to support secondary IP
    >> configuration for edge cache, and delivery service assignment to secondary
    >> IP.
    >> 
    >> Traffic Monitor should also monitor connectivity of secondary IPs
    >> configured. And Traffic Router needs support to resolve streamer FQDN to
    >> secondary IP assigned in a delivery service.
    >> 
    >> Traffic Server should record the IP serving client request. And should
    >> reject request to an unassigned IP for a delivery service.
    >> 
    >> This design has taken compatibility into consideration: if no secondary IP
    >> configured, or some parts of the system has not been upgraded to the
    >> version supports this feature, the traffic will be served by primary IPs as
    >> before.
    >> ---
    >> 
    >> Replies for Robert's comments is embedded in the email thread. Much
    >> appreciated and welcome to any further comments.
    >> 
    >> Thanks,
    >> Zhilin
    >> 
    >> 
    >> 
    >> 
    >> On 29/03/2018, 10:19 AM, "Neil Hao (nbaoping)" <nb...@cisco.com>
    >> wrote:
    >> 
    >>    Hi Robert/Nir,
    >> 
    >>    Thanks very much for the quick and detail reply, and sorry for that I
    >> didn’t make the whole feature clearly. Actually, it’s our Secondary IP
    >> feature, which is a big feature that will bring change to all the
    >> components in the Traffic Control. I thought our teammate reviewed the
    >> design with you guys before, but it seems not. And after discussion, we
    >> will start the whole feature design review with you guys soon, I think it
    >> will be better to continue the discussion after that.
    >> 
    >>    Thanks,
    >>    Neil
    >> 
    >>    On 3/29/18, 1:16 AM, "Robert Butts" <ro...@gmail.com> wrote:
    >> 
    >>        I agree with Nir, it's not as simple as changing a structure to
    >> `[]URL`,
    >>        it's a bigger architectural design question.
    >> 
    >>        How do you plan to mark caches Unavailable if they're unhealthy on
    >> one
    >>        interface, but healthy on another?
    >> 
    >>        Right now, Traffic Router needs a boolean for each cache, it
    >> doesn't know
    >>        anything about multiple network interfaces, IPv4 vs IPv6, etc. It
    >> only
    >>        knows the FQDN, which is all the clients it's giving DNS records
    >> to will
    >>        know when they request the cache.
    >> 
    >>        Questions:
    >>        Is a cache marked Unavailable when any interface is unreachable?
    >> Or all of
    >>        them?
    >> ZH> Actually, we will care about an IP availability instead of interface
    >> availability. Please take a look at 3.1.2 of the design doc.
    >> 
    >>        What if an interface is reachable, but one interface reports
    >> different
    >>        stats than another interface? For example, what if someone
    >> configures a
    >>        different caching proxy (ATS) on each interface?
    >> ZH> Will only use 1 ATS to serve traffic from all IPs configured.
    >> 
    >>        How are stats aggregated? Should the monitor aggregate all stats
    >> from
    >>        different polls and interfaces together, and consider them the same
    >>        "server"? If not, how do we reconcile the different stats with
    >> what the
    >>        Monitor reports on `CrStates` and `CacheStats`? If so, again, what
    >> happens
    >>        if different interfaces have different ATS instances, so e.g. the
    >> byte
    >>        count on one is 100, and the other is 1000, then 101, then 1001.
    >> It simply
    >>        won't work. Do we handle that? Or just ignore it, and document "all
    >>        interfaces must report the same stats"? Do we try to detect that
    >> and give a
    >>        useful error or warning?
    >> ZH> The bandwidth for interfaces will be aggregated. We will only have 1
    >> ATS to serve traffic from all interfaces. The connectivity check is IP
    >> based. And the stats collection will be interface based. Please take a look
    >> at 3.1.2 of the design doc for details.
    >> 
    >>        In Traffic Ops, servers have specific data used for polling.
    >> Traffic
    >>        Monitor gets the stats URI path from Parameters, and the URI IP
    >> from the
    >>        Servers table. It doesn't use the FQDN, Server Host or Server
    >> Domain. Where
    >>        would these other interfaces come from? Parameters? Or another
    >> table linked
    >>        to the servers table? (I'd really, really rather we didn't put
    >> more data in
    >>        unsafe Parameters, which can not exist, not be properly formatted,
    >> need
    >>        safety checks in all code that ever uses them, and are confusing
    >> and opaque
    >>        to new users) Would these other interfaces be in addition to using
    >> the IP
    >>        from the Server table? Or replace it?
    >> 
    >>        Do we have config options for all of these? Only some of them? In
    >> the
    >>        config file, or Traffic Ops fields?
    >> ZH> Please take a look at 3.1.1 of the design doc. Basically, we will add
    >> new APIs, or new fields to existing APIs. So this feature implementation
    >> will not impact existing functionality.
    >> 
    >>        I'd like to hear the use case too, and e.g. why it isn't better to
    >> simply
    >>        make each different interface a different server in Traffic Ops?
    >> How is the
    >> ZH> We discussed this solution too. But the main issue is running ort
    >> script for one server will overwrite the ATS configuration for anther
    >> server. The use case is our customer want different client to be served by
    >> different IP. For example a mobile client will be served by different IP of
    >> a PC client.
    >>        Traffic Router routing to them, anyway? Are you setting up the
    >> same DNS
    >>        record to point to the IPs of all interfaces? How is that
    >> configured in
    >> ZH> For each edge, each DS will be assigned to a single IP. If no
    >> secondary IP specified, it will work just as the behavior today. Please
    >> take a look at 3.1.3 of the design doc.
    >>        Traffic Ops then? I.e. which interfaces are configured as the
    >> Server IP and
    >>        IP6? Are we certain there aren't other issues in other Traffic
    >> Control
    >>        components, with a Server IP and IP6 not having a one-to-one
    >> relationship
    >>        with the FQDN A/AAAA record?
    >> ZH> Please check 3.1.1 of the design doc. There will be new pages for
    >> secondary IPs configuration, the current functionality should not be
    >> impacted.
    >> 
    >>        Do we need to take the bigger step, of having a Traffic Ops Server
    >> have an
    >>        array of IPs? That's a lot more work (especially making sure it
    >> works
    >>        everywhere, e.g. Traffic Router), but it solves a lot of questions
    >> and
    >>        hackery, gives us a lot more flexibility, and matches the physical
    >> reality
    >>        better.
    >> ZH> When making this design, we are trying to avoid impact to current
    >> functionality and compatibility with earlier version. So we add extra
    >> tables or fields for secondary IPs.
    >> 
    >>        I'm not opposed to the idea, but we need to think through the
    >> architecture,
    >>        we need to be sure the added complexity is worth it over existing
    >>        solutions, we need to make all the options (e.g. Unavailable if
    >> any vs all)
    >>        configurable, and we need to make sure the common simple case of a
    >> single
    >>        Server IP and IP6 still work without additional configuration
    >> complexity.
    >> ZH> Yes, agree with you. We are trying to not impact the existing
    >> solution. Please take a look at the design doc for more details.
    >> 
    >> 
    >> 
    >>        On Wed, Mar 28, 2018 at 10:19 AM, Nir Sopher <ni...@qwilt.com>
    >> wrote:
    >> 
    >>> Hi Eric/Neil,
    >>> Isn't the question of supporting multi interfaces per server a
    >> much wider
    >>> question? Architectural wise.
    >>> What would be the desired behavior if the monitoring shows that
    >> only one of
    >>> the interfaces is down? Will the router send traffic to the
    >> healthy
    >>> interfaces? How?
    >>> Nir
    >>> 
    >>> On Wed, Mar 28, 2018, 19:10 Eric Friedrich (efriedri) <
    >> efriedri@cisco.com>
    >>> wrote:
    >>> 
    >>>> The use case behind this question probably deserves a longer
    >> dev@ email.
    >>>> 
    >>>> I will oversimplify: we are extending TC to support multiple
    >> IPv4 (or
    >>>> multiple IPv6) addresses per edge cache (across 1 or more
    >> NICs).
    >>>> 
    >>>> Assume all addresses are reachable from the TM.
    >>>> 
    >>>> —Eric
    >>>> 
    >>>> 
    >>>>> On Mar 28, 2018, at 11:37 AM, Robert Butts <
    >> robert.o.butts@gmail.com>
    >>>> wrote:
    >>>>> 
    >>>>> When you say different interfaces, do you mean IPv4 versus
    >> IPv6? Or
    >>>>> something else?
    >>>>> 
    >>>>> If you mean IPv4 vs IPv6, we have a PR for that from Dylan
    >> Volz
    >>>>> https://github.com/apache/incubator-trafficcontrol/pull/1627
    >>>>> 
    >>>>> I'm hoping to get to it early next week, just haven't found
    >> the time to
    >>>>> review and test it yet.
    >>>>> 
    >>>>> Or did you mean something else by "interface"? Linux network
    >>> interfaces?
    >>>>> Ports?
    >>>>> 
    >>>>> 
    >>>>> On Wed, Mar 28, 2018 at 12:02 AM, Neil Hao (nbaoping) <
    >>>> nbaoping@cisco.com>
    >>>>> wrote:
    >>>>> 
    >>>>>> Hi,
    >>>>>> 
    >>>>>> Currently, we poll exact one URL request to each cache
    >> server for one
    >>>>>> interface, but now we’d like to add multiple interfaces
    >> support,
    >>>> therefore,
    >>>>>> we need multiple requests to query each interface of the
    >> cache
    >>> server, I
    >>>>>> check the code of Traffic Monitor, it seems we don’t
    >> support this kind
    >>>> of
    >>>>>> polling, right?
    >>>>>> 
    >>>>>> I figure out different ways to support this:
    >>>>>> 1) The first way: change the ‘Urls’ field in the
    >> HttpPollerConfig from
    >>>>>> ‘map[string]PollConfig’ to ‘map[string][]PollConfig’, so
    >> that we can
    >>>> have
    >>>>>> multiple polling config to query the multiple interfaces
    >> info.
    >>>>>> 
    >>>>>> 2) The second way: Change the ‘URL’ field in the PollConfig
    >> from
    >>>> ‘string’
    >>>>>> to ‘[]string’.
    >>>>>> 
    >>>>>> No matter which way, it seems it will bring a little big
    >> change to the
    >>>>>> current polling model. I’m not sure if I’m on the right
    >> direction,
    >>> would
    >>>>>> you guys have suggestions for this?
    >>>>>> 
    >>>>>> Thanks,
    >>>>>> Neil
    >>>>>> 
    >>>> 
    >>>> 
    >>> 
    >> 
    >> 
    >> 
    >> 
    >> 
    > 
    > 
    
    


Re: Question about the poll model of the Traffic Monitor

Posted by "Eric Friedrich (efriedri)" <ef...@cisco.com>.
Zhilin-
  I added you to the Wiki permissions. Please try again

—Eric

> On Apr 3, 2018, at 2:00 AM, Zhilin Huang (zhilhuan) <zh...@cisco.com> wrote:
> 
> Hi Dave,
> 
> I could not find the edit button on this page. Looks like I do not have the authority to add the doc.
> 
> Thanks,
> Zhilin
> 
> 
> On 03/04/2018, 2:43 AM, "David Neuman" <da...@gmail.com> wrote:
> 
>    Hi Zhilin,
>    Is it possible to get this design doc added to our wiki?  I create a design
>    docs page here (https://cwiki.apache.org/confluence/display/TC/Design+Docs).
>    I think it would be good to get the document there so it doesn't get lost
>    over time.
> 
>    Thanks!
>    Dave
> 
>    On Wed, Mar 28, 2018 at 10:41 PM, Zhilin Huang (zhilhuan) <
>    zhilhuan@cisco.com> wrote:
> 
>> Hi Guys,
>> 
>> Thanks a lot for the discussion. I should put the design earlier for
>> review, and sorry for the delay. Here is the link for the design doc:
>> https://docs.google.com/document/d/1vgq-pGNoLLYf7Y3cu5hWu67TUKpN5hucrp
>> -ZS9nSsd4/edit?usp=sharing
>> 
>> Short summary for the feature design:
>> ---
>> There is feature request from market to add secondary IPs support on edge
>> cache servers, and the functionality to assign a delivery service to a
>> secondary IP of an edge cache.
>> 
>> This feature requires Traffic Ops implementation to support secondary IP
>> configuration for edge cache, and delivery service assignment to secondary
>> IP.
>> 
>> Traffic Monitor should also monitor connectivity of secondary IPs
>> configured. And Traffic Router needs support to resolve streamer FQDN to
>> secondary IP assigned in a delivery service.
>> 
>> Traffic Server should record the IP serving client request. And should
>> reject request to an unassigned IP for a delivery service.
>> 
>> This design has taken compatibility into consideration: if no secondary IP
>> configured, or some parts of the system has not been upgraded to the
>> version supports this feature, the traffic will be served by primary IPs as
>> before.
>> ---
>> 
>> Replies for Robert's comments is embedded in the email thread. Much
>> appreciated and welcome to any further comments.
>> 
>> Thanks,
>> Zhilin
>> 
>> 
>> 
>> 
>> On 29/03/2018, 10:19 AM, "Neil Hao (nbaoping)" <nb...@cisco.com>
>> wrote:
>> 
>>    Hi Robert/Nir,
>> 
>>    Thanks very much for the quick and detail reply, and sorry for that I
>> didn’t make the whole feature clearly. Actually, it’s our Secondary IP
>> feature, which is a big feature that will bring change to all the
>> components in the Traffic Control. I thought our teammate reviewed the
>> design with you guys before, but it seems not. And after discussion, we
>> will start the whole feature design review with you guys soon, I think it
>> will be better to continue the discussion after that.
>> 
>>    Thanks,
>>    Neil
>> 
>>    On 3/29/18, 1:16 AM, "Robert Butts" <ro...@gmail.com> wrote:
>> 
>>        I agree with Nir, it's not as simple as changing a structure to
>> `[]URL`,
>>        it's a bigger architectural design question.
>> 
>>        How do you plan to mark caches Unavailable if they're unhealthy on
>> one
>>        interface, but healthy on another?
>> 
>>        Right now, Traffic Router needs a boolean for each cache, it
>> doesn't know
>>        anything about multiple network interfaces, IPv4 vs IPv6, etc. It
>> only
>>        knows the FQDN, which is all the clients it's giving DNS records
>> to will
>>        know when they request the cache.
>> 
>>        Questions:
>>        Is a cache marked Unavailable when any interface is unreachable?
>> Or all of
>>        them?
>> ZH> Actually, we will care about an IP availability instead of interface
>> availability. Please take a look at 3.1.2 of the design doc.
>> 
>>        What if an interface is reachable, but one interface reports
>> different
>>        stats than another interface? For example, what if someone
>> configures a
>>        different caching proxy (ATS) on each interface?
>> ZH> Will only use 1 ATS to serve traffic from all IPs configured.
>> 
>>        How are stats aggregated? Should the monitor aggregate all stats
>> from
>>        different polls and interfaces together, and consider them the same
>>        "server"? If not, how do we reconcile the different stats with
>> what the
>>        Monitor reports on `CrStates` and `CacheStats`? If so, again, what
>> happens
>>        if different interfaces have different ATS instances, so e.g. the
>> byte
>>        count on one is 100, and the other is 1000, then 101, then 1001.
>> It simply
>>        won't work. Do we handle that? Or just ignore it, and document "all
>>        interfaces must report the same stats"? Do we try to detect that
>> and give a
>>        useful error or warning?
>> ZH> The bandwidth for interfaces will be aggregated. We will only have 1
>> ATS to serve traffic from all interfaces. The connectivity check is IP
>> based. And the stats collection will be interface based. Please take a look
>> at 3.1.2 of the design doc for details.
>> 
>>        In Traffic Ops, servers have specific data used for polling.
>> Traffic
>>        Monitor gets the stats URI path from Parameters, and the URI IP
>> from the
>>        Servers table. It doesn't use the FQDN, Server Host or Server
>> Domain. Where
>>        would these other interfaces come from? Parameters? Or another
>> table linked
>>        to the servers table? (I'd really, really rather we didn't put
>> more data in
>>        unsafe Parameters, which can not exist, not be properly formatted,
>> need
>>        safety checks in all code that ever uses them, and are confusing
>> and opaque
>>        to new users) Would these other interfaces be in addition to using
>> the IP
>>        from the Server table? Or replace it?
>> 
>>        Do we have config options for all of these? Only some of them? In
>> the
>>        config file, or Traffic Ops fields?
>> ZH> Please take a look at 3.1.1 of the design doc. Basically, we will add
>> new APIs, or new fields to existing APIs. So this feature implementation
>> will not impact existing functionality.
>> 
>>        I'd like to hear the use case too, and e.g. why it isn't better to
>> simply
>>        make each different interface a different server in Traffic Ops?
>> How is the
>> ZH> We discussed this solution too. But the main issue is running ort
>> script for one server will overwrite the ATS configuration for anther
>> server. The use case is our customer want different client to be served by
>> different IP. For example a mobile client will be served by different IP of
>> a PC client.
>>        Traffic Router routing to them, anyway? Are you setting up the
>> same DNS
>>        record to point to the IPs of all interfaces? How is that
>> configured in
>> ZH> For each edge, each DS will be assigned to a single IP. If no
>> secondary IP specified, it will work just as the behavior today. Please
>> take a look at 3.1.3 of the design doc.
>>        Traffic Ops then? I.e. which interfaces are configured as the
>> Server IP and
>>        IP6? Are we certain there aren't other issues in other Traffic
>> Control
>>        components, with a Server IP and IP6 not having a one-to-one
>> relationship
>>        with the FQDN A/AAAA record?
>> ZH> Please check 3.1.1 of the design doc. There will be new pages for
>> secondary IPs configuration, the current functionality should not be
>> impacted.
>> 
>>        Do we need to take the bigger step, of having a Traffic Ops Server
>> have an
>>        array of IPs? That's a lot more work (especially making sure it
>> works
>>        everywhere, e.g. Traffic Router), but it solves a lot of questions
>> and
>>        hackery, gives us a lot more flexibility, and matches the physical
>> reality
>>        better.
>> ZH> When making this design, we are trying to avoid impact to current
>> functionality and compatibility with earlier version. So we add extra
>> tables or fields for secondary IPs.
>> 
>>        I'm not opposed to the idea, but we need to think through the
>> architecture,
>>        we need to be sure the added complexity is worth it over existing
>>        solutions, we need to make all the options (e.g. Unavailable if
>> any vs all)
>>        configurable, and we need to make sure the common simple case of a
>> single
>>        Server IP and IP6 still work without additional configuration
>> complexity.
>> ZH> Yes, agree with you. We are trying to not impact the existing
>> solution. Please take a look at the design doc for more details.
>> 
>> 
>> 
>>        On Wed, Mar 28, 2018 at 10:19 AM, Nir Sopher <ni...@qwilt.com>
>> wrote:
>> 
>>> Hi Eric/Neil,
>>> Isn't the question of supporting multi interfaces per server a
>> much wider
>>> question? Architectural wise.
>>> What would be the desired behavior if the monitoring shows that
>> only one of
>>> the interfaces is down? Will the router send traffic to the
>> healthy
>>> interfaces? How?
>>> Nir
>>> 
>>> On Wed, Mar 28, 2018, 19:10 Eric Friedrich (efriedri) <
>> efriedri@cisco.com>
>>> wrote:
>>> 
>>>> The use case behind this question probably deserves a longer
>> dev@ email.
>>>> 
>>>> I will oversimplify: we are extending TC to support multiple
>> IPv4 (or
>>>> multiple IPv6) addresses per edge cache (across 1 or more
>> NICs).
>>>> 
>>>> Assume all addresses are reachable from the TM.
>>>> 
>>>> —Eric
>>>> 
>>>> 
>>>>> On Mar 28, 2018, at 11:37 AM, Robert Butts <
>> robert.o.butts@gmail.com>
>>>> wrote:
>>>>> 
>>>>> When you say different interfaces, do you mean IPv4 versus
>> IPv6? Or
>>>>> something else?
>>>>> 
>>>>> If you mean IPv4 vs IPv6, we have a PR for that from Dylan
>> Volz
>>>>> https://github.com/apache/incubator-trafficcontrol/pull/1627
>>>>> 
>>>>> I'm hoping to get to it early next week, just haven't found
>> the time to
>>>>> review and test it yet.
>>>>> 
>>>>> Or did you mean something else by "interface"? Linux network
>>> interfaces?
>>>>> Ports?
>>>>> 
>>>>> 
>>>>> On Wed, Mar 28, 2018 at 12:02 AM, Neil Hao (nbaoping) <
>>>> nbaoping@cisco.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> Currently, we poll exact one URL request to each cache
>> server for one
>>>>>> interface, but now we’d like to add multiple interfaces
>> support,
>>>> therefore,
>>>>>> we need multiple requests to query each interface of the
>> cache
>>> server, I
>>>>>> check the code of Traffic Monitor, it seems we don’t
>> support this kind
>>>> of
>>>>>> polling, right?
>>>>>> 
>>>>>> I figure out different ways to support this:
>>>>>> 1) The first way: change the ‘Urls’ field in the
>> HttpPollerConfig from
>>>>>> ‘map[string]PollConfig’ to ‘map[string][]PollConfig’, so
>> that we can
>>>> have
>>>>>> multiple polling config to query the multiple interfaces
>> info.
>>>>>> 
>>>>>> 2) The second way: Change the ‘URL’ field in the PollConfig
>> from
>>>> ‘string’
>>>>>> to ‘[]string’.
>>>>>> 
>>>>>> No matter which way, it seems it will bring a little big
>> change to the
>>>>>> current polling model. I’m not sure if I’m on the right
>> direction,
>>> would
>>>>>> you guys have suggestions for this?
>>>>>> 
>>>>>> Thanks,
>>>>>> Neil
>>>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> 
>> 
> 
> 


Re: Question about the poll model of the Traffic Monitor

Posted by "Zhilin Huang (zhilhuan)" <zh...@cisco.com>.
Hi Dave,

I could not find the edit button on this page. Looks like I do not have the authority to add the doc.

Thanks,
Zhilin


On 03/04/2018, 2:43 AM, "David Neuman" <da...@gmail.com> wrote:

    Hi Zhilin,
    Is it possible to get this design doc added to our wiki?  I create a design
    docs page here (https://cwiki.apache.org/confluence/display/TC/Design+Docs).
    I think it would be good to get the document there so it doesn't get lost
    over time.
    
    Thanks!
    Dave
    
    On Wed, Mar 28, 2018 at 10:41 PM, Zhilin Huang (zhilhuan) <
    zhilhuan@cisco.com> wrote:
    
    > Hi Guys,
    >
    > Thanks a lot for the discussion. I should put the design earlier for
    > review, and sorry for the delay. Here is the link for the design doc:
    > https://docs.google.com/document/d/1vgq-pGNoLLYf7Y3cu5hWu67TUKpN5hucrp
    > -ZS9nSsd4/edit?usp=sharing
    >
    > Short summary for the feature design:
    > ---
    > There is feature request from market to add secondary IPs support on edge
    > cache servers, and the functionality to assign a delivery service to a
    > secondary IP of an edge cache.
    >
    > This feature requires Traffic Ops implementation to support secondary IP
    > configuration for edge cache, and delivery service assignment to secondary
    > IP.
    >
    > Traffic Monitor should also monitor connectivity of secondary IPs
    > configured. And Traffic Router needs support to resolve streamer FQDN to
    > secondary IP assigned in a delivery service.
    >
    > Traffic Server should record the IP serving client request. And should
    > reject request to an unassigned IP for a delivery service.
    >
    > This design has taken compatibility into consideration: if no secondary IP
    > configured, or some parts of the system has not been upgraded to the
    > version supports this feature, the traffic will be served by primary IPs as
    > before.
    > ---
    >
    > Replies for Robert's comments is embedded in the email thread. Much
    > appreciated and welcome to any further comments.
    >
    > Thanks,
    > Zhilin
    >
    >
    >
    >
    > On 29/03/2018, 10:19 AM, "Neil Hao (nbaoping)" <nb...@cisco.com>
    > wrote:
    >
    >     Hi Robert/Nir,
    >
    >     Thanks very much for the quick and detail reply, and sorry for that I
    > didn’t make the whole feature clearly. Actually, it’s our Secondary IP
    > feature, which is a big feature that will bring change to all the
    > components in the Traffic Control. I thought our teammate reviewed the
    > design with you guys before, but it seems not. And after discussion, we
    > will start the whole feature design review with you guys soon, I think it
    > will be better to continue the discussion after that.
    >
    >     Thanks,
    >     Neil
    >
    >     On 3/29/18, 1:16 AM, "Robert Butts" <ro...@gmail.com> wrote:
    >
    >         I agree with Nir, it's not as simple as changing a structure to
    > `[]URL`,
    >         it's a bigger architectural design question.
    >
    >         How do you plan to mark caches Unavailable if they're unhealthy on
    > one
    >         interface, but healthy on another?
    >
    >         Right now, Traffic Router needs a boolean for each cache, it
    > doesn't know
    >         anything about multiple network interfaces, IPv4 vs IPv6, etc. It
    > only
    >         knows the FQDN, which is all the clients it's giving DNS records
    > to will
    >         know when they request the cache.
    >
    >         Questions:
    >         Is a cache marked Unavailable when any interface is unreachable?
    > Or all of
    >         them?
    > ZH> Actually, we will care about an IP availability instead of interface
    > availability. Please take a look at 3.1.2 of the design doc.
    >
    >         What if an interface is reachable, but one interface reports
    > different
    >         stats than another interface? For example, what if someone
    > configures a
    >         different caching proxy (ATS) on each interface?
    > ZH> Will only use 1 ATS to serve traffic from all IPs configured.
    >
    >         How are stats aggregated? Should the monitor aggregate all stats
    > from
    >         different polls and interfaces together, and consider them the same
    >         "server"? If not, how do we reconcile the different stats with
    > what the
    >         Monitor reports on `CrStates` and `CacheStats`? If so, again, what
    > happens
    >         if different interfaces have different ATS instances, so e.g. the
    > byte
    >         count on one is 100, and the other is 1000, then 101, then 1001.
    > It simply
    >         won't work. Do we handle that? Or just ignore it, and document "all
    >         interfaces must report the same stats"? Do we try to detect that
    > and give a
    >         useful error or warning?
    > ZH> The bandwidth for interfaces will be aggregated. We will only have 1
    > ATS to serve traffic from all interfaces. The connectivity check is IP
    > based. And the stats collection will be interface based. Please take a look
    > at 3.1.2 of the design doc for details.
    >
    >         In Traffic Ops, servers have specific data used for polling.
    > Traffic
    >         Monitor gets the stats URI path from Parameters, and the URI IP
    > from the
    >         Servers table. It doesn't use the FQDN, Server Host or Server
    > Domain. Where
    >         would these other interfaces come from? Parameters? Or another
    > table linked
    >         to the servers table? (I'd really, really rather we didn't put
    > more data in
    >         unsafe Parameters, which can not exist, not be properly formatted,
    > need
    >         safety checks in all code that ever uses them, and are confusing
    > and opaque
    >         to new users) Would these other interfaces be in addition to using
    > the IP
    >         from the Server table? Or replace it?
    >
    >         Do we have config options for all of these? Only some of them? In
    > the
    >         config file, or Traffic Ops fields?
    > ZH> Please take a look at 3.1.1 of the design doc. Basically, we will add
    > new APIs, or new fields to existing APIs. So this feature implementation
    > will not impact existing functionality.
    >
    >         I'd like to hear the use case too, and e.g. why it isn't better to
    > simply
    >         make each different interface a different server in Traffic Ops?
    > How is the
    > ZH> We discussed this solution too. But the main issue is running ort
    > script for one server will overwrite the ATS configuration for anther
    > server. The use case is our customer want different client to be served by
    > different IP. For example a mobile client will be served by different IP of
    > a PC client.
    >         Traffic Router routing to them, anyway? Are you setting up the
    > same DNS
    >         record to point to the IPs of all interfaces? How is that
    > configured in
    > ZH> For each edge, each DS will be assigned to a single IP. If no
    > secondary IP specified, it will work just as the behavior today. Please
    > take a look at 3.1.3 of the design doc.
    >         Traffic Ops then? I.e. which interfaces are configured as the
    > Server IP and
    >         IP6? Are we certain there aren't other issues in other Traffic
    > Control
    >         components, with a Server IP and IP6 not having a one-to-one
    > relationship
    >         with the FQDN A/AAAA record?
    > ZH> Please check 3.1.1 of the design doc. There will be new pages for
    > secondary IPs configuration, the current functionality should not be
    > impacted.
    >
    >         Do we need to take the bigger step, of having a Traffic Ops Server
    > have an
    >         array of IPs? That's a lot more work (especially making sure it
    > works
    >         everywhere, e.g. Traffic Router), but it solves a lot of questions
    > and
    >         hackery, gives us a lot more flexibility, and matches the physical
    > reality
    >         better.
    > ZH> When making this design, we are trying to avoid impact to current
    > functionality and compatibility with earlier version. So we add extra
    > tables or fields for secondary IPs.
    >
    >         I'm not opposed to the idea, but we need to think through the
    > architecture,
    >         we need to be sure the added complexity is worth it over existing
    >         solutions, we need to make all the options (e.g. Unavailable if
    > any vs all)
    >         configurable, and we need to make sure the common simple case of a
    > single
    >         Server IP and IP6 still work without additional configuration
    > complexity.
    > ZH> Yes, agree with you. We are trying to not impact the existing
    > solution. Please take a look at the design doc for more details.
    >
    >
    >
    >         On Wed, Mar 28, 2018 at 10:19 AM, Nir Sopher <ni...@qwilt.com>
    > wrote:
    >
    >         > Hi Eric/Neil,
    >         > Isn't the question of supporting multi interfaces per server a
    > much wider
    >         > question? Architectural wise.
    >         > What would be the desired behavior if the monitoring shows that
    > only one of
    >         > the interfaces is down? Will the router send traffic to the
    > healthy
    >         > interfaces? How?
    >         > Nir
    >         >
    >         > On Wed, Mar 28, 2018, 19:10 Eric Friedrich (efriedri) <
    > efriedri@cisco.com>
    >         > wrote:
    >         >
    >         > > The use case behind this question probably deserves a longer
    > dev@ email.
    >         > >
    >         > > I will oversimplify: we are extending TC to support multiple
    > IPv4 (or
    >         > > multiple IPv6) addresses per edge cache (across 1 or more
    > NICs).
    >         > >
    >         > > Assume all addresses are reachable from the TM.
    >         > >
    >         > > —Eric
    >         > >
    >         > >
    >         > > > On Mar 28, 2018, at 11:37 AM, Robert Butts <
    > robert.o.butts@gmail.com>
    >         > > wrote:
    >         > > >
    >         > > > When you say different interfaces, do you mean IPv4 versus
    > IPv6? Or
    >         > > > something else?
    >         > > >
    >         > > > If you mean IPv4 vs IPv6, we have a PR for that from Dylan
    > Volz
    >         > > > https://github.com/apache/incubator-trafficcontrol/pull/1627
    >         > > >
    >         > > > I'm hoping to get to it early next week, just haven't found
    > the time to
    >         > > > review and test it yet.
    >         > > >
    >         > > > Or did you mean something else by "interface"? Linux network
    >         > interfaces?
    >         > > > Ports?
    >         > > >
    >         > > >
    >         > > > On Wed, Mar 28, 2018 at 12:02 AM, Neil Hao (nbaoping) <
    >         > > nbaoping@cisco.com>
    >         > > > wrote:
    >         > > >
    >         > > >> Hi,
    >         > > >>
    >         > > >> Currently, we poll exact one URL request to each cache
    > server for one
    >         > > >> interface, but now we’d like to add multiple interfaces
    > support,
    >         > > therefore,
    >         > > >> we need multiple requests to query each interface of the
    > cache
    >         > server, I
    >         > > >> check the code of Traffic Monitor, it seems we don’t
    > support this kind
    >         > > of
    >         > > >> polling, right?
    >         > > >>
    >         > > >> I figure out different ways to support this:
    >         > > >> 1) The first way: change the ‘Urls’ field in the
    > HttpPollerConfig from
    >         > > >> ‘map[string]PollConfig’ to ‘map[string][]PollConfig’, so
    > that we can
    >         > > have
    >         > > >> multiple polling config to query the multiple interfaces
    > info.
    >         > > >>
    >         > > >> 2) The second way: Change the ‘URL’ field in the PollConfig
    > from
    >         > > ‘string’
    >         > > >> to ‘[]string’.
    >         > > >>
    >         > > >> No matter which way, it seems it will bring a little big
    > change to the
    >         > > >> current polling model. I’m not sure if I’m on the right
    > direction,
    >         > would
    >         > > >> you guys have suggestions for this?
    >         > > >>
    >         > > >> Thanks,
    >         > > >> Neil
    >         > > >>
    >         > >
    >         > >
    >         >
    >
    >
    >
    >
    >