You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficcontrol.apache.org by GitBox <gi...@apache.org> on 2021/08/10 22:04:59 UTC

[GitHub] [trafficcontrol] jrushford opened a new pull request #6097: Add the Traffic Monitor health client.

jrushford opened a new pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097


   Adds to the trafficcontrol-cache-config package a client that runs on trafficserver caches to monitor the health status
   of it's parent caches as reported by Traffic Monitor.  The client will utilize the Traffic Server traffic_ctl tool to mark 
   parents UP or DOWN based upon the health status from Traffic Monitor.
   
   ## Which Traffic Control components are affected by this PR?
   
   - Traffic Control Cache Config (T3C, formerly ORT)
   - Traffic Monitor
   
   ## What is the best way to verify this PR?
   Run the unit tests.
   
   ## PR submission checklist
   - [x] This PR has tests 
   - [x] This PR has documentation 
   - [] This PR has a CHANGELOG.md entry 
   - [x] This PR **DOES NOT FIX A SERIOUS SECURITY VULNERABILITY** (see [the Apache Software Foundation's security guidelines](https://apache.org/security) for details)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688664329



##########
File path: cache-config/tm-health-client/tmutil/tmutil.go
##########
@@ -0,0 +1,674 @@
+package tmutil
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"bytes"
+	"errors"
+	"io/ioutil"
+	"math/rand"
+	"net"
+	"os"
+	"os/exec"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/cache-config/tm-health-client/config"
+	"github.com/apache/trafficcontrol/lib/go-log"
+	"github.com/apache/trafficcontrol/lib/go-tc"
+	"github.com/apache/trafficcontrol/traffic_monitor/datareq"
+	"github.com/apache/trafficcontrol/traffic_monitor/tmclient"
+	"gopkg.in/yaml.v2"
+)
+
+const serverRequest = "https://tp.cdn.comcast.net/api/3.0/servers?type=RASCAL"
+
+type ParentAvailable interface {
+	available() bool
+}
+
+// the necessary fields of a trafficserver parents config file needed to
+// read the file and keep track of it's modification time.
+type ParentConfigFile struct {
+	Filename       string
+	LastModifyTime int64
+}
+
+// the necessary data required to keep track of trafficserver config
+// files, lists of parents a trafficserver instance uses, and directory
+// locations used for configuration and trafficserver executables.
+type ParentInfo struct {
+	ParentDotConfig        ParentConfigFile
+	StrategiesDotYaml      ParentConfigFile
+	TrafficServerBinDir    string
+	TrafficServerConfigDir string
+	Parents                map[string]ParentStatus
+	Cfg                    config.Cfg
+}
+
+// when reading the strategies.yaml, these fields are used to help
+// parse out fail_over objects
+type FailOver struct {
+	MaxSimpleRetries      int      `yaml:"max_simple_retries,omitempty"`
+	MaxUnavailableRetries int      `yaml:"max_unavailable_retries,omitempty"`
+	RingMode              string   `yaml:"ring_mode,omitempty"`
+	ResponseCodes         []int    `yaml:"response_codes,omitempty"`
+	MarkDownCodes         []int    `yaml:"markdown_codes,omitempty"`
+	HealthCheck           []string `yaml:"health_check,omitempty"`
+}
+
+// the trafficserver 'HostStatus' fields that are necessary to interface
+// with the trafficserver 'traffic_ctl' command.
+type ParentStatus struct {
+	Fqdn         string
+	ActiveReason bool
+	LocalReason  bool
+	ManualReason bool
+}
+
+// used to get the overall parent availablity from the
+// HostStatus markdown reasons.  all markdown reasons
+// must be true for a parent to be considered available
+func (p ParentStatus) available() bool {
+	if !p.ActiveReason {
+		return false
+	} else if !p.LocalReason {
+		return false
+	} else if !p.ManualReason {
+		return false
+	}
+	return true
+}
+
+// used to log that a parent's status is either UP or
+// DOWN based upon the HostStatus reason codes.  to
+// be considered UP, all reason codes must be 'true'
+func (p ParentStatus) Status() string {
+	if !p.ActiveReason {
+		return "DOWN"
+	} else if !p.LocalReason {
+		return "DOWN"
+	} else if !p.ManualReason {
+		return "DOWN"
+	}
+	return "UP"
+}
+
+type StatusReason int
+
+// these are the HostStatus reason codes used withing
+// trafficserver.
+const (
+	ACTIVE StatusReason = iota
+	LOCAL
+	MANUAL
+)
+
+// used for logging a parent's HostStatus reason code
+// setting.
+func (s StatusReason) String() string {
+	switch s {
+	case ACTIVE:
+		return "ACTIVE"
+	case LOCAL:
+		return "LOCAL"
+	case MANUAL:
+		return "MANUAL"
+	}
+	return "UNDEFINED"
+}
+
+// the fields used from strategies.yaml that describe
+// a parent.
+type Host struct {
+	HostName  string     `yaml:"host"`
+	Protocols []Protocol `yaml:"protocol"`
+}
+
+// the protocol object in strategies.yaml that help to
+// describe a parent.
+type Protocol struct {
+	Scheme           string  `yaml:"scheme"`
+	Port             int     `yaml:"port"`
+	Health_check_url string  `yaml:"health_check_url,omitempty"`
+	Weight           float64 `yaml:"weight,omitempty"`
+}
+
+// a trafficserver strategy object from strategies.yaml
+type Strategy struct {
+	Strategy        string   `yaml:"strategy"`
+	Policy          string   `yaml:"policy"`
+	HashKey         string   `yaml:"hash_key,omitempty"`
+	GoDirect        bool     `yaml:"go_direct,omitempty"`
+	ParentIsProxy   bool     `yaml:"parent_is_proxy,omitempty"`
+	CachePeerResult bool     `yaml:"cache_peer_result,omitempty"`
+	Scheme          string   `yaml:"scheme"`
+	FailOvers       FailOver `yaml:"failover,omitempty"`
+}
+
+// the top level array defintions in a trafficserver strategies.yaml
+// configuration file.
+type Strategies struct {
+	Strategy []Strategy    `yaml:"strategies"`
+	Hosts    []Host        `yaml:"hosts"`
+	Groups   []interface{} `yaml:"groups"`
+}
+
+// used at startup to load a trafficservers list of parents from
+// it's 'parent.config', 'strategies.yaml' and current parent
+// status from trafficservers HostStatus subsystem.
+func NewParentInfo(cfg config.Cfg) (*ParentInfo, error) {
+
+	parentConfig := cfg.TrafficServerConfigDir + "/parent.config"
+	modTime, err := getFileModificationTime(parentConfig)
+	if err != nil {
+		return nil, errors.New("error reading parent.config: " + err.Error())
+	}
+	parents := ParentConfigFile{
+		Filename:       parentConfig,
+		LastModifyTime: modTime,
+	}
+
+	stratyaml := cfg.TrafficServerConfigDir + "/strategies.yaml"
+	modTime, err = getFileModificationTime(stratyaml)
+	if err != nil {
+		return nil, errors.New("error reading strategies.yaml: " + err.Error())
+	}
+
+	strategies := ParentConfigFile{
+		Filename:       cfg.TrafficServerConfigDir + "/strategies.yaml",
+		LastModifyTime: modTime,
+	}
+
+	parentInfo := ParentInfo{
+		ParentDotConfig:        parents,
+		StrategiesDotYaml:      strategies,
+		TrafficServerBinDir:    cfg.TrafficServerBinDir,
+		TrafficServerConfigDir: cfg.TrafficServerConfigDir,
+	}
+
+	// initialize the trafficserver parents map
+	parentStatus := make(map[string]ParentStatus)
+
+	// read parent.config
+	if err := parentInfo.readParentConfig(parentStatus); err != nil {
+		return nil, errors.New("loading parent.config file: " + err.Error())
+	}
+
+	// read strategies.yaml
+	if err := parentInfo.readStrategies(parentStatus); err != nil {
+		return nil, errors.New("loading parent strategies.yaml file: " + err.Error())
+	}
+
+	// collect the trafficserver parent status from the HostStatus subsystem
+	if err := parentInfo.readHostStatus(parentStatus); err != nil {
+		return nil, errors.New("reading trafficserver host status: " + err.Error())
+	}
+
+	log.Infof("startup loaded %d parent records\n", len(parentStatus))
+
+	parentInfo.Parents = parentStatus
+	parentInfo.Cfg = cfg
+
+	return &parentInfo, nil
+}
+
+// Queries a traffic monitor that is monitoring the trafficserver instance running on a host to
+// obtain the availability, health, of a parent used by trafficserver.
+func (c *ParentInfo) GetCacheStatuses() (map[tc.CacheName]datareq.CacheStatus, error) {
+
+	tmHostName, err := c.findATrafficMonitor()
+	if err != nil {
+		return nil, errors.New("finding a trafficmonitor: " + err.Error())
+	}
+	tmc := tmclient.New("http://"+tmHostName, config.GetRequestTimeout())
+
+	return tmc.CacheStatuses()
+}
+
+// The main polling function that keeps the parents list current if
+// with any changes to the trafficserver parent.config or strategies.yaml.
+// Also, it keeps parent status current with the the trafficserver HostStatus
+// subsystem.  Finally, on each poll cycle a trafficmonitor is queried to check
+// that all parents used by this trafficserver are available for use based upon
+// the trafficmonitors idea from it's health protocol.  Parents are marked up or
+// down in the trafficserver subsystem based upon that hosts current status and
+// the status that trafficmonitor health protocol has determined for a parent
+func (c *ParentInfo) PollAndUpdateCacheStatus() {
+	pollingInterval := config.GetTMPollingInterval()
+	log.Infoln("polling started")
+
+	for {
+		if err := c.UpdateParentInfo(); err != nil {
+			log.Errorf("could not load new ATS parent info: %s\n", err.Error())
+		} else {
+			log.Debugf("updated parent info, total number of parents: %d\n", len(c.Parents))
+		}
+
+		// read traffic manager cache statuses
+		caches, err := c.GetCacheStatuses()
+		if err != nil {
+			log.Errorln(err.Error())
+		}
+
+		for k, v := range caches {
+			hostName := string(k)
+			cs, ok := c.Parents[hostName]
+			if ok {
+				tmAvailable := *v.CombinedAvailable
+				if cs.available() != tmAvailable {
+					if !c.Cfg.EnableActiveMarkdowns {
+						if !tmAvailable {
+							log.Infof("TM reports that %s is not available and should be marked DOWN but, mark downs are disabled by configuration", hostName)
+						} else {
+							log.Infof("TM reports that %s is available and should be marked UP but, mark up is disabled by configuration", hostName)
+						}
+					} else {
+						if err = c.markParent(cs.Fqdn, *v.Status, tmAvailable); err != nil {
+							log.Errorln(err.Error())
+						}
+					}
+				}
+			}
+		}
+
+		time.Sleep(pollingInterval)
+	}
+}
+
+// Used by the polling function to update the parents list from
+// changes to parent.config and strategies.yaml.  The parents
+// availability is also updated to reflect the current state from
+// the trafficserver HostStatus subsystem.
+func (c *ParentInfo) UpdateParentInfo() error {
+	ptime, err := getFileModificationTime(c.ParentDotConfig.Filename)
+	if err != nil {
+		return errors.New("error reading parent.config: " + err.Error())
+	}
+	stime, err := getFileModificationTime(c.StrategiesDotYaml.Filename)
+	if err != nil {
+		return errors.New("error reading strategies.yaml: " + err.Error())
+	}
+	if c.ParentDotConfig.LastModifyTime < ptime {
+		// read parent.config
+		if err := c.readParentConfig(c.Parents); err != nil {
+			return errors.New("updating parent.config file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new parent.config, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	if c.StrategiesDotYaml.LastModifyTime < stime {
+		// read strategies.yaml
+		if err := c.readStrategies(c.Parents); err != nil {
+			return errors.New("updating parent strategies.yaml file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new strategies.yaml, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	// collect the trafficserver current host status
+	if err := c.readHostStatus(c.Parents); err != nil {
+		return errors.New("reading latest trafficserver host status: " + err.Error())
+	}
+
+	return nil
+}
+
+// choose an availabe trafficmonitor, returns an error if

Review comment:
       Nitpick: `availabe` -> `available`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688667882



##########
File path: cache-config/tm-health-client/tmutil/tmutil.go
##########
@@ -0,0 +1,674 @@
+package tmutil
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"bytes"
+	"errors"
+	"io/ioutil"
+	"math/rand"
+	"net"
+	"os"
+	"os/exec"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/cache-config/tm-health-client/config"
+	"github.com/apache/trafficcontrol/lib/go-log"
+	"github.com/apache/trafficcontrol/lib/go-tc"
+	"github.com/apache/trafficcontrol/traffic_monitor/datareq"
+	"github.com/apache/trafficcontrol/traffic_monitor/tmclient"
+	"gopkg.in/yaml.v2"
+)
+
+const serverRequest = "https://tp.cdn.comcast.net/api/3.0/servers?type=RASCAL"
+
+type ParentAvailable interface {
+	available() bool
+}
+
+// the necessary fields of a trafficserver parents config file needed to
+// read the file and keep track of it's modification time.
+type ParentConfigFile struct {
+	Filename       string
+	LastModifyTime int64
+}
+
+// the necessary data required to keep track of trafficserver config
+// files, lists of parents a trafficserver instance uses, and directory
+// locations used for configuration and trafficserver executables.
+type ParentInfo struct {
+	ParentDotConfig        ParentConfigFile
+	StrategiesDotYaml      ParentConfigFile
+	TrafficServerBinDir    string
+	TrafficServerConfigDir string
+	Parents                map[string]ParentStatus
+	Cfg                    config.Cfg
+}
+
+// when reading the strategies.yaml, these fields are used to help
+// parse out fail_over objects
+type FailOver struct {
+	MaxSimpleRetries      int      `yaml:"max_simple_retries,omitempty"`
+	MaxUnavailableRetries int      `yaml:"max_unavailable_retries,omitempty"`
+	RingMode              string   `yaml:"ring_mode,omitempty"`
+	ResponseCodes         []int    `yaml:"response_codes,omitempty"`
+	MarkDownCodes         []int    `yaml:"markdown_codes,omitempty"`
+	HealthCheck           []string `yaml:"health_check,omitempty"`
+}
+
+// the trafficserver 'HostStatus' fields that are necessary to interface
+// with the trafficserver 'traffic_ctl' command.
+type ParentStatus struct {
+	Fqdn         string
+	ActiveReason bool
+	LocalReason  bool
+	ManualReason bool
+}
+
+// used to get the overall parent availablity from the
+// HostStatus markdown reasons.  all markdown reasons
+// must be true for a parent to be considered available
+func (p ParentStatus) available() bool {
+	if !p.ActiveReason {
+		return false
+	} else if !p.LocalReason {
+		return false
+	} else if !p.ManualReason {
+		return false
+	}
+	return true
+}
+
+// used to log that a parent's status is either UP or
+// DOWN based upon the HostStatus reason codes.  to
+// be considered UP, all reason codes must be 'true'
+func (p ParentStatus) Status() string {
+	if !p.ActiveReason {
+		return "DOWN"
+	} else if !p.LocalReason {
+		return "DOWN"
+	} else if !p.ManualReason {
+		return "DOWN"
+	}
+	return "UP"
+}
+
+type StatusReason int
+
+// these are the HostStatus reason codes used withing
+// trafficserver.
+const (
+	ACTIVE StatusReason = iota
+	LOCAL
+	MANUAL
+)
+
+// used for logging a parent's HostStatus reason code
+// setting.
+func (s StatusReason) String() string {
+	switch s {
+	case ACTIVE:
+		return "ACTIVE"
+	case LOCAL:
+		return "LOCAL"
+	case MANUAL:
+		return "MANUAL"
+	}
+	return "UNDEFINED"
+}
+
+// the fields used from strategies.yaml that describe
+// a parent.
+type Host struct {
+	HostName  string     `yaml:"host"`
+	Protocols []Protocol `yaml:"protocol"`
+}
+
+// the protocol object in strategies.yaml that help to
+// describe a parent.
+type Protocol struct {
+	Scheme           string  `yaml:"scheme"`
+	Port             int     `yaml:"port"`
+	Health_check_url string  `yaml:"health_check_url,omitempty"`
+	Weight           float64 `yaml:"weight,omitempty"`
+}
+
+// a trafficserver strategy object from strategies.yaml
+type Strategy struct {
+	Strategy        string   `yaml:"strategy"`
+	Policy          string   `yaml:"policy"`
+	HashKey         string   `yaml:"hash_key,omitempty"`
+	GoDirect        bool     `yaml:"go_direct,omitempty"`
+	ParentIsProxy   bool     `yaml:"parent_is_proxy,omitempty"`
+	CachePeerResult bool     `yaml:"cache_peer_result,omitempty"`
+	Scheme          string   `yaml:"scheme"`
+	FailOvers       FailOver `yaml:"failover,omitempty"`
+}
+
+// the top level array defintions in a trafficserver strategies.yaml
+// configuration file.
+type Strategies struct {
+	Strategy []Strategy    `yaml:"strategies"`
+	Hosts    []Host        `yaml:"hosts"`
+	Groups   []interface{} `yaml:"groups"`
+}
+
+// used at startup to load a trafficservers list of parents from
+// it's 'parent.config', 'strategies.yaml' and current parent
+// status from trafficservers HostStatus subsystem.
+func NewParentInfo(cfg config.Cfg) (*ParentInfo, error) {
+
+	parentConfig := cfg.TrafficServerConfigDir + "/parent.config"
+	modTime, err := getFileModificationTime(parentConfig)
+	if err != nil {
+		return nil, errors.New("error reading parent.config: " + err.Error())
+	}
+	parents := ParentConfigFile{
+		Filename:       parentConfig,
+		LastModifyTime: modTime,
+	}
+
+	stratyaml := cfg.TrafficServerConfigDir + "/strategies.yaml"
+	modTime, err = getFileModificationTime(stratyaml)
+	if err != nil {
+		return nil, errors.New("error reading strategies.yaml: " + err.Error())
+	}
+
+	strategies := ParentConfigFile{
+		Filename:       cfg.TrafficServerConfigDir + "/strategies.yaml",
+		LastModifyTime: modTime,
+	}
+
+	parentInfo := ParentInfo{
+		ParentDotConfig:        parents,
+		StrategiesDotYaml:      strategies,
+		TrafficServerBinDir:    cfg.TrafficServerBinDir,
+		TrafficServerConfigDir: cfg.TrafficServerConfigDir,
+	}
+
+	// initialize the trafficserver parents map
+	parentStatus := make(map[string]ParentStatus)
+
+	// read parent.config
+	if err := parentInfo.readParentConfig(parentStatus); err != nil {
+		return nil, errors.New("loading parent.config file: " + err.Error())
+	}
+
+	// read strategies.yaml
+	if err := parentInfo.readStrategies(parentStatus); err != nil {
+		return nil, errors.New("loading parent strategies.yaml file: " + err.Error())
+	}
+
+	// collect the trafficserver parent status from the HostStatus subsystem
+	if err := parentInfo.readHostStatus(parentStatus); err != nil {
+		return nil, errors.New("reading trafficserver host status: " + err.Error())
+	}
+
+	log.Infof("startup loaded %d parent records\n", len(parentStatus))
+
+	parentInfo.Parents = parentStatus
+	parentInfo.Cfg = cfg
+
+	return &parentInfo, nil
+}
+
+// Queries a traffic monitor that is monitoring the trafficserver instance running on a host to
+// obtain the availability, health, of a parent used by trafficserver.
+func (c *ParentInfo) GetCacheStatuses() (map[tc.CacheName]datareq.CacheStatus, error) {
+
+	tmHostName, err := c.findATrafficMonitor()
+	if err != nil {
+		return nil, errors.New("finding a trafficmonitor: " + err.Error())
+	}
+	tmc := tmclient.New("http://"+tmHostName, config.GetRequestTimeout())
+
+	return tmc.CacheStatuses()
+}
+
+// The main polling function that keeps the parents list current if
+// with any changes to the trafficserver parent.config or strategies.yaml.
+// Also, it keeps parent status current with the the trafficserver HostStatus
+// subsystem.  Finally, on each poll cycle a trafficmonitor is queried to check
+// that all parents used by this trafficserver are available for use based upon
+// the trafficmonitors idea from it's health protocol.  Parents are marked up or
+// down in the trafficserver subsystem based upon that hosts current status and
+// the status that trafficmonitor health protocol has determined for a parent
+func (c *ParentInfo) PollAndUpdateCacheStatus() {
+	pollingInterval := config.GetTMPollingInterval()
+	log.Infoln("polling started")
+
+	for {
+		if err := c.UpdateParentInfo(); err != nil {
+			log.Errorf("could not load new ATS parent info: %s\n", err.Error())
+		} else {
+			log.Debugf("updated parent info, total number of parents: %d\n", len(c.Parents))
+		}
+
+		// read traffic manager cache statuses
+		caches, err := c.GetCacheStatuses()
+		if err != nil {
+			log.Errorln(err.Error())
+		}
+
+		for k, v := range caches {
+			hostName := string(k)
+			cs, ok := c.Parents[hostName]
+			if ok {
+				tmAvailable := *v.CombinedAvailable
+				if cs.available() != tmAvailable {
+					if !c.Cfg.EnableActiveMarkdowns {
+						if !tmAvailable {
+							log.Infof("TM reports that %s is not available and should be marked DOWN but, mark downs are disabled by configuration", hostName)
+						} else {
+							log.Infof("TM reports that %s is available and should be marked UP but, mark up is disabled by configuration", hostName)
+						}
+					} else {
+						if err = c.markParent(cs.Fqdn, *v.Status, tmAvailable); err != nil {
+							log.Errorln(err.Error())
+						}
+					}
+				}
+			}
+		}
+
+		time.Sleep(pollingInterval)
+	}
+}
+
+// Used by the polling function to update the parents list from
+// changes to parent.config and strategies.yaml.  The parents
+// availability is also updated to reflect the current state from
+// the trafficserver HostStatus subsystem.
+func (c *ParentInfo) UpdateParentInfo() error {
+	ptime, err := getFileModificationTime(c.ParentDotConfig.Filename)
+	if err != nil {
+		return errors.New("error reading parent.config: " + err.Error())
+	}
+	stime, err := getFileModificationTime(c.StrategiesDotYaml.Filename)
+	if err != nil {
+		return errors.New("error reading strategies.yaml: " + err.Error())
+	}
+	if c.ParentDotConfig.LastModifyTime < ptime {
+		// read parent.config
+		if err := c.readParentConfig(c.Parents); err != nil {
+			return errors.New("updating parent.config file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new parent.config, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	if c.StrategiesDotYaml.LastModifyTime < stime {
+		// read strategies.yaml
+		if err := c.readStrategies(c.Parents); err != nil {
+			return errors.New("updating parent strategies.yaml file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new strategies.yaml, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	// collect the trafficserver current host status
+	if err := c.readHostStatus(c.Parents); err != nil {
+		return errors.New("reading latest trafficserver host status: " + err.Error())
+	}
+
+	return nil
+}
+
+// choose an availabe trafficmonitor, returns an error if
+// there are none
+func (c *ParentInfo) findATrafficMonitor() (string, error) {
+	var tmHostname string
+	lth := len(c.Cfg.TrafficMonitors)
+	if lth == 0 {
+		return "", errors.New("there are no available traffic monitors")
+	}
+
+	// choose one at random
+	rand.Seed(time.Now().UnixNano())
+	r := (rand.Intn(7919) % lth)
+
+	tms := make([]string, 0)
+	for k, v := range c.Cfg.TrafficMonitors {
+		if v == true {
+			log.Debugf("traffic monitor %s is available\n", k)
+			tms = append(tms, k)
+		}
+	}
+
+	lth = len(tms)
+	if lth > 0 {
+		if r < lth { // return the trafficmonitor at r, r could be the first

Review comment:
       Though it'll need to be moved inside `if lth > 0 { r := rand.Intn(lth)` because `rand.Intn(0)` panics




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] jrushford commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
jrushford commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688757109



##########
File path: cache-config/tm-health-client/README.md
##########
@@ -0,0 +1,167 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+<!--
+
+  !!!
+      This file is both a Github Readme and manpage!
+      Please make sure changes appear properly with man,
+      and follow man conventions, such as:
+      https://www.bell-labs.com/usr/dmr/www/manintro.html
+
+      A primary goal of t3c is to follow POSIX and LSB standards
+      and conventions, so it's easy to learn and use by people
+      who know Linux and other *nix systems. Providing a proper
+      manpage is a big part of that.
+  !!!
+
+-->
+# NAME
+
+tm-health-client - Traffic Monitor Health Client service
+
+# SYNOPSIS
+
+tm-health-client [-f config-file]  -h  [-l logging-directory]  -v 
+
+# DESCRIPTION
+
+The tm-health-client command is used to manage **Apache Traffic Server** parents on a
+host running **Apache Traffic Server**.  The command should be started by **systemd** 
+and run as a service. On startup, the command reads it's default configuration file

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] jrushford commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
jrushford commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688757623



##########
File path: cache-config/tm-health-client/config/config.go
##########
@@ -0,0 +1,256 @@
+package config
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"encoding/json"
+	"errors"
+	"io/ioutil"
+	"net/url"
+	"os"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/lib/go-log"
+	toclient "github.com/apache/trafficcontrol/traffic_ops/v3-client"
+	"github.com/pborman/getopt/v2"
+)
+
+var tmPollingInterval time.Duration
+var toRequestTimeout time.Duration
+
+const (
+	DefaultConfigFile             = "/etc/trafficcontrol-cache-config/tm-health-client.json"
+	DefaultLogDirectory           = "/var/log/trafficcontrol-cache-config"
+	DefaultLogFile                = "tm-health-client.log"
+	DefaultTrafficServerConfigDir = "/opt/trafficserver/etc/trafficserver"
+	DefaultTrafficServerBinDir    = "/opt/trafficserver/bin"
+)
+
+type Cfg struct {
+	CDNName                 string `json:"cdn-name"`
+	EnableActiveMarkdowns   bool   `json:"enable-active-markdowns"`
+	ReasonCode              string `json:"reason-code"`
+	TOCredentialFile        string `json:"to-credential-file"`
+	TORequestTimeOutSeconds string `json:"to-request-timeout-seconds"`
+	TOPass                  string
+	TOUrl                   string
+	TOUser                  string
+	TmPollIntervalSeconds   string          `json:"tm-poll-interval-seconds"`
+	TrafficServerConfigDir  string          `json:"trafficserver-config-dir"`
+	TrafficServerBinDir     string          `json:"trafficserver-bin-dir"`
+	TrafficMonitors         map[string]bool `json:"trafficmonitors,omitempty"`
+}
+
+type LogCfg struct {
+	LogLocationErr   string
+	LogLocationDebug string
+	LogLocationInfo  string
+	LogLocationWarn  string
+}
+
+func (lcfg LogCfg) ErrorLog() log.LogLocation   { return log.LogLocation(lcfg.LogLocationErr) }
+func (lcfg LogCfg) WarningLog() log.LogLocation { return log.LogLocation(lcfg.LogLocationWarn) }
+func (lcfg LogCfg) InfoLog() log.LogLocation    { return log.LogLocation(lcfg.LogLocationInfo) }
+func (lcfg LogCfg) DebugLog() log.LogLocation   { return log.LogLocation(lcfg.LogLocationDebug) }
+func (lcfg LogCfg) EventLog() log.LogLocation   { return log.LogLocation(log.LogLocationNull) } // not used
+
+func readCredentials(cfg *Cfg) error {
+	fn := cfg.TOCredentialFile
+	f, err := os.Open(fn)
+
+	if err != nil {
+		return errors.New("failed to open + " + fn + " :" + err.Error())
+	}
+	defer f.Close()
+
+	var to_pass_found = false
+	var to_url_found = false
+	var to_user_found = false
+
+	scanner := bufio.NewScanner(f)
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if strings.HasPrefix(line, "#") {
+			continue
+		}
+		fields := strings.Split(line, " ")
+		for _, v := range fields {
+			if strings.HasPrefix(v, "TO_") {
+				sf := strings.Split(v, "=")
+				if len(sf) == 2 {
+					if sf[0] == "TO_URL" {
+						// parse the url after trimming off any surrounding double quotes
+						cfg.TOUrl = strings.Trim(sf[1], "\"")
+						to_url_found = true
+					}
+					if sf[0] == "TO_USER" {
+						// set the TOUser after trimming off any surrounding quotes.
+						cfg.TOUser = strings.Trim(sf[1], "\"")
+						to_user_found = true
+					}
+					// set the TOPass after trimming off any surrounding quotes.
+					if sf[0] == "TO_PASS" {
+						cfg.TOPass = strings.Trim(sf[1], "\"")
+						to_pass_found = true
+					}
+				}
+			}
+		}
+	}
+	if !to_url_found && !to_user_found && !to_pass_found {
+		return errors.New("failed to retrieve one or more TrafficOps credentails")
+	}
+
+	return nil
+}
+
+func GetConfig() (Cfg, error, bool) {
+	var err error
+	var configFile string
+	var logLocationErr = log.LogLocationStderr
+	var logLocationDebug = log.LogLocationNull
+	var logLocationInfo = log.LogLocationNull
+	var logLocationWarn = log.LogLocationNull
+
+	configFilePtr := getopt.StringLong("config-file", 'f', DefaultConfigFile, "full path to the json config file")
+	logdirPtr := getopt.StringLong("logging-dir", 'l', DefaultLogDirectory, "directory location for log files")
+	helpPtr := getopt.BoolLong("help", 'h', "Print usage information and exit")
+	verbosePtr := getopt.CounterLong("verbose", 'v', `Log verbosity. Logging is output to stderr. By default, errors are logged. To log warnings, pass '-v'. To log info, pass '-vv', debug pass '-vvv'`)
+
+	getopt.Parse()
+
+	if configFilePtr != nil {
+		configFile = *configFilePtr
+	} else {
+		configFile = DefaultConfigFile
+	}
+
+	var logfile string
+
+	logfile = *logdirPtr + "/" + DefaultLogFile

Review comment:
       Fixed, I always forget to use filepath.Join() :-)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688638345



##########
File path: cache-config/tm-health-client/config/config.go
##########
@@ -0,0 +1,256 @@
+package config
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"encoding/json"
+	"errors"
+	"io/ioutil"
+	"net/url"
+	"os"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/lib/go-log"
+	toclient "github.com/apache/trafficcontrol/traffic_ops/v3-client"
+	"github.com/pborman/getopt/v2"
+)
+
+var tmPollingInterval time.Duration
+var toRequestTimeout time.Duration
+
+const (
+	DefaultConfigFile             = "/etc/trafficcontrol-cache-config/tm-health-client.json"
+	DefaultLogDirectory           = "/var/log/trafficcontrol-cache-config"
+	DefaultLogFile                = "tm-health-client.log"
+	DefaultTrafficServerConfigDir = "/opt/trafficserver/etc/trafficserver"
+	DefaultTrafficServerBinDir    = "/opt/trafficserver/bin"
+)
+
+type Cfg struct {
+	CDNName                 string `json:"cdn-name"`
+	EnableActiveMarkdowns   bool   `json:"enable-active-markdowns"`
+	ReasonCode              string `json:"reason-code"`
+	TOCredentialFile        string `json:"to-credential-file"`
+	TORequestTimeOutSeconds string `json:"to-request-timeout-seconds"`
+	TOPass                  string
+	TOUrl                   string
+	TOUser                  string
+	TmPollIntervalSeconds   string          `json:"tm-poll-interval-seconds"`
+	TrafficServerConfigDir  string          `json:"trafficserver-config-dir"`
+	TrafficServerBinDir     string          `json:"trafficserver-bin-dir"`
+	TrafficMonitors         map[string]bool `json:"trafficmonitors,omitempty"`
+}
+
+type LogCfg struct {
+	LogLocationErr   string
+	LogLocationDebug string
+	LogLocationInfo  string
+	LogLocationWarn  string
+}
+
+func (lcfg LogCfg) ErrorLog() log.LogLocation   { return log.LogLocation(lcfg.LogLocationErr) }
+func (lcfg LogCfg) WarningLog() log.LogLocation { return log.LogLocation(lcfg.LogLocationWarn) }
+func (lcfg LogCfg) InfoLog() log.LogLocation    { return log.LogLocation(lcfg.LogLocationInfo) }
+func (lcfg LogCfg) DebugLog() log.LogLocation   { return log.LogLocation(lcfg.LogLocationDebug) }
+func (lcfg LogCfg) EventLog() log.LogLocation   { return log.LogLocation(log.LogLocationNull) } // not used
+
+func readCredentials(cfg *Cfg) error {
+	fn := cfg.TOCredentialFile
+	f, err := os.Open(fn)
+
+	if err != nil {
+		return errors.New("failed to open + " + fn + " :" + err.Error())
+	}
+	defer f.Close()
+
+	var to_pass_found = false
+	var to_url_found = false
+	var to_user_found = false
+
+	scanner := bufio.NewScanner(f)
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if strings.HasPrefix(line, "#") {
+			continue
+		}
+		fields := strings.Split(line, " ")
+		for _, v := range fields {
+			if strings.HasPrefix(v, "TO_") {
+				sf := strings.Split(v, "=")
+				if len(sf) == 2 {
+					if sf[0] == "TO_URL" {
+						// parse the url after trimming off any surrounding double quotes
+						cfg.TOUrl = strings.Trim(sf[1], "\"")
+						to_url_found = true
+					}
+					if sf[0] == "TO_USER" {
+						// set the TOUser after trimming off any surrounding quotes.
+						cfg.TOUser = strings.Trim(sf[1], "\"")
+						to_user_found = true
+					}
+					// set the TOPass after trimming off any surrounding quotes.
+					if sf[0] == "TO_PASS" {
+						cfg.TOPass = strings.Trim(sf[1], "\"")
+						to_pass_found = true
+					}
+				}
+			}
+		}
+	}
+	if !to_url_found && !to_user_found && !to_pass_found {
+		return errors.New("failed to retrieve one or more TrafficOps credentails")
+	}
+
+	return nil
+}
+
+func GetConfig() (Cfg, error, bool) {
+	var err error
+	var configFile string
+	var logLocationErr = log.LogLocationStderr
+	var logLocationDebug = log.LogLocationNull
+	var logLocationInfo = log.LogLocationNull
+	var logLocationWarn = log.LogLocationNull
+
+	configFilePtr := getopt.StringLong("config-file", 'f', DefaultConfigFile, "full path to the json config file")
+	logdirPtr := getopt.StringLong("logging-dir", 'l', DefaultLogDirectory, "directory location for log files")
+	helpPtr := getopt.BoolLong("help", 'h', "Print usage information and exit")
+	verbosePtr := getopt.CounterLong("verbose", 'v', `Log verbosity. Logging is output to stderr. By default, errors are logged. To log warnings, pass '-v'. To log info, pass '-vv', debug pass '-vvv'`)
+
+	getopt.Parse()
+
+	if configFilePtr != nil {
+		configFile = *configFilePtr
+	} else {
+		configFile = DefaultConfigFile
+	}
+
+	var logfile string
+
+	logfile = *logdirPtr + "/" + DefaultLogFile
+
+	logLocationErr = logfile
+
+	if *verbosePtr == 1 {
+		logLocationWarn = logfile
+	} else if *verbosePtr == 2 {
+		logLocationInfo = logfile
+		logLocationWarn = logfile
+	} else if *verbosePtr == 3 {
+		logLocationInfo = logfile
+		logLocationWarn = logfile
+		logLocationDebug = logfile
+	}
+
+	if help := *helpPtr; help == true {
+		Usage()
+		return Cfg{}, nil, true
+	}
+
+	lcfg := LogCfg{
+		LogLocationDebug: logLocationDebug,
+		LogLocationErr:   logLocationErr,
+		LogLocationInfo:  logLocationInfo,
+		LogLocationWarn:  logLocationWarn,
+	}
+
+	if err := log.InitCfg(&lcfg); err != nil {
+		return Cfg{}, errors.New("Initializing loggers: " + err.Error() + "\n"), false
+	}
+
+	cfg := Cfg{
+		TrafficMonitors: make(map[string]bool, 0),
+	}
+
+	if err = LoadConfig(&cfg, configFile); err != nil {
+		return Cfg{}, errors.New(err.Error() + "\n"), false
+	}
+
+	if err = readCredentials(&cfg); err != nil {
+		return cfg, err, false
+	}
+
+	err = GetTrafficMonitorsStatus(&cfg)
+	if err != nil {
+		return cfg, err, false
+	}
+
+	return cfg, nil, false
+}
+
+func GetTrafficMonitorsStatus(cfg *Cfg) error {
+	u, err := url.Parse(cfg.TOUrl + "/api/3.0/servers?type=RASCAL&status=ONLINE")

Review comment:
       Am I misreading this, or is the path here never used? I was worried about hard-coding the `3.0`, but it looks like it isn't used, it's just to create a valid URL to parse?
   
   It looks like `url.Parse("?type=RASCAL&status=ONLINE")` will work: https://play.golang.org/p/gxKCH7wCZZ8
   
   Or you could also:
   ```
   qry := url.Values{}
   qry.Add("type", "RASCAL")
   qry.Add("status", "ONLINE")
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688630495



##########
File path: cache-config/tm-health-client/README.md
##########
@@ -0,0 +1,167 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+<!--
+
+  !!!
+      This file is both a Github Readme and manpage!
+      Please make sure changes appear properly with man,
+      and follow man conventions, such as:
+      https://www.bell-labs.com/usr/dmr/www/manintro.html
+
+      A primary goal of t3c is to follow POSIX and LSB standards
+      and conventions, so it's easy to learn and use by people
+      who know Linux and other *nix systems. Providing a proper
+      manpage is a big part of that.
+  !!!
+
+-->
+# NAME
+
+tm-health-client - Traffic Monitor Health Client service
+
+# SYNOPSIS
+
+tm-health-client [-f config-file]  -h  [-l logging-directory]  -v 
+
+# DESCRIPTION
+
+The tm-health-client command is used to manage **Apache Traffic Server** parents on a
+host running **Apache Traffic Server**.  The command should be started by **systemd** 
+and run as a service. On startup, the command reads it's default configuration file

Review comment:
       Nitpick: `it's` -> `its`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] jrushford commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
jrushford commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688758844



##########
File path: cache-config/tm-health-client/tmutil/tmutil.go
##########
@@ -0,0 +1,674 @@
+package tmutil
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"bytes"
+	"errors"
+	"io/ioutil"
+	"math/rand"
+	"net"
+	"os"
+	"os/exec"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/cache-config/tm-health-client/config"
+	"github.com/apache/trafficcontrol/lib/go-log"
+	"github.com/apache/trafficcontrol/lib/go-tc"
+	"github.com/apache/trafficcontrol/traffic_monitor/datareq"
+	"github.com/apache/trafficcontrol/traffic_monitor/tmclient"
+	"gopkg.in/yaml.v2"
+)
+
+const serverRequest = "https://tp.cdn.comcast.net/api/3.0/servers?type=RASCAL"
+
+type ParentAvailable interface {
+	available() bool
+}
+
+// the necessary fields of a trafficserver parents config file needed to
+// read the file and keep track of it's modification time.
+type ParentConfigFile struct {
+	Filename       string
+	LastModifyTime int64
+}
+
+// the necessary data required to keep track of trafficserver config
+// files, lists of parents a trafficserver instance uses, and directory
+// locations used for configuration and trafficserver executables.
+type ParentInfo struct {
+	ParentDotConfig        ParentConfigFile
+	StrategiesDotYaml      ParentConfigFile
+	TrafficServerBinDir    string
+	TrafficServerConfigDir string
+	Parents                map[string]ParentStatus
+	Cfg                    config.Cfg
+}
+
+// when reading the strategies.yaml, these fields are used to help
+// parse out fail_over objects
+type FailOver struct {
+	MaxSimpleRetries      int      `yaml:"max_simple_retries,omitempty"`
+	MaxUnavailableRetries int      `yaml:"max_unavailable_retries,omitempty"`
+	RingMode              string   `yaml:"ring_mode,omitempty"`
+	ResponseCodes         []int    `yaml:"response_codes,omitempty"`
+	MarkDownCodes         []int    `yaml:"markdown_codes,omitempty"`
+	HealthCheck           []string `yaml:"health_check,omitempty"`
+}
+
+// the trafficserver 'HostStatus' fields that are necessary to interface
+// with the trafficserver 'traffic_ctl' command.
+type ParentStatus struct {
+	Fqdn         string
+	ActiveReason bool
+	LocalReason  bool
+	ManualReason bool
+}
+
+// used to get the overall parent availablity from the
+// HostStatus markdown reasons.  all markdown reasons
+// must be true for a parent to be considered available
+func (p ParentStatus) available() bool {
+	if !p.ActiveReason {
+		return false
+	} else if !p.LocalReason {
+		return false
+	} else if !p.ManualReason {
+		return false
+	}
+	return true
+}
+
+// used to log that a parent's status is either UP or
+// DOWN based upon the HostStatus reason codes.  to
+// be considered UP, all reason codes must be 'true'
+func (p ParentStatus) Status() string {
+	if !p.ActiveReason {
+		return "DOWN"
+	} else if !p.LocalReason {
+		return "DOWN"
+	} else if !p.ManualReason {
+		return "DOWN"
+	}
+	return "UP"
+}
+
+type StatusReason int
+
+// these are the HostStatus reason codes used withing
+// trafficserver.
+const (
+	ACTIVE StatusReason = iota
+	LOCAL
+	MANUAL
+)
+
+// used for logging a parent's HostStatus reason code
+// setting.
+func (s StatusReason) String() string {
+	switch s {
+	case ACTIVE:
+		return "ACTIVE"
+	case LOCAL:
+		return "LOCAL"
+	case MANUAL:
+		return "MANUAL"
+	}
+	return "UNDEFINED"
+}
+
+// the fields used from strategies.yaml that describe
+// a parent.
+type Host struct {
+	HostName  string     `yaml:"host"`
+	Protocols []Protocol `yaml:"protocol"`
+}
+
+// the protocol object in strategies.yaml that help to
+// describe a parent.
+type Protocol struct {
+	Scheme           string  `yaml:"scheme"`
+	Port             int     `yaml:"port"`
+	Health_check_url string  `yaml:"health_check_url,omitempty"`
+	Weight           float64 `yaml:"weight,omitempty"`
+}
+
+// a trafficserver strategy object from strategies.yaml
+type Strategy struct {
+	Strategy        string   `yaml:"strategy"`
+	Policy          string   `yaml:"policy"`
+	HashKey         string   `yaml:"hash_key,omitempty"`
+	GoDirect        bool     `yaml:"go_direct,omitempty"`
+	ParentIsProxy   bool     `yaml:"parent_is_proxy,omitempty"`
+	CachePeerResult bool     `yaml:"cache_peer_result,omitempty"`
+	Scheme          string   `yaml:"scheme"`
+	FailOvers       FailOver `yaml:"failover,omitempty"`
+}
+
+// the top level array defintions in a trafficserver strategies.yaml
+// configuration file.
+type Strategies struct {
+	Strategy []Strategy    `yaml:"strategies"`
+	Hosts    []Host        `yaml:"hosts"`
+	Groups   []interface{} `yaml:"groups"`
+}
+
+// used at startup to load a trafficservers list of parents from
+// it's 'parent.config', 'strategies.yaml' and current parent
+// status from trafficservers HostStatus subsystem.
+func NewParentInfo(cfg config.Cfg) (*ParentInfo, error) {
+
+	parentConfig := cfg.TrafficServerConfigDir + "/parent.config"
+	modTime, err := getFileModificationTime(parentConfig)
+	if err != nil {
+		return nil, errors.New("error reading parent.config: " + err.Error())
+	}
+	parents := ParentConfigFile{
+		Filename:       parentConfig,
+		LastModifyTime: modTime,
+	}
+
+	stratyaml := cfg.TrafficServerConfigDir + "/strategies.yaml"
+	modTime, err = getFileModificationTime(stratyaml)
+	if err != nil {
+		return nil, errors.New("error reading strategies.yaml: " + err.Error())
+	}
+
+	strategies := ParentConfigFile{
+		Filename:       cfg.TrafficServerConfigDir + "/strategies.yaml",
+		LastModifyTime: modTime,
+	}
+
+	parentInfo := ParentInfo{
+		ParentDotConfig:        parents,
+		StrategiesDotYaml:      strategies,
+		TrafficServerBinDir:    cfg.TrafficServerBinDir,
+		TrafficServerConfigDir: cfg.TrafficServerConfigDir,
+	}
+
+	// initialize the trafficserver parents map
+	parentStatus := make(map[string]ParentStatus)
+
+	// read parent.config
+	if err := parentInfo.readParentConfig(parentStatus); err != nil {
+		return nil, errors.New("loading parent.config file: " + err.Error())
+	}
+
+	// read strategies.yaml
+	if err := parentInfo.readStrategies(parentStatus); err != nil {
+		return nil, errors.New("loading parent strategies.yaml file: " + err.Error())
+	}
+
+	// collect the trafficserver parent status from the HostStatus subsystem
+	if err := parentInfo.readHostStatus(parentStatus); err != nil {
+		return nil, errors.New("reading trafficserver host status: " + err.Error())
+	}
+
+	log.Infof("startup loaded %d parent records\n", len(parentStatus))
+
+	parentInfo.Parents = parentStatus
+	parentInfo.Cfg = cfg
+
+	return &parentInfo, nil
+}
+
+// Queries a traffic monitor that is monitoring the trafficserver instance running on a host to
+// obtain the availability, health, of a parent used by trafficserver.
+func (c *ParentInfo) GetCacheStatuses() (map[tc.CacheName]datareq.CacheStatus, error) {
+
+	tmHostName, err := c.findATrafficMonitor()
+	if err != nil {
+		return nil, errors.New("finding a trafficmonitor: " + err.Error())
+	}
+	tmc := tmclient.New("http://"+tmHostName, config.GetRequestTimeout())
+
+	return tmc.CacheStatuses()
+}
+
+// The main polling function that keeps the parents list current if
+// with any changes to the trafficserver parent.config or strategies.yaml.
+// Also, it keeps parent status current with the the trafficserver HostStatus
+// subsystem.  Finally, on each poll cycle a trafficmonitor is queried to check
+// that all parents used by this trafficserver are available for use based upon
+// the trafficmonitors idea from it's health protocol.  Parents are marked up or
+// down in the trafficserver subsystem based upon that hosts current status and
+// the status that trafficmonitor health protocol has determined for a parent
+func (c *ParentInfo) PollAndUpdateCacheStatus() {
+	pollingInterval := config.GetTMPollingInterval()
+	log.Infoln("polling started")
+
+	for {
+		if err := c.UpdateParentInfo(); err != nil {
+			log.Errorf("could not load new ATS parent info: %s\n", err.Error())
+		} else {
+			log.Debugf("updated parent info, total number of parents: %d\n", len(c.Parents))
+		}
+
+		// read traffic manager cache statuses
+		caches, err := c.GetCacheStatuses()
+		if err != nil {
+			log.Errorln(err.Error())
+		}
+
+		for k, v := range caches {
+			hostName := string(k)
+			cs, ok := c.Parents[hostName]
+			if ok {
+				tmAvailable := *v.CombinedAvailable
+				if cs.available() != tmAvailable {
+					if !c.Cfg.EnableActiveMarkdowns {
+						if !tmAvailable {
+							log.Infof("TM reports that %s is not available and should be marked DOWN but, mark downs are disabled by configuration", hostName)
+						} else {
+							log.Infof("TM reports that %s is available and should be marked UP but, mark up is disabled by configuration", hostName)
+						}
+					} else {
+						if err = c.markParent(cs.Fqdn, *v.Status, tmAvailable); err != nil {
+							log.Errorln(err.Error())
+						}
+					}
+				}
+			}
+		}
+
+		time.Sleep(pollingInterval)
+	}
+}
+
+// Used by the polling function to update the parents list from
+// changes to parent.config and strategies.yaml.  The parents
+// availability is also updated to reflect the current state from
+// the trafficserver HostStatus subsystem.
+func (c *ParentInfo) UpdateParentInfo() error {
+	ptime, err := getFileModificationTime(c.ParentDotConfig.Filename)
+	if err != nil {
+		return errors.New("error reading parent.config: " + err.Error())
+	}
+	stime, err := getFileModificationTime(c.StrategiesDotYaml.Filename)
+	if err != nil {
+		return errors.New("error reading strategies.yaml: " + err.Error())
+	}
+	if c.ParentDotConfig.LastModifyTime < ptime {
+		// read parent.config
+		if err := c.readParentConfig(c.Parents); err != nil {
+			return errors.New("updating parent.config file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new parent.config, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	if c.StrategiesDotYaml.LastModifyTime < stime {
+		// read strategies.yaml
+		if err := c.readStrategies(c.Parents); err != nil {
+			return errors.New("updating parent strategies.yaml file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new strategies.yaml, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	// collect the trafficserver current host status
+	if err := c.readHostStatus(c.Parents); err != nil {
+		return errors.New("reading latest trafficserver host status: " + err.Error())
+	}
+
+	return nil
+}
+
+// choose an availabe trafficmonitor, returns an error if
+// there are none
+func (c *ParentInfo) findATrafficMonitor() (string, error) {
+	var tmHostname string
+	lth := len(c.Cfg.TrafficMonitors)
+	if lth == 0 {
+		return "", errors.New("there are no available traffic monitors")
+	}
+
+	// choose one at random
+	rand.Seed(time.Now().UnixNano())
+	r := (rand.Intn(7919) % lth)
+
+	tms := make([]string, 0)
+	for k, v := range c.Cfg.TrafficMonitors {
+		if v == true {
+			log.Debugf("traffic monitor %s is available\n", k)
+			tms = append(tms, k)
+		}
+	}
+
+	lth = len(tms)
+	if lth > 0 {
+		if r < lth { // return the trafficmonitor at r, r could be the first

Review comment:
       yeah, I was playing with a primes, anyway I've changed it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688632353



##########
File path: cache-config/tm-health-client/config/config.go
##########
@@ -0,0 +1,256 @@
+package config
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"encoding/json"
+	"errors"
+	"io/ioutil"
+	"net/url"
+	"os"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/lib/go-log"
+	toclient "github.com/apache/trafficcontrol/traffic_ops/v3-client"
+	"github.com/pborman/getopt/v2"

Review comment:
       Nitpick: we like to separate external libraries with a blank line, to make them easier to read and more obvious




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688667496



##########
File path: cache-config/tm-health-client/tmutil/tmutil.go
##########
@@ -0,0 +1,674 @@
+package tmutil
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"bytes"
+	"errors"
+	"io/ioutil"
+	"math/rand"
+	"net"
+	"os"
+	"os/exec"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/cache-config/tm-health-client/config"
+	"github.com/apache/trafficcontrol/lib/go-log"
+	"github.com/apache/trafficcontrol/lib/go-tc"
+	"github.com/apache/trafficcontrol/traffic_monitor/datareq"
+	"github.com/apache/trafficcontrol/traffic_monitor/tmclient"
+	"gopkg.in/yaml.v2"
+)
+
+const serverRequest = "https://tp.cdn.comcast.net/api/3.0/servers?type=RASCAL"
+
+type ParentAvailable interface {
+	available() bool
+}
+
+// the necessary fields of a trafficserver parents config file needed to
+// read the file and keep track of it's modification time.
+type ParentConfigFile struct {
+	Filename       string
+	LastModifyTime int64
+}
+
+// the necessary data required to keep track of trafficserver config
+// files, lists of parents a trafficserver instance uses, and directory
+// locations used for configuration and trafficserver executables.
+type ParentInfo struct {
+	ParentDotConfig        ParentConfigFile
+	StrategiesDotYaml      ParentConfigFile
+	TrafficServerBinDir    string
+	TrafficServerConfigDir string
+	Parents                map[string]ParentStatus
+	Cfg                    config.Cfg
+}
+
+// when reading the strategies.yaml, these fields are used to help
+// parse out fail_over objects
+type FailOver struct {
+	MaxSimpleRetries      int      `yaml:"max_simple_retries,omitempty"`
+	MaxUnavailableRetries int      `yaml:"max_unavailable_retries,omitempty"`
+	RingMode              string   `yaml:"ring_mode,omitempty"`
+	ResponseCodes         []int    `yaml:"response_codes,omitempty"`
+	MarkDownCodes         []int    `yaml:"markdown_codes,omitempty"`
+	HealthCheck           []string `yaml:"health_check,omitempty"`
+}
+
+// the trafficserver 'HostStatus' fields that are necessary to interface
+// with the trafficserver 'traffic_ctl' command.
+type ParentStatus struct {
+	Fqdn         string
+	ActiveReason bool
+	LocalReason  bool
+	ManualReason bool
+}
+
+// used to get the overall parent availablity from the
+// HostStatus markdown reasons.  all markdown reasons
+// must be true for a parent to be considered available
+func (p ParentStatus) available() bool {
+	if !p.ActiveReason {
+		return false
+	} else if !p.LocalReason {
+		return false
+	} else if !p.ManualReason {
+		return false
+	}
+	return true
+}
+
+// used to log that a parent's status is either UP or
+// DOWN based upon the HostStatus reason codes.  to
+// be considered UP, all reason codes must be 'true'
+func (p ParentStatus) Status() string {
+	if !p.ActiveReason {
+		return "DOWN"
+	} else if !p.LocalReason {
+		return "DOWN"
+	} else if !p.ManualReason {
+		return "DOWN"
+	}
+	return "UP"
+}
+
+type StatusReason int
+
+// these are the HostStatus reason codes used withing
+// trafficserver.
+const (
+	ACTIVE StatusReason = iota
+	LOCAL
+	MANUAL
+)
+
+// used for logging a parent's HostStatus reason code
+// setting.
+func (s StatusReason) String() string {
+	switch s {
+	case ACTIVE:
+		return "ACTIVE"
+	case LOCAL:
+		return "LOCAL"
+	case MANUAL:
+		return "MANUAL"
+	}
+	return "UNDEFINED"
+}
+
+// the fields used from strategies.yaml that describe
+// a parent.
+type Host struct {
+	HostName  string     `yaml:"host"`
+	Protocols []Protocol `yaml:"protocol"`
+}
+
+// the protocol object in strategies.yaml that help to
+// describe a parent.
+type Protocol struct {
+	Scheme           string  `yaml:"scheme"`
+	Port             int     `yaml:"port"`
+	Health_check_url string  `yaml:"health_check_url,omitempty"`
+	Weight           float64 `yaml:"weight,omitempty"`
+}
+
+// a trafficserver strategy object from strategies.yaml
+type Strategy struct {
+	Strategy        string   `yaml:"strategy"`
+	Policy          string   `yaml:"policy"`
+	HashKey         string   `yaml:"hash_key,omitempty"`
+	GoDirect        bool     `yaml:"go_direct,omitempty"`
+	ParentIsProxy   bool     `yaml:"parent_is_proxy,omitempty"`
+	CachePeerResult bool     `yaml:"cache_peer_result,omitempty"`
+	Scheme          string   `yaml:"scheme"`
+	FailOvers       FailOver `yaml:"failover,omitempty"`
+}
+
+// the top level array defintions in a trafficserver strategies.yaml
+// configuration file.
+type Strategies struct {
+	Strategy []Strategy    `yaml:"strategies"`
+	Hosts    []Host        `yaml:"hosts"`
+	Groups   []interface{} `yaml:"groups"`
+}
+
+// used at startup to load a trafficservers list of parents from
+// it's 'parent.config', 'strategies.yaml' and current parent
+// status from trafficservers HostStatus subsystem.
+func NewParentInfo(cfg config.Cfg) (*ParentInfo, error) {
+
+	parentConfig := cfg.TrafficServerConfigDir + "/parent.config"
+	modTime, err := getFileModificationTime(parentConfig)
+	if err != nil {
+		return nil, errors.New("error reading parent.config: " + err.Error())
+	}
+	parents := ParentConfigFile{
+		Filename:       parentConfig,
+		LastModifyTime: modTime,
+	}
+
+	stratyaml := cfg.TrafficServerConfigDir + "/strategies.yaml"
+	modTime, err = getFileModificationTime(stratyaml)
+	if err != nil {
+		return nil, errors.New("error reading strategies.yaml: " + err.Error())
+	}
+
+	strategies := ParentConfigFile{
+		Filename:       cfg.TrafficServerConfigDir + "/strategies.yaml",
+		LastModifyTime: modTime,
+	}
+
+	parentInfo := ParentInfo{
+		ParentDotConfig:        parents,
+		StrategiesDotYaml:      strategies,
+		TrafficServerBinDir:    cfg.TrafficServerBinDir,
+		TrafficServerConfigDir: cfg.TrafficServerConfigDir,
+	}
+
+	// initialize the trafficserver parents map
+	parentStatus := make(map[string]ParentStatus)
+
+	// read parent.config
+	if err := parentInfo.readParentConfig(parentStatus); err != nil {
+		return nil, errors.New("loading parent.config file: " + err.Error())
+	}
+
+	// read strategies.yaml
+	if err := parentInfo.readStrategies(parentStatus); err != nil {
+		return nil, errors.New("loading parent strategies.yaml file: " + err.Error())
+	}
+
+	// collect the trafficserver parent status from the HostStatus subsystem
+	if err := parentInfo.readHostStatus(parentStatus); err != nil {
+		return nil, errors.New("reading trafficserver host status: " + err.Error())
+	}
+
+	log.Infof("startup loaded %d parent records\n", len(parentStatus))
+
+	parentInfo.Parents = parentStatus
+	parentInfo.Cfg = cfg
+
+	return &parentInfo, nil
+}
+
+// Queries a traffic monitor that is monitoring the trafficserver instance running on a host to
+// obtain the availability, health, of a parent used by trafficserver.
+func (c *ParentInfo) GetCacheStatuses() (map[tc.CacheName]datareq.CacheStatus, error) {
+
+	tmHostName, err := c.findATrafficMonitor()
+	if err != nil {
+		return nil, errors.New("finding a trafficmonitor: " + err.Error())
+	}
+	tmc := tmclient.New("http://"+tmHostName, config.GetRequestTimeout())
+
+	return tmc.CacheStatuses()
+}
+
+// The main polling function that keeps the parents list current if
+// with any changes to the trafficserver parent.config or strategies.yaml.
+// Also, it keeps parent status current with the the trafficserver HostStatus
+// subsystem.  Finally, on each poll cycle a trafficmonitor is queried to check
+// that all parents used by this trafficserver are available for use based upon
+// the trafficmonitors idea from it's health protocol.  Parents are marked up or
+// down in the trafficserver subsystem based upon that hosts current status and
+// the status that trafficmonitor health protocol has determined for a parent
+func (c *ParentInfo) PollAndUpdateCacheStatus() {
+	pollingInterval := config.GetTMPollingInterval()
+	log.Infoln("polling started")
+
+	for {
+		if err := c.UpdateParentInfo(); err != nil {
+			log.Errorf("could not load new ATS parent info: %s\n", err.Error())
+		} else {
+			log.Debugf("updated parent info, total number of parents: %d\n", len(c.Parents))
+		}
+
+		// read traffic manager cache statuses
+		caches, err := c.GetCacheStatuses()
+		if err != nil {
+			log.Errorln(err.Error())
+		}
+
+		for k, v := range caches {
+			hostName := string(k)
+			cs, ok := c.Parents[hostName]
+			if ok {
+				tmAvailable := *v.CombinedAvailable
+				if cs.available() != tmAvailable {
+					if !c.Cfg.EnableActiveMarkdowns {
+						if !tmAvailable {
+							log.Infof("TM reports that %s is not available and should be marked DOWN but, mark downs are disabled by configuration", hostName)
+						} else {
+							log.Infof("TM reports that %s is available and should be marked UP but, mark up is disabled by configuration", hostName)
+						}
+					} else {
+						if err = c.markParent(cs.Fqdn, *v.Status, tmAvailable); err != nil {
+							log.Errorln(err.Error())
+						}
+					}
+				}
+			}
+		}
+
+		time.Sleep(pollingInterval)
+	}
+}
+
+// Used by the polling function to update the parents list from
+// changes to parent.config and strategies.yaml.  The parents
+// availability is also updated to reflect the current state from
+// the trafficserver HostStatus subsystem.
+func (c *ParentInfo) UpdateParentInfo() error {
+	ptime, err := getFileModificationTime(c.ParentDotConfig.Filename)
+	if err != nil {
+		return errors.New("error reading parent.config: " + err.Error())
+	}
+	stime, err := getFileModificationTime(c.StrategiesDotYaml.Filename)
+	if err != nil {
+		return errors.New("error reading strategies.yaml: " + err.Error())
+	}
+	if c.ParentDotConfig.LastModifyTime < ptime {
+		// read parent.config
+		if err := c.readParentConfig(c.Parents); err != nil {
+			return errors.New("updating parent.config file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new parent.config, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	if c.StrategiesDotYaml.LastModifyTime < stime {
+		// read strategies.yaml
+		if err := c.readStrategies(c.Parents); err != nil {
+			return errors.New("updating parent strategies.yaml file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new strategies.yaml, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	// collect the trafficserver current host status
+	if err := c.readHostStatus(c.Parents); err != nil {
+		return errors.New("reading latest trafficserver host status: " + err.Error())
+	}
+
+	return nil
+}
+
+// choose an availabe trafficmonitor, returns an error if
+// there are none
+func (c *ParentInfo) findATrafficMonitor() (string, error) {
+	var tmHostname string
+	lth := len(c.Cfg.TrafficMonitors)
+	if lth == 0 {
+		return "", errors.New("there are no available traffic monitors")
+	}
+
+	// choose one at random
+	rand.Seed(time.Now().UnixNano())
+	r := (rand.Intn(7919) % lth)
+
+	tms := make([]string, 0)
+	for k, v := range c.Cfg.TrafficMonitors {
+		if v == true {
+			log.Debugf("traffic monitor %s is available\n", k)
+			tms = append(tms, k)
+		}
+	}
+
+	lth = len(tms)
+	if lth > 0 {
+		if r < lth { // return the trafficmonitor at r, r could be the first

Review comment:
       Nitpick: this `if` could be removed if `r` above was changed to `r := rand.Intn(lth)`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] jrushford commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
jrushford commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688757979



##########
File path: cache-config/tm-health-client/config/config.go
##########
@@ -0,0 +1,256 @@
+package config
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"encoding/json"
+	"errors"
+	"io/ioutil"
+	"net/url"
+	"os"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/lib/go-log"
+	toclient "github.com/apache/trafficcontrol/traffic_ops/v3-client"
+	"github.com/pborman/getopt/v2"
+)
+
+var tmPollingInterval time.Duration
+var toRequestTimeout time.Duration
+
+const (
+	DefaultConfigFile             = "/etc/trafficcontrol-cache-config/tm-health-client.json"
+	DefaultLogDirectory           = "/var/log/trafficcontrol-cache-config"
+	DefaultLogFile                = "tm-health-client.log"
+	DefaultTrafficServerConfigDir = "/opt/trafficserver/etc/trafficserver"
+	DefaultTrafficServerBinDir    = "/opt/trafficserver/bin"
+)
+
+type Cfg struct {
+	CDNName                 string `json:"cdn-name"`
+	EnableActiveMarkdowns   bool   `json:"enable-active-markdowns"`
+	ReasonCode              string `json:"reason-code"`
+	TOCredentialFile        string `json:"to-credential-file"`
+	TORequestTimeOutSeconds string `json:"to-request-timeout-seconds"`
+	TOPass                  string
+	TOUrl                   string
+	TOUser                  string
+	TmPollIntervalSeconds   string          `json:"tm-poll-interval-seconds"`
+	TrafficServerConfigDir  string          `json:"trafficserver-config-dir"`
+	TrafficServerBinDir     string          `json:"trafficserver-bin-dir"`
+	TrafficMonitors         map[string]bool `json:"trafficmonitors,omitempty"`
+}
+
+type LogCfg struct {
+	LogLocationErr   string
+	LogLocationDebug string
+	LogLocationInfo  string
+	LogLocationWarn  string
+}
+
+func (lcfg LogCfg) ErrorLog() log.LogLocation   { return log.LogLocation(lcfg.LogLocationErr) }
+func (lcfg LogCfg) WarningLog() log.LogLocation { return log.LogLocation(lcfg.LogLocationWarn) }
+func (lcfg LogCfg) InfoLog() log.LogLocation    { return log.LogLocation(lcfg.LogLocationInfo) }
+func (lcfg LogCfg) DebugLog() log.LogLocation   { return log.LogLocation(lcfg.LogLocationDebug) }
+func (lcfg LogCfg) EventLog() log.LogLocation   { return log.LogLocation(log.LogLocationNull) } // not used
+
+func readCredentials(cfg *Cfg) error {
+	fn := cfg.TOCredentialFile
+	f, err := os.Open(fn)
+
+	if err != nil {
+		return errors.New("failed to open + " + fn + " :" + err.Error())
+	}
+	defer f.Close()
+
+	var to_pass_found = false
+	var to_url_found = false
+	var to_user_found = false
+
+	scanner := bufio.NewScanner(f)
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if strings.HasPrefix(line, "#") {
+			continue
+		}
+		fields := strings.Split(line, " ")
+		for _, v := range fields {
+			if strings.HasPrefix(v, "TO_") {
+				sf := strings.Split(v, "=")
+				if len(sf) == 2 {
+					if sf[0] == "TO_URL" {
+						// parse the url after trimming off any surrounding double quotes
+						cfg.TOUrl = strings.Trim(sf[1], "\"")
+						to_url_found = true
+					}
+					if sf[0] == "TO_USER" {
+						// set the TOUser after trimming off any surrounding quotes.
+						cfg.TOUser = strings.Trim(sf[1], "\"")
+						to_user_found = true
+					}
+					// set the TOPass after trimming off any surrounding quotes.
+					if sf[0] == "TO_PASS" {
+						cfg.TOPass = strings.Trim(sf[1], "\"")
+						to_pass_found = true
+					}
+				}
+			}
+		}
+	}
+	if !to_url_found && !to_user_found && !to_pass_found {
+		return errors.New("failed to retrieve one or more TrafficOps credentails")
+	}
+
+	return nil
+}
+
+func GetConfig() (Cfg, error, bool) {
+	var err error
+	var configFile string
+	var logLocationErr = log.LogLocationStderr
+	var logLocationDebug = log.LogLocationNull
+	var logLocationInfo = log.LogLocationNull
+	var logLocationWarn = log.LogLocationNull
+
+	configFilePtr := getopt.StringLong("config-file", 'f', DefaultConfigFile, "full path to the json config file")
+	logdirPtr := getopt.StringLong("logging-dir", 'l', DefaultLogDirectory, "directory location for log files")
+	helpPtr := getopt.BoolLong("help", 'h', "Print usage information and exit")
+	verbosePtr := getopt.CounterLong("verbose", 'v', `Log verbosity. Logging is output to stderr. By default, errors are logged. To log warnings, pass '-v'. To log info, pass '-vv', debug pass '-vvv'`)
+
+	getopt.Parse()
+
+	if configFilePtr != nil {
+		configFile = *configFilePtr
+	} else {
+		configFile = DefaultConfigFile
+	}
+
+	var logfile string
+
+	logfile = *logdirPtr + "/" + DefaultLogFile
+
+	logLocationErr = logfile
+
+	if *verbosePtr == 1 {
+		logLocationWarn = logfile
+	} else if *verbosePtr == 2 {
+		logLocationInfo = logfile
+		logLocationWarn = logfile
+	} else if *verbosePtr == 3 {
+		logLocationInfo = logfile
+		logLocationWarn = logfile
+		logLocationDebug = logfile
+	}
+
+	if help := *helpPtr; help == true {
+		Usage()
+		return Cfg{}, nil, true
+	}
+
+	lcfg := LogCfg{
+		LogLocationDebug: logLocationDebug,
+		LogLocationErr:   logLocationErr,
+		LogLocationInfo:  logLocationInfo,
+		LogLocationWarn:  logLocationWarn,
+	}
+
+	if err := log.InitCfg(&lcfg); err != nil {
+		return Cfg{}, errors.New("Initializing loggers: " + err.Error() + "\n"), false
+	}
+
+	cfg := Cfg{
+		TrafficMonitors: make(map[string]bool, 0),
+	}
+
+	if err = LoadConfig(&cfg, configFile); err != nil {
+		return Cfg{}, errors.New(err.Error() + "\n"), false
+	}
+
+	if err = readCredentials(&cfg); err != nil {
+		return cfg, err, false
+	}
+
+	err = GetTrafficMonitorsStatus(&cfg)
+	if err != nil {
+		return cfg, err, false
+	}
+
+	return cfg, nil, false
+}
+
+func GetTrafficMonitorsStatus(cfg *Cfg) error {
+	u, err := url.Parse(cfg.TOUrl + "/api/3.0/servers?type=RASCAL&status=ONLINE")

Review comment:
       Fixed, yeah I had just pasted that path in.

##########
File path: cache-config/tm-health-client/tm-health-client.json
##########
@@ -0,0 +1,11 @@
+{
+  "cdn-name": "over-the-top",

Review comment:
       changed

##########
File path: cache-config/tm-health-client/tmutil/tmutil.go
##########
@@ -0,0 +1,674 @@
+package tmutil
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"bytes"
+	"errors"
+	"io/ioutil"
+	"math/rand"
+	"net"
+	"os"
+	"os/exec"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/cache-config/tm-health-client/config"
+	"github.com/apache/trafficcontrol/lib/go-log"
+	"github.com/apache/trafficcontrol/lib/go-tc"
+	"github.com/apache/trafficcontrol/traffic_monitor/datareq"
+	"github.com/apache/trafficcontrol/traffic_monitor/tmclient"
+	"gopkg.in/yaml.v2"
+)
+
+const serverRequest = "https://tp.cdn.comcast.net/api/3.0/servers?type=RASCAL"
+
+type ParentAvailable interface {
+	available() bool
+}
+
+// the necessary fields of a trafficserver parents config file needed to
+// read the file and keep track of it's modification time.
+type ParentConfigFile struct {
+	Filename       string
+	LastModifyTime int64
+}
+
+// the necessary data required to keep track of trafficserver config
+// files, lists of parents a trafficserver instance uses, and directory
+// locations used for configuration and trafficserver executables.
+type ParentInfo struct {
+	ParentDotConfig        ParentConfigFile
+	StrategiesDotYaml      ParentConfigFile
+	TrafficServerBinDir    string
+	TrafficServerConfigDir string
+	Parents                map[string]ParentStatus
+	Cfg                    config.Cfg
+}
+
+// when reading the strategies.yaml, these fields are used to help
+// parse out fail_over objects
+type FailOver struct {
+	MaxSimpleRetries      int      `yaml:"max_simple_retries,omitempty"`
+	MaxUnavailableRetries int      `yaml:"max_unavailable_retries,omitempty"`
+	RingMode              string   `yaml:"ring_mode,omitempty"`
+	ResponseCodes         []int    `yaml:"response_codes,omitempty"`
+	MarkDownCodes         []int    `yaml:"markdown_codes,omitempty"`
+	HealthCheck           []string `yaml:"health_check,omitempty"`
+}
+
+// the trafficserver 'HostStatus' fields that are necessary to interface
+// with the trafficserver 'traffic_ctl' command.
+type ParentStatus struct {
+	Fqdn         string
+	ActiveReason bool
+	LocalReason  bool
+	ManualReason bool
+}
+
+// used to get the overall parent availablity from the
+// HostStatus markdown reasons.  all markdown reasons
+// must be true for a parent to be considered available
+func (p ParentStatus) available() bool {
+	if !p.ActiveReason {
+		return false
+	} else if !p.LocalReason {
+		return false
+	} else if !p.ManualReason {
+		return false
+	}
+	return true
+}
+
+// used to log that a parent's status is either UP or
+// DOWN based upon the HostStatus reason codes.  to
+// be considered UP, all reason codes must be 'true'
+func (p ParentStatus) Status() string {
+	if !p.ActiveReason {
+		return "DOWN"
+	} else if !p.LocalReason {
+		return "DOWN"
+	} else if !p.ManualReason {
+		return "DOWN"
+	}
+	return "UP"
+}
+
+type StatusReason int
+
+// these are the HostStatus reason codes used withing
+// trafficserver.
+const (
+	ACTIVE StatusReason = iota
+	LOCAL
+	MANUAL
+)
+
+// used for logging a parent's HostStatus reason code
+// setting.
+func (s StatusReason) String() string {
+	switch s {
+	case ACTIVE:
+		return "ACTIVE"
+	case LOCAL:
+		return "LOCAL"
+	case MANUAL:
+		return "MANUAL"
+	}
+	return "UNDEFINED"
+}
+
+// the fields used from strategies.yaml that describe
+// a parent.
+type Host struct {
+	HostName  string     `yaml:"host"`
+	Protocols []Protocol `yaml:"protocol"`
+}
+
+// the protocol object in strategies.yaml that help to
+// describe a parent.
+type Protocol struct {
+	Scheme           string  `yaml:"scheme"`
+	Port             int     `yaml:"port"`
+	Health_check_url string  `yaml:"health_check_url,omitempty"`
+	Weight           float64 `yaml:"weight,omitempty"`
+}
+
+// a trafficserver strategy object from strategies.yaml
+type Strategy struct {
+	Strategy        string   `yaml:"strategy"`
+	Policy          string   `yaml:"policy"`
+	HashKey         string   `yaml:"hash_key,omitempty"`
+	GoDirect        bool     `yaml:"go_direct,omitempty"`
+	ParentIsProxy   bool     `yaml:"parent_is_proxy,omitempty"`
+	CachePeerResult bool     `yaml:"cache_peer_result,omitempty"`
+	Scheme          string   `yaml:"scheme"`
+	FailOvers       FailOver `yaml:"failover,omitempty"`
+}
+
+// the top level array defintions in a trafficserver strategies.yaml
+// configuration file.
+type Strategies struct {
+	Strategy []Strategy    `yaml:"strategies"`
+	Hosts    []Host        `yaml:"hosts"`
+	Groups   []interface{} `yaml:"groups"`
+}
+
+// used at startup to load a trafficservers list of parents from
+// it's 'parent.config', 'strategies.yaml' and current parent
+// status from trafficservers HostStatus subsystem.
+func NewParentInfo(cfg config.Cfg) (*ParentInfo, error) {
+
+	parentConfig := cfg.TrafficServerConfigDir + "/parent.config"
+	modTime, err := getFileModificationTime(parentConfig)
+	if err != nil {
+		return nil, errors.New("error reading parent.config: " + err.Error())
+	}
+	parents := ParentConfigFile{
+		Filename:       parentConfig,
+		LastModifyTime: modTime,
+	}
+
+	stratyaml := cfg.TrafficServerConfigDir + "/strategies.yaml"
+	modTime, err = getFileModificationTime(stratyaml)
+	if err != nil {
+		return nil, errors.New("error reading strategies.yaml: " + err.Error())
+	}
+
+	strategies := ParentConfigFile{
+		Filename:       cfg.TrafficServerConfigDir + "/strategies.yaml",
+		LastModifyTime: modTime,
+	}
+
+	parentInfo := ParentInfo{
+		ParentDotConfig:        parents,
+		StrategiesDotYaml:      strategies,
+		TrafficServerBinDir:    cfg.TrafficServerBinDir,
+		TrafficServerConfigDir: cfg.TrafficServerConfigDir,
+	}
+
+	// initialize the trafficserver parents map
+	parentStatus := make(map[string]ParentStatus)
+
+	// read parent.config
+	if err := parentInfo.readParentConfig(parentStatus); err != nil {
+		return nil, errors.New("loading parent.config file: " + err.Error())
+	}
+
+	// read strategies.yaml
+	if err := parentInfo.readStrategies(parentStatus); err != nil {
+		return nil, errors.New("loading parent strategies.yaml file: " + err.Error())
+	}
+
+	// collect the trafficserver parent status from the HostStatus subsystem
+	if err := parentInfo.readHostStatus(parentStatus); err != nil {
+		return nil, errors.New("reading trafficserver host status: " + err.Error())
+	}
+
+	log.Infof("startup loaded %d parent records\n", len(parentStatus))
+
+	parentInfo.Parents = parentStatus
+	parentInfo.Cfg = cfg
+
+	return &parentInfo, nil
+}
+
+// Queries a traffic monitor that is monitoring the trafficserver instance running on a host to
+// obtain the availability, health, of a parent used by trafficserver.
+func (c *ParentInfo) GetCacheStatuses() (map[tc.CacheName]datareq.CacheStatus, error) {
+
+	tmHostName, err := c.findATrafficMonitor()
+	if err != nil {
+		return nil, errors.New("finding a trafficmonitor: " + err.Error())
+	}
+	tmc := tmclient.New("http://"+tmHostName, config.GetRequestTimeout())
+
+	return tmc.CacheStatuses()
+}
+
+// The main polling function that keeps the parents list current if
+// with any changes to the trafficserver parent.config or strategies.yaml.
+// Also, it keeps parent status current with the the trafficserver HostStatus
+// subsystem.  Finally, on each poll cycle a trafficmonitor is queried to check
+// that all parents used by this trafficserver are available for use based upon
+// the trafficmonitors idea from it's health protocol.  Parents are marked up or
+// down in the trafficserver subsystem based upon that hosts current status and
+// the status that trafficmonitor health protocol has determined for a parent
+func (c *ParentInfo) PollAndUpdateCacheStatus() {
+	pollingInterval := config.GetTMPollingInterval()
+	log.Infoln("polling started")
+
+	for {
+		if err := c.UpdateParentInfo(); err != nil {
+			log.Errorf("could not load new ATS parent info: %s\n", err.Error())
+		} else {
+			log.Debugf("updated parent info, total number of parents: %d\n", len(c.Parents))
+		}
+
+		// read traffic manager cache statuses
+		caches, err := c.GetCacheStatuses()
+		if err != nil {
+			log.Errorln(err.Error())
+		}
+
+		for k, v := range caches {
+			hostName := string(k)
+			cs, ok := c.Parents[hostName]
+			if ok {
+				tmAvailable := *v.CombinedAvailable
+				if cs.available() != tmAvailable {
+					if !c.Cfg.EnableActiveMarkdowns {
+						if !tmAvailable {
+							log.Infof("TM reports that %s is not available and should be marked DOWN but, mark downs are disabled by configuration", hostName)
+						} else {
+							log.Infof("TM reports that %s is available and should be marked UP but, mark up is disabled by configuration", hostName)
+						}
+					} else {
+						if err = c.markParent(cs.Fqdn, *v.Status, tmAvailable); err != nil {
+							log.Errorln(err.Error())
+						}
+					}
+				}
+			}
+		}
+
+		time.Sleep(pollingInterval)
+	}
+}
+
+// Used by the polling function to update the parents list from
+// changes to parent.config and strategies.yaml.  The parents
+// availability is also updated to reflect the current state from
+// the trafficserver HostStatus subsystem.
+func (c *ParentInfo) UpdateParentInfo() error {
+	ptime, err := getFileModificationTime(c.ParentDotConfig.Filename)
+	if err != nil {
+		return errors.New("error reading parent.config: " + err.Error())
+	}
+	stime, err := getFileModificationTime(c.StrategiesDotYaml.Filename)
+	if err != nil {
+		return errors.New("error reading strategies.yaml: " + err.Error())
+	}
+	if c.ParentDotConfig.LastModifyTime < ptime {
+		// read parent.config
+		if err := c.readParentConfig(c.Parents); err != nil {
+			return errors.New("updating parent.config file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new parent.config, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	if c.StrategiesDotYaml.LastModifyTime < stime {
+		// read strategies.yaml
+		if err := c.readStrategies(c.Parents); err != nil {
+			return errors.New("updating parent strategies.yaml file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new strategies.yaml, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	// collect the trafficserver current host status
+	if err := c.readHostStatus(c.Parents); err != nil {
+		return errors.New("reading latest trafficserver host status: " + err.Error())
+	}
+
+	return nil
+}
+
+// choose an availabe trafficmonitor, returns an error if

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688633383



##########
File path: cache-config/tm-health-client/config/config.go
##########
@@ -0,0 +1,256 @@
+package config
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"encoding/json"
+	"errors"
+	"io/ioutil"
+	"net/url"
+	"os"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/lib/go-log"
+	toclient "github.com/apache/trafficcontrol/traffic_ops/v3-client"
+	"github.com/pborman/getopt/v2"
+)
+
+var tmPollingInterval time.Duration
+var toRequestTimeout time.Duration
+
+const (
+	DefaultConfigFile             = "/etc/trafficcontrol-cache-config/tm-health-client.json"
+	DefaultLogDirectory           = "/var/log/trafficcontrol-cache-config"
+	DefaultLogFile                = "tm-health-client.log"
+	DefaultTrafficServerConfigDir = "/opt/trafficserver/etc/trafficserver"
+	DefaultTrafficServerBinDir    = "/opt/trafficserver/bin"
+)
+
+type Cfg struct {
+	CDNName                 string `json:"cdn-name"`
+	EnableActiveMarkdowns   bool   `json:"enable-active-markdowns"`
+	ReasonCode              string `json:"reason-code"`
+	TOCredentialFile        string `json:"to-credential-file"`
+	TORequestTimeOutSeconds string `json:"to-request-timeout-seconds"`
+	TOPass                  string
+	TOUrl                   string
+	TOUser                  string
+	TmPollIntervalSeconds   string          `json:"tm-poll-interval-seconds"`
+	TrafficServerConfigDir  string          `json:"trafficserver-config-dir"`
+	TrafficServerBinDir     string          `json:"trafficserver-bin-dir"`
+	TrafficMonitors         map[string]bool `json:"trafficmonitors,omitempty"`
+}
+
+type LogCfg struct {
+	LogLocationErr   string
+	LogLocationDebug string
+	LogLocationInfo  string
+	LogLocationWarn  string
+}
+
+func (lcfg LogCfg) ErrorLog() log.LogLocation   { return log.LogLocation(lcfg.LogLocationErr) }
+func (lcfg LogCfg) WarningLog() log.LogLocation { return log.LogLocation(lcfg.LogLocationWarn) }
+func (lcfg LogCfg) InfoLog() log.LogLocation    { return log.LogLocation(lcfg.LogLocationInfo) }
+func (lcfg LogCfg) DebugLog() log.LogLocation   { return log.LogLocation(lcfg.LogLocationDebug) }
+func (lcfg LogCfg) EventLog() log.LogLocation   { return log.LogLocation(log.LogLocationNull) } // not used
+
+func readCredentials(cfg *Cfg) error {
+	fn := cfg.TOCredentialFile
+	f, err := os.Open(fn)
+
+	if err != nil {
+		return errors.New("failed to open + " + fn + " :" + err.Error())
+	}
+	defer f.Close()
+
+	var to_pass_found = false
+	var to_url_found = false
+	var to_user_found = false
+
+	scanner := bufio.NewScanner(f)
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if strings.HasPrefix(line, "#") {
+			continue
+		}
+		fields := strings.Split(line, " ")
+		for _, v := range fields {
+			if strings.HasPrefix(v, "TO_") {
+				sf := strings.Split(v, "=")
+				if len(sf) == 2 {
+					if sf[0] == "TO_URL" {
+						// parse the url after trimming off any surrounding double quotes
+						cfg.TOUrl = strings.Trim(sf[1], "\"")
+						to_url_found = true
+					}
+					if sf[0] == "TO_USER" {
+						// set the TOUser after trimming off any surrounding quotes.
+						cfg.TOUser = strings.Trim(sf[1], "\"")
+						to_user_found = true
+					}
+					// set the TOPass after trimming off any surrounding quotes.
+					if sf[0] == "TO_PASS" {
+						cfg.TOPass = strings.Trim(sf[1], "\"")
+						to_pass_found = true
+					}
+				}
+			}
+		}
+	}
+	if !to_url_found && !to_user_found && !to_pass_found {
+		return errors.New("failed to retrieve one or more TrafficOps credentails")
+	}
+
+	return nil
+}
+
+func GetConfig() (Cfg, error, bool) {
+	var err error
+	var configFile string
+	var logLocationErr = log.LogLocationStderr
+	var logLocationDebug = log.LogLocationNull
+	var logLocationInfo = log.LogLocationNull
+	var logLocationWarn = log.LogLocationNull
+
+	configFilePtr := getopt.StringLong("config-file", 'f', DefaultConfigFile, "full path to the json config file")
+	logdirPtr := getopt.StringLong("logging-dir", 'l', DefaultLogDirectory, "directory location for log files")
+	helpPtr := getopt.BoolLong("help", 'h', "Print usage information and exit")
+	verbosePtr := getopt.CounterLong("verbose", 'v', `Log verbosity. Logging is output to stderr. By default, errors are logged. To log warnings, pass '-v'. To log info, pass '-vv', debug pass '-vvv'`)
+
+	getopt.Parse()
+
+	if configFilePtr != nil {
+		configFile = *configFilePtr
+	} else {
+		configFile = DefaultConfigFile
+	}
+
+	var logfile string
+
+	logfile = *logdirPtr + "/" + DefaultLogFile

Review comment:
       Nitpick: better to use `filepath.Join` instead of `/`, to work on aberrant operating systems




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] jrushford commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
jrushford commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688757194



##########
File path: cache-config/tm-health-client/config/config.go
##########
@@ -0,0 +1,256 @@
+package config
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"encoding/json"
+	"errors"
+	"io/ioutil"
+	"net/url"
+	"os"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/lib/go-log"
+	toclient "github.com/apache/trafficcontrol/traffic_ops/v3-client"
+	"github.com/pborman/getopt/v2"

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688664593



##########
File path: cache-config/tm-health-client/tmutil/tmutil.go
##########
@@ -0,0 +1,674 @@
+package tmutil
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"bytes"
+	"errors"
+	"io/ioutil"
+	"math/rand"
+	"net"
+	"os"
+	"os/exec"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/cache-config/tm-health-client/config"
+	"github.com/apache/trafficcontrol/lib/go-log"
+	"github.com/apache/trafficcontrol/lib/go-tc"
+	"github.com/apache/trafficcontrol/traffic_monitor/datareq"
+	"github.com/apache/trafficcontrol/traffic_monitor/tmclient"
+	"gopkg.in/yaml.v2"
+)
+
+const serverRequest = "https://tp.cdn.comcast.net/api/3.0/servers?type=RASCAL"
+
+type ParentAvailable interface {
+	available() bool
+}
+
+// the necessary fields of a trafficserver parents config file needed to
+// read the file and keep track of it's modification time.
+type ParentConfigFile struct {
+	Filename       string
+	LastModifyTime int64
+}
+
+// the necessary data required to keep track of trafficserver config
+// files, lists of parents a trafficserver instance uses, and directory
+// locations used for configuration and trafficserver executables.
+type ParentInfo struct {
+	ParentDotConfig        ParentConfigFile
+	StrategiesDotYaml      ParentConfigFile
+	TrafficServerBinDir    string
+	TrafficServerConfigDir string
+	Parents                map[string]ParentStatus
+	Cfg                    config.Cfg
+}
+
+// when reading the strategies.yaml, these fields are used to help
+// parse out fail_over objects
+type FailOver struct {
+	MaxSimpleRetries      int      `yaml:"max_simple_retries,omitempty"`
+	MaxUnavailableRetries int      `yaml:"max_unavailable_retries,omitempty"`
+	RingMode              string   `yaml:"ring_mode,omitempty"`
+	ResponseCodes         []int    `yaml:"response_codes,omitempty"`
+	MarkDownCodes         []int    `yaml:"markdown_codes,omitempty"`
+	HealthCheck           []string `yaml:"health_check,omitempty"`
+}
+
+// the trafficserver 'HostStatus' fields that are necessary to interface
+// with the trafficserver 'traffic_ctl' command.
+type ParentStatus struct {
+	Fqdn         string
+	ActiveReason bool
+	LocalReason  bool
+	ManualReason bool
+}
+
+// used to get the overall parent availablity from the
+// HostStatus markdown reasons.  all markdown reasons
+// must be true for a parent to be considered available
+func (p ParentStatus) available() bool {
+	if !p.ActiveReason {
+		return false
+	} else if !p.LocalReason {
+		return false
+	} else if !p.ManualReason {
+		return false
+	}
+	return true
+}
+
+// used to log that a parent's status is either UP or
+// DOWN based upon the HostStatus reason codes.  to
+// be considered UP, all reason codes must be 'true'
+func (p ParentStatus) Status() string {
+	if !p.ActiveReason {
+		return "DOWN"
+	} else if !p.LocalReason {
+		return "DOWN"
+	} else if !p.ManualReason {
+		return "DOWN"
+	}
+	return "UP"
+}
+
+type StatusReason int
+
+// these are the HostStatus reason codes used withing
+// trafficserver.
+const (
+	ACTIVE StatusReason = iota
+	LOCAL
+	MANUAL
+)
+
+// used for logging a parent's HostStatus reason code
+// setting.
+func (s StatusReason) String() string {
+	switch s {
+	case ACTIVE:
+		return "ACTIVE"
+	case LOCAL:
+		return "LOCAL"
+	case MANUAL:
+		return "MANUAL"
+	}
+	return "UNDEFINED"
+}
+
+// the fields used from strategies.yaml that describe
+// a parent.
+type Host struct {
+	HostName  string     `yaml:"host"`
+	Protocols []Protocol `yaml:"protocol"`
+}
+
+// the protocol object in strategies.yaml that help to
+// describe a parent.
+type Protocol struct {
+	Scheme           string  `yaml:"scheme"`
+	Port             int     `yaml:"port"`
+	Health_check_url string  `yaml:"health_check_url,omitempty"`
+	Weight           float64 `yaml:"weight,omitempty"`
+}
+
+// a trafficserver strategy object from strategies.yaml
+type Strategy struct {
+	Strategy        string   `yaml:"strategy"`
+	Policy          string   `yaml:"policy"`
+	HashKey         string   `yaml:"hash_key,omitempty"`
+	GoDirect        bool     `yaml:"go_direct,omitempty"`
+	ParentIsProxy   bool     `yaml:"parent_is_proxy,omitempty"`
+	CachePeerResult bool     `yaml:"cache_peer_result,omitempty"`
+	Scheme          string   `yaml:"scheme"`
+	FailOvers       FailOver `yaml:"failover,omitempty"`
+}
+
+// the top level array defintions in a trafficserver strategies.yaml
+// configuration file.
+type Strategies struct {
+	Strategy []Strategy    `yaml:"strategies"`
+	Hosts    []Host        `yaml:"hosts"`
+	Groups   []interface{} `yaml:"groups"`
+}
+
+// used at startup to load a trafficservers list of parents from
+// it's 'parent.config', 'strategies.yaml' and current parent
+// status from trafficservers HostStatus subsystem.
+func NewParentInfo(cfg config.Cfg) (*ParentInfo, error) {
+
+	parentConfig := cfg.TrafficServerConfigDir + "/parent.config"
+	modTime, err := getFileModificationTime(parentConfig)
+	if err != nil {
+		return nil, errors.New("error reading parent.config: " + err.Error())
+	}
+	parents := ParentConfigFile{
+		Filename:       parentConfig,
+		LastModifyTime: modTime,
+	}
+
+	stratyaml := cfg.TrafficServerConfigDir + "/strategies.yaml"
+	modTime, err = getFileModificationTime(stratyaml)
+	if err != nil {
+		return nil, errors.New("error reading strategies.yaml: " + err.Error())
+	}
+
+	strategies := ParentConfigFile{
+		Filename:       cfg.TrafficServerConfigDir + "/strategies.yaml",
+		LastModifyTime: modTime,
+	}
+
+	parentInfo := ParentInfo{
+		ParentDotConfig:        parents,
+		StrategiesDotYaml:      strategies,
+		TrafficServerBinDir:    cfg.TrafficServerBinDir,
+		TrafficServerConfigDir: cfg.TrafficServerConfigDir,
+	}
+
+	// initialize the trafficserver parents map
+	parentStatus := make(map[string]ParentStatus)
+
+	// read parent.config
+	if err := parentInfo.readParentConfig(parentStatus); err != nil {
+		return nil, errors.New("loading parent.config file: " + err.Error())
+	}
+
+	// read strategies.yaml
+	if err := parentInfo.readStrategies(parentStatus); err != nil {
+		return nil, errors.New("loading parent strategies.yaml file: " + err.Error())
+	}
+
+	// collect the trafficserver parent status from the HostStatus subsystem
+	if err := parentInfo.readHostStatus(parentStatus); err != nil {
+		return nil, errors.New("reading trafficserver host status: " + err.Error())
+	}
+
+	log.Infof("startup loaded %d parent records\n", len(parentStatus))
+
+	parentInfo.Parents = parentStatus
+	parentInfo.Cfg = cfg
+
+	return &parentInfo, nil
+}
+
+// Queries a traffic monitor that is monitoring the trafficserver instance running on a host to
+// obtain the availability, health, of a parent used by trafficserver.
+func (c *ParentInfo) GetCacheStatuses() (map[tc.CacheName]datareq.CacheStatus, error) {
+
+	tmHostName, err := c.findATrafficMonitor()
+	if err != nil {
+		return nil, errors.New("finding a trafficmonitor: " + err.Error())
+	}
+	tmc := tmclient.New("http://"+tmHostName, config.GetRequestTimeout())
+
+	return tmc.CacheStatuses()
+}
+
+// The main polling function that keeps the parents list current if
+// with any changes to the trafficserver parent.config or strategies.yaml.
+// Also, it keeps parent status current with the the trafficserver HostStatus
+// subsystem.  Finally, on each poll cycle a trafficmonitor is queried to check
+// that all parents used by this trafficserver are available for use based upon
+// the trafficmonitors idea from it's health protocol.  Parents are marked up or
+// down in the trafficserver subsystem based upon that hosts current status and
+// the status that trafficmonitor health protocol has determined for a parent
+func (c *ParentInfo) PollAndUpdateCacheStatus() {
+	pollingInterval := config.GetTMPollingInterval()
+	log.Infoln("polling started")
+
+	for {
+		if err := c.UpdateParentInfo(); err != nil {
+			log.Errorf("could not load new ATS parent info: %s\n", err.Error())
+		} else {
+			log.Debugf("updated parent info, total number of parents: %d\n", len(c.Parents))
+		}
+
+		// read traffic manager cache statuses
+		caches, err := c.GetCacheStatuses()
+		if err != nil {
+			log.Errorln(err.Error())
+		}
+
+		for k, v := range caches {
+			hostName := string(k)
+			cs, ok := c.Parents[hostName]
+			if ok {
+				tmAvailable := *v.CombinedAvailable
+				if cs.available() != tmAvailable {
+					if !c.Cfg.EnableActiveMarkdowns {
+						if !tmAvailable {
+							log.Infof("TM reports that %s is not available and should be marked DOWN but, mark downs are disabled by configuration", hostName)
+						} else {
+							log.Infof("TM reports that %s is available and should be marked UP but, mark up is disabled by configuration", hostName)
+						}
+					} else {
+						if err = c.markParent(cs.Fqdn, *v.Status, tmAvailable); err != nil {
+							log.Errorln(err.Error())
+						}
+					}
+				}
+			}
+		}
+
+		time.Sleep(pollingInterval)
+	}
+}
+
+// Used by the polling function to update the parents list from
+// changes to parent.config and strategies.yaml.  The parents
+// availability is also updated to reflect the current state from
+// the trafficserver HostStatus subsystem.
+func (c *ParentInfo) UpdateParentInfo() error {
+	ptime, err := getFileModificationTime(c.ParentDotConfig.Filename)
+	if err != nil {
+		return errors.New("error reading parent.config: " + err.Error())
+	}
+	stime, err := getFileModificationTime(c.StrategiesDotYaml.Filename)
+	if err != nil {
+		return errors.New("error reading strategies.yaml: " + err.Error())
+	}
+	if c.ParentDotConfig.LastModifyTime < ptime {
+		// read parent.config
+		if err := c.readParentConfig(c.Parents); err != nil {
+			return errors.New("updating parent.config file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new parent.config, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	if c.StrategiesDotYaml.LastModifyTime < stime {
+		// read strategies.yaml
+		if err := c.readStrategies(c.Parents); err != nil {
+			return errors.New("updating parent strategies.yaml file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new strategies.yaml, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	// collect the trafficserver current host status
+	if err := c.readHostStatus(c.Parents); err != nil {
+		return errors.New("reading latest trafficserver host status: " + err.Error())
+	}
+
+	return nil
+}
+
+// choose an availabe trafficmonitor, returns an error if
+// there are none

Review comment:
       Nitpick: needs a period, GoDoc comments should be complete sentences/paragraphs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] jrushford commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
jrushford commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688758309



##########
File path: cache-config/tm-health-client/tmutil/tmutil.go
##########
@@ -0,0 +1,674 @@
+package tmutil
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import (
+	"bufio"
+	"bytes"
+	"errors"
+	"io/ioutil"
+	"math/rand"
+	"net"
+	"os"
+	"os/exec"
+	"strings"
+	"time"
+
+	"github.com/apache/trafficcontrol/cache-config/tm-health-client/config"
+	"github.com/apache/trafficcontrol/lib/go-log"
+	"github.com/apache/trafficcontrol/lib/go-tc"
+	"github.com/apache/trafficcontrol/traffic_monitor/datareq"
+	"github.com/apache/trafficcontrol/traffic_monitor/tmclient"
+	"gopkg.in/yaml.v2"
+)
+
+const serverRequest = "https://tp.cdn.comcast.net/api/3.0/servers?type=RASCAL"
+
+type ParentAvailable interface {
+	available() bool
+}
+
+// the necessary fields of a trafficserver parents config file needed to
+// read the file and keep track of it's modification time.
+type ParentConfigFile struct {
+	Filename       string
+	LastModifyTime int64
+}
+
+// the necessary data required to keep track of trafficserver config
+// files, lists of parents a trafficserver instance uses, and directory
+// locations used for configuration and trafficserver executables.
+type ParentInfo struct {
+	ParentDotConfig        ParentConfigFile
+	StrategiesDotYaml      ParentConfigFile
+	TrafficServerBinDir    string
+	TrafficServerConfigDir string
+	Parents                map[string]ParentStatus
+	Cfg                    config.Cfg
+}
+
+// when reading the strategies.yaml, these fields are used to help
+// parse out fail_over objects
+type FailOver struct {
+	MaxSimpleRetries      int      `yaml:"max_simple_retries,omitempty"`
+	MaxUnavailableRetries int      `yaml:"max_unavailable_retries,omitempty"`
+	RingMode              string   `yaml:"ring_mode,omitempty"`
+	ResponseCodes         []int    `yaml:"response_codes,omitempty"`
+	MarkDownCodes         []int    `yaml:"markdown_codes,omitempty"`
+	HealthCheck           []string `yaml:"health_check,omitempty"`
+}
+
+// the trafficserver 'HostStatus' fields that are necessary to interface
+// with the trafficserver 'traffic_ctl' command.
+type ParentStatus struct {
+	Fqdn         string
+	ActiveReason bool
+	LocalReason  bool
+	ManualReason bool
+}
+
+// used to get the overall parent availablity from the
+// HostStatus markdown reasons.  all markdown reasons
+// must be true for a parent to be considered available
+func (p ParentStatus) available() bool {
+	if !p.ActiveReason {
+		return false
+	} else if !p.LocalReason {
+		return false
+	} else if !p.ManualReason {
+		return false
+	}
+	return true
+}
+
+// used to log that a parent's status is either UP or
+// DOWN based upon the HostStatus reason codes.  to
+// be considered UP, all reason codes must be 'true'
+func (p ParentStatus) Status() string {
+	if !p.ActiveReason {
+		return "DOWN"
+	} else if !p.LocalReason {
+		return "DOWN"
+	} else if !p.ManualReason {
+		return "DOWN"
+	}
+	return "UP"
+}
+
+type StatusReason int
+
+// these are the HostStatus reason codes used withing
+// trafficserver.
+const (
+	ACTIVE StatusReason = iota
+	LOCAL
+	MANUAL
+)
+
+// used for logging a parent's HostStatus reason code
+// setting.
+func (s StatusReason) String() string {
+	switch s {
+	case ACTIVE:
+		return "ACTIVE"
+	case LOCAL:
+		return "LOCAL"
+	case MANUAL:
+		return "MANUAL"
+	}
+	return "UNDEFINED"
+}
+
+// the fields used from strategies.yaml that describe
+// a parent.
+type Host struct {
+	HostName  string     `yaml:"host"`
+	Protocols []Protocol `yaml:"protocol"`
+}
+
+// the protocol object in strategies.yaml that help to
+// describe a parent.
+type Protocol struct {
+	Scheme           string  `yaml:"scheme"`
+	Port             int     `yaml:"port"`
+	Health_check_url string  `yaml:"health_check_url,omitempty"`
+	Weight           float64 `yaml:"weight,omitempty"`
+}
+
+// a trafficserver strategy object from strategies.yaml
+type Strategy struct {
+	Strategy        string   `yaml:"strategy"`
+	Policy          string   `yaml:"policy"`
+	HashKey         string   `yaml:"hash_key,omitempty"`
+	GoDirect        bool     `yaml:"go_direct,omitempty"`
+	ParentIsProxy   bool     `yaml:"parent_is_proxy,omitempty"`
+	CachePeerResult bool     `yaml:"cache_peer_result,omitempty"`
+	Scheme          string   `yaml:"scheme"`
+	FailOvers       FailOver `yaml:"failover,omitempty"`
+}
+
+// the top level array defintions in a trafficserver strategies.yaml
+// configuration file.
+type Strategies struct {
+	Strategy []Strategy    `yaml:"strategies"`
+	Hosts    []Host        `yaml:"hosts"`
+	Groups   []interface{} `yaml:"groups"`
+}
+
+// used at startup to load a trafficservers list of parents from
+// it's 'parent.config', 'strategies.yaml' and current parent
+// status from trafficservers HostStatus subsystem.
+func NewParentInfo(cfg config.Cfg) (*ParentInfo, error) {
+
+	parentConfig := cfg.TrafficServerConfigDir + "/parent.config"
+	modTime, err := getFileModificationTime(parentConfig)
+	if err != nil {
+		return nil, errors.New("error reading parent.config: " + err.Error())
+	}
+	parents := ParentConfigFile{
+		Filename:       parentConfig,
+		LastModifyTime: modTime,
+	}
+
+	stratyaml := cfg.TrafficServerConfigDir + "/strategies.yaml"
+	modTime, err = getFileModificationTime(stratyaml)
+	if err != nil {
+		return nil, errors.New("error reading strategies.yaml: " + err.Error())
+	}
+
+	strategies := ParentConfigFile{
+		Filename:       cfg.TrafficServerConfigDir + "/strategies.yaml",
+		LastModifyTime: modTime,
+	}
+
+	parentInfo := ParentInfo{
+		ParentDotConfig:        parents,
+		StrategiesDotYaml:      strategies,
+		TrafficServerBinDir:    cfg.TrafficServerBinDir,
+		TrafficServerConfigDir: cfg.TrafficServerConfigDir,
+	}
+
+	// initialize the trafficserver parents map
+	parentStatus := make(map[string]ParentStatus)
+
+	// read parent.config
+	if err := parentInfo.readParentConfig(parentStatus); err != nil {
+		return nil, errors.New("loading parent.config file: " + err.Error())
+	}
+
+	// read strategies.yaml
+	if err := parentInfo.readStrategies(parentStatus); err != nil {
+		return nil, errors.New("loading parent strategies.yaml file: " + err.Error())
+	}
+
+	// collect the trafficserver parent status from the HostStatus subsystem
+	if err := parentInfo.readHostStatus(parentStatus); err != nil {
+		return nil, errors.New("reading trafficserver host status: " + err.Error())
+	}
+
+	log.Infof("startup loaded %d parent records\n", len(parentStatus))
+
+	parentInfo.Parents = parentStatus
+	parentInfo.Cfg = cfg
+
+	return &parentInfo, nil
+}
+
+// Queries a traffic monitor that is monitoring the trafficserver instance running on a host to
+// obtain the availability, health, of a parent used by trafficserver.
+func (c *ParentInfo) GetCacheStatuses() (map[tc.CacheName]datareq.CacheStatus, error) {
+
+	tmHostName, err := c.findATrafficMonitor()
+	if err != nil {
+		return nil, errors.New("finding a trafficmonitor: " + err.Error())
+	}
+	tmc := tmclient.New("http://"+tmHostName, config.GetRequestTimeout())
+
+	return tmc.CacheStatuses()
+}
+
+// The main polling function that keeps the parents list current if
+// with any changes to the trafficserver parent.config or strategies.yaml.
+// Also, it keeps parent status current with the the trafficserver HostStatus
+// subsystem.  Finally, on each poll cycle a trafficmonitor is queried to check
+// that all parents used by this trafficserver are available for use based upon
+// the trafficmonitors idea from it's health protocol.  Parents are marked up or
+// down in the trafficserver subsystem based upon that hosts current status and
+// the status that trafficmonitor health protocol has determined for a parent
+func (c *ParentInfo) PollAndUpdateCacheStatus() {
+	pollingInterval := config.GetTMPollingInterval()
+	log.Infoln("polling started")
+
+	for {
+		if err := c.UpdateParentInfo(); err != nil {
+			log.Errorf("could not load new ATS parent info: %s\n", err.Error())
+		} else {
+			log.Debugf("updated parent info, total number of parents: %d\n", len(c.Parents))
+		}
+
+		// read traffic manager cache statuses
+		caches, err := c.GetCacheStatuses()
+		if err != nil {
+			log.Errorln(err.Error())
+		}
+
+		for k, v := range caches {
+			hostName := string(k)
+			cs, ok := c.Parents[hostName]
+			if ok {
+				tmAvailable := *v.CombinedAvailable
+				if cs.available() != tmAvailable {
+					if !c.Cfg.EnableActiveMarkdowns {
+						if !tmAvailable {
+							log.Infof("TM reports that %s is not available and should be marked DOWN but, mark downs are disabled by configuration", hostName)
+						} else {
+							log.Infof("TM reports that %s is available and should be marked UP but, mark up is disabled by configuration", hostName)
+						}
+					} else {
+						if err = c.markParent(cs.Fqdn, *v.Status, tmAvailable); err != nil {
+							log.Errorln(err.Error())
+						}
+					}
+				}
+			}
+		}
+
+		time.Sleep(pollingInterval)
+	}
+}
+
+// Used by the polling function to update the parents list from
+// changes to parent.config and strategies.yaml.  The parents
+// availability is also updated to reflect the current state from
+// the trafficserver HostStatus subsystem.
+func (c *ParentInfo) UpdateParentInfo() error {
+	ptime, err := getFileModificationTime(c.ParentDotConfig.Filename)
+	if err != nil {
+		return errors.New("error reading parent.config: " + err.Error())
+	}
+	stime, err := getFileModificationTime(c.StrategiesDotYaml.Filename)
+	if err != nil {
+		return errors.New("error reading strategies.yaml: " + err.Error())
+	}
+	if c.ParentDotConfig.LastModifyTime < ptime {
+		// read parent.config
+		if err := c.readParentConfig(c.Parents); err != nil {
+			return errors.New("updating parent.config file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new parent.config, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	if c.StrategiesDotYaml.LastModifyTime < stime {
+		// read strategies.yaml
+		if err := c.readStrategies(c.Parents); err != nil {
+			return errors.New("updating parent strategies.yaml file: " + err.Error())
+		} else {
+			log.Infof("updated parents from new strategies.yaml, total parents: %d\n", len(c.Parents))
+		}
+	}
+
+	// collect the trafficserver current host status
+	if err := c.readHostStatus(c.Parents); err != nil {
+		return errors.New("reading latest trafficserver host status: " + err.Error())
+	}
+
+	return nil
+}
+
+// choose an availabe trafficmonitor, returns an error if
+// there are none

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c merged pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c merged pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #6097: Adds the Traffic Monitor health client.

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #6097:
URL: https://github.com/apache/trafficcontrol/pull/6097#discussion_r688639886



##########
File path: cache-config/tm-health-client/tm-health-client.json
##########
@@ -0,0 +1,11 @@
+{
+  "cdn-name": "over-the-top",

Review comment:
       Nitpick: our Comcast CDN names aren't really secret, but probably better to use something more generic here like "my-cdn" or something




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficcontrol.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org