You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficcontrol.apache.org by GitBox <gi...@apache.org> on 2020/04/27 17:14:14 UTC

[GitHub] [trafficcontrol] srijeet0406 opened a new pull request #4665: Support IMS header in Get endpoints

srijeet0406 opened a new pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665


   <!--
   ************ STOP!! ************
   If this Pull Request is intended to fix a security vulnerability, DO NOT submit it! Instead, contact
   the Apache Software Foundation Security Team at security@trafficcontrol.apache.org and follow the
   guidelines at https://www.apache.org/security/ regarding vulnerability disclosure.
   -->
   ## What does this PR (Pull Request) do?
   <!-- Explain the changes you made here. If this fixes an Issue, identify it by
   replacing the text in the checkbox item with the Issue number e.g.
   
   - [x] This PR fixes #9001 OR is not related to any Issue
   
   ^ This will automatically close Issue number 9001 when the Pull Request is
   merged (The '#' is important).
   
   Be sure you check the box properly, see the "The following criteria are ALL
   met by this PR" section for details.
   -->
   
   - [ ] This PR fixes #REPLACE_ME OR is not related to any Issue <!-- You can check for an issue here: https://github.com/apache/trafficcontrol/issues -->
   
   
   ## Which Traffic Control components are affected by this PR?
   <!-- Please delete all components from this list that are NOT affected by this
   Pull Request. Also, feel free to add the name of a tool or script that is
   affected but not on the list.
   
   Additionally, if this Pull Request does NOT affect documentation, please
   explain why documentation is not required. -->
   
   - CDN in a Box
   - Documentation
   - Grove
   - Traffic Control Client <!-- Please specify which; e.g. 'Python', 'Go', 'Java' -->
   - Traffic Monitor
   - Traffic Ops
   - Traffic Ops ORT
   - Traffic Portal
   - Traffic Router
   - Traffic Stats
   - Traffic Vault
   
   ## What is the best way to verify this PR?
   <!-- Please include here ALL the steps necessary to test your Pull Request. If
   it includes tests (and most should), outline here the steps needed to run the
   tests. If not, lay out the manual testing procedure and please explain why
   tests are unnecessary for this Pull Request. -->
   
   ## If this is a bug fix, what versions of Traffic Control are affected?
   <!-- If this PR fixes a bug, please list here all of the affected versions - to
   the best of your knowledge. It's also pretty helpful to include a commit hash
   of where 'master' is at the time this PR is opened (if it affects master),
   because what 'master' means will change over time. For example, if this PR
   fixes a bug that's present in master (at commit hash '1df853c8'), in v4.0.0,
   and in the current 4.0.1 Release candidate (e.g. RC1), then this list would
   look like:
   
   - master (1df853c8)
   - 4.0.0
   - 4.0.1 (RC1)
   
   If you don't know what other versions might have this bug, AND don't know how
   to find the commit hash of 'master', then feel free to leave this section
   blank (or, preferably, delete it entirely).
    -->
   
   
   ## The following criteria are ALL met by this PR
   <!-- Check the boxes to signify that the associated statement is true. To
   "check a box", replace the space inside of the square brackets with an 'x'.
   e.g.
   
   - [ x] <- Wrong
   - [x ] <- Wrong
   - [] <- Wrong
   - [*] <- Wrong
   - [x] <- Correct!
   
   -->
   
   - [ ] This PR includes tests OR I have explained why tests are unnecessary
   - [ ] This PR includes documentation OR I have explained why documentation is unnecessary
   - [ ] This PR includes an update to CHANGELOG.md OR such an update is not necessary
   - [ ] This PR includes any and all required license headers
   - [ ] This PR ensures that database migration sequence is correct OR this PR does not include a database migration
   - [ ] This PR **DOES NOT FIX A SERIOUS SECURITY VULNERABILITY** (see [the Apache Software Foundation's security guidelines](https://www.apache.org/security/) for details)
   
   
   ## Additional Information
   <!-- If you would like to include any additional information on the PR for
   potential reviewers please put it here.
   
   Some examples of this would be:
   
   - Before and after screenshots/gifs of the Traffic Portal if it is affected
   - Links to other dependent Pull Requests
   - References to relevant context (e.g. new/updates to dependent libraries,
   mailing list records, blueprints)
   
   Feel free to leave this section blank (or, preferably, delete it entirely).
   -->
   
   <!--
   Licensed to the Apache Software Foundation (ASF) under one
   or more contributor license agreements.  See the NOTICE file
   distributed with this work for additional information
   regarding copyright ownership.  The ASF licenses this file
   to you under the Apache License, Version 2.0 (the
   "License"); you may not use this file except in compliance
   with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
   Unless required by applicable law or agreed to in writing,
   software distributed under the License is distributed on an
   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
   KIND, either express or implied.  See the License for the
   specific language governing permissions and limitations
   under the License.
   -->
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r428243652



##########
File path: traffic_ops/app/db/migrations/20200430000000_add_on_delete_trigger.sql
##########
@@ -0,0 +1,419 @@
+-- +goose Up

Review comment:
       The new tables (ip_address and interface) do not have `lastUpdated` fields so I dont think we'll be needing entries for those tables. But I agree about the name of the migration, I have changed that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-622441217


   @rawlinp @rob05c I'm close to being done with the logging and indexing changes. I'll also add the `Last-Modified` header to the responses and update the PR. I believe there's another ticket for the `ETags`, not fully sure.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r428774816



##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {

Review comment:
       Ah, yeah, but `l` doesn't _have_ to be local to the if/else block scope. Also, what about naming it `imsDate`? `l` doesn't really convey what it is and makes `if l.After(v.LatestTime.Time) {` harder to understand.

##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {
+		query := val.SelectMaxLastUpdatedQuery(where, orderBy, pagination, val.GetType())
+		rows, err := val.APIInfo().Tx.NamedQuery(query, queryValues)
+		defer rows.Close()
+		if err != nil {
+			log.Warnf("Couldn't get the max last updated time: %v", err)
+			return runSecond, max
+		}
+		if err == sql.ErrNoRows {
+			runSecond = false

Review comment:
       I find that unnecessary state changes make the code harder to follow -- since it is primarily just the return value, there's no real need for it to change state -- they could be `const` too.
   
   `return dontRunSecond, max` and `return runSecond, max` makes it more clear IMO, but if you don't want to change it that's fine too. Just a suggestion.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416305815



##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       The reason I did that is because of the way the "ParamColumns" method works. Its creating an alias for the main table and then passing that along to the dbHelpers function to create the "where" clause. If we have a deleted table, we would need to query against that table (with its own alias as well). For ex, this:
   
   `SELECT max(t) from (
   		SELECT max(last_updated) as t from type typ WHERE typ.id=12345
   		UNION ALL
   	SELECT max(last_updated) as t from deleted_type dtyp WHERE dtyp.id=12345
   		) as res`
   This is why I was thinking of splitting out the two. The only way I could think of was by changing the "ParamColumns" method, but that seemed like a more involved change. Is there a better way of doing this? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r440382470



##########
File path: traffic_ops/client/cdn.go
##########
@@ -100,10 +100,33 @@ func (to *Session) GetCDNByID(id int) ([]tc.CDN, ReqInf, error) {
 	return data.Response, reqInf, nil
 }
 
+func (to *Session) GetCDNByNameIMS(name string, header http.Header) ([]tc.CDN, ReqInf, error) {
+	url := fmt.Sprintf("%s?name=%s", API_CDNS, url.QueryEscape(name))
+	resp, remoteAddr, err := to.requestIMS(http.MethodGet, url, header)
+	reqInf := ReqInf{CacheHitStatus: CacheHitStatusMiss, RemoteAddr: remoteAddr}
+	if resp != nil {
+		reqInf.StatusCode = resp.StatusCode
+	}
+	if resp.StatusCode == http.StatusNotModified {

Review comment:
       this should be moved into the previous block because `resp` might be nil here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r447279704



##########
File path: traffic_ops/traffic_ops_golang/cachegroup/cachegroups.go
##########
@@ -444,7 +445,7 @@ func (cg *TOCacheGroup) Read(h http.Header, useIMS bool) ([]interface{}, error,
 	}
 
 	if useIMS {
-		runSecond, maxTime := ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))
+		runSecond, maxTime = ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))

Review comment:
       Correct, I set the "last-modified" in the response only if we get a valid IMS request header, and if the "use_ims" flag is set to true in the conf file.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r428155185



##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {
+		query := val.SelectMaxLastUpdatedQuery(where, orderBy, pagination, val.GetType())
+		rows, err := val.APIInfo().Tx.NamedQuery(query, queryValues)
+		defer rows.Close()
+		if err != nil {
+			log.Warnf("Couldn't get the max last updated time: %v", err)
+			return runSecond, max
+		}
+		if err == sql.ErrNoRows {
+			runSecond = false

Review comment:
       I was just trying to minimize the number of variables that I use in the function. I always find it easier to read a piece of code if there is one variable tracking the state of the system, rather than multiple. Just a personal preference, but let me know if you'd like me to change it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416305815



##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       The reason I did that is because of the way the "ParamColumns" method works. Its creating an alias for the main table and then passing that along to the dbHelpers function to create the "where" clause. If we have a deleted table, we would need to query against that table (with its own alias as well). For ex, this:
   
   ```SELECT max(t) from (
   ```SELECT max(last_updated) as t from type typ WHERE typ.id=12345
   ```UNION ALL
   ```SELECT max(last_updated) as t from deleted_type dtyp WHERE dtyp.id=12345
   ```) as res
   
   This is why I was thinking of splitting out the two. The only way I could think of was by changing the "ParamColumns" method, but that seemed like a more involved change. Is there a better way of doing this? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] mitchell852 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
mitchell852 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r447141840



##########
File path: docs/source/admin/traffic_ops.rst
##########
@@ -443,6 +443,10 @@ This file deals with the configuration parameters of running Traffic Ops itself.
 		:disabled_routes: A list of API route IDs to disable. Requests matching these routes will receive a 503 response. To find the route ID for a given path you would like to disable, run ``./traffic_ops_golang`` using the :option:`--api-routes` option to view all the route information, including route IDs and paths.
 		:ignore_unknown_routes: If ``false`` (default) return an error and prevent startup if unknown route IDs are found. Otherwise, log a warning and continue startup.
 
+:use_ims:
+    .. versionadded:: 5.0
+    This is an optional boolean value to enable the handling of the "If-Modified-Since" HTTP request header. Default: false

Review comment:
       what do you think about adding
   
   ```"use_ims": false,```
   
   to cdn.conf? rather than excluding it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417559896



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       In your example, I think it would be fine if we returned the full response back because the reality is that things are not deleted very often, so I don't think the theoretical savings of those rare cases is worth the cost of dev/maintenance. IMO I think it would be better to keep the deletion case simple, not fully duplicate tables, and not fully duplicate queries. If anything was deleted from a table since time T, _any_ IMS requests with time < T that involve those tables could simply return the full response. Since deletion is so rare, the extra complexity doesn't seem worth the savings.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417670476



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       >  Without it, it'll mean any modification to any profile will make IMS fail on ORT for every server.
   
   But we have a plan to address this in the near future (cache config snapshots), to where this would no longer matter for ORT anyways. So really, the IMS here is really for optimizing things like SREs doing things in Traffic Portal, etc, not for ORT (long-term).
   
   > it doesn't seem unreasonable, it just means adding a column has to add it in both tables
   
   That is where we get into trouble. We already have problems with keeping our nullable and non-nullable Go structs in sync with each other. Unless it's been hardcoded into your brain that you need to update things in all the various places that they're duplicated, you will forget to do so. Duplication is not ideal. It's also an additional, ongoing cost to developing new features, not just maintenance.
   
   >  Our data is tiny.
   
   I'm not really concerned about the size of the data, I'm more concerned about the _contents_ of the data. Data is a liability. We have things like passwords, physical addresses, IP addresses, etc. -- all things that don't really make me comfortable knowing that we'd be holding onto them indefinitely.
   
   > The savings are not theoretical
   
   By theoretical I meant in the sense of the _amount_ of savings. I'm not disagreeing that successful IMS requests are _actual_ savings.
   
   > and I don't believe the cases are rare.
   
   I guess that really depends on what we each think "rare" means. What do you consider "rare"? Once a day? A week? A month?
   
   For reference, in the last ninety days at my company, we only issued 104 `DELETE` TO API requests. Now, that isn't including requests like `POST /profileparameter` that actually delete things under the hood, but we had 12 of those `POST /profileparameter` requests in the last ninety days. So we're talking about slightly over one thing a day on average that could potentially invalidate cache for a certain table. Also, they appeared to mostly be DELETEs of servers and delivery services, which make sense, but it's important to also note that those are also the kinds of resources that are typically asked for in large quantities via GET requests.
   
   Think about opening either of those tables (servers and DSes) in TP -- by default TP requests _all_ of those, and those are easily the two slowest pages in TP. If there were _any_ servers or DSes deleted, it doesn't matter if we tracked every single one deleted or simply tracked that certain tables had been deleted from. That cache would be invalidated the same either way. Why not try to keep the implementation simple? 
   
   > The query is not "fully duplicated" in code, the query parameters are abstracted
   
   I was referring to the `insert into deleted_table` queries -- they contain a lot of duplicated code with their corresponding `insert into table` and `select from table` queries. We would need to be careful to share as much as possible between those three queries rather than having three separate, mostly-duplicated queries, to keep in sync.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417540449



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       Right, we could only track IDs, but then any change would flag as modified if anything had been deleted, even if it wasn't in the filter. 
   
   E.g. if someone deleted a Parameter on Profile Foo, and someone requested `profiles/name/bar/parameters`, it would flag as modified, even though it wasn't really.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417530126



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       I've been trying to follow along, but I might've missed something. Why do we need to copy all the existing tables to `deleted_foo`? At a high level, don't we only care about the answer to the question "have any foos been updated or deleted since this point in time?" 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] ocket8888 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
ocket8888 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r420269096



##########
File path: traffic_ops/traffic_ops_golang/traffic_ops_golang.go
##########
@@ -59,6 +59,7 @@ func main() {
 	configFileName := flag.String("cfg", "", "The config file path")
 	dbConfigFileName := flag.String("dbcfg", "", "The db config file path")
 	riakConfigFileName := flag.String("riakcfg", "", "The riak config file path")
+	useIMS := flag.Bool("use_ims", false, "Use If-Modified-Since requests")

Review comment:
       I'm good with that too.
   
   In that case it should be noted in the [TO configuration file section](https://traffic-control-cdn.readthedocs.io/en/latest/admin/traffic_ops.html#cdn-conf).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r420301477



##########
File path: traffic_ops/traffic_ops_golang/traffic_ops_golang.go
##########
@@ -59,6 +59,7 @@ func main() {
 	configFileName := flag.String("cfg", "", "The config file path")
 	dbConfigFileName := flag.String("dbcfg", "", "The db config file path")
 	riakConfigFileName := flag.String("riakcfg", "", "The riak config file path")
+	useIMS := flag.Bool("use_ims", false, "Use If-Modified-Since requests")

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417533955



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       We need to keep track of the deleted components because, if we just compare the "last_updated" times of the items present in the main tables with what we are supplying in the IMS header, if something got deleted at a time after the IMS time in the request, we would not take that into account and return 304 (which is incorrect, since the endpoint has changed since the IMS time, i.e., one of the objects got deleted).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416000982



##########
File path: lib/go-rfc/http.go
##########
@@ -36,6 +36,7 @@ const (
 	ContentDisposition     = "Content-Disposition"      // RFC6266
 	ApplicationOctetStream = "application/octet-stream" // RFC2046§4.5.2
 	Vary                   = "Vary"                     // RFC7231§7.1.4
+	IfModifiedSince        = "If-Modified-Since"        // RFC1945

Review comment:
       RFC1945 is obsolete, should be RFC7232§3.3
   
   https://tools.ietf.org/html/rfc7232#section-3.3

##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,16 @@
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+-- deleted_type
+CREATE TABLE IF NOT EXISTS deleted_type (
+    id bigint NOT NULL,
+    name text,
+    description text,
+    use_in_table text,
+    last_updated timestamp with time zone DEFAULT now(),
+    deleted_time timestamp with time zone NOT NULL DEFAULT now()

Review comment:
       I think I suggested `deleted_time`, but thinking about it some more, deleted rows should be immutable, so `last_updated` and `deleted_time` will always be identical, right? And while `deleted_time` might be a little more intuitive, `last_updated` matches the convention in every other table, and will probably make it easier to re-use some existing query building stuff.
   
   What do you think about removing `deleted_time` and just using `last_updated` for everything?

##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       Will `where2`, `orderBy2`, and `pagination2` ever be different? I can't think of a reason they would be. If not, should we just have `selectMaxLastUpdatedQuery(where, orderBy, pagination` and use them for both the `type` and `deleted_type` parts of the query?

##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -191,8 +271,28 @@ func GenericOptionsDelete(val GenericOptionsDeleter) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
+func InsertInDeletedTable(val GenericDeleter, v interface{}) (error, error, int) {
+	err := val.InsertIntoDeletedQuery(v, val.APIInfo().Tx)
+	if err != nil {
+		return ParseDBError(err)
+	}
+	return nil, nil, http.StatusOK
+}
+
 // GenericDelete does a Delete (DELETE) for the given GenericDeleter object and type. This exists as a generic function, for the common use case of a simple delete with query parameters defined in the sqlx struct tags.
 func GenericDelete(val GenericDeleter) (error, error, int) {
+	var v interface{}
+	res, err := val.APIInfo().Tx.NamedQuery(val.SelectBeforeDeleteQuery(), val)

Review comment:
       Instead of selecting, deleting, then inserting the selected, can we just insert right here, and save a query? E.g. `insert into foo_deleted (id) values ($1)` ?

##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       Also, because all tables need a `last_updated`, and all `deleted_` tables should also be identical, we should be able to use a func to build the query to reduce duplication, e.g. 
   ```
   func MakeLastUpdatedQuery(table string) string {
       return `
   SELECT MAX(t) from (
     SELECT MAX(last_updated) t FROM ` + table + ` ` + where + orderBy + pagination + `
     UNION ALL
     SELECT MAX(last_updated) t from deleted_` + table + ` ` + where + orderBy + pagination + `
   ) as res
   `
   }
   ```
   
   And then all endpoints can use that




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] mitchell852 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
mitchell852 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r447256758



##########
File path: traffic_ops/traffic_ops_golang/cachegroup/cachegroups.go
##########
@@ -444,7 +445,7 @@ func (cg *TOCacheGroup) Read(h http.Header, useIMS bool) ([]interface{}, error,
 	}
 
 	if useIMS {
-		runSecond, maxTime := ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))
+		runSecond, maxTime = ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))

Review comment:
       ok, just want to make sure that if you don't provide the following header:
   
   ```
   -H "If-Modified-Since: [date]"
   ```
   
   then you get back 
   
   ```
   < last-modified: Mon, 01 Jan 0001 00:00:00 UTC
   ```

##########
File path: traffic_ops/traffic_ops_golang/cachegroup/cachegroups.go
##########
@@ -444,7 +445,7 @@ func (cg *TOCacheGroup) Read(h http.Header, useIMS bool) ([]interface{}, error,
 	}
 
 	if useIMS {
-		runSecond, maxTime := ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))
+		runSecond, maxTime = ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))

Review comment:
       just want to make sure that's the expected result.

##########
File path: docs/source/admin/traffic_ops.rst
##########
@@ -443,6 +443,10 @@ This file deals with the configuration parameters of running Traffic Ops itself.
 		:disabled_routes: A list of API route IDs to disable. Requests matching these routes will receive a 503 response. To find the route ID for a given path you would like to disable, run ``./traffic_ops_golang`` using the :option:`--api-routes` option to view all the route information, including route IDs and paths.
 		:ignore_unknown_routes: If ``false`` (default) return an error and prevent startup if unknown route IDs are found. Otherwise, log a warning and continue startup.
 
+:use_ims:
+    .. versionadded:: 5.0
+    This is an optional boolean value to enable the handling of the "If-Modified-Since" HTTP request header. Default: false

Review comment:
       what do you think about adding
   
   ```"use_ims": false,```
   
   to cdn.conf?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r418391790



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       Ah, yes, if `last_updated` doesn't have an index, it definitely needs one for these queries. +1
   And +1 on logging




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] ocket8888 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
ocket8888 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r420182335



##########
File path: traffic_ops/traffic_ops_golang/traffic_ops_golang.go
##########
@@ -59,6 +59,7 @@ func main() {
 	configFileName := flag.String("cfg", "", "The config file path")
 	dbConfigFileName := flag.String("dbcfg", "", "The db config file path")
 	riakConfigFileName := flag.String("riakcfg", "", "The riak config file path")
+	useIMS := flag.Bool("use_ims", false, "Use If-Modified-Since requests")

Review comment:
       Can we make this `use-ims` instead? Hyphens are easier to type in command line flags.
   
   Also, the new option should be added to [the Traffic Ops usage section](https://traffic-control-cdn.readthedocs.io/en/latest/admin/traffic_ops.html#traffic-ops-golang) as well as to the usage example therein.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r420277292



##########
File path: traffic_ops/traffic_ops_golang/traffic_ops_golang.go
##########
@@ -59,6 +59,7 @@ func main() {
 	configFileName := flag.String("cfg", "", "The config file path")
 	dbConfigFileName := flag.String("dbcfg", "", "The db config file path")
 	riakConfigFileName := flag.String("riakcfg", "", "The riak config file path")
+	useIMS := flag.Bool("use_ims", false, "Use If-Modified-Since requests")

Review comment:
       I'll get that changed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417562835



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       >I think there is a trade-off to be made here in terms of complexity vs performance gain. 
   
   I think it's a pretty big performance gain, and not too complex. Without it, it'll mean any modification to any profile will make IMS fail on ORT for every server. It's hard to know for sure without looking at detailed change data over a long period of time, but I suspect not handling params will put the IMS hit rate <10%, and with params will be >90%.
   
   > we have to keep the columns of both tables in sync forever
   
   We do. It's a little code, but it doesn't seem unreasonable, it just means adding a column has to add it in both tables.
   
   > not to mention dealing with the additional problems created by keeping deleted data forever.
   
   Personally, I don't believe this will be an issue. Our data is tiny. Speaking from experience, gigabytes of data really isn't big or hard to deal with, and our data is megabytes.
   
   But if it is an issue, it's pretty easy to truncate it, and always return Modified for an IMS older than that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-622428488


   Just so we don't forget, we probably need to add the `Last-Modified` header to all responses in order for this to be fully supported, right? https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Last-Modified
   
   It seems like we need to set that to the latest `last_modified` in the response body.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417552558



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       Suppose I make a request to `/api/2.0/deliveryservices?cdn=FOO` at 10:00am.
   Someone else then does a `DELETE /api/2.0/deliveryservices/cdn-bar-ds` at 10:05am.
   I then do an IMS request to `/api/2.0/deliveryservices?cdn=FOO` at 10:10am, with my `If-Modified-Since: 10:00am`.
   
   The `deleted_deliveryservices` table only has `id | last_updated`. It says "there has been a DS deleted since 10:00am, therefore I must send you the full request back.
   
   When in reality, no DS on CDN "FOO" has been deleted or modified, and it can safely return a 304 Not Modified. But there's no way for the code to determine that, without knowing the data queried on, the cdn id, of the deleted record.
   
   We don't actually need all data, just all columns that can be queried on. But that's probably most of them, it seems easier and safer to add them all.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416305815



##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       The reason I did that is because of the way the "ParamColumns" method works. Its creating an alias for the main table and then passing that along to the dbHelpers function to create the "where" clause. If we have a deleted table, we would need to query against that table (with its own alias as well). For ex, this:
   
   `SELECT max(t) from (
   SELECT max(last_updated) as t from type typ WHERE typ.id=12345
   UNION ALL
   SELECT max(last_updated) as t from deleted_type dtyp WHERE dtyp.id=12345
   ) as res`
   
   This is why I was thinking of splitting out the two. The only way I could think of was by changing the "ParamColumns" method, but that seemed like a more involved change. Is there a better way of doing this? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416305815



##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       The reason I did that is because of the way the "ParamColumns" method works. Its creating an alias for the main table and then passing that along to the dbHelpers function to create the "where" clause. If we have a deleted table, we would need to query against that table (with its own alias as well). For ex, this:
   
   ```SELECT max(t) from (
   		SELECT max(last_updated) as t from type typ WHERE typ.id=12345
   		UNION ALL
   	SELECT max(last_updated) as t from deleted_type dtyp WHERE dtyp.id=12345
   		) as res```
   This is why I was thinking of splitting out the two. The only way I could think of was by changing the "ParamColumns" method, but that seemed like a more involved change. Is there a better way of doing this? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417547461



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       IMO, it would depend on what all the acceptable params for `DELETE` are. For ex, in some cases, we could be deleting by name, in others, by ID.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r418387622



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       Refactored the DELETE logic to use just one table with just the last timestamp of the deleted member of that particular "type".




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r428152445



##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {

Review comment:
       Yeah, but the variable `l` is local to the if/else block scope, hence I had to go this route.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r428804097



##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {
+		query := val.SelectMaxLastUpdatedQuery(where, orderBy, pagination, val.GetType())
+		rows, err := val.APIInfo().Tx.NamedQuery(query, queryValues)
+		defer rows.Close()
+		if err != nil {
+			log.Warnf("Couldn't get the max last updated time: %v", err)
+			return runSecond, max
+		}
+		if err == sql.ErrNoRows {
+			runSecond = false

Review comment:
       I changed it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416319640



##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       The `UNION ALL` queries can have the same alias, they're effectively separate queries. So
   ```
   `
   SELECT MAX(t) from (
     SELECT MAX(last_updated) as t FROM type typ ` + where + orderBy + pagination + `
     UNION ALL
     SELECT MAX(last_updated) as t FROM deleted_type typ ` + where + orderBy + pagination + `
   ) as res
   `
   ```
   Works fine 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] ocket8888 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
ocket8888 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417825474



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       Looking around at the way other projects implement this, it looks like they typically do one of two things:
   
   - Implement a `last_deleted` timestamp table with one row for each table that just stores the timestamp of the last deletion. That returns payloads which invalidate unnecessarily sometimes but are still a useful improvement over nothing, or
   - Don't support IMS on collections of objects. Just use ETags for collections, and support IMS on single objects.
   
   Those both seem fine to me - and I didn't look too hard at the DB implementation tbh because I thought he wanted me specifically to look at the API code and it's still just a draft - but I share Rawlin's reservations about creating tables to shadow all data that ever existed in the database.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416818942



##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       👍 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416101732



##########
File path: lib/go-rfc/http.go
##########
@@ -36,6 +36,7 @@ const (
 	ContentDisposition     = "Content-Disposition"      // RFC6266
 	ApplicationOctetStream = "application/octet-stream" // RFC2046§4.5.2
 	Vary                   = "Vary"                     // RFC7231§7.1.4
+	IfModifiedSince        = "If-Modified-Since"        // RFC1945

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-622430632


   >we probably need to add the Last-Modified header to all responses in order for this to be fully supported, right?
   We should. It isn't strictly necessary, clients can use the Date. But it's better, and there's no reason not to.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] mitchell852 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
mitchell852 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r447284555



##########
File path: traffic_ops/traffic_ops_golang/cachegroup/cachegroups.go
##########
@@ -444,7 +445,7 @@ func (cg *TOCacheGroup) Read(h http.Header, useIMS bool) ([]interface{}, error,
 	}
 
 	if useIMS {
-		runSecond, maxTime := ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))
+		runSecond, maxTime = ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))

Review comment:
       i guess i expected that the `last-modified` key would not even be in the response at all if IMS was not in the request header




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-622444171


   Since ETags are generally better, would it make sense to just implement them instead of IMS? Do we need both? Or do we just replace IMS support with ETags in the future? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r447282075



##########
File path: docs/source/admin/traffic_ops.rst
##########
@@ -443,6 +443,10 @@ This file deals with the configuration parameters of running Traffic Ops itself.
 		:disabled_routes: A list of API route IDs to disable. Requests matching these routes will receive a 503 response. To find the route ID for a given path you would like to disable, run ``./traffic_ops_golang`` using the :option:`--api-routes` option to view all the route information, including route IDs and paths.
 		:ignore_unknown_routes: If ``false`` (default) return an error and prevent startup if unknown route IDs are found. Otherwise, log a warning and continue startup.
 
+:use_ims:
+    .. versionadded:: 5.0
+    This is an optional boolean value to enable the handling of the "If-Modified-Since" HTTP request header. Default: false

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r418390844



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       >  I see solutions like denormalized outside-the-API JSON blob snapshots or intermediary proxies as hacks to work around a broken system, not the ideal.
   
   I wouldn't really call it a hack. I think it's actually a step in the right direction that will help us achieve much greater performance and efficiency in dishing out config data to thousands of caches at once. Since caches are by far the biggest client base of the TO API, it makes sense to optimize their requests as much as possible, and I think the cache-config snapshots idea is the simplest way to give us the most optimal solution for those requests. It can even have all those nice features -- perfect IMS, delta encoding, read-while-writer, 1s caching -- at a much cheaper price than implementing them all across the entire API since it's just a single endpoint.
   
   _Back to the PR at hand:_
   
   I've also been thinking, we really need to make sure the IMS DB queries are significantly cheaper compared to the non-IMS DB queries. Currently, I'm pretty sure that they will do a full sequential scan in order to find the most recent `last_updated` because it isn't indexed, and that isn't really much cheaper on the DB than sequentially scanning and returning all of the data. So IMS hits would save bandwidth, but not load on the DB, which I think may be more important. It also basically doubles the cost of an IMS miss, and we don't really know what our IMS hit percentage would be in practice.
   
   I was playing around with indexing `last_updated`, and this seems to make the `select max(last_updated)` queries much cheaper:
   ```
   create index deliveryservice_last_updated_idx on deliveryservice (last_updated DESC NULLS LAST);
   ```
   The `DESC` seems to be important because, in theory, the first timestamp in the index should be the max, so this should be a very cheap query compared to a sequential scan.
   
   There's also the chance that we're wrong about the DB performance in practice, if things are updated often enough to cause way more IMS misses than expected. In that case, it would be nice to have a feature flag to disable IMS handling. That also means we should probably log whether a request was IMS_HIT, IMS_MISS, NON_IMS, etc. Then we will be able to determine what our hit percentage is from the logs.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r447292089



##########
File path: traffic_ops/traffic_ops_golang/cachegroup/cachegroups.go
##########
@@ -444,7 +445,7 @@ func (cg *TOCacheGroup) Read(h http.Header, useIMS bool) ([]interface{}, error,
 	}
 
 	if useIMS {
-		runSecond, maxTime := ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))
+		runSecond, maxTime = ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where))

Review comment:
       Well, I have the logic set currently so that if we get a valid (not nil) pointer in the `maxtime` variable, we set the `last-modified` value to that. So, if we get back an empty maxTime (which is the case if you dont supply an IMS header, or if the flag is set to false), we set the last-modified to `Mon, 01 Jan 0001 00:00:00 UTC`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r428244285



##########
File path: traffic_ops/app/db/migrations/20200430000000_add_on_delete_trigger.sql
##########
@@ -0,0 +1,419 @@
+-- +goose Up

Review comment:
       Oh nvm, I see the topology and other tables have the lastUpdated field, will update those.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417549118



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       > Right, we could only track IDs, but then any change would flag as modified if anything had been deleted, even if it wasn't in the filter.
   
   I think there is a trade-off to be made here in terms of complexity vs performance gain. We can keep it really simple by ignoring the "filter" side of things, and have simple invalidations per table. Or, we can copy all of the existing data, which means we have to keep the columns of both tables in sync forever, not to mention dealing with the additional problems created by keeping deleted data forever.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417691108



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       >we have a plan to address this in the near future (cache config snapshots)
   
   I see solutions like denormalized outside-the-API JSON blob snapshots or intermediary proxies as hacks to work around a broken system, not the ideal. If Traffic Ops were reasonably performant and implemented HTTP standards, and relatively simple performance optimizations like IMS, a 1s cache, and Read-While-Writer on concurrent requests, expensive and unsafe workarounds wouldn't be necessary. 
   
   There are scales where the cost of these things becomes necessary; <1 req/s on a <100MB database are several orders of magnitude below those levels.
   
   >in the last ninety days at my company, we only issued 104 DELETE
   
   Even one DELETE a day, if it's breaking an otherwise successful IMS, means increasing the TO load by thousands of times the necessary, for some brief period. If that change was one cache, it means the server has to handle full requests from every cache in the CDN checking in at once, instead of a single full request and 999 cheap IMS's.
   
   >We would need to be careful to share as much as possible between those three queries rather than having three separate, mostly-duplicated queries
   
   String building isn't hard. In fact, we already have columns duplicated in places, which should really concatenate strings to avoid duplicating all the column names.
   
   At any rate, I'm not going to -1. I'm not a primary developer on Traffic Ops anymore, and I really don't like telling other people how hard something should be for them. If decision is that it's too difficult and expensive to maintain real IMS, well, I'm not writing (much) code on the project anymore to say.
   
   But I will say, Traffic Ops is at the center of everything else in Traffic Control. If TO would do the right thing -- HTTP standards like IMS and Delta, Read-While-Writer, 1s caching, other small things to get decent performance -- these things would be a tiny, tiny fraction of the code TO has to maintain, and would make things better for every other component of Traffic Control, as well as all its operators and users.
   
   The cost of these small features in TO is less than cached blobs and slower configs, and far, far less than additional intermediary proxy service deployments.
   
   TO has had abysmal performance since the beginning. Golang is much better now, but the scalability of TO, how many requests it can handle, is still orders of magnitude lower than it should be.
   
   The things we have to do to work around poor TO performance -- proxies in front of TO, ORT dispersion and backoff and slower config deployment, TM slower health updating -- everything else in TC would be so, so much better if we were just willing to do the work to make TO performant. And the development and operational time of all those things far, far outweighs the time for TO.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] mitchell852 merged pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
mitchell852 merged pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r438943606



##########
File path: traffic_ops/traffic_ops_golang/cachegroup/cachegroups.go
##########
@@ -436,13 +438,24 @@ func (cg *TOCacheGroup) Read() ([]interface{}, error, error, int) {
 	}
 	where, orderBy, pagination, queryValues, errs := dbhelpers.BuildWhereAndOrderByAndPagination(cg.ReqInfo.Params, queryParamsToQueryCols)
 	if len(errs) > 0 {
-		return nil, util.JoinErrs(errs), nil, http.StatusBadRequest
+		return nil, util.JoinErrs(errs), nil, http.StatusBadRequest, nil
+	}
+
+	runSecond, maxTime := ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where, orderBy, pagination))
+	if useIMS {

Review comment:
       So it looks like you addressed this in generic_crud.go but we should also skip the `TryIfModifiedSinceQuery` if IMS is disabled. I noticed this here and in topologies.go, so I imagine there might be other cases to still address in other files as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-622432946


   >It seems like we need to set that to the latest last_modified in the response body.
   
   Also, IMS is second-precision: so for headers, as well as all the comparisons to determine if it's been modified, we need to round down to the previous second.
   
   This isn't a huge deal for users, but it is for scripts. If we don't round down, a fast script which updates-then-retrieves could easily not get the change it just made, and break.
   
   Also: when we implement ETags, this isn't an issue, they can have as much precision as we like. We should definitely add ETags and If-None-Match support ASAP, which should be easy with this framework. ETags are always better. They're not only more precise, but they can have IDs, or paths (e.g. to ensure someone doesn't erroneously send the etag from `/servers?name=foo` to `/servers?name=bar`), or any other data we need to put in there.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r438916430



##########
File path: traffic_ops/app/db/migrations/20200520000000_add_deleted_tables.sql
##########
@@ -0,0 +1,470 @@
+/*

Review comment:
       There are some new tables to account for here: https://github.com/apache/trafficcontrol/pull/4687/files

##########
File path: traffic_ops/traffic_ops_golang/deliveryservice/servers/servers.go
##########
@@ -114,11 +117,20 @@ func ReadDSSHandler(w http.ResponseWriter, r *http.Request) {
 
 	dss := TODeliveryServiceServer{}
 	dss.SetInfo(inf)
-	results, err := dss.readDSS(inf.Tx, inf.User, inf.Params, inf.IntParams, nil, nil)
+	results, err, maxTime := dss.readDSS(nil, inf.Tx, inf.User, inf.Params, inf.IntParams, nil, nil, false)
 	if err != nil {
 		api.HandleErr(w, r, inf.Tx.Tx, http.StatusInternalServerError, nil, err)
 		return
 	}
+	if maxTime != nil {

Review comment:
       There are conflicts in this file now due to the total reworking of the servers API here: https://github.com/apache/trafficcontrol/pull/4700/files




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r427680257



##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {

Review comment:
       since the `if` block returns early, we could remove the `else` block here to reduce the indentation

##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,13 @@
+

Review comment:
       This migration should probably be in the same file as the other one since they both belong to the same feature.

##########
File path: traffic_ops/app/db/migrations/20200430000000_add_on_delete_trigger.sql
##########
@@ -0,0 +1,419 @@
+-- +goose Up

Review comment:
       All files need an Apache 2.0 license header unless the file type doesn't support it (in which case it would need an exception in `.dependency_license`)
   
   Also, this file name will need a date that is later than https://github.com/apache/trafficcontrol/blob/master/traffic_ops/app/db/migrations/20200507000000_interfaces.sql, and probably also needs updated to include the new tables and APIs that have been added since this PR was opened (all the new topologies tables, and the new tables introduced in that linked migration).

##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])

Review comment:
       If `ims[0]` is a string, can we use `%s` instead? Also, if an SRE simply saw this warning in the log, I don't think they'd have enough information to know what it means. We should say something like `IMS request header date '%s' not parseable`.

##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {
+		query := val.SelectMaxLastUpdatedQuery(where, orderBy, pagination, val.GetType())
+		rows, err := val.APIInfo().Tx.NamedQuery(query, queryValues)
+		defer rows.Close()
+		if err != nil {
+			log.Warnf("Couldn't get the max last updated time: %v", err)
+			return runSecond, max
+		}
+		if err == sql.ErrNoRows {
+			runSecond = false

Review comment:
       instead of setting `runSecond = false` before returning it, would it be more clear to create a 2nd variable at the top - `dontRunSecond = false` - and return that instead? 

##########
File path: docs/source/admin/traffic_ops.rst
##########
@@ -441,6 +441,10 @@ This file deals with the configuration parameters of running Traffic Ops itself.
 		:disabled_routes: A list of API route IDs to disable. Requests matching these routes will receive a 503 response. To find the route ID for a given path you would like to disable, run ``./traffic_ops_golang`` using the :option:`--api-routes` option to view all the route information, including route IDs and paths.
 		:ignore_unknown_routes: If ``false`` (default) return an error and prevent startup if unknown route IDs are found. Otherwise, log a warning and continue startup.
 
+:use_ims:
+    .. versionadded:: 4.1
+    This is an optional boolean value to set or unset the use of "If-Modified-Since" header in the http requests. Default: false

Review comment:
       I think this would be more clearly described as `This is an optional boolean value to enable the handling of the "If-Modified-Since" HTTP request header`

##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -20,9 +20,15 @@ package api
  */
 
 import (
+	"database/sql"
 	"errors"
 	"fmt"
+	"github.com/apache/trafficcontrol/grove/web"
+	"github.com/apache/trafficcontrol/lib/go-log"
+	"github.com/apache/trafficcontrol/lib/go-rfc"
+	ims2 "github.com/apache/trafficcontrol/traffic_ops/traffic_ops_golang/util/ims"

Review comment:
       Rather than renaming the import we should rename the `ims` variable below to something like `imsDate`

##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,13 @@
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+-- last_deleted
+CREATE TABLE IF NOT EXISTS last_deleted (
+  tab_name text NOT NULL,

Review comment:
       Mind renaming this to `table_name`? Also, it should probably be made the primary key.

##########
File path: docs/source/admin/traffic_ops.rst
##########
@@ -441,6 +441,10 @@ This file deals with the configuration parameters of running Traffic Ops itself.
 		:disabled_routes: A list of API route IDs to disable. Requests matching these routes will receive a 503 response. To find the route ID for a given path you would like to disable, run ``./traffic_ops_golang`` using the :option:`--api-routes` option to view all the route information, including route IDs and paths.
 		:ignore_unknown_routes: If ``false`` (default) return an error and prevent startup if unknown route IDs are found. Otherwise, log a warning and continue startup.
 
+:use_ims:
+    .. versionadded:: 4.1

Review comment:
       this would be added in 5.0 at this point since the 4.1.x branch is already cut

##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {

Review comment:
       `MakeFirstQuery` is not very clear as to the intention of this function. I seems to check if the IMS header is valid, and if so, run the query for selecting the max last_updated field for the request. Should we rename it `TryIfModifiedSinceQuery`? This could use a godoc comment as well describing what it does and what the return values mean.
   
   Also, `h map[string][]string` is not clear that it is actually the HTTP headers. Why not keep it as `h http.Header`?

##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {
+		query := val.SelectMaxLastUpdatedQuery(where, orderBy, pagination, val.GetType())
+		rows, err := val.APIInfo().Tx.NamedQuery(query, queryValues)
+		defer rows.Close()
+		if err != nil {
+			log.Warnf("Couldn't get the max last updated time: %v", err)
+			return runSecond, max
+		}
+		if err == sql.ErrNoRows {
+			runSecond = false
+			return runSecond, max
+		}
+		// This should only ever contain one row
+		if rows.Next() {
+			v := &ims2.LatestTimestamp{}
+			if err = rows.StructScan(v); err != nil || v == nil {
+				log.Warnf("Failed to parse the max time stamp into a struct %v", err)
+				return runSecond, max
+			}
+			max = v.LatestTime.Time
+			// The request IMS time is later than the max of (lastUpdated, deleted_time)
+			if l.After(v.LatestTime.Time) {
+				runSecond = false
+				return runSecond, max
+			}
+		} else {
+			runSecond = false
+		}
+	}
+	return runSecond, max
+}
+
+func GenericRead(h http.Header, val GenericReader, useIMS bool) ([]interface{}, error, error, int, *time.Time) {
+	vals := []interface{}{}
+	code := http.StatusOK
 	where, orderBy, pagination, queryValues, errs := dbhelpers.BuildWhereAndOrderByAndPagination(val.APIInfo().Params, val.ParamColumns())
 	if len(errs) > 0 {
-		return nil, util.JoinErrs(errs), nil, http.StatusBadRequest
+		return nil, util.JoinErrs(errs), nil, http.StatusBadRequest, nil
 	}
-
+	runSecond, maxTime := MakeFirstQuery(val, h, where, orderBy, pagination, queryValues)
+	if useIMS {

Review comment:
       if `useIMS = false` I think we should probably skip the entire first query to disable as much of the IMS functionality as possible

##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {
+		query := val.SelectMaxLastUpdatedQuery(where, orderBy, pagination, val.GetType())
+		rows, err := val.APIInfo().Tx.NamedQuery(query, queryValues)
+		defer rows.Close()

Review comment:
       In general I think we `defer rows.Close()` _after_ checking `err != nil`. Otherwise, `rows` might be nil, and calling close on it would cause a nil pointer deref.

##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -118,28 +125,86 @@ func GenericCreateNameBasedID(val GenericCreator) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
-func GenericRead(val GenericReader) ([]interface{}, error, error, int) {
+func MakeFirstQuery(val GenericReader, h map[string][]string, where string, orderBy string, pagination string, queryValues map[string]interface{}) (bool, time.Time) {
+	var max time.Time
+	ims := []string{}
+	runSecond := true
+	if h == nil {
+		return runSecond, max
+	}
+	ims = h[rfc.IfModifiedSince]
+	if len(ims) == 0 {
+		return runSecond, max
+	}
+	if l, ok := web.ParseHTTPDate(ims[0]); !ok {
+		log.Warnf("Date not parsable %v", ims[0])
+		return runSecond, max
+	} else {
+		query := val.SelectMaxLastUpdatedQuery(where, orderBy, pagination, val.GetType())
+		rows, err := val.APIInfo().Tx.NamedQuery(query, queryValues)
+		defer rows.Close()
+		if err != nil {
+			log.Warnf("Couldn't get the max last updated time: %v", err)
+			return runSecond, max
+		}
+		if err == sql.ErrNoRows {
+			runSecond = false
+			return runSecond, max
+		}
+		// This should only ever contain one row
+		if rows.Next() {
+			v := &ims2.LatestTimestamp{}
+			if err = rows.StructScan(v); err != nil || v == nil {
+				log.Warnf("Failed to parse the max time stamp into a struct %v", err)
+				return runSecond, max
+			}
+			max = v.LatestTime.Time
+			// The request IMS time is later than the max of (lastUpdated, deleted_time)
+			if l.After(v.LatestTime.Time) {
+				runSecond = false
+				return runSecond, max
+			}
+		} else {
+			runSecond = false
+		}
+	}
+	return runSecond, max
+}
+
+func GenericRead(h http.Header, val GenericReader, useIMS bool) ([]interface{}, error, error, int, *time.Time) {
+	vals := []interface{}{}
+	code := http.StatusOK
 	where, orderBy, pagination, queryValues, errs := dbhelpers.BuildWhereAndOrderByAndPagination(val.APIInfo().Params, val.ParamColumns())
 	if len(errs) > 0 {
-		return nil, util.JoinErrs(errs), nil, http.StatusBadRequest
+		return nil, util.JoinErrs(errs), nil, http.StatusBadRequest, nil
 	}
-
+	runSecond, maxTime := MakeFirstQuery(val, h, where, orderBy, pagination, queryValues)
+	if useIMS {
+		if !runSecond {
+			log.Debugln("IMS HIT")

Review comment:
       oh, by logging IMS hits vs misses, I was thinking it should be in the TO access logs, similar to the ATS cache result codes: https://docs.trafficserver.apache.org/en/8.0.x/admin-guide/logging/cache-results.en.html#cache-result-codes
   
   We could have IMS_HIT, IMS_MISS, and NON_IMS. Ideally, each result code would be documented (I'm not sure if the TO access log is documented already or not).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r438943606



##########
File path: traffic_ops/traffic_ops_golang/cachegroup/cachegroups.go
##########
@@ -436,13 +438,24 @@ func (cg *TOCacheGroup) Read() ([]interface{}, error, error, int) {
 	}
 	where, orderBy, pagination, queryValues, errs := dbhelpers.BuildWhereAndOrderByAndPagination(cg.ReqInfo.Params, queryParamsToQueryCols)
 	if len(errs) > 0 {
-		return nil, util.JoinErrs(errs), nil, http.StatusBadRequest
+		return nil, util.JoinErrs(errs), nil, http.StatusBadRequest, nil
+	}
+
+	runSecond, maxTime := ims.TryIfModifiedSinceQuery(cg.APIInfo().Tx, h, queryValues, selectMaxLastUpdatedQuery(where, orderBy, pagination))
+	if useIMS {

Review comment:
       So it looks like you addressed this in generic_crud.go but we should also skip the `TryIfModifiedSinceQuery` everywhere if IMS is disabled. I noticed this here and in topologies.go, so I imagine there might be other cases to still address in other files as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-642804828


   Oh and thinking about DB-related tests to make sure each table is accounted for w.r.t. the deleted_tables table (and indexing last_updated), I think might be able to do that if we get creative with that in the DB tests: https://github.com/apache/trafficcontrol/blob/master/traffic_ops_db/test/docker/run-db-test.sh
   
   Basically, it would be great if that test failed whenever someone forgot to account for the IMS-related behavior, in order to ensure IMS is maintained for new endpoints/resources.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r420230190



##########
File path: traffic_ops/traffic_ops_golang/traffic_ops_golang.go
##########
@@ -59,6 +59,7 @@ func main() {
 	configFileName := flag.String("cfg", "", "The config file path")
 	dbConfigFileName := flag.String("dbcfg", "", "The db config file path")
 	riakConfigFileName := flag.String("riakcfg", "", "The riak config file path")
+	useIMS := flag.Bool("use_ims", false, "Use If-Modified-Since requests")

Review comment:
       Rather than being a CLI flag this should probably just be an option in TO's config file instead.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp edited a comment on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp edited a comment on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-642804828


   Oh and thinking about DB-related tests to make sure each table is accounted for w.r.t. the deleted_tables table (and indexing last_updated), I think we might be able to do that if we get creative with that in the DB tests: https://github.com/apache/trafficcontrol/blob/master/traffic_ops_db/test/docker/run-db-test.sh
   
   Basically, it would be great if that test failed whenever someone forgot to account for the IMS-related behavior, in order to ensure IMS is maintained for new endpoints/resources.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r428111149



##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -20,9 +20,15 @@ package api
  */
 
 import (
+	"database/sql"
 	"errors"
 	"fmt"
+	"github.com/apache/trafficcontrol/grove/web"
+	"github.com/apache/trafficcontrol/lib/go-log"
+	"github.com/apache/trafficcontrol/lib/go-rfc"
+	ims2 "github.com/apache/trafficcontrol/traffic_ops/traffic_ops_golang/util/ims"

Review comment:
       Aah yeah my goland does this by itself sometimes, I'll change this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417691108



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       >we have a plan to address this in the near future (cache config snapshots)
   
   I see solutions like denormalized outside-the-API JSON blob snapshots or intermediary proxies as hacks to work around a broken system, not the ideal. If Traffic Ops were reasonably performant and implemented HTTP standards, and relatively simple performance optimizations like IMS, a 1s cache, and Read-While-Writer on concurrent requests, expensive and unsafe workarounds wouldn't be necessary. 
   
   There are scales where the cost of these things becomes necessary; <1 req/s on a <100MB database are several orders of magnitude below those levels.
   
   >in the last ninety days at my company, we only issued 104 DELETE
   
   Even one DELETE a day, if it's breaking an otherwise successful IMS, means increasing the TO load by thousands of times the necessary, for some brief period. If that change was one cache, it means the server has to handle full requests from every cache in the CDN checking in at once, instead of a single full request and 999 cheap IMS's.
   
   >We would need to be careful to share as much as possible between those three queries rather than having three separate, mostly-duplicated queries
   
   String building isn't hard. In fact, we already have columns duplicated in places, which should really concatenate strings to avoid duplicating all the column names.
   
   At any rate, I'm not going to -1. I'm not a primary developer on Traffic Ops anymore, and I really don't like telling other people how hard something should be for them. If decision is that it's too difficult and expensive to maintain real IMS, well, I'm not writing (much) code on the project anymore to say.
   
   But I will say, Traffic Ops is at the center of everything else in Traffic Control. If TO would do the right things -- HTTP standards like IMS and Delta, Read-While-Writer, 1s caching, other small things to get decent performance -- these things would be a tiny, tiny fraction of the code TO has to maintain, and would make things better for every other component of Traffic Control, as well as all its operators and users.
   
   The cost of these small features in TO is less than cached blobs and slower configs, and far, far less than additional intermediary proxy service deployments.
   
   TO has had abysmal performance since the beginning. Golang is much better now, but the scalability of TO, how many requests it can handle, is still orders of magnitude lower than it should be.
   
   The things we have to do to work around poor TO performance -- proxies in front of TO, ORT dispersion and backoff and slower config deployment, TM slower health updating -- everything else in TC would be so, so much better if we were just willing to do the work to make TO performant. And the development and operational time of all those things far, far outweighs the time for TO.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-622445961


   IMO we should do both. All the logic only happens once, it's only a couple more lines to add the IMS and LM headers, and some clients can do IMS but not INM.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c edited a comment on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c edited a comment on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-622430632


   >we probably need to add the Last-Modified header to all responses in order for this to be fully supported, right?
   
   We should. It isn't strictly necessary, clients can use the Date. But it's better, and there's no reason not to.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416209560



##########
File path: traffic_ops/traffic_ops_golang/api/generic_crud.go
##########
@@ -191,8 +271,28 @@ func GenericOptionsDelete(val GenericOptionsDeleter) (error, error, int) {
 	return nil, nil, http.StatusOK
 }
 
+func InsertInDeletedTable(val GenericDeleter, v interface{}) (error, error, int) {
+	err := val.InsertIntoDeletedQuery(v, val.APIInfo().Tx)
+	if err != nil {
+		return ParseDBError(err)
+	}
+	return nil, nil, http.StatusOK
+}
+
 // GenericDelete does a Delete (DELETE) for the given GenericDeleter object and type. This exists as a generic function, for the common use case of a simple delete with query parameters defined in the sqlx struct tags.
 func GenericDelete(val GenericDeleter) (error, error, int) {
+	var v interface{}
+	res, err := val.APIInfo().Tx.NamedQuery(val.SelectBeforeDeleteQuery(), val)

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416305815



##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       The reason I did that is because of the way the "ParamColumns" method works. Its creating an alias for the main table and then passing that along to the dbHelpers function to create the "where" clause. If we have a deleted table, we would need to query against that table (with its own alias as well). For ex, this:
   
   ```SELECT max(t) from (
   		SELECT max(last_updated) as t from type typ WHERE typ.id=12345
   		UNION ALL
   	SELECT max(last_updated) as t from deleted_type dtyp WHERE dtyp.id=12345
   		) as res```
   
   This is why I was thinking of splitting out the two. The only way I could think of was by changing the "ParamColumns" method, but that seemed like a more involved change. Is there a better way of doing this? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rob05c commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rob05c commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416319640



##########
File path: traffic_ops/traffic_ops_golang/types/types.go
##########
@@ -158,6 +174,34 @@ func (tp *TOType) loadUseInTable() (error, error, string) {
 	return nil, nil, useInTable
 }
 
+func selectMaxLastUpdatedQuery(where, orderBy, pagination, where2, orderBy2, pagination2 string) string {

Review comment:
       The `UNION ALL` queries can have the same alias, they're effectively separate queries. So
   ```
   `SELECT max(t) from (
   		SELECT max(last_updated) as t from type typ ` + where + orderBy + pagination +
   		` UNION ALL
   	select max(last_updated) as t from deleted_type typ ` + where + orderBy + pagination +
   		` ) as res`
   ```
   Works fine 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r416102040



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,16 @@
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+-- deleted_type
+CREATE TABLE IF NOT EXISTS deleted_type (
+    id bigint NOT NULL,
+    name text,
+    description text,
+    use_in_table text,
+    last_updated timestamp with time zone DEFAULT now(),
+    deleted_time timestamp with time zone NOT NULL DEFAULT now()

Review comment:
       Yeah, makes sense to use the updated "last_updated" time from the deleted_table. Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] rawlinp commented on a change in pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
rawlinp commented on a change in pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#discussion_r417543915



##########
File path: traffic_ops/app/db/migrations/20200424000000_add_deleted_tables.sql
##########
@@ -0,0 +1,608 @@
+/*
+
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+*/
+
+-- +goose Up
+-- SQL in section 'Up' is executed when this migration is applied
+
+CREATE TABLE IF NOT EXISTS deleted_api_capability (
+    id bigserial PRIMARY KEY,

Review comment:
       Yeah, I understand that part, so we need _something_ to tell us that rows were deleted. But, why do we need _all_ the table data? Shouldn't all the `deleted_foo` tables only have a single column -- e.g. `deleted_at` -- that we update via an `ON DELETE` trigger similar to the trigger for `last_updated`? It seems like the `deleted_foo` tables would only ever need a single row to track when something was last deleted from its corresponding table. Maybe we only need one `deleted` table, which has a row for every _main_ table, that contains the timestamp of when something was last deleted from that referenced table?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [trafficcontrol] srijeet0406 commented on pull request #4665: Support IMS header in Get endpoints

Posted by GitBox <gi...@apache.org>.
srijeet0406 commented on pull request #4665:
URL: https://github.com/apache/trafficcontrol/pull/4665#issuecomment-622707316


   Here are the fixes in the latest commit:
   1.) Now we are logging on an IMS hit/ miss, or on a non IMS vs IMS request 
   2.) All the responses now have a `Last-Modified` header which has the "latest" ts of the response set
   3.) We are now able to turn the IMS mode on/ off by using the `run_ims` flag (which defaults to `false`)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org