You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by GitBox <gi...@apache.org> on 2022/04/12 15:00:42 UTC

[GitHub] [trafficserver] amoghyermalkar123 opened a new issue, #8789: Round robin consistent hash not working.

amoghyermalkar123 opened a new issue, #8789:
URL: https://github.com/apache/trafficserver/issues/8789

   ```
   dest_domain=. method=get parent="parent1, parent2, parent3" scheme=https round_robin=consistent_hash
   ```
   The above is my parent.config file. According to the docs :
   ```
   - consistent hash of the url so that one parent is chosen for a given url. If a parent is down, the traffic that would go to the down parent is rehashed amongst the remaining parents.
   ```
   But when i request a content from a child node, the same content request goes out to all the parent servers. To be exact, i am testing with a vod so the content requested are video segments. It's not like different segments are asked from different servers. There are redundant requests as well. 
   For example, the same segment is requested by the child from atleast all the parent servers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] jrushford commented on issue #8789: Round robin consistent hash not working.

Posted by GitBox <gi...@apache.org>.
jrushford commented on issue #8789:
URL: https://github.com/apache/trafficserver/issues/8789#issuecomment-1098239444

   
   > ```
   > 
   > The log says that it couldn't connect to the parent server or Response is not valid, while I see TCP_MISS 200 on both the parent servers. On top of that, the other parent server is successfully serving other content. Why would it say failed to connect for this specific request?
   
   your cache couldn't connect to the parent but was able to connect to the other two, I can't see from the logs why one request failed with the connection error to the one parent.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] amoghyermalkar123 closed issue #8789: Round robin consistent hash not working.

Posted by GitBox <gi...@apache.org>.
amoghyermalkar123 closed issue #8789: Round robin consistent hash not working.
URL: https://github.com/apache/trafficserver/issues/8789


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] jrushford commented on issue #8789: Round robin consistent hash not working.

Posted by GitBox <gi...@apache.org>.
jrushford commented on issue #8789:
URL: https://github.com/apache/trafficserver/issues/8789#issuecomment-1098105303

   @amoghyermalkar123 You're using the consistent_hash algorithm.  So, a hash is created from the request url path.  That hash is used to select a parent from a map.  The client session is not tied to one specific parent for all requests that it makes.  One request may go to parent1 and the next request may go to parent2, etc... the parent chosen is determined by the request URL path.  So, a request for a specific object will always go to the same parent when there is a cache miss.  It will only use another parent if the one originally chosen is unreachable or if the request response from that parent times out. In your debug, I'm not seeing any of the request URL's, do they vary, are your parallel requests all using  just one request?  Once the response has been obtained, it should be cached on your child cache.  Once cached there will be no further requests to parents for the same object until it expires from the cache.  As for the connect error to parent 1, I cannot say from lookin
 g at your logs.  At that time, a connection failure occurred and Parent selection chose another parent to try.  A connection error is caused when TCP is unable to establish the connection or the error could have been from a timeout on receiving the response.  I've never seen the issue that you've described but will try to duplicate. There are some parent proxy timeout settings in records.config that we should take a look at.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] jrushford commented on issue #8789: Round robin consistent hash not working.

Posted by GitBox <gi...@apache.org>.
jrushford commented on issue #8789:
URL: https://github.com/apache/trafficserver/issues/8789#issuecomment-1096877179

   @amoghyermalkar123 which ATS version?  I'll see if I can duplicate it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] amoghyermalkar123 commented on issue #8789: Round robin consistent hash not working.

Posted by GitBox <gi...@apache.org>.
amoghyermalkar123 commented on issue #8789:
URL: https://github.com/apache/trafficserver/issues/8789#issuecomment-1097491485

   8.1.x Version.
   Yes I do get the expected responses. I'm not sure how to turn on debugging "parent_select", can't see any option in records.config, can you let me know how to do that? 
   That's the part, parents are not even failing, all the parents are serving the content. I thought about parent retries but in that case the request logs should have mentioned failed, but they don't.
   Both parents are serving the request i.e. 200 OK. They are even fetching the content from the origin if it's a TCP_MISS 200.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] amoghyermalkar123 commented on issue #8789: Round robin consistent hash not working.

Posted by GitBox <gi...@apache.org>.
amoghyermalkar123 commented on issue #8789:
URL: https://github.com/apache/trafficserver/issues/8789#issuecomment-1098854063

   @jrushford appreciate the detailed explanation. I see the behavior on the servers exactly as described above. Closing this issue.
   Also if I run into a few problems I hope i'll get the help :)
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] amoghyermalkar123 commented on issue #8789: Round robin consistent hash not working.

Posted by GitBox <gi...@apache.org>.
amoghyermalkar123 commented on issue #8789:
URL: https://github.com/apache/trafficserver/issues/8789#issuecomment-1097499216

   A few more points, firstly, A bit context on the test you should know - i sent out **parallel** requests for the vod on the same child server.
   I expected that only one parent server would be serving up the content which as mentioned above wasn't the case.
   
   Secondly, I want to understand how parent selection is implemented - the algorithm. Basically how it is decided what content should go to a specific parent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] amoghyermalkar123 commented on issue #8789: Round robin consistent hash not working.

Posted by GitBox <gi...@apache.org>.
amoghyermalkar123 commented on issue #8789:
URL: https://github.com/apache/trafficserver/issues/8789#issuecomment-1097911107

   Update, I realized this later that we have an option for debug logs to watch parent selection in action, I'll provide a trimmed down sequence of logs for the aforementioned issue, please help me in understanding the activity.
   
   ```
   [ET_NET 5] DEBUG: <HttpTransact.cc:3009 (HandleCacheOpenReadMiss)> (http_trans) [1] [HandleCacheOpenReadMiss] --- MISS
   [ET_NET 5] DEBUG: <HttpTransact.cc:3010 (HandleCacheOpenReadMiss)> (http_seq) [1] [HttpTransact::HandleCacheOpenReadMiss] Miss in cache
   [ET_NET 5] DEBUG: <HttpTransact.cc:4957 (get_ka_info_from_config)> (http_trans) [1] get_ka_info_from_config, server_info->http_version 65537, check_hostdb 0
   [ET_NET 5] DEBUG: <ParentSelection.cc:117 (findParent)> (parent_select) In ParentConfigParams::findParent(): parent_table: 0x556d29afbe90.
   [ET_NET 5] DEBUG: <ParentSelection.cc:121 (findParent)> (parent_select) policy.ParentEnable: 1
   [ET_NET 5] DEBUG: <ParentSelection.cc:840 (UpdateMatch)> (parent_select) Matched with 0x556d29a2b7a8 parent node from line 1
   [ET_NET 5] DEBUG: <ParentConsistentHash.cc:146 (selectParent)> (parent_select) ParentConsistentHash::selectParent(): Using a consistent hash parent selection strategy.
   [ET_NET 5] DEBUG: <ParentConsistentHash.cc:210 (selectParent)> (parent_select) Initial parent lookups: 1
   [ET_NET 5] DEBUG: <ParentConsistentHash.cc:309 (selectParent)> (parent_select) Additional parent lookups: 1
   [ET_NET 5] DEBUG: <ParentConsistentHash.cc:334 (selectParent)> (parent_select) Chosen parent: parent1.com.443.com.443
   [ET_NET 5] DEBUG: <ParentSelection.cc:181 (findParent)> (parent_select) PARENT_SPECIFIED
   [ET_NET 5] DEBUG: <ParentSelection.cc:182 (findParent)> (parent_select) Result for request-domain was parent parent1.com.443:443
   
   .....
   
   [ET_NET 0] DEBUG: <DNS.cc:1503 (dns_process)> (dns) Got 1 DNS records for [parent1.com]
   [ET_NET 0] DEBUG: <DNS.cc:1721 (dns_process)> (dns) received A name = parent1.com
   [ET_NET 0] DEBUG: <DNS.cc:1737 (dns_process)> (dns) received A = parent1-ip
   
   
   DEBUG: <HttpSM.cc:4685 (do_http_server_open)> (http) [1] open connection to parent1.com: parent1-ip:443
   DEBUG: <HttpSM.cc:4698 (do_http_server_open)> (http_seq) [HttpSM::do_http_server_open] Sending request to server
   DEBUG: <HttpTransact.cc:3385 (handle_response_from_parent)> (http_trans) [1] [1] failed to connect to parent 
   
   [ET_NET 5] DEBUG: <HttpTransact.cc:3413 (handle_response_from_parent)> (http_trans) [1] [handle_response_from_parent] 3 per parent attempts exhausted
   [ET_NET 5] NOTE: Parent initially marked as down parent1.com:443
   
   [ET_NET 5] DEBUG: <ParentSelection.cc:196 (nextParent)> (parent_select) ParentConfigParams::nextParent(): parent_table: 0x556d29afbe90, result->rec: 0x556d29a2b7a8
   [ET_NET 5] DEBUG: <ParentSelection.cc:212 (nextParent)> (parent_select) ParentConfigParams::nextParent(): result->r: 2, tablePtr: 0x556d29afbe90
   [ET_NET 5] DEBUG: <ParentSelection.cc:215 (nextParent)> (parent_select) Calling selectParent() from nextParent
   [ET_NET 5] DEBUG: <ParentConsistentHash.cc:146 (selectParent)> (parent_select) ParentConsistentHash::selectParent(): Using a consistent hash parent selection strategy.
   [ET_NET 5] DEBUG: <ParentConsistentHash.cc:210 (selectParent)> (parent_select) Initial parent lookups: 1
   [ET_NET 5] DEBUG: <ParentConsistentHash.cc:309 (selectParent)> (parent_select) Additional parent lookups: 1
   [ET_NET 5] DEBUG: <ParentConsistentHash.cc:334 (selectParent)> (parent_select) Chosen parent: parent2.com.443
   [ET_NET 5] DEBUG: <ParentSelection.cc:234 (nextParent)> (parent_select) Retry result for request-domain was parent parent2.com:443
   
   ```
   The log says that it couldn't connect to the parent server or Response is not valid, while I see TCP_MISS 200 on both the parent servers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org