You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/04/15 01:55:09 UTC

[GitHub] [druid] jihoonson opened a new issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

jihoonson opened a new issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701
 
 
   ### Affected Version
   
   0.18.0
   
   ### Description
   
   This bug seems to be introduced in https://github.com/apache/druid/pull/9481. Before this PR, `DruidLeaderClient` used `ServerDiscoverySelector` to find the leader. Since only the leader Overlord and Coordinator announce themselves, `ServerDiscoverySelector` could return the current leader to `DruidLeaderClient`. But now, `DruidLeaderClient` picks up the first node returned from `DruidNodeDiscovery` which knows all announced nodes. This breaks the internal redirection of `DruidLeaderClient`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613776169
 
 
   When it tries to follow redirects, [it checks the current known leader](https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L282-L294) which calls `pickOneHost()`. This function used to use `ServerDiscoverySelector` before #9481 which returns the known leader, but now it just returns the first node from `DruidNodeDiscovery`.
   
   https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L297-L307

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613770134
 
 
   This is a release blocker for 0.18.0 since Druid doesn't work with multi-masters.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
gianm commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613782117
 
 
   Reverting the patch makes sense to me, then for the next release, adding it back along with implementing redirects for `goAsync`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug edited a comment on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
himanshug edited a comment on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613778345
 
 
   ignoring name of that method, when we  make request to wrong leader, we should get a HTTP 302 back and that redirect is then followed  to  get the request to right node (see https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L210 ) and newly  discovered leader  is then cached  ( see  https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L231  )
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613774678
 
 
   @jihoonson `DruidLeaderClient` is designed  to follow  redirects so as to land requests to  leader. can  you please describe, what functionality is broken by this change ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
gianm commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613781088
 
 
   The problem happens with `DruidLeaderClient.goAsync`, which lets callers provide their own HttpResponseHandler, but doesn't contain the leader-following logic. It is used by the Druid SQL system tables so they can avoid materializing the entire response for the APIs they are hitting (which could be large). Perhaps the best fix is that it should learn how to follow leaders.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on issue #9701: DruidLeaderClient.goAsync() is broken

Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9701: DruidLeaderClient.goAsync() is broken
URL: https://github.com/apache/druid/issues/9701#issuecomment-613784068
 
 
   or rather, to expedite things, let us revert #9481 for now.
   
   I will revive that along with bringing in `DruidLeaderSelector.getCurrentLeader()` or follow redirect in goAsync later.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson edited a comment on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
jihoonson edited a comment on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613776169
 
 
   When it tries to follow redirects, [it checks the current known leader](https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L282-L294) which calls `pickOneHost()`. Before #9481, this function used to use `ServerDiscoverySelector` which returns the known leader and fall back to `DruidNodeDiscovery` when there is no known leader, but now it always returns the first node from `DruidNodeDiscovery`.
   
   https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L297-L307

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9701: DruidLeaderClient.goAsync() is broken

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient.goAsync() is broken
URL: https://github.com/apache/druid/issues/9701#issuecomment-613785771
 
 
   @gianm @himanshug thanks for understanding! The PR is ready at https://github.com/apache/druid/pull/9702.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613782915
 
 
   ah, I see , wasn't aware of this change. I agree, right fix is  to make it follow leader.
   
   for now, I think right way to  fix it  to  not  revert #9481 but rather use `DruidLeaderSelector` interface to get  the   leader.. so as  to not add  hardcoded dependency  back to zk. Similar to what router  is doing  `DruidLeaderSelector.getCurrentLeader()` for  example https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/AsyncManagementForwardingServlet.java#L95
   
   that will revert the behavior equivalent to using zk  specific  `ServerDiscoverySelector`
   
   let me do a PR unless someone sees problems with that ..  @gianm @jihoonson  ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613781799
 
 
   Yep, sorry for confusing. @himanshug you're right about `DruidLeaderClient.g()`, but it happens with `goAsync()` as @gianm mentioned which is reported in https://github.com/apache/druid/issues/8716.
   
   I agree that the best approach to fix this would fix https://github.com/apache/druid/issues/8716, but we are running out of time for 0.18.0, I'm still inclined to reverting https://github.com/apache/druid/pull/9481 for now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson closed issue #9701: DruidLeaderClient.goAsync() is broken

Posted by GitBox <gi...@apache.org>.
jihoonson closed issue #9701: DruidLeaderClient.goAsync() is broken
URL: https://github.com/apache/druid/issues/9701
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613771880
 
 
   I think we can revert #9481 for both master and 0.18.0 branches since it doesn't have any changes other than removing `ServerDiscoverySelector` from `DruidLeaderClient`. @himanshug does it make sense?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader

Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613778345
 
 
   ignoring name of that method, when we  make request to wrong leader, we should get a HTTP 302 back and that is redirect is then followed  to  get the request to right node (see https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L210 ) and newly  discovered leader  is then cached  ( see  https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L231  )
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org