You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/04/15 01:55:09 UTC
[GitHub] [druid] jihoonson opened a new issue #9701: DruidLeaderClient
always thinks the first node returned from DruidNodeDiscovery is the leader
jihoonson opened a new issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701
### Affected Version
0.18.0
### Description
This bug seems to be introduced in https://github.com/apache/druid/pull/9481. Before this PR, `DruidLeaderClient` used `ServerDiscoverySelector` to find the leader. Since only the leader Overlord and Coordinator announce themselves, `ServerDiscoverySelector` could return the current leader to `DruidLeaderClient`. But now, `DruidLeaderClient` picks up the first node returned from `DruidNodeDiscovery` which knows all announced nodes. This breaks the internal redirection of `DruidLeaderClient`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] jihoonson commented on issue #9701: DruidLeaderClient
always thinks the first node returned from DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613776169
When it tries to follow redirects, [it checks the current known leader](https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L282-L294) which calls `pickOneHost()`. This function used to use `ServerDiscoverySelector` before #9481 which returns the known leader, but now it just returns the first node from `DruidNodeDiscovery`.
https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L297-L307
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] jihoonson commented on issue #9701: DruidLeaderClient
always thinks the first node returned from DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613770134
This is a release blocker for 0.18.0 since Druid doesn't work with multi-masters.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] gianm commented on issue #9701: DruidLeaderClient always
thinks the first node returned from DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
gianm commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613782117
Reverting the patch makes sense to me, then for the next release, adding it back along with implementing redirects for `goAsync`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] himanshug edited a comment on issue #9701:
DruidLeaderClient always thinks the first node returned from
DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
himanshug edited a comment on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613778345
ignoring name of that method, when we make request to wrong leader, we should get a HTTP 302 back and that redirect is then followed to get the request to right node (see https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L210 ) and newly discovered leader is then cached ( see https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L231 )
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] himanshug commented on issue #9701: DruidLeaderClient
always thinks the first node returned from DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613774678
@jihoonson `DruidLeaderClient` is designed to follow redirects so as to land requests to leader. can you please describe, what functionality is broken by this change ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] gianm commented on issue #9701: DruidLeaderClient always
thinks the first node returned from DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
gianm commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613781088
The problem happens with `DruidLeaderClient.goAsync`, which lets callers provide their own HttpResponseHandler, but doesn't contain the leader-following logic. It is used by the Druid SQL system tables so they can avoid materializing the entire response for the APIs they are hitting (which could be large). Perhaps the best fix is that it should learn how to follow leaders.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] himanshug commented on issue #9701:
DruidLeaderClient.goAsync() is broken
Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9701: DruidLeaderClient.goAsync() is broken
URL: https://github.com/apache/druid/issues/9701#issuecomment-613784068
or rather, to expedite things, let us revert #9481 for now.
I will revive that along with bringing in `DruidLeaderSelector.getCurrentLeader()` or follow redirect in goAsync later.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] jihoonson edited a comment on issue #9701:
DruidLeaderClient always thinks the first node returned from
DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
jihoonson edited a comment on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613776169
When it tries to follow redirects, [it checks the current known leader](https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L282-L294) which calls `pickOneHost()`. Before #9481, this function used to use `ServerDiscoverySelector` which returns the known leader and fall back to `DruidNodeDiscovery` when there is no known leader, but now it always returns the first node from `DruidNodeDiscovery`.
https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L297-L307
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] jihoonson commented on issue #9701:
DruidLeaderClient.goAsync() is broken
Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient.goAsync() is broken
URL: https://github.com/apache/druid/issues/9701#issuecomment-613785771
@gianm @himanshug thanks for understanding! The PR is ready at https://github.com/apache/druid/pull/9702.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] himanshug commented on issue #9701: DruidLeaderClient
always thinks the first node returned from DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613782915
ah, I see , wasn't aware of this change. I agree, right fix is to make it follow leader.
for now, I think right way to fix it to not revert #9481 but rather use `DruidLeaderSelector` interface to get the leader.. so as to not add hardcoded dependency back to zk. Similar to what router is doing `DruidLeaderSelector.getCurrentLeader()` for example https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/AsyncManagementForwardingServlet.java#L95
that will revert the behavior equivalent to using zk specific `ServerDiscoverySelector`
let me do a PR unless someone sees problems with that .. @gianm @jihoonson ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] jihoonson commented on issue #9701: DruidLeaderClient
always thinks the first node returned from DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613781799
Yep, sorry for confusing. @himanshug you're right about `DruidLeaderClient.g()`, but it happens with `goAsync()` as @gianm mentioned which is reported in https://github.com/apache/druid/issues/8716.
I agree that the best approach to fix this would fix https://github.com/apache/druid/issues/8716, but we are running out of time for 0.18.0, I'm still inclined to reverting https://github.com/apache/druid/pull/9481 for now.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] jihoonson closed issue #9701: DruidLeaderClient.goAsync()
is broken
Posted by GitBox <gi...@apache.org>.
jihoonson closed issue #9701: DruidLeaderClient.goAsync() is broken
URL: https://github.com/apache/druid/issues/9701
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] jihoonson commented on issue #9701: DruidLeaderClient
always thinks the first node returned from DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613771880
I think we can revert #9481 for both master and 0.18.0 branches since it doesn't have any changes other than removing `ServerDiscoverySelector` from `DruidLeaderClient`. @himanshug does it make sense?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] himanshug commented on issue #9701: DruidLeaderClient
always thinks the first node returned from DruidNodeDiscovery is the leader
Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9701: DruidLeaderClient always thinks the first node returned from DruidNodeDiscovery is the leader
URL: https://github.com/apache/druid/issues/9701#issuecomment-613778345
ignoring name of that method, when we make request to wrong leader, we should get a HTTP 302 back and that is redirect is then followed to get the request to right node (see https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L210 ) and newly discovered leader is then cached ( see https://github.com/apache/druid/blob/cda9f41e69ebc0513782805fca9ead378c1910ab/server/src/main/java/org/apache/druid/discovery/DruidLeaderClient.java#L231 )
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org