You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "David Alves (JIRA)" <ji...@apache.org> on 2016/12/08 16:31:58 UTC
[jira] [Resolved] (KUDU-1127) Avoid holding RPC handler threads on
replicas that are part of a degraded tablet
[ https://issues.apache.org/jira/browse/KUDU-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Alves resolved KUDU-1127.
-------------------------------
Resolution: Fixed
Assignee: David Alves
Fix Version/s: 1.2.0
4d8fe6cf2a1804bae142ddfb5e672af37dad036e did quite a bit in this regard like not hanging threads more than a fixed amount and short circuiting the wait. We might making it async in the future but we can open a new ticket for that.
> Avoid holding RPC handler threads on replicas that are part of a degraded tablet
> --------------------------------------------------------------------------------
>
> Key: KUDU-1127
> URL: https://issues.apache.org/jira/browse/KUDU-1127
> Project: Kudu
> Issue Type: Sub-task
> Components: tserver
> Affects Versions: Private Beta
> Reporter: Todd Lipcon
> Assignee: David Alves
> Fix For: 1.2.0
>
>
> If the client performs a snapshot scan, we may need to wait for the leader to tell us that the timestamp is "safe". If the majority of nodes in a tablet are down, this will never happen. After KUDU-689, well wait with a deadline, but even this multi-second wait will end up blocking a lot of RPC handlers, potentially preventing other useful work from getting done.
> We should probably short-circuit the wait in the case that we haven't heard from any leader within the election timeout and just respond immediately. Alternatively, we could make this an async callback vs a blocking wait on handler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)