You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2016/02/27 00:50:18 UTC
[jira] [Commented] (KUDU-1353) Alter table sometimes doesn't finish when using multiple masters

    [ https://issues.apache.org/jira/browse/KUDU-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15170133#comment-15170133 ] 

Todd Lipcon commented on KUDU-1353:
-----------------------------------

Another point here is that it's probably not just an issue for alter table. If a client tried to read a table after a similar scenario, it would also get an error that there is no leader on the tablet, right? Or would our client "speculative guess for leader" stuff make it work OK?

> Alter table sometimes doesn't finish when using multiple masters
> ----------------------------------------------------------------
>
>                 Key: KUDU-1353
>                 URL: https://issues.apache.org/jira/browse/KUDU-1353
>             Project: Kudu
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.7.0
>            Reporter: Adar Dembo
>            Assignee: Adar Dembo
>            Priority: Critical
>         Attachments: first.txt, second.txt
>
>
> Under certain circumstances, a client's AlterTable request may get "stuck" in a quorum of masters, causing the client to time out. The flaky test dashboard usually shows a couple instances of this failure in master_failover-itest::TestRenameTableSync.
> How does this happen? Let's assume the following:
> * Three masters
> * Table with one tablet, replicated three times.
> # It starts with a complicated master leader election in which master 1 is elected for term 1, master 2 for term 2, then master 1 steps down.
> # This means the three tservers have had a chance to register with both master 1 and master 2. 
> # Now, the tablet's leader replica is elected, causing its next tablet report to include a "dirty" entry to master 2.
> # Master 2 is killed, master 1 is reelected for term 3.
> # The client issues AlterTable to master 1, but master 1 has no idea who the tablet leader is, so the AlterTablet RPC it issues fails.
> # At this point, there are now outstanding master->leader AlterTablet RPCs.
> # The tablet leader begins heartbeating to master 1. Its report is incremental and excludes registration information. Master 1 does not ask the tserver to register, because it already got that information in step 1 during the weird master leader election. The report includes the "tablet report" section but lists no tablets, so the master doesn't ask it to send a full report. No AlterTablet RPC is sent.
> # At this point, the alter table won't make forward progress until the tablet leader's role changes and an AlterTablet RPC is sent, which may not happen for a long time, or at all (in an integration test).
> Some possible solutions:
> # When a tserver decides to heartbeat to a new leader master, it should always send a full tablet report.
> # When the master crafts an AlterTablet RPC and tries to find the leader, it currently uses "soft state" to do so, state that's only updated in the event of a tablet role change or full tablet report. Instead, we could use "hard state" which, even though the master may be newly elected, should already include up-to-date consensus configuration information.
> # Change tservers to always heartbeat to all masters. Doing so means that following a master leader election, the new master has up-to-date "soft state" and can find the leader when issuing an AlterTablet RPC.
> # Add a list of tservers to the master's "hard state", so that TSDescriptors (needed when finding the leader tablet) can be instantiated without the need for full tablet reports.
> At the moment I'm not sure which approach makes the most sense, or if there's a better one out there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)