You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@kudu.apache.org by "Ray Liu (rayliu)" <ra...@cisco.com> on 2020/07/15 06:41:34 UTC

Master can not rejoin cluster when failed.

We have a Kudu cluster with 3 Masters and 9 tablet servers.
When we try to drop a table with more a thousand tablets the Leader Master crashed.
The last logs for crashed master are a bunch of
W0715 04:00:57.330158 30337 catalog_manager.cc:3485] TS cd17b92888a84d39b2adcad1ca947037 (hdsj1kud005.webex.com:7050): delete failed for tablet 4250e813a29e4ca7a2633c6015c5530d because the tablet was not found. No further retry: Not found: Tablet not found: 4250e813a29e4ca7a2633c6015c5530d

Before these delete failed logs, there are many:
W0715 03:59:40.047675 30336 connection.cc:361] RPC call timeout handler was delayed by 11.8487s! This may be due to a process-wide pause such as swapping, logging-related delays, or allocator lock contention. Will allow an additional 3s for a response.

So, when this leader master crashed, a new leader master was elected from the remaining two masters.
But when I try to restart the crashed master, it stuck forever (2 hours for now).
The logs are a repetition of these:

  I0715 06:30:36.438797 18042 raft_consensus.cc:465] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [term 4 FOLLOWER]: Starting pre-election (no leader contacted us within the election timeout)
  I0715 06:30:36.438868 18042 raft_consensus.cc:487] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [term 4  FOLLOWER]: Starting pre-election with config: opid_index: -1 OBSOLETE_local: false peers { permanent_uuid: "e8f90d84b4754a379ffedaa32b528fb4" member_type: VOTER last_known_addr { host: "master1" port: 7051 } } peers { permanent_uuid: "ba89996893e44391a10a2fc1f2c2ada3" member_type: VOTER last_known_addr { host: "master2" port: 7051 } } peers { permanent_uuid: "81338568ef854b10ac0acac1d9eeeb6c" member_type: VOTER last_known_addr { host: "master3" port: 7051 } }
  I0715 06:30:36.439072 18042 leader_election.cc:296] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: Requested pre-vote from peers e8f90d84b4754a379ffedaa32b528fb4 (master1:7051), ba89996893e44391a10a2fc1f2c2ada3 (master2:7051)
  W0715 06:30:36.439657 13256 leader_election.cc:341] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer ba89996893e44391a10a2fc1f2c2ada3 (master2:7051): Remote error: Not authorized: unauthorized access to method: RequestConsensusVote
  W0715 06:30:36.439787 13255 leader_election.cc:341] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer e8f90d84b4754a379ffedaa32b528fb4 (master1:7051): Remote error: Not authorized: unauthorized access to method: RequestConsensusVote
  I0715 06:30:36.439808 13255 leader_election.cc:310] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: Election decided. Result: candidate lost. Election summary: received 3 responses out of 3 voters: 1 yes votes; 2 no votes. yes voters: 81338568ef854b10ac0acac1d9eeeb6c; no voters: ba89996893e44391a10a2fc1f2c2ada3, e8f90d84b4754a379ffedaa32b528fb4
  I0715 06:30:36.439839 18042 raft_consensus.cc:2597] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [term 4 FOLLOWER]: Leader pre-election lost for term 5. Reason: could not achieve majority
  W0715 06:30:36.531898 13465 server_base.cc:587] Unauthorized access attempt to method kudu.consensus.ConsensusService.UpdateConsensus from {username='kudu'} at ip:port


The master summary from ksck is
  Master Summary
               UUID               |        Address        |    Status
----------------------------------+-----------------------+--------------
ba89996893e44391a10a2fc1f2c2ada3 | master1               | HEALTHY
e8f90d84b4754a379ffedaa32b528fb4 | master2               | HEALTHY
81338568ef854b10ac0acac1d9eeeb6c | master3               | UNAUTHORIZED
Error from master3: Remote error: could not fetch consensus info from master: Not authorized: unauthorized access to method: GetConsensusState (UNAUTHORIZED)
All reported replicas are:
  A = ba89996893e44391a10a2fc1f2c2ada3
  B = e8f90d84b4754a379ffedaa32b528fb4
  C = 81338568ef854b10ac0acac1d9eeeb6c
The consensus matrix is:
Config source |        Replicas        | Current term | Config index | Committed?
---------------+------------------------+--------------+--------------+------------
A                        | A*  B   C              | 4            | -1           | Yes
B                        | A*  B   C              | 4            | -1           | Yes
C                        | [config not available] |              |              |

What can I do if these three masters can’t achieve consensus forever?
Is it safe to delete --fs_data_dirs/ --fs_metadata_dir/ --fs_wal_dir for the crashed master if in order to get it online without any data loss?

Thanks

Re: Master can not rejoin cluster when failed.

Posted by "Ray Liu (rayliu)" <ra...@cisco.com>.

The root cause is we launched the kudu master process with users don't have super user privileges.


On 7/15/20, 21:23, "Attila Bukor" <ab...@apache.org> wrote:

    Hi Ray,
    
    It seems the problem is that kudu user is not authorized to UpdateConsensus on
    the other masters. What user are the other two masters started with?
    
    I wouldn't recommend wiping the master, it most likely wouldn't solve the
    problem, and Kudu can't automatically recover from a deleted master, you would
    need to recreate it manually.
    
    Attila
    
    On Wed, Jul 15, 2020 at 06:41:34AM +0000, Ray Liu (rayliu) wrote:
    > We have a Kudu cluster with 3 Masters and 9 tablet servers.
    > When we try to drop a table with more a thousand tablets the Leader Master crashed.
    > The last logs for crashed master are a bunch of
    > W0715 04:00:57.330158 30337 catalog_manager.cc:3485] TS cd17b92888a84d39b2adcad1ca947037 (hdsj1kud005.webex.com:7050): delete failed for tablet 4250e813a29e4ca7a2633c6015c5530d because the tablet was not found. No further retry: Not found: Tablet not found: 4250e813a29e4ca7a2633c6015c5530d
    > 
    > Before these delete failed logs, there are many:
    > W0715 03:59:40.047675 30336 connection.cc:361] RPC call timeout handler was delayed by 11.8487s! This may be due to a process-wide pause such as swapping, logging-related delays, or allocator lock contention. Will allow an additional 3s for a response.
    > 
    > So, when this leader master crashed, a new leader master was elected from the remaining two masters.
    > But when I try to restart the crashed master, it stuck forever (2 hours for now).
    > The logs are a repetition of these:
    > 
    >   I0715 06:30:36.438797 18042 raft_consensus.cc:465] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [term 4 FOLLOWER]: Starting pre-election (no leader contacted us within the election timeout)
    >   I0715 06:30:36.438868 18042 raft_consensus.cc:487] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [term 4  FOLLOWER]: Starting pre-election with config: opid_index: -1 OBSOLETE_local: false peers { permanent_uuid: "e8f90d84b4754a379ffedaa32b528fb4" member_type: VOTER last_known_addr { host: "master1" port: 7051 } } peers { permanent_uuid: "ba89996893e44391a10a2fc1f2c2ada3" member_type: VOTER last_known_addr { host: "master2" port: 7051 } } peers { permanent_uuid: "81338568ef854b10ac0acac1d9eeeb6c" member_type: VOTER last_known_addr { host: "master3" port: 7051 } }
    >   I0715 06:30:36.439072 18042 leader_election.cc:296] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: Requested pre-vote from peers e8f90d84b4754a379ffedaa32b528fb4 (master1:7051), ba89996893e44391a10a2fc1f2c2ada3 (master2:7051)
    >   W0715 06:30:36.439657 13256 leader_election.cc:341] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer ba89996893e44391a10a2fc1f2c2ada3 (master2:7051): Remote error: Not authorized: unauthorized access to method: RequestConsensusVote
    >   W0715 06:30:36.439787 13255 leader_election.cc:341] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer e8f90d84b4754a379ffedaa32b528fb4 (master1:7051): Remote error: Not authorized: unauthorized access to method: RequestConsensusVote
    >   I0715 06:30:36.439808 13255 leader_election.cc:310] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: Election decided. Result: candidate lost. Election summary: received 3 responses out of 3 voters: 1 yes votes; 2 no votes. yes voters: 81338568ef854b10ac0acac1d9eeeb6c; no voters: ba89996893e44391a10a2fc1f2c2ada3, e8f90d84b4754a379ffedaa32b528fb4
    >   I0715 06:30:36.439839 18042 raft_consensus.cc:2597] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [term 4 FOLLOWER]: Leader pre-election lost for term 5. Reason: could not achieve majority
    >   W0715 06:30:36.531898 13465 server_base.cc:587] Unauthorized access attempt to method kudu.consensus.ConsensusService.UpdateConsensus from {username='kudu'} at ip:port
    > 
    > 
    > The master summary from ksck is
    >   Master Summary
    >                UUID               |        Address        |    Status
    > ----------------------------------+-----------------------+--------------
    > ba89996893e44391a10a2fc1f2c2ada3 | master1               | HEALTHY
    > e8f90d84b4754a379ffedaa32b528fb4 | master2               | HEALTHY
    > 81338568ef854b10ac0acac1d9eeeb6c | master3               | UNAUTHORIZED
    > Error from master3: Remote error: could not fetch consensus info from master: Not authorized: unauthorized access to method: GetConsensusState (UNAUTHORIZED)
    > All reported replicas are:
    >   A = ba89996893e44391a10a2fc1f2c2ada3
    >   B = e8f90d84b4754a379ffedaa32b528fb4
    >   C = 81338568ef854b10ac0acac1d9eeeb6c
    > The consensus matrix is:
    > Config source |        Replicas        | Current term | Config index | Committed?
    > ---------------+------------------------+--------------+--------------+------------
    > A                        | A*  B   C              | 4            | -1           | Yes
    > B                        | A*  B   C              | 4            | -1           | Yes
    > C                        | [config not available] |              |              |
    > 
    > What can I do if these three masters can’t achieve consensus forever?
    > Is it safe to delete --fs_data_dirs/ --fs_metadata_dir/ --fs_wal_dir for the crashed master if in order to get it online without any data loss?
    > 
    > Thanks

Re: Master can not rejoin cluster when failed.

Posted by Attila Bukor <ab...@apache.org>.

Hi Ray,

It seems the problem is that kudu user is not authorized to UpdateConsensus on
the other masters. What user are the other two masters started with?

I wouldn't recommend wiping the master, it most likely wouldn't solve the
problem, and Kudu can't automatically recover from a deleted master, you would
need to recreate it manually.

Attila

On Wed, Jul 15, 2020 at 06:41:34AM +0000, Ray Liu (rayliu) wrote:
> We have a Kudu cluster with 3 Masters and 9 tablet servers.
> When we try to drop a table with more a thousand tablets the Leader Master crashed.
> The last logs for crashed master are a bunch of
> W0715 04:00:57.330158 30337 catalog_manager.cc:3485] TS cd17b92888a84d39b2adcad1ca947037 (hdsj1kud005.webex.com:7050): delete failed for tablet 4250e813a29e4ca7a2633c6015c5530d because the tablet was not found. No further retry: Not found: Tablet not found: 4250e813a29e4ca7a2633c6015c5530d
> 
> Before these delete failed logs, there are many:
> W0715 03:59:40.047675 30336 connection.cc:361] RPC call timeout handler was delayed by 11.8487s! This may be due to a process-wide pause such as swapping, logging-related delays, or allocator lock contention. Will allow an additional 3s for a response.
> 
> So, when this leader master crashed, a new leader master was elected from the remaining two masters.
> But when I try to restart the crashed master, it stuck forever (2 hours for now).
> The logs are a repetition of these:
> 
>   I0715 06:30:36.438797 18042 raft_consensus.cc:465] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [term 4 FOLLOWER]: Starting pre-election (no leader contacted us within the election timeout)
>   I0715 06:30:36.438868 18042 raft_consensus.cc:487] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [term 4  FOLLOWER]: Starting pre-election with config: opid_index: -1 OBSOLETE_local: false peers { permanent_uuid: "e8f90d84b4754a379ffedaa32b528fb4" member_type: VOTER last_known_addr { host: "master1" port: 7051 } } peers { permanent_uuid: "ba89996893e44391a10a2fc1f2c2ada3" member_type: VOTER last_known_addr { host: "master2" port: 7051 } } peers { permanent_uuid: "81338568ef854b10ac0acac1d9eeeb6c" member_type: VOTER last_known_addr { host: "master3" port: 7051 } }
>   I0715 06:30:36.439072 18042 leader_election.cc:296] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: Requested pre-vote from peers e8f90d84b4754a379ffedaa32b528fb4 (master1:7051), ba89996893e44391a10a2fc1f2c2ada3 (master2:7051)
>   W0715 06:30:36.439657 13256 leader_election.cc:341] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer ba89996893e44391a10a2fc1f2c2ada3 (master2:7051): Remote error: Not authorized: unauthorized access to method: RequestConsensusVote
>   W0715 06:30:36.439787 13255 leader_election.cc:341] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer e8f90d84b4754a379ffedaa32b528fb4 (master1:7051): Remote error: Not authorized: unauthorized access to method: RequestConsensusVote
>   I0715 06:30:36.439808 13255 leader_election.cc:310] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [CANDIDATE]: Term 5 pre-election: Election decided. Result: candidate lost. Election summary: received 3 responses out of 3 voters: 1 yes votes; 2 no votes. yes voters: 81338568ef854b10ac0acac1d9eeeb6c; no voters: ba89996893e44391a10a2fc1f2c2ada3, e8f90d84b4754a379ffedaa32b528fb4
>   I0715 06:30:36.439839 18042 raft_consensus.cc:2597] T 00000000000000000000000000000000 P 81338568ef854b10ac0acac1d9eeeb6c [term 4 FOLLOWER]: Leader pre-election lost for term 5. Reason: could not achieve majority
>   W0715 06:30:36.531898 13465 server_base.cc:587] Unauthorized access attempt to method kudu.consensus.ConsensusService.UpdateConsensus from {username='kudu'} at ip:port
> 
> 
> The master summary from ksck is
>   Master Summary
>                UUID               |        Address        |    Status
> ----------------------------------+-----------------------+--------------
> ba89996893e44391a10a2fc1f2c2ada3 | master1               | HEALTHY
> e8f90d84b4754a379ffedaa32b528fb4 | master2               | HEALTHY
> 81338568ef854b10ac0acac1d9eeeb6c | master3               | UNAUTHORIZED
> Error from master3: Remote error: could not fetch consensus info from master: Not authorized: unauthorized access to method: GetConsensusState (UNAUTHORIZED)
> All reported replicas are:
>   A = ba89996893e44391a10a2fc1f2c2ada3
>   B = e8f90d84b4754a379ffedaa32b528fb4
>   C = 81338568ef854b10ac0acac1d9eeeb6c
> The consensus matrix is:
> Config source |        Replicas        | Current term | Config index | Committed?
> ---------------+------------------------+--------------+--------------+------------
> A                        | A*  B   C              | 4            | -1           | Yes
> B                        | A*  B   C              | 4            | -1           | Yes
> C                        | [config not available] |              |              |
> 
> What can I do if these three masters can’t achieve consensus forever?
> Is it safe to delete --fs_data_dirs/ --fs_metadata_dir/ --fs_wal_dir for the crashed master if in order to get it online without any data loss?
> 
> Thanks