You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Adar Dembo (JIRA)" <ji...@apache.org> on 2019/03/18 03:55:00 UTC

[jira] [Assigned] (KUDU-2080) Masters stuck in a bad state when not starting them together on initial deployment

     [ https://issues.apache.org/jira/browse/KUDU-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adar Dembo reassigned KUDU-2080:
--------------------------------

    Assignee: Jiahongchao

> Masters stuck in a bad state when not starting them together on initial deployment
> ----------------------------------------------------------------------------------
>
>                 Key: KUDU-2080
>                 URL: https://issues.apache.org/jira/browse/KUDU-2080
>             Project: Kudu
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.3.0
>            Reporter: Attila Bukor
>            Assignee: Jiahongchao
>            Priority: Major
>              Labels: usability
>
> When masters are started separately on the first run when they're trying to write the consensus data they won't be able to connect to each other and fail writing.
> {code}
> I0726 14:15:22.894768 55240 consensus_peers.cc:503] Retrying to get permanent uuid for remote peer: member_type: VOTER last_known_addr { host: "master1.example.com" port: 7051 } attempt: 10
> W0726 14:15:22.895084 55240 consensus_peers.cc:493] Error getting permanent uuid from config peer master1.example.com:7051: Network error: Client connection negotiation failed: client connection to 10.1.0.1:7051: connect: Connection refused (error 111)
> I0726 14:15:36.235213 55240 consensus_peers.cc:503] Retrying to get permanent uuid for remote peer: member_type: VOTER last_known_addr { host: "master1.example.com" port: 7051 } attempt: 11
> W0726 14:15:36.235498 55240 consensus_peers.cc:493] Error getting permanent uuid from config peer master1.example.com:7051: Network error: Client connection negotiation failed: client connection to 10.1.0.1:7051: connect: Connection refused (error 111)
> E0726 14:15:36.235572 55240 master.cc:171] Master@0.0.0.0:7051: Unable to init master catalog manager: Timed out: Unable to initialize catalog manager: Failed to initialize sys tables async: Failed to create new distributed Raft config: Unable to resolve UUID for peer member_type: VOTER last_known_addr { host: "master1.example.com" port: 7051 }: Getting permanent uuid from master1.example.com:7051 timed out after 30000 ms.: Network error: Client connection negotiation failed: client connection to 10.1.0.1:7051: connect: Connection refused (error 111)
> F0726 14:15:36.235663 55079 master_main.cc:71] Check failed: _s.ok() Bad status: Timed out: Unable to initialize catalog manager: Failed to initialize sys tables async: Failed to create new distributed Raft config: Unable to resolve UUID for peer member_type: VOTER last_known_addr { host: "master1.example.com" port: 7051 }: Getting permanent uuid from master1.example.com:7051 timed out after 30000 ms.: Network error: Client connection negotiation failed: client connection to 10.1.0.1:7051: connect: Connection refused (error 111)
> {code}
> After this the tablet-meta will be there but the consensus-meta will be missing and the startup will fail until all masters' data directory is empty and they're started again at the same time (similarly to KUDU-1186):
> {code}
> I0726 14:20:52.455219 58429 sys_catalog.cc:128] Verifying existing consensus state
> E0726 14:20:52.455294 58429 master.cc:171] Master@0.0.0.0:7051: Unable to init master catalog manager: Not found: Unable to initialize catalog manager: Failed to initialize sys tables async: Unable to load consensus metadata for tablet 00000000000000000000000000000000: /data/kudu/master/data/consensus-meta/00000000000000000000000000000000: No such file or directory (error 2)
> F0726 14:20:52.455400 58268 master_main.cc:71] Check failed: _s.ok() Bad status: Not found: Unable to initialize catalog manager: Failed to initialize sys tables async: Unable to load consensus metadata for tablet 00000000000000000000000000000000: /data/kudu/master/data/consensus-meta/00000000000000000000000000000000: No such file or directory (error 2)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)