You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Mike Percy (JIRA)" <ji...@apache.org> on 2018/05/08 02:38:00 UTC

[jira] [Updated] (KUDU-2433) Possible stack-smashing bug

     [ https://issues.apache.org/jira/browse/KUDU-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Percy updated KUDU-2433:
-----------------------------
    Summary: Possible stack-smashing bug  (was: Potential flake in client-negotiation-failover-itest due to negotiation timeout)

> Possible stack-smashing bug
> ---------------------------
>
>                 Key: KUDU-2433
>                 URL: https://issues.apache.org/jira/browse/KUDU-2433
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Mike Percy
>            Priority: Major
>         Attachments: client-negotiation-failover-itest.txt
>
>
> There appears to be a rare flake in client-negotiation-failover-itest in ClientFailoverOnNegotiationTimeoutITest.Kudu1580ConnectToTServer due to a negotiation timeout. This appeared on one of the test dashboards:
> {code:java}
> I0507 20:10:39.140327 1387 raft_consensus.cc:2459] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 FOLLOWER]: Leader pre-election lost for term 4. Reason: could not achieve majority
> W0507 20:10:39.140895 296 leader_election.cc:286] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer 5014824b94b64fd49da2c1a645c078aa: Network error: Client connection negotiation failed: client connection to 127.0.52.194:43942: connect: Connection refused (error 111)
> I0507 20:10:39.165484 1389 raft_consensus.cc:435] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [term 3 FOLLOWER]: Starting pre-election (detected failure of leader b5e887ddd8a14cf3b2c24a0aa14be025)
> I0507 20:10:39.165598 1389 raft_consensus.cc:2704] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing failure detection for 0.853s (starting election)
> I0507 20:10:39.165699 1389 raft_consensus.cc:457] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [term 3 FOLLOWER]: Starting pre-election with config: opid_index: -1 OBSOLETE_local: false peers { permanent_uuid: "b5e887ddd8a14cf3b2c24a0aa14be025" member_type: VOTER last_known_addr { host: "127.0.52.195" port: 38453 } } peers { permanent_uuid: "d1bf0f72a88f4483a1906bc8246849dd" member_type: VOTER last_known_addr { host: "127.0.52.193" port: 46057 } } peers { permanent_uuid: "5014824b94b64fd49da2c1a645c078aa" member_type: VOTER last_known_addr { host: "127.0.52.194" port: 43942 } }
> I0507 20:10:39.166122 1389 leader_election.cc:230] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer b5e887ddd8a14cf3b2c24a0aa14be025
> I0507 20:10:39.166318 1389 leader_election.cc:230] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer 5014824b94b64fd49da2c1a645c078aa
> W0507 20:10:39.166895 297 leader_election.cc:286] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer b5e887ddd8a14cf3b2c24a0aa14be025: Timed out: RequestConsensusVote RPC to 127.0.52.195:38453 timed out after 0.633s (SENT)
> I0507 20:10:39.166954 297 leader_election.cc:257] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: Election decided. Result: candidate lost.
> W0507 20:10:39.167076 296 leader_election.cc:286] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer 5014824b94b64fd49da2c1a645c078aa: Network error: Client connection negotiation failed: client connection to 127.0.52.194:43942: connect: Connection refused (error 111)
> I0507 20:10:39.167102 1389 raft_consensus.cc:2704] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing failure detection for 0.685s (election complete)
> I0507 20:10:39.167136 1389 raft_consensus.cc:2459] T a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [term 3 FOLLOWER]: Leader pre-election lost for term 4. Reason: could not achieve majority
> W0507 20:10:39.179502 660 negotiation.cc:313] Failed RPC negotiation. Trace:
> 0507 20:10:36.182479 (+ 0us) reactor.cc:577] Submitting negotiation task for client connection to 127.0.52.195:38453
> 0507 20:10:36.182694 (+ 215us) negotiation.cc:98] Waiting for socket to connect
> 0507 20:10:36.182707 (+ 13us) client_negotiation.cc:160] Beginning negotiation
> 0507 20:10:36.182797 (+ 90us) client_negotiation.cc:237] Sending NEGOTIATE NegotiatePB request
> 0507 20:10:39.179166 (+2996369us) negotiation.cc:304] Negotiation complete: Timed out: Client connection negotiation failed: client connection to 127.0.52.195:38453
> Metrics: {"client-negotiator.queue_time_us":63}
> W0507 20:10:39.179760 659 meta_cache.cc:216] Tablet a0376a5e3b9f495aa00d366b35594a27: Replica b5e887ddd8a14cf3b2c24a0aa14be025 (127.0.52.195:38453) has failed: Timed out: Client connection negotiation failed: client connection to 127.0.52.195:38453
> W0507 20:10:39.242803 297 leader_election.cc:286] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer b5e887ddd8a14cf3b2c24a0aa14be025: Timed out: RequestConsensusVote RPC to 127.0.52.195:38453 timed out after 0.101s (SENT)
> I0507 20:10:39.242909 297 leader_election.cc:257] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: Election decided. Result: candidate lost.
> I0507 20:10:39.243211 1389 raft_consensus.cc:2704] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing failure detection for 0.097s (election complete)
> I0507 20:10:39.243288 1389 raft_consensus.cc:2459] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 FOLLOWER]: Leader pre-election lost for term 4. Reason: could not achieve majority
> W0507 20:10:39.263934 297 leader_election.cc:286] T d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer b5e887ddd8a14cf3b2c24a0aa14be025: Timed out: RequestConsensusVote RPC to 127.0.52.195:38453 timed out after 0.975s (SENT)
> I0507 20:10:39.264031 297 leader_election.cc:257] T d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 5 pre-election: Election decided. Result: candidate lost.
> I0507 20:10:39.264344 1389 raft_consensus.cc:2704] T d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing failure detection for 0.720s (election complete)
> I0507 20:10:39.264427 1389 raft_consensus.cc:2459] T d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd [term 4 FOLLOWER]: Leader pre-election lost for term 5. Reason: could not achieve majority
> I0507 20:10:39.340955 1389 raft_consensus.cc:435] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 FOLLOWER]: Starting pre-election (detected failure of leader b5e887ddd8a14cf3b2c24a0aa14be025)
> I0507 20:10:39.341055 1389 raft_consensus.cc:2704] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing failure detection for 0.311s (starting election)
> I0507 20:10:39.341125 1389 raft_consensus.cc:457] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 FOLLOWER]: Starting pre-election with config: opid_index: -1 OBSOLETE_local: false peers { permanent_uuid: "5014824b94b64fd49da2c1a645c078aa" member_type: VOTER last_known_addr { host: "127.0.52.194" port: 43942 } } peers { permanent_uuid: "d1bf0f72a88f4483a1906bc8246849dd" member_type: VOTER last_known_addr { host: "127.0.52.193" port: 46057 } } peers { permanent_uuid: "b5e887ddd8a14cf3b2c24a0aa14be025" member_type: VOTER last_known_addr { host: "127.0.52.195" port: 38453 } }
> I0507 20:10:39.341584 1389 leader_election.cc:230] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer 5014824b94b64fd49da2c1a645c078aa
> I0507 20:10:39.341753 1389 leader_election.cc:230] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer b5e887ddd8a14cf3b2c24a0aa14be025
> W0507 20:10:39.342707 296 leader_election.cc:286] T 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer 5014824b94b64fd49da2c1a645c078aa: Network error: Client connection negotiation failed: client connection to 127.0.52.194:43942: connect: Connection refused (error 111)
> W0507 20:10:39.443085 659 batcher.cc:348] Timed out: Failed to write batch of 1 ops to tablet a0376a5e3b9f495aa00d366b35594a27 after 261 attempt(s): Failed to write to server: (no server available): Write(tablet: a0376a5e3b9f495aa00d366b35594a27, num_ops: 1, num_attempts: 261) passed its deadline: Illegal state: Replica d1bf0f72a88f4483a1906bc8246849dd is not leader of this config. Role: FOLLOWER. Consensus state: current_term: 3 leader_uuid: "b5e887ddd8a14cf3b2c24a0aa14be025" committed_config { opid_index: -1 OBSOLETE_local: false peers { permanent_uuid: "b5e887ddd8a14cf3b2c24a0aa14be025" member_type: VOTER last_known_addr { host: "127.0.52.195" port: 38453 } } peers { permanent_uuid: "d1bf0f72a88f4483a1906bc8246849dd" member_type: VOTER last_known_addr { host: "127.0.52.193" port: 46057 } } peers { permanent_uuid: "5014824b94b64fd49da2c1a645c078aa" member_type: VOTER last_known_addr { host: "127.0.52.194" port: 43942 } } }
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/integration-tests/client-negotiation-failover-itest.cc:186: Failure
> Failed
> Bad status: IO error: Some errors occurred
> ...
> [ FAILED ] ClientFailoverOnNegotiationTimeoutITest.Kudu1580ConnectToTServer (325968 ms){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)