You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2017/08/07 15:24:00 UTC

[jira] [Commented] (KUDU-2090) Insert operation request timed out, UpdateConsensus RPC timed out.

    [ https://issues.apache.org/jira/browse/KUDU-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116740#comment-16116740 ] 

Jean-Daniel Cryans commented on KUDU-2090:
------------------------------------------

Hi [~LUOYAJUN], we use jira to track bugs, improvements, new features, not to handle problems encountered when using Kudu. Please write to the user@ mailing list or see if someone can help you on the Slack channel: http://kudu.apache.org/community.html

From a quick look at the log you attached, I wouldn't be able to pinpoint a problem. It definitely takes a long time to replicate the writes, but no evidence as to why since we only have logs from one machine. Plus there's a bunch of "Tablet not found" which needs to be looked into. 

bq. But we see that Kudu recommends to limit the number of tablets per server to 100 or fewer

What it says is "Recommended maximum number of tablet servers is 100", meaning the number of servers not tablets.

> Insert operation request timed out, UpdateConsensus RPC timed out.
> ------------------------------------------------------------------
>
>                 Key: KUDU-2090
>                 URL: https://issues.apache.org/jira/browse/KUDU-2090
>             Project: Kudu
>          Issue Type: Bug
>          Components: tablet
>    Affects Versions: 1.3.0
>         Environment: Kudu 1.3.0-1.cdh5.11.0.p0.12, CentOS Linux release 7.3.1611 (Core)
>            Reporter: LUOYAJUN
>         Attachments: kudu-tserver.WARNING
>
>
> Insert operation occurs timeout, with the logs appear 'UpdateConsensus RPC'. The kudu cluster consists 3 masters and 8 tabletServers, with 143 tables and 1115 tablets.
> Some message of this issue in the TabletServer Log:
> W0807 03:19:45.116417 20083 consensus_peers.cc:357] T 5c0a1dbeeef04cc796d65746b5cda4dc P a8a23a2a3bb0446db77dcc85fc85530a -> Peer 622a4488ce774290b2dcd3104a06ae3c (hadoop-04:7050): Couldn't send request to peer 622a4488ce774290b2dcd3104a06ae3c for tablet 5c0a1dbeeef04cc796d65746b5cda4dc. Status: Timed out: UpdateConsensus RPC to 10.20.110.4:7050 timed out after 1.000s (SENT). Retrying in the next heartbeat period. Already tried 6 times.
> W0807 03:19:45.163341 20085 consensus_peers.cc:357] T bd89a18ccc0142d784942ecc130ff3b6 P a8a23a2a3bb0446db77dcc85fc85530a -> Peer cf66bf6093764bffa6387c241f9994c6 (hadoop-02:7050): Couldn't send request to peer cf66bf6093764bffa6387c241f9994c6 for tablet bd89a18ccc0142d784942ecc130ff3b6. Error code: TABLET_NOT_FOUND (6). Status: Timed out: UpdateConsensus RPC to 10.20.110.2:7050 timed out after 1.000s (SENT). Retrying in the next heartbeat period. Already tried 1 times.
> W0807 03:19:45.320494 20083 consensus_peers.cc:357] T 0b821119e2b849c38f981269da488fdc P a8a23a2a3bb0446db77dcc85fc85530a -> Peer 622a4488ce774290b2dcd3104a06ae3c (hadoop-04:7050): Couldn't send request to peer 622a4488ce774290b2dcd3104a06ae3c for tablet 0b821119e2b849c38f981269da488fdc. Error code: TABLET_NOT_FOUND (6). Status: Timed out: UpdateConsensus RPC to 10.20.110.4:7050 timed out after 1.000s (SENT). Retrying in the next heartbeat period. Already tried 7 times.
> W0807 03:19:45.320538 20083 consensus_peers.cc:357] T 8471841aa0114924868cfdf596e9bf95 P a8a23a2a3bb0446db77dcc85fc85530a -> Peer c0556f4e50a34b04b9f4b1ffc63f3ffb (hadoop-03:7050): Couldn't send request to peer c0556f4e50a34b04b9f4b1ffc63f3ffb for tablet 8471841aa0114924868cfdf596e9bf95. Status: Timed out: UpdateConsensus RPC to 10.20.110.3:7050 timed out after 1.000s (SENT). Retrying in the next heartbeat period. Already tried 7 times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)