You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Adar Dembo (JIRA)" <ji...@apache.org> on 2018/12/13 19:44:00 UTC

[jira] [Resolved] (KUDU-2638) kudu cluster restart very long time to reused

     [ https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adar Dembo resolved KUDU-2638.
------------------------------
       Resolution: Information Provided
    Fix Version/s: n/a

I spent some time looking at your log; here are my observations:
 * The bulk of the time appears to be spent loading tablet metadata. How many tablets are on this node? What kind of hardware is being used for the tserver's metadata directory?
 * The actual time spent compacting this tablet is minimal: ~8 seconds for all of the compaction operations to run, vs. ~2m to bootstrap the tablet.
 * This tablet only has one other peer (besides the local replica), which means both replicas need to be running before you'll be able to write to the tablet. Was the tablet's table created with a replication factor of 2?
 * By removing all but the references to tablet 5aae5dc9e6f4468aaf00c060152d4fed it's much more difficult to understand what's going on. For example, I don't know how many data directories you have (which affects compaction speed). I don't know how many tablets you have, or which block manager you're using (both of which affect startup time). A full tserver log (with e.g. hostnames redacted) would yield better results.

Anyway, I'm not seeing anything actionable here, so I'm going to close this JIRA. General queries like these (i.e. "help me understand why my cluster is restarting slowly") should be directed towards the Kudu user mailing list or Slack channels.

> kudu cluster restart very long time to reused
> ---------------------------------------------
>
>                 Key: KUDU-2638
>                 URL: https://issues.apache.org/jira/browse/KUDU-2638
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: jiaqiyang
>            Priority: Major
>             Fix For: n/a
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                                                                                                  
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> --------------------------------------------------------------------------------+------------
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)