You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2016/06/23 18:55:16 UTC

[jira] [Commented] (KUDU-1495) Deleted tablets may not quiesce maintenance operations in a timely fashion

    [ https://issues.apache.org/jira/browse/KUDU-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15346980#comment-15346980 ] 

Todd Lipcon commented on KUDU-1495:
-----------------------------------

Seems like the issue is this block:
{code}
    while (iter->first->running_ > 0) {
      op->cond_->Wait();
      iter = ops_.find(op);
      CHECK(iter != ops_.end()) << "Tried to unregister " << op->name()
          << ", but another thread unregistered it while we were "
          << "waiting for it to complete";
    }
{code}

we have 4 running compactions, so each time one finishes, it drops to 3, the op doesn't get unregistered, and another compaction gets scheduled.

Seems like we should either (a) not mark compaction as runnable on a QUIESCING tablet, or (b) have the maintenace manager 'unregister' wait for running ops before unregistering them (or somehow mark them unrunnable while waiting)

> Deleted tablets may not quiesce maintenance operations in a timely fashion
> --------------------------------------------------------------------------
>
>                 Key: KUDU-1495
>                 URL: https://issues.apache.org/jira/browse/KUDU-1495
>             Project: Kudu
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 0.9.0
>            Reporter: Todd Lipcon
>
> With multiple maintenance manager threads, if a tablet is very under-compacted, and you delete the tablet, it will get stuck in 'QUIESCING' state for quite some time. It seems like new maintenance operations will still start on that tablet even though it is in 'quiescing' state. This can cause a tablet to remain for quite some time running compactions even after its table has been deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)