You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "shenxingwuying (Jira)" <ji...@apache.org> on 2022/05/05 03:50:00 UTC
[jira] [Commented] (KUDU-3364) Add TimerThread to ThreadPool to support a category of problem
[ https://issues.apache.org/jira/browse/KUDU-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532021#comment-17532021 ]
shenxingwuying commented on KUDU-3364:
--------------------------------------
Simply, I want to another way to implement the periodic rescheduling, and at the same time can mannual trigge the task immediately.
I give an example to explain my intent, I add a rebalance interface for kudu CLI.
https://gerrit.cloudera.org/c/18402/12/src/kudu/master/master_service.cc#686
> Add TimerThread to ThreadPool to support a category of problem
> --------------------------------------------------------------
>
> Key: KUDU-3364
> URL: https://issues.apache.org/jira/browse/KUDU-3364
> Project: Kudu
> Issue Type: New Feature
> Reporter: shenxingwuying
> Assignee: shenxingwuying
> Priority: Minor
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> h1. Scenanios
> In general, I am talking about a category of problem.
> There are some periodic tasks or automatically triggered scheduling tasks in kudu.
> For example, automatic rebalance of cluster data, some GC task and compaction tasks.
> Their implementation is by kudu Thread, maybe std::thread or ThreadPool, the really task internally periodic scheduled or internally strategy to trigge execution.
> They are all internal, we cann't do some.
> In fact, we need a method our control to trigge the above types of actions.
> In general, I am talking about a category of problem.
> Some scenarios is significant.
> Below is examples:
>
> h2. data rebalance
> There are two rebalance ways:
> 1. enable auto rebalance
> 2. use rebalance tool 1.14 before.
> The two ways maybe exist some conflicts at opeations race, because rebalance tool' logic is a litte complex at tool and auto rebalance is running at master.
> In future, auto rebalance at master will become very steady and become the main way for data rebalance. And at the same time, admin opers need a external trigger the rebalance just like auto rebalance.
> But, now auto rebalance is running in a thread and by time period.
> Although we can add a api for MasterService, but the api is synchronize, and will cose very much, we need a asynchronized method to trigger the rebalance.
> h2. auto compaction
> Another example is auto compaction,
> I have found compaction strategy is not always valid, so maybe we need a method controlled by admin users to triggle compaction.
> If we can do a RowSetInCompaction, we need not restart the kudu cluster.
> h1.
> h1. My Solution
> Add a timer in ThreadPool. This timer is a worker thread that schedules tasks to the specified thread according to time.
> We can limit only SERIAL ThreadPoolToken can enable TimerThread.
> Pseudo code expresses my intention:
> {code:java}
> //代码占位符
> class TimerThread {
> class Task {
> ThreadPoolToken token;
> std::function<void()> f;
> };
>
> void Schedule(Task task, int delay_ms) {
> tasks_.insert(...);
> }
> void RunLoop() {
> while (...) {
> SleepFor(100ms);
> tasks = FindTasks();
> for (auto task : tasks) {
> token = task.token;
> token->Submit(task.f);
> tasks_.erase...
> }
> }
> }
> scoped_refptr<Thread> thread_;
> std::multimap<MonoTime, Task> tasks;
> };
> class ThreadPool{
> ...
> TimerThread* timer_;
> ...
> };
> class ThreadPoolToken {
> void Scheduler();
> };{code}
> This scheme can be compatible with the previous ThreadPool, and timer is nullptr by default.
> For periodic tasks, We can use a Control ThreadPool with timer to refact some codes to make them more clear, to avoid the problem of too many single threads in the past.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)