You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/28 16:05:24 UTC

[GitHub] [incubator-druid] sascha-coenen commented on issue #6993: [Proposal] Dynamic prioritization and laning

sascha-coenen commented on issue #6993: [Proposal] Dynamic prioritization and laning
URL: https://github.com/apache/incubator-druid/issues/6993#issuecomment-487392418
 
 
   
   I **LOVE LOVE LOVE** the motion to work on dynamic prioritisation and laning and I also like these two terms. 
   
   Actually I was thinking about the same feature for a long time. I agree to your problem statement BUT I would model the problem quite differently. I believe to have an alternative proposal to make which would be similarly easy to implement but lead to a superior system behaviour (more automatic fairness, more control authority by admin, more self-healing) I think it might be looked at as being combinable into the existing proposal and some parts of my proposal below can be considered optional. 
   
   ---
   
   As was already stated, internally, Druid already has a query prioritisation mechanism, but out of the box, this priority can only be specified by a client as a static value.
   
   Asking for a "dynamic prioritisation" is already implicitly formulating that there are **several aspects that influence a query's priority**. 
   
   (Some of these are intrinsic - meaning that Druid can compute them itself. Some are extrinsic - they are known only by an admin/client/user.)
   
   **(a) [extrinsic] the user context/intention** 
   - how urgent is the query
   is the query originating from an interactive system or is it a report query that may take a long time.
   - how important is the user - CEO/VIP/customer/employee/backend-process
   
   **(b) [intrinsic] the query's weight** 
   - an estimation of how long it takes to execute or how much computational power is needed
   
   **(c) [intrinsic] the system state**
   - how much of the query result is already cached
   - is the system idle or under heavy load
   
   **(d) [intrinsic] the system/user's history**
   - how much strain has a user put onto the system recently. One might want to down-prioritise queries originating from a heavy-hitter over time or after a quota is reached to give other more modest users also a fair share.
   
   ---
   
   **Alternative Proposal**
   Let me say for each of the above how I would model them individually and then how those pieces could be put together:
   
   **(a) "the extrinsic context"**
   To model the extrinsic input, a query should be associated with a logical token that an admin can map to certain parameterizations. I'm used to calling such a logical token a "resource-pool" although I don't consider it a good name. Basically, instead of directly setting a "priority" inside of a query's context object, I would instead propose to specify a freely chosen logical resource pool name in the same manner.
   
   ```
   {
       "type": "topN",
       "context": {
           "resource-pool": "interactive-customer-query"
       }
   }
   ```
   
   An admin can pre-configure resource-pools and then queries can refer to them. An unknown resource pool or missing declaration could then map to a "default" resource pool. 
   
   
   **(b) "computing a query weight"**
   - given an incoming query, a broker would first determine a weight for it.
   - one could make this mechanism pluggable to have a smooth migration path.
   -- A simple strategy would assign each query the same weight.
   -- A legacy strategy could set the query weight to be equal to the query priority set in the query context object.
   -- A similarly simple strategy could look up a static query weight from a resource-pool.
   (This would then be functionally equivalent to the current way of setting priorities in the query context)
   -- Even a realistic query weight is rather straightforward and easy to implement:
   ```    
       weight := #segments * #metrics * #splits * selectivity
   
       # stands for number-of
       selectivity means which fraction of records need to be scanned due to the given filter condition.
   ```    
   
   **(c) "the system state"**
   - similar to a query weight, the system state can also be modelled as another weight that can be combined with a query-weight and other inputs to form an eventual query priority.
   - this is where Gian's proposal plugs in. 
   - (c1) "system is under heavy load"
   --this increases the weight and ends up mapping more queries to low priorities.
   - (c2) "query results are partially cached"
   -- if a certain fraction of a query is cached, this should reduce the query weight. 
   -- if a broker can determine which fraction of a query is cached it would reduce the query weight accordingly
       `query-weight := query-weight * %cached `
   -- the broker then sends the query along with its weight to the historical, which can optionally further discount the given query weight by observing it's local cache situation.
   
   **(d) "the system/user's history"**
   - an additional weight can be computed based on a user's or resource-pool's query history this weight acts as a penalty.
   - a broker could keep a temporally-discounted sum over all query-weights per connection or per resource pool (or both)
   - due to a mathematical argument, there exists only a single temporal discounting function which is rational/consistent, which is exponential temporal discounting. This is nice two times over because a) it means one does not need to make this weighting strategy pluggable necessarily and b) the exponential function has nice computational properties which allow for very efficient incremental computation of an "eternal running sum"
   `
   temporal-weight := sum ( weight(query[i]) * exp( now() - time(query[i]) ) )
   `
   
   ---
   
   **Putting those pieces together:**
   - a client annotates a query with a resource pool name like "my-pool" instead of directly specifying a static priority.
   - a broker receives the query and computes a weight
     `weight := query-weight * system-weight * temporal-discount`
   - broker sends query and weight to the historicals. Alternatively, the broker can turn the weight into a priority and send this.
   - historicals receive query along with weight and either treat the weight as a priority and follow their current scheduling logic, or alternatively, in a sophisticated setup, a historical could further modulate the weight based on its local conditions like "local load condition" or "local cache"
   - the broker also uses the resource pool name to lookup configuration attributes which can influence/modulate the weight computations, for instance:
   -- base-offsets == laning
   -- weighting factors
   -- directives as to whether to engage or ignore a certain criteria or stage in the weight computation.
   -- specification of which pluggable strategy to use.
   -- half-life for exponential temporal discounting.
   -- in short, any parameterization for the process of computing a dynamic query priority.
   
   ---
   
   Although it might seem like a nice-to-have component, it is especially the incorporation of an exponential temporal discounting that would lead to a self-healing system that also has automatic fairness.
   There would never have to any query that need to be rejected. Queries might simply get such high weights that it will be never their turn to compute which might eventually time them out.
   Likewise, users pinging the system to death with massive amounts of queries will not be able to impact other more modest users. 
   
   All of the above might sound complex on paper but even if all got implemented, this would still be very straightforward and easy to do and would constitute a minor effort.
   
   There are more sophisticated solutions, which allow a query to move from one lane to another one.
   
   Lastly, although I agree that one might want to start small, I consider it useful and necessary to try to formulate a "true north" vision nonetheless - an ideal, fully fledged solution. One can then still implement only part of it, but it allows for validating whether that first step is well aligned with what potential later steps one might want to take.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org