You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/03/07 22:59:50 UTC

[GitHub] [incubator-druid] egor-ryashin commented on a change in pull request #7185: Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality

egor-ryashin commented on a change in pull request #7185: Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality
URL: https://github.com/apache/incubator-druid/pull/7185#discussion_r263576057
 
 

 ##########
 File path: server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java
 ##########
 @@ -94,10 +93,29 @@
 @ManageLifecycle
 public class DruidCoordinator
 {
-  public static Comparator<DataSegment> SEGMENT_COMPARATOR = Ordering.from(Comparators.intervalsByEndThenStart())
-                                                                     .onResultOf(DataSegment::getInterval)
-                                                                     .compound(Ordering.<DataSegment>natural())
-                                                                     .reverse();
+  /**
+   * This comparator orders "freshest" segments first, i. e. segments with most recent intervals.
+   *
+   * It is used in historical nodes' {@link LoadQueuePeon}s to make historicals load more recent segment first.
+   *
+   * It is also used in {@link DruidCoordinatorRuntimeParams} for {@link
+   * DruidCoordinatorRuntimeParams#getAvailableSegments()} - a collection of segments to be considered during some
+   * coordinator run for different {@link DruidCoordinatorHelper}s. The order matters only for {@link
+   * DruidCoordinatorRuleRunner}, which tries to apply the rules while iterating the segments in the order imposed by
+   * this comparator. In {@link LoadRule} the throttling limit may be hit (via {@link ReplicationThrottler}; see
+   * {@link CoordinatorDynamicConfig#getReplicationThrottleLimit()}). So before we potentially hit this limit, we want
+   * to schedule loading the more recent segments (among all of those that need to be loaded).
+   *
+   * In both {@link LoadQueuePeon}s and {@link DruidCoordinatorRuleRunner}, we want to load more recent segments first
+   * because presumably they are queried more often and contain are more important data for users, so if the Druid
+   * cluster has availability problems and struggling to make all segments available immediately, at least we try to
+   * make more "important" (more recent) segments available as soon as possible.
+   */
+  static Comparator<DataSegment> SEGMENT_COMPARATOR_RECENT_FIRST = Ordering
 
 Review comment:
   Not `final`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org