You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/06/04 08:06:37 UTC

[GitHub] [druid] clintropolis commented on a change in pull request #9965: API to verify a datasource has the latest ingested data

clintropolis commented on a change in pull request #9965:
URL: https://github.com/apache/druid/pull/9965#discussion_r434946314



##########
File path: server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java
##########
@@ -391,6 +393,43 @@ public Response getServedSegmentsInInterval(
     return getServedSegmentsInInterval(dataSourceName, full != null, theInterval::contains);
   }
 
+  @GET
+  @Path("/{dataSourceName}/loadstatus")
+  @Produces(MediaType.APPLICATION_JSON)
+  @ResourceFilters(DatasourceResourceFilter.class)
+  public Response getDatasourceLoadstatus(
+      @PathParam("dataSourceName") String dataSourceName,
+      @QueryParam("interval") @Nullable final String interval,
+      @QueryParam("firstCheck") @Nullable final Boolean firstCheck
+  )
+  {
+    if (serverInventoryView == null || serverInventoryView.getSegmentLoadInfos() == null) {
+      return Response.ok(ImmutableMap.of("loaded", false)).build();
+    }
+    // Force poll
+    Interval theInterval = interval == null ? Intervals.ETERNITY : Intervals.of(interval);
+    boolean requiresMetadataStorePoll = firstCheck == null ? true :firstCheck;
+
+    Optional<Iterable<DataSegment>> segments = segmentsMetadataManager.iterateAllUsedNonOvershadowedSegmentsForDatasourceInterval(
+        dataSourceName,
+        theInterval,
+        requiresMetadataStorePoll
+    );
+
+    if (!segments.isPresent()) {
+      return logAndCreateDataSourceNotFoundResponse(dataSourceName);

Review comment:
       It seems a little off that not having `serverInventoryView` or it not being initialized would return a `{"loaded":false}` but a datasource not existing would be an empty response. I think this response is fine, but maybe the `serverInventoryView == null` case should be an error response indicating that information in unavailable

##########
File path: server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java
##########
@@ -391,6 +393,43 @@ public Response getServedSegmentsInInterval(
     return getServedSegmentsInInterval(dataSourceName, full != null, theInterval::contains);
   }
 
+  @GET
+  @Path("/{dataSourceName}/loadstatus")
+  @Produces(MediaType.APPLICATION_JSON)
+  @ResourceFilters(DatasourceResourceFilter.class)
+  public Response getDatasourceLoadstatus(
+      @PathParam("dataSourceName") String dataSourceName,
+      @QueryParam("interval") @Nullable final String interval,
+      @QueryParam("firstCheck") @Nullable final Boolean firstCheck

Review comment:
       I think this parameter should be something like `forceMetadataPoll` or `forceMetadataRefresh` something

##########
File path: server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataManager.java
##########
@@ -403,11 +425,16 @@ private void awaitOrPerformDatabasePoll()
   }
 
   /**
-   * If the latest {@link DatabasePoll} is a {@link PeriodicDatabasePoll}, or an {@link OnDemandDatabasePoll} that is
-   * made not longer than {@link #periodicPollDelay} from now, awaits for it and returns true; returns false otherwise,
-   * meaning that a new on-demand database poll should be initiated.
+   * This method returns true without waiting for database poll if the latest {@link DatabasePoll} is a
+   * {@link PeriodicDatabasePoll} that has completed it's first poll, or an {@link OnDemandDatabasePoll} that is
+   * made not longer than {@link #periodicPollDelay} from current time.
+   * This method does wait untill completion for if the latest {@link DatabasePoll} is a
+   * {@link PeriodicDatabasePoll} that has not completed it's first poll, or an {@link OnDemandDatabasePoll} that is
+   * alrady in the process of polling the database.
+   * This means that any method using this check can read from snapshot that is
+   * up to {@link SqlSegmentsMetadataManager#periodicPollDelay} old.
    */
-  private boolean awaitLatestDatabasePoll()
+  private boolean useLatestSnapshotIfWithinDelay()

Review comment:
       nit: i think the old name was better, and still works with the new method being named `forceOrWaitOngoingDatabasePoll`

##########
File path: server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java
##########
@@ -113,6 +114,18 @@ int markAsUsedNonOvershadowedSegments(String dataSource, Set<String> segmentIds)
    */
   Iterable<DataSegment> iterateAllUsedSegments();
 
+  /**
+   * Returns an iterable to go over all used and non-overshadowed segments of given data sources over given interval.
+   * The order in which segments are iterated is unspecified.
+   * If {@param requiresLatest} is true then a force metadatastore poll will be triggered. This can cause a longer
+   * response time but will ensure that the latest segment information (at the time this method is called) is returned.
+   * If {@param requiresLatest} is false then segment information from stale snapshot of up to the last periodic poll
+   * period {@link SqlSegmentsMetadataManager#periodicPollDelay} will be used.
+   */
+  Optional<Iterable<DataSegment>> iterateAllUsedNonOvershadowedSegmentsForDatasourceInterval(String datasource,

Review comment:
       nit: formatting seems strange here, i suggest: 
   ```
     Optional<Iterable<DataSegment>> iterateAllUsedNonOvershadowedSegmentsForDatasourceInterval(
         String datasource,
         Interval interval,
         boolean requiresLatest
     );
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org