You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/05/01 01:25:43 UTC

[GitHub] [incubator-druid] jon-wei commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table

jon-wei commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r279989631
 
 

 ##########
 File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java
 ##########
 @@ -159,14 +162,61 @@ public Response getDatabaseSegments(
     }
     final Stream<DataSegment> metadataSegments = dataSourceStream.flatMap(t -> t.getSegments().stream());
 
-    final Function<DataSegment, Iterable<ResourceAction>> raGenerator = segment -> Collections.singletonList(
-        AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource()));
+    if (includeOvershadowedStatus != null) {
+      final Iterable<SegmentWithOvershadowedStatus> authorizedSegments = findAuthorizedSegmentWithOvershadowedStatus(
+          req,
+          druidDataSources,
+          metadataSegments
+      );
+      Response.ResponseBuilder builder = Response.status(Response.Status.OK);
+      return builder.entity(authorizedSegments).build();
+    } else {
+
+      final Function<DataSegment, Iterable<ResourceAction>> raGenerator = segment -> Collections.singletonList(
+          AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource()));
+
+      final Iterable<DataSegment> authorizedSegments = AuthorizationUtils.filterAuthorizedResources(
+          req,
+          metadataSegments::iterator,
+          raGenerator,
+          authorizerMapper
+      );
+
+      Response.ResponseBuilder builder = Response.status(Response.Status.OK);
+      return builder.entity(authorizedSegments).build();
+    }
+  }
 
-    final Iterable<DataSegment> authorizedSegments =
-        AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper);
+  private Iterable<SegmentWithOvershadowedStatus> findAuthorizedSegmentWithOvershadowedStatus(
+      HttpServletRequest req,
+      Collection<ImmutableDruidDataSource> druidDataSources,
+      Stream<DataSegment> metadataSegments
+  )
+  {
+    // It's fine to add all overshadowed segments to a single collection because only
+    // a small fraction of the segments in the cluster are expected to be overshadowed,
+    // so building this collection shouldn't generate a lot of garbage.
+    final Set<DataSegment> overshadowedSegments = new HashSet<>();
+    for (ImmutableDruidDataSource dataSource : druidDataSources) {
+      overshadowedSegments.addAll(ImmutableDruidDataSource.determineOvershadowedSegments(dataSource.getSegments()));
 
 Review comment:
   I haven't really formed an opinion on DataSegment mutability presently, but I think @leventov's suggestion for lazily computing the overshadowed view at most once per SQLSegmentMetadataManager poll() and sharing that view with the metadata retrieval APIs and the coordinator balancing logic makes a lot of sense. 
   
   > Because the current design doesn't seem reasonable to me at this point. (So there won't be much difference from as if you just do the implementation right in this PR, but if you wish you can separate in two PRs.)
   
   I agree with making the adjustment to the overshadowed view computation as an immediate follow on, though I think a separate PR is a bit better:
   - The coordinator balancing logic is a pretty "core" part of the system, and I feel like it would be better to change that in a separate PR that calls attention more explicitly to that/isolates that change more
   - This PR is getting a bit long, a little tedious to navigate

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org