You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/05/13 12:17:33 UTC

[GitHub] [incubator-druid] leventov opened a new pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource

leventov opened a new pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
URL: https://github.com/apache/incubator-druid/pull/7653
 
 
   This PR is a large portion of #7306 PR extracted, but avoiding renaming classes so that Git doesn't lose track of files' history.
   
   ### Description
   
   #### Decouple `MetadataSegmentManager.start`/`stop` and `startPollingDatabasePeriodically`/`stopPollingDatabasePeriodically`.
   
    - `Lifecycle`'s `start()` now doesn't start polling the database periodically. 
    - `stopPollingDatabasePeriodically()` doesn't shutdown the executor, only the polling task. Next `startPollingDatabasePeriodically()` reuses the executor.
   
   #### `SQLMetadataSegmentManager`: periodic and on-demand database polls
   Made the `MetadataSegmentManager`'s interface simpler and more robust by not exposing "periodic polls" implementation detail. Instead, when the Coordinator is not leading (hence `MetadataSegmentManager` is not polling the database periodically) and somebody (e. g. `DataSourcesResource`) queries the data from `MetadataSegmentManager`, an on-demand database poll is initiated under the hoods (if the previous on-demand poll was long time ago).
   
   This change reworks the change #7447 by @gianm.
   
   #### `DataSourcesResource` return code changes: returning numChangedSegments: N and segmentStateChanged: boolean
   
   POST `/druid/coordinator/v1/datasources/{dataSourceName}`
   DELETE `/druid/coordinator/v1/datasources/{dataSourceName}`
   POST `/druid/coordinator/v1/datasources/{dataSourceName}/markUnused`
   POST `/druid/coordinator/v1/datasources/{dataSourceName}/markUsed`
   
   Now return a JSON object of the form `{"numChangedSegments": N}` instead of 204 (empty response) when no segments were changed. On the other hand, 500 (server error) is not returned instead of 204 (empty response).
   
   There is a separate question of whether all endpoints in `DataSourcesResource` should return 404 on non-existent data source: see #7652. This aspect is not changed in this PR.
   
   POST `/druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}`
   DELETE `/druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}`
   
   Now return a JSON object of the form `{"segmentStateChanged": true/false}`.
   
   The change originates from this discussion: https://github.com/apache/incubator-druid/pull/7306#discussion_r275633562
   
   #### REST API change: GET `/druid/coordinator/v1/metadata/datasources?includeDisabled` -> `includeUnused`
   `includeDisabled` is still accepted, but a warning is emitted in a log. See changes in `DataSourcesResource` class.
   
   FYI added a method `JettyUtils.getQueryParam()` to facilitate such parameter renames. (That's why this PR has `Compatibility` tag.)
   
   #### `MetadataSegmentManager`: API refactor
   
    - `enableDataSource` -> `markAsUsedAllNonOvershadowedSegmentsInDataSource`
    - `enableSegments(dataSource, interval)` -> `markAsUsedNonOvershadowedSegmentsInInterval`
    - `enableSegments(dataSource, Collection segmentIds)` -> `markAsUsedNonOvershadowedSegments(dataSource, Set segmentIds) throws UnknownSegmentIdException`
    - `enableSegment` -> `markSegmentAsUsed`
    - `removeDataSource` -> `markAsUnusedAllSegmentsInDataSource`
    - `disableSegments(dataSource, interval)` -> `markAsUnusedSegmentsInInterval`
    - `disableSegments(dataSource, Collection segmentIds)` -> `markSegmentsAsUnused(dataSource, Set segmentIds) throws UnknownSegmentIdException`
    - `removeSegment` -> `markSegmentAsUnused`
    - `getDataSource` -> `prepareImmutableDataSourceWithUsedSegments`
    - `getDataSources` -> `prepareImmutableDataSourcesWithAllUsedSegments`
    - `iterateAllSegments` -> `iterateAllUsedSegments`
    - `getAllDataSourceNames` -> `retrieveAllDataSourceNames`
   
   #### `SQLMetadataSegmentManager`: remove data from `dataSources` in `markAsUnusedSegmentsInInterval` and `markSegmentsAsUnused`
   
   Fixed the logic added in #7490 to match the behavior of `markAsUnusedAllSegmentsInDataSource`. FYI @dampcake.
   
   #### Optimizations
    - Fixed needlessly expensive `ImmutableDruidDataSource.hashCode()`, see https://github.com/apache/incubator-druid/pull/6901#discussion_r265741002.
    - Removed needless `toImmutableDataSource()` (hence garbage) in `DataSourcesResource.getDataSource()`.
   
   <hr>
   
   This PR is `Incompatible` because it changes return codes of REST methods.
   
   This PR is `Development Blocker` because it blocks #7306.
   
   <hr>
   
   For reviewers: key changed classes are `SQLMetadataSegmentManager` and `DataSourcesResource`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org