You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "klsince (via GitHub)" <gi...@apache.org> on 2024/02/20 23:09:29 UTC
[PR] Enter segment preloading at partition level [pinot]
klsince opened a new pull request, #12451:
URL: https://github.com/apache/pinot/pull/12451
Currently, we trigger the segment preloading only when creating the tableDataMgr on a server, which is less flexible than triggering at the partition level. For example, when moving a partition to a server already hosting some other partitions, the current triggering logic at table level won't preload segments for the new partition.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [PR] Enter segment preloading at partition level [pinot]
Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang merged PR #12451:
URL: https://github.com/apache/pinot/pull/12451
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [PR] Enter segment preloading at partition level [pinot]
Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #12451:
URL: https://github.com/apache/pinot/pull/12451#discussion_r1514496150
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BaseTableUpsertMetadataManager.java:
##########
@@ -55,20 +39,11 @@ public abstract class BaseTableUpsertMetadataManager implements TableUpsertMetad
private static final Logger LOGGER = LoggerFactory.getLogger(BaseTableUpsertMetadataManager.class);
protected String _tableNameWithType;
- protected TableDataManager _tableDataManager;
- protected HelixManager _helixManager;
- protected ExecutorService _segmentPreloadExecutor;
Review Comment:
these are all breaking changes. Third party code may be extending this class and using these attributes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [PR] Enter segment preloading at partition level [pinot]
Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #12451:
URL: https://github.com/apache/pinot/pull/12451#discussion_r1498371551
##########
pinot-common/src/main/java/org/apache/pinot/common/utils/SegmentUtils.java:
##########
@@ -49,6 +49,24 @@ public static Integer getRealtimeSegmentPartitionId(String segmentName, String r
ZKMetadataProvider.getSegmentZKMetadata(helixManager.getHelixPropertyStore(), realtimeTableName, segmentName);
Preconditions.checkState(segmentZKMetadata != null,
"Failed to find segment ZK metadata for segment: %s of table: %s", segmentName, realtimeTableName);
+ return getRealtimeSegmentPartitionId(segmentZKMetadata, partitionColumn);
+ }
+
+ @Nullable
+ public static Integer getRealtimeSegmentPartitionId(String segmentName, SegmentZKMetadata segmentZKMetadata,
+ @Nullable String partitionColumn) {
+ // A fast path if the segmentName is an LLC segment name: get the partition id from the name directly
+ LLCSegmentName llcSegmentName = LLCSegmentName.of(segmentName);
+ if (llcSegmentName != null) {
+ return llcSegmentName.getPartitionGroupId();
+ }
+ // Otherwise, retrieve the partition id from the segment zk metadata.
+ return getRealtimeSegmentPartitionId(segmentZKMetadata, partitionColumn);
+ }
+
+ @Nullable
+ public static Integer getRealtimeSegmentPartitionId(SegmentZKMetadata segmentZKMetadata,
Review Comment:
(minor) Make this private? We probably don't want user to directly use this method?
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -102,6 +117,12 @@ public abstract class BasePartitionUpsertMetadataManager implements PartitionUps
// Initialize with 1 pending operation to indicate the metadata manager can take more operations
private int _numPendingOperations = 1;
private boolean _closed;
+ // The lock and boolean flag below ensure only one thread can start preloading and preloading happens only once.
+ private final Lock _preloadLock = new ReentrantLock();
+ private volatile boolean _isPreloaded = false;
Review Comment:
Why do we need 2 flags? Can we initialize `_isPreloading` based on whether preload is enabled?
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java:
##########
@@ -452,6 +460,12 @@ public void addSegment(String segmentName, IndexLoadingConfig indexLoadingConfig
_serverMetrics.addValueToTableGauge(_tableNameWithType, ServerGauge.SEGMENT_COUNT, 1L);
}
+ private boolean isReadyToPreload() {
Review Comment:
Rename this to `isUpsertPreloadEnabled()`. We don't need to check both upsert and this, checking only this should be enough.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [PR] Enter segment preloading at partition level [pinot]
Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #12451:
URL: https://github.com/apache/pinot/pull/12451#discussion_r1509628827
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java:
##########
@@ -359,6 +357,12 @@ public boolean isPartialUpsertEnabled() {
&& _tableUpsertMetadataManager.getUpsertMode() == UpsertConfig.Mode.PARTIAL;
}
+ private boolean isUpsertPreloadEnabled() {
+ UpsertConfig upsertConfig = _tableConfig.getUpsertConfig();
+ return _tableUpsertMetadataManager != null && _segmentPreloadExecutor != null && upsertConfig != null
Review Comment:
Good question. I followed the method call below used to create partition mgr, as that's the entry point for preloading. And this is only called in `addSegment()` or `handleUpsert()` methods. So afaik, we should be good to check this at the beginning of the addSegment() method.
```
... _tableUpsertMetadataManager.getOrCreatePartitionManager(partitionId);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [PR] Enter segment preloading at partition level [pinot]
Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #12451:
URL: https://github.com/apache/pinot/pull/12451#discussion_r1505162788
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java:
##########
@@ -359,6 +357,12 @@ public boolean isPartialUpsertEnabled() {
&& _tableUpsertMetadataManager.getUpsertMode() == UpsertConfig.Mode.PARTIAL;
}
+ private boolean isUpsertPreloadEnabled() {
+ UpsertConfig upsertConfig = _tableConfig.getUpsertConfig();
+ return _tableUpsertMetadataManager != null && _segmentPreloadExecutor != null && upsertConfig != null
Review Comment:
Not introduced in this PR, but do we have checks in other places? Basically we don't want it to skip preloading silently
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/PartitionUpsertMetadataManager.java:
##########
@@ -64,6 +68,14 @@ public interface PartitionUpsertMetadataManager extends Closeable {
*/
void addSegment(ImmutableSegment segment);
+ /**
+ * Preload segments for the table partition. Segments can be added differently during preloading.
+ */
+ void preloadSegments(TableDataManager tableDataManager, IndexLoadingConfig indexLoadingConfig,
Review Comment:
Suggest only passing in `indexLoadingConfig`. Other properties can be set during initialization. We can consider only setting `tableDataManager` during initialization, and add getters in `TableDataManager` to get `helixManager` and `segmentPreloadExecutor`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [PR] Enter segment preloading at partition level [pinot]
Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #12451:
URL: https://github.com/apache/pinot/pull/12451#issuecomment-1955449569
## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
Attention: `50 lines` in your changes are missing coverage. Please review.
> Comparison is base [(`8c395ff`)](https://app.codecov.io/gh/apache/pinot/commit/8c395ff0788099cdb2e36b768e0694853262b942?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) 61.73% compared to head [(`cb76026`)](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) 61.56%.
| [Files](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Patch % | Lines |
|---|---|---|
| [...cal/upsert/BasePartitionUpsertMetadataManager.java](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC91cHNlcnQvQmFzZVBhcnRpdGlvblVwc2VydE1ldGFkYXRhTWFuYWdlci5qYXZh) | 63.75% | [24 Missing and 5 partials :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
| [...ata/manager/realtime/RealtimeTableDataManager.java](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvUmVhbHRpbWVUYWJsZURhdGFNYW5hZ2VyLmphdmE=) | 0.00% | [15 Missing and 1 partial :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
| [...va/org/apache/pinot/common/utils/SegmentUtils.java](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvU2VnbWVudFV0aWxzLmphdmE=) | 0.00% | [5 Missing :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
<details><summary>Additional details and impacted files</summary>
```diff
@@ Coverage Diff @@
## master #12451 +/- ##
============================================
- Coverage 61.73% 61.56% -0.18%
Complexity 207 207
============================================
Files 2436 2436
Lines 133173 133196 +23
Branches 20631 20635 +4
============================================
- Hits 82220 81996 -224
- Misses 44907 45146 +239
- Partials 6046 6054 +8
```
| [Flag](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
|---|---|---|
| [custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (ø)` | |
| [integration](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
| [integration1](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
| [integration2](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (ø)` | |
| [java-11](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (-61.69%)` | :arrow_down: |
| [java-21](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.56% <50.49%> (-0.06%)` | :arrow_down: |
| [skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `34.82% <0.00%> (-26.88%)` | :arrow_down: |
| [skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `27.70% <50.49%> (-33.89%)` | :arrow_down: |
| [temurin](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.56% <50.49%> (-0.18%)` | :arrow_down: |
| [unittests](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.55% <50.49%> (-0.18%)` | :arrow_down: |
| [unittests1](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.65% <0.00%> (-0.20%)` | :arrow_down: |
| [unittests2](https://app.codecov.io/gh/apache/pinot/pull/12451/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `27.70% <50.49%> (-0.04%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
</details>
[:umbrella: View full report in Codecov by Sentry](https://app.codecov.io/gh/apache/pinot/pull/12451?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
:loudspeaker: Have feedback on the report? [Share it here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [PR] Enter segment preloading at partition level [pinot]
Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #12451:
URL: https://github.com/apache/pinot/pull/12451#discussion_r1498426332
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -102,6 +117,12 @@ public abstract class BasePartitionUpsertMetadataManager implements PartitionUps
// Initialize with 1 pending operation to indicate the metadata manager can take more operations
private int _numPendingOperations = 1;
private boolean _closed;
+ // The lock and boolean flag below ensure only one thread can start preloading and preloading happens only once.
+ private final Lock _preloadLock = new ReentrantLock();
+ private volatile boolean _isPreloaded = false;
Review Comment:
simplified, by setting _isPreloading to true if preloading is enabled, otherwise false
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
Re: [PR] Enter segment preloading at partition level [pinot]
Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #12451:
URL: https://github.com/apache/pinot/pull/12451#discussion_r1514938270
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BaseTableUpsertMetadataManager.java:
##########
@@ -55,20 +39,11 @@ public abstract class BaseTableUpsertMetadataManager implements TableUpsertMetad
private static final Logger LOGGER = LoggerFactory.getLogger(BaseTableUpsertMetadataManager.class);
protected String _tableNameWithType;
- protected TableDataManager _tableDataManager;
- protected HelixManager _helixManager;
- protected ExecutorService _segmentPreloadExecutor;
Review Comment:
those were added when adding the preloading logic but was made at table level. Those got reverted in this PR to move the preloading logic to table partition level for more flexibility and more explicit/clear locking logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org