You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "Jackie-Jiang (via GitHub)" <gi...@apache.org> on 2023/06/28 21:25:59 UTC

[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #10927: [WIP] improve disk read for partial upsert handler

Jackie-Jiang commented on code in PR #10927:
URL: https://github.com/apache/pinot/pull/10927#discussion_r1245782927


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##########
@@ -254,7 +256,9 @@ protected void doAddRecord(MutableSegment segment, RecordInfo recordInfo) {
   @Override
   protected GenericRow doUpdateRecord(GenericRow record, RecordInfo recordInfo) {
     assert _partialUpsertHandler != null;
-    AtomicReference<GenericRow> previousRecordReference = new AtomicReference<>();
+    AtomicReference<GenericRow> mergedRowReference = new AtomicReference<>();

Review Comment:
   The change can happen in-place, and we don't really need to create this reference. We may directly modify `record`



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##########
@@ -268,13 +272,28 @@ protected GenericRow doUpdateRecord(GenericRow record, RecordInfo recordInfo) {
             int currentDocId = recordLocation.getDocId();
             if (currentQueryableDocIds == null || currentQueryableDocIds.contains(currentDocId)) {
               _reuse.clear();

Review Comment:
   `_reuse` is no longer used?



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##########
@@ -268,13 +272,28 @@ protected GenericRow doUpdateRecord(GenericRow record, RecordInfo recordInfo) {
             int currentDocId = recordLocation.getDocId();
             if (currentQueryableDocIds == null || currentQueryableDocIds.contains(currentDocId)) {
               _reuse.clear();
-              previousRecordReference.set(currentSegment.getRecord(currentDocId, _reuse));
+              for (String column: record.getFieldToValueMap().keySet()) {
+                PinotSegmentColumnReader pinotSegmentColumnReader =
+                    new PinotSegmentColumnReader(recordLocation.getSegment(), column);

Review Comment:
   Also we need to close it after reading the value



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##########
@@ -268,13 +272,28 @@ protected GenericRow doUpdateRecord(GenericRow record, RecordInfo recordInfo) {
             int currentDocId = recordLocation.getDocId();
             if (currentQueryableDocIds == null || currentQueryableDocIds.contains(currentDocId)) {
               _reuse.clear();
-              previousRecordReference.set(currentSegment.getRecord(currentDocId, _reuse));
+              for (String column: record.getFieldToValueMap().keySet()) {
+                PinotSegmentColumnReader pinotSegmentColumnReader =
+                    new PinotSegmentColumnReader(recordLocation.getSegment(), column);

Review Comment:
   We should create `PinotSegmentColumnReader` only when we actually need to read the value. In override mode, if the value exist, we don't need to create it



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/PartialUpsertHandler.java:
##########
@@ -90,4 +90,31 @@ public GenericRow merge(GenericRow previousRecord, GenericRow newRecord) {
     }
     return newRecord;
   }
+
+  protected PartialUpsertMerger getMergerForColumn(String column) {
+    return _column2Mergers.getOrDefault(column, _defaultPartialUpsertMerger);
+  }
+
+  public GenericRow merge(String column, Object previousValue, GenericRow newRecord) {
+    if (!_primaryKeyColumns.contains(column)) {

Review Comment:
   These checks should be performed earlier



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org