You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "rohangarg (via GitHub)" <gi...@apache.org> on 2023/04/14 14:44:02 UTC

[GitHub] [druid] rohangarg opened a new pull request, #14089: Log merge and push timings for PartialGenericSegmentMergeTask

rohangarg opened a new pull request, #14089:
URL: https://github.com/apache/druid/pull/14089

   Adds logging for merging of partial segments and also the time taken to push the merged segment to deep storage for `PartialGenericSegmentMergeTask`. Similar logging is being for other merge tasks as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #14089: Log merge and push timings for PartialGenericSegmentMergeTask

Posted by "cryptoe (via GitHub)" <gi...@apache.org>.
cryptoe commented on code in PR #14089:
URL: https://github.com/apache/druid/pull/14089#discussion_r1167354619


##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentMergeTask.java:
##########
@@ -294,7 +300,20 @@ private Set<DataSegment> mergeAndPushSegments(
             exception -> !(exception instanceof NullPointerException) && exception instanceof Exception,
             5
         );
+        long pushFinishTime = System.nanoTime();
         pushedSegments.add(segment);
+        LOG.info(
+            "Segment[%s] of %,d bytes "
+            + "built from %d partial segment(s) in %,dms; "
+            + "pushed to deep storage in %,dms. "
+            + "Load spec is: %s",
+            segment.getId(),
+            segment.getSize(),
+            segmentFilesToMerge.size(),
+            (mergeFinishTime - startTime) / 1000000,
+            (pushFinishTime - mergeFinishTime) / 1000000,
+            jsonMapper.writeValueAsString(segment.getLoadSpec())

Review Comment:
   Should we also log the interval ?



##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentMergeTask.java:
##########
@@ -294,7 +300,20 @@ private Set<DataSegment> mergeAndPushSegments(
             exception -> !(exception instanceof NullPointerException) && exception instanceof Exception,
             5
         );
+        long pushFinishTime = System.nanoTime();
         pushedSegments.add(segment);
+        LOG.info(
+            "Segment[%s] of %,d bytes "
+            + "built from %d partial segment(s) in %,dms; "
+            + "pushed to deep storage in %,dms. "
+            + "Load spec is: %s",
+            segment.getId(),
+            segment.getSize(),
+            segmentFilesToMerge.size(),
+            (mergeFinishTime - startTime) / 1000000,
+            (pushFinishTime - mergeFinishTime) / 1000000,
+            jsonMapper.writeValueAsString(segment.getLoadSpec())

Review Comment:
   Why do we want to log the loadspec ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #14089: Log merge and push timings for PartialGenericSegmentMergeTask

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.
rohangarg commented on code in PR #14089:
URL: https://github.com/apache/druid/pull/14089#discussion_r1169544602


##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentMergeTask.java:
##########
@@ -266,6 +267,12 @@ private Set<DataSegment> mergeAndPushSegments(
             persistDir,
             0
         );
+        long mergeFinishTime = System.nanoTime();
+        LOG.info("Merging [%d] input segment(s) for interval [%s] took [%,d]ms.",

Review Comment:
   ```suggestion
           LOG.info("Merged [%d] input segment(s) for interval [%s] in [%,d]ms.",
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #14089: Log merge and push timings for PartialGenericSegmentMergeTask

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.
rohangarg commented on code in PR #14089:
URL: https://github.com/apache/druid/pull/14089#discussion_r1168686220


##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentMergeTask.java:
##########
@@ -266,6 +271,7 @@ private Set<DataSegment> mergeAndPushSegments(
             persistDir,
             0
         );
+        long mergeFinishTime = System.nanoTime();

Review Comment:
   Added a message upon completion of merge stage + fixed the message to add all interpolated values as per convention + removed objectMapper usage in the log.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] imply-cheddar commented on a diff in pull request #14089: Log merge and push timings for PartialGenericSegmentMergeTask

Posted by "imply-cheddar (via GitHub)" <gi...@apache.org>.
imply-cheddar commented on code in PR #14089:
URL: https://github.com/apache/druid/pull/14089#discussion_r1169543455


##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentMergeTask.java:
##########
@@ -266,6 +267,12 @@ private Set<DataSegment> mergeAndPushSegments(
             persistDir,
             0
         );
+        long mergeFinishTime = System.nanoTime();
+        LOG.info("Merging [%d] input segment(s) for interval [%s] took [%,d]ms.",

Review Comment:
   Sorry for the really nitty nit, but this message initially made me think that it was in the future rather than past tense.  What do you think about making this
   
   ```
   "Merged [%,d] input segments for interval [%s] in [%,d]ms"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #14089: Log merge and push timings for PartialGenericSegmentMergeTask

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.
rohangarg commented on code in PR #14089:
URL: https://github.com/apache/druid/pull/14089#discussion_r1168340191


##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentMergeTask.java:
##########
@@ -294,7 +300,20 @@ private Set<DataSegment> mergeAndPushSegments(
             exception -> !(exception instanceof NullPointerException) && exception instanceof Exception,
             5
         );
+        long pushFinishTime = System.nanoTime();
         pushedSegments.add(segment);
+        LOG.info(
+            "Segment[%s] of %,d bytes "
+            + "built from %d partial segment(s) in %,dms; "
+            + "pushed to deep storage in %,dms. "
+            + "Load spec is: %s",
+            segment.getId(),
+            segment.getSize(),
+            segmentFilesToMerge.size(),
+            (mergeFinishTime - startTime) / 1000000,
+            (pushFinishTime - mergeFinishTime) / 1000000,
+            jsonMapper.writeValueAsString(segment.getLoadSpec())

Review Comment:
   The load is logged at all other places, so this keeps the parity about the deep storage segment info.
   The interval is logged as a part of `SegmentId` 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg merged pull request #14089: Log merge and push timings for PartialGenericSegmentMergeTask

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.
rohangarg merged PR #14089:
URL: https://github.com/apache/druid/pull/14089


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] imply-cheddar commented on a diff in pull request #14089: Log merge and push timings for PartialGenericSegmentMergeTask

Posted by "imply-cheddar (via GitHub)" <gi...@apache.org>.
imply-cheddar commented on code in PR #14089:
URL: https://github.com/apache/druid/pull/14089#discussion_r1168371060


##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentMergeTask.java:
##########
@@ -266,6 +271,7 @@ private Set<DataSegment> mergeAndPushSegments(
             persistDir,
             0
         );
+        long mergeFinishTime = System.nanoTime();

Review Comment:
   It might be nice to add a message here stating that the merge completed.  It can be nice to see it in its own line when you are watching logs for progress.



##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentMergeTask.java:
##########
@@ -294,7 +300,20 @@ private Set<DataSegment> mergeAndPushSegments(
             exception -> !(exception instanceof NullPointerException) && exception instanceof Exception,
             5
         );
+        long pushFinishTime = System.nanoTime();
         pushedSegments.add(segment);
+        LOG.info(
+            "Segment[%s] of %,d bytes "
+            + "built from %d partial segment(s) in %,dms; "
+            + "pushed to deep storage in %,dms. "
+            + "Load spec is: %s",
+            segment.getId(),
+            segment.getSize(),
+            segmentFilesToMerge.size(),
+            (mergeFinishTime - startTime) / 1000000,
+            (pushFinishTime - mergeFinishTime) / 1000000,
+            jsonMapper.writeValueAsString(segment.getLoadSpec())

Review Comment:
   I don't think it's worth passing in an `ObjectMapper` for it.  `.toString()` is hopefully good enough.  So, please remove the `ObjectMapper` from the constructor and just toString the thing.
   
   Also, this log message can be better.  You can see a write-up of the idea here: https://github.com/apache/druid/blob/master/dev/style-conventions.md
   
   Perhaps something like 
   
   `"built segment [%s] ([%,d] input segments in [%,d] ms) and pushed ([%,d] ms) to deep storage [%s]"`
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org