You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "robertzych (via GitHub)" <gi...@apache.org> on 2023/07/10 20:04:06 UTC

[GitHub] [pinot] robertzych commented on a diff in pull request #10463: Segment compaction for upsert real-time tables

robertzych commented on code in PR #10463:
URL: https://github.com/apache/pinot/pull/10463#discussion_r1258849592


##########
pinot-core/src/main/java/org/apache/pinot/core/common/MinionConstants.java:
##########
@@ -136,4 +136,18 @@ public static class SegmentGenerationAndPushTask {
     public static final String CONFIG_NUMBER_CONCURRENT_TASKS_PER_INSTANCE =
         "SegmentGenerationAndPushTask.numConcurrentTasksPerInstance";
   }
+
+  public static class UpsertCompactionTask {
+    public static final String TASK_TYPE = "UpsertCompactionTask";
+    /**
+     * The time period to wait before picking segments for this task
+     * e.g. if set to "2d", no task will be scheduled for a time window younger than 2 days
+     */
+    public static final String BUFFER_TIME_PERIOD_KEY = "bufferTimePeriod";
+    /**
+     * The maximum percent of old records allowed for a completed segment.
+     * e.g. if the percent surpasses 30, then the segment will be compacted
+     */
+    public static final String INVALID_RECORDS_THRESHOLD_PERCENT = "invalidRecordsThresholdPercent";

Review Comment:
   @Jackie-Jiang @snleee Some users may want to use both a percentage and a record count at the same time. For example, they may choose to start with a percentage then add a record count later to prevent compaction on smaller segments. Another benefit of using both concurrently may be to help prevent compaction in cases when the user provided a small value for the record count. In this case, the higher percentage would prevent compaction and therefore compensate for the small record count config. 
   
   How does keeping percentage as required but adding record count as an optional config sound?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org