You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/11/30 07:56:48 UTC

[GitHub] [pinot] vvivekiyer opened a new pull request, #9875: Update cardinality when converting raw column to dict based

vvivekiyer opened a new pull request, #9875:
URL: https://github.com/apache/pinot/pull/9875

   Fixes the issue #9874
   
   Adding tests is still pending.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #9875: Update cardinality when converting raw column to dict based

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on code in PR #9875:
URL: https://github.com/apache/pinot/pull/9875#discussion_r1041377162


##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTestSet.java:
##########
@@ -606,6 +608,37 @@ public void testReset(TableType tableType)
     }, 30_000L, "Failed to wait for all segments come back online");
   }
 
+  public String reloadTableAndValidateResponse(String tableName, TableType tableType, boolean forceDownload)
+      throws IOException {
+    String jobId = null;
+    String response =
+        sendPostRequest(_controllerRequestURLBuilder.forTableReload(tableName, tableType, forceDownload), null);
+    String tableNameWithType = TableNameBuilder.forType(tableType).tableNameWithType(tableName);
+    JsonNode tableLevelDetails =
+        JsonUtils.stringToJsonNode(StringEscapeUtils.unescapeJava(response.split(": ")[1])).get(tableNameWithType);
+    String isZKWriteSuccess = tableLevelDetails.get("reloadJobMetaZKStorageStatus").asText();
+    assertEquals("SUCCESS", isZKWriteSuccess);

Review Comment:
   (minor, not introduced in this PR)
   ```suggestion
       assertEquals(isZKWriteSuccess, "SUCCESS");
   ```



##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTestSet.java:
##########
@@ -606,6 +608,37 @@ public void testReset(TableType tableType)
     }, 30_000L, "Failed to wait for all segments come back online");
   }
 
+  public String reloadTableAndValidateResponse(String tableName, TableType tableType, boolean forceDownload)
+      throws IOException {
+    String jobId = null;
+    String response =
+        sendPostRequest(_controllerRequestURLBuilder.forTableReload(tableName, tableType, forceDownload), null);
+    String tableNameWithType = TableNameBuilder.forType(tableType).tableNameWithType(tableName);
+    JsonNode tableLevelDetails =
+        JsonUtils.stringToJsonNode(StringEscapeUtils.unescapeJava(response.split(": ")[1])).get(tableNameWithType);
+    String isZKWriteSuccess = tableLevelDetails.get("reloadJobMetaZKStorageStatus").asText();
+    assertEquals("SUCCESS", isZKWriteSuccess);
+    jobId = tableLevelDetails.get("reloadJobId").asText();

Review Comment:
   (minor) Remove the declaration on line 613
   ```suggestion
       String jobId = tableLevelDetails.get("reloadJobId").asText();
   ```



##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/LLCRealtimeClusterIntegrationTest.java:
##########
@@ -206,7 +206,7 @@ public void setUp()
   @Override
   public void tearDown()
       throws Exception {
-    FileUtils.deleteDirectory(new File(CONSUMER_DIRECTORY));
+     FileUtils.deleteDirectory(new File(CONSUMER_DIRECTORY));

Review Comment:
   (nit) revert



##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTestSet.java:
##########
@@ -606,6 +608,37 @@ public void testReset(TableType tableType)
     }, 30_000L, "Failed to wait for all segments come back online");
   }
 
+  public String reloadTableAndValidateResponse(String tableName, TableType tableType, boolean forceDownload)
+      throws IOException {
+    String jobId = null;
+    String response =
+        sendPostRequest(_controllerRequestURLBuilder.forTableReload(tableName, tableType, forceDownload), null);
+    String tableNameWithType = TableNameBuilder.forType(tableType).tableNameWithType(tableName);
+    JsonNode tableLevelDetails =
+        JsonUtils.stringToJsonNode(StringEscapeUtils.unescapeJava(response.split(": ")[1])).get(tableNameWithType);
+    String isZKWriteSuccess = tableLevelDetails.get("reloadJobMetaZKStorageStatus").asText();
+    assertEquals("SUCCESS", isZKWriteSuccess);
+    jobId = tableLevelDetails.get("reloadJobId").asText();
+    String jobStatusResponse = sendGetRequest(_controllerRequestURLBuilder.forControllerJobStatus(jobId));
+    JsonNode jobStatus = JsonUtils.stringToJsonNode(jobStatusResponse);
+
+    // Validate all fields are present
+    assertEquals(jobStatus.get("metadata").get("jobId").asText(), jobId);
+    assertEquals(jobStatus.get("metadata").get("jobType").asText(), "RELOAD_ALL_SEGMENTS");
+    assertEquals(jobStatus.get("metadata").get("tableName").asText(), tableNameWithType);
+    return jobId;
+  }
+
+  public boolean isReloadJobCompleted(String reloadJobId)

Review Comment:
   We can remove the `OfflineClusterIntegrationTest.validateReloadJobSuccess()` and replace all usage with `assertTrue(isReloadJobCompleted(jobId)`. Currently we didn't really validate if the reload is done



##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/LLCRealtimeClusterIntegrationTest.java:
##########
@@ -277,6 +277,48 @@ public void testReload()
     testReload(false);
   }
 
+  @Test
+  public void testEnableDisableDictionaryWithReload()
+      throws Exception {
+    String query = "SELECT * FROM myTable WHERE ActualElapsedTime > 0";
+    JsonNode queryResponse = postQuery(query);
+    long numTotalDocs = queryResponse.get("totalDocs").asLong();
+
+    // Enable dictionary.
+    TableConfig tableConfig = getRealtimeTableConfig();
+    tableConfig.getIndexingConfig().getNoDictionaryColumns().remove("ActualElapsedTime");
+    updateTableConfig(tableConfig);
+    String enableDictReloadId = reloadTableAndValidateResponse(getTableName(), TableType.REALTIME, false);
+    TestUtils.waitForCondition(aVoid -> {
+      try {
+        return isReloadJobCompleted(enableDictReloadId);
+      } catch (Exception e) {
+        throw new RuntimeException(e);
+      }
+    }, 60_000L, "Failed to reload.");
+
+    queryResponse = postQuery(query);
+    long numTotalDocsAfterReload = queryResponse.get("totalDocs").asLong();
+    assertEquals(numTotalDocs, numTotalDocsAfterReload);

Review Comment:
   We want to validate if total doc and query response is always correct during the reload. We can also pick a query that can return different metadata based on whether the column is dictionary encoded to ensure the changing index actually happens. E.g. if we pick a value that doesn't exist, `SELECT COUNT(*) FROM myTable WHERE ActualElapsedTime = <val>` will have docs scanned in filter for raw index, but no scan for dictionary



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] vvivekiyer commented on a diff in pull request #9875: Update cardinality when converting raw column to dict based

Posted by GitBox <gi...@apache.org>.
vvivekiyer commented on code in PR #9875:
URL: https://github.com/apache/pinot/pull/9875#discussion_r1042899409


##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/LLCRealtimeClusterIntegrationTest.java:
##########
@@ -277,6 +277,48 @@ public void testReload()
     testReload(false);
   }
 
+  @Test
+  public void testEnableDisableDictionaryWithReload()
+      throws Exception {
+    String query = "SELECT * FROM myTable WHERE ActualElapsedTime > 0";
+    JsonNode queryResponse = postQuery(query);
+    long numTotalDocs = queryResponse.get("totalDocs").asLong();
+
+    // Enable dictionary.
+    TableConfig tableConfig = getRealtimeTableConfig();
+    tableConfig.getIndexingConfig().getNoDictionaryColumns().remove("ActualElapsedTime");
+    updateTableConfig(tableConfig);
+    String enableDictReloadId = reloadTableAndValidateResponse(getTableName(), TableType.REALTIME, false);
+    TestUtils.waitForCondition(aVoid -> {
+      try {
+        return isReloadJobCompleted(enableDictReloadId);
+      } catch (Exception e) {
+        throw new RuntimeException(e);
+      }
+    }, 60_000L, "Failed to reload.");
+
+    queryResponse = postQuery(query);
+    long numTotalDocsAfterReload = queryResponse.get("totalDocs").asLong();
+    assertEquals(numTotalDocs, numTotalDocsAfterReload);

Review Comment:
   Oh good point. Added this test. 
   Just want to mention I enabled both dictionary and inverted index because "docs scanned in filter" is zero only if inverted index is added.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #9875: Update cardinality when converting raw column to dict based

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #9875:
URL: https://github.com/apache/pinot/pull/9875#issuecomment-1331817358

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/9875?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#9875](https://codecov.io/gh/apache/pinot/pull/9875?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4c3186f) into [master](https://codecov.io/gh/apache/pinot/commit/ca4c16cc616a977349996717f0c16c37e5e22b3f?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ca4c16c) will **decrease** coverage by `6.18%`.
   > The diff coverage is `92.00%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #9875      +/-   ##
   ============================================
   - Coverage     70.37%   64.19%   -6.19%     
   + Complexity     5468     4986     -482     
   ============================================
     Files          1973     1920      -53     
     Lines        105815   103393    -2422     
     Branches      16032    15744     -288     
   ============================================
   - Hits          74469    66368    -8101     
   - Misses        26161    32241    +6080     
   + Partials       5185     4784     -401     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `?` | |
   | unittests1 | `67.83% <92.00%> (+0.03%)` | :arrow_up: |
   | unittests2 | `15.79% <0.00%> (-0.04%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/9875?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ocal/segment/index/loader/ForwardIndexHandler.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9Gb3J3YXJkSW5kZXhIYW5kbGVyLmphdmE=) | `85.75% <92.00%> (+0.13%)` | :arrow_up: |
   | [...va/org/apache/pinot/common/config/NettyConfig.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vY29uZmlnL05ldHR5Q29uZmlnLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/pinot/common/metrics/MinionMeter.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...g/apache/pinot/common/metrics/ControllerMeter.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Db250cm9sbGVyTWV0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/pinot/common/metrics/MinionQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25RdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ache/pinot/server/access/AccessControlFactory.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VydmVyL2FjY2Vzcy9BY2Nlc3NDb250cm9sRmFjdG9yeS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/pinot/common/messages/SegmentReloadMessage.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvU2VnbWVudFJlbG9hZE1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/pinot/common/messages/TableDeletionMessage.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvVGFibGVEZWxldGlvbk1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...pinot/core/data/manager/realtime/TimerService.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvVGltZXJTZXJ2aWNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [441 more](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang merged pull request #9875: Update cardinality when converting raw column to dict based

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged PR #9875:
URL: https://github.com/apache/pinot/pull/9875


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] vvivekiyer commented on pull request #9875: Update cardinality when converting raw column to dict based

Posted by GitBox <gi...@apache.org>.
vvivekiyer commented on PR #9875:
URL: https://github.com/apache/pinot/pull/9875#issuecomment-1334721514

   @Jackie-Jiang @siddharthteotia  please review. Ideally, would like to add a unit test but it wasn't straightforward. I'm trying to see if I can add some sort of testing in `LLCRealtimeClusterIntegrationTest`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on pull request #9875: Update cardinality when converting raw column to dict based

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on PR #9875:
URL: https://github.com/apache/pinot/pull/9875#issuecomment-1335672091

   @vvivekiyer Adding a test in `LLCRealtimeClusterIntegrationTest` should be relatively easy. You can pick a column with dictionary and a column without dictionary, then switch the encoding, reload and check, switch them back, reload and check


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org