You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/11/30 07:56:48 UTC
[GitHub] [pinot] vvivekiyer opened a new pull request, #9875: Update cardinality when converting raw column to dict based
vvivekiyer opened a new pull request, #9875:
URL: https://github.com/apache/pinot/pull/9875
Fixes the issue #9874
Adding tests is still pending.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #9875: Update cardinality when converting raw column to dict based
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on code in PR #9875:
URL: https://github.com/apache/pinot/pull/9875#discussion_r1041377162
##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTestSet.java:
##########
@@ -606,6 +608,37 @@ public void testReset(TableType tableType)
}, 30_000L, "Failed to wait for all segments come back online");
}
+ public String reloadTableAndValidateResponse(String tableName, TableType tableType, boolean forceDownload)
+ throws IOException {
+ String jobId = null;
+ String response =
+ sendPostRequest(_controllerRequestURLBuilder.forTableReload(tableName, tableType, forceDownload), null);
+ String tableNameWithType = TableNameBuilder.forType(tableType).tableNameWithType(tableName);
+ JsonNode tableLevelDetails =
+ JsonUtils.stringToJsonNode(StringEscapeUtils.unescapeJava(response.split(": ")[1])).get(tableNameWithType);
+ String isZKWriteSuccess = tableLevelDetails.get("reloadJobMetaZKStorageStatus").asText();
+ assertEquals("SUCCESS", isZKWriteSuccess);
Review Comment:
(minor, not introduced in this PR)
```suggestion
assertEquals(isZKWriteSuccess, "SUCCESS");
```
##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTestSet.java:
##########
@@ -606,6 +608,37 @@ public void testReset(TableType tableType)
}, 30_000L, "Failed to wait for all segments come back online");
}
+ public String reloadTableAndValidateResponse(String tableName, TableType tableType, boolean forceDownload)
+ throws IOException {
+ String jobId = null;
+ String response =
+ sendPostRequest(_controllerRequestURLBuilder.forTableReload(tableName, tableType, forceDownload), null);
+ String tableNameWithType = TableNameBuilder.forType(tableType).tableNameWithType(tableName);
+ JsonNode tableLevelDetails =
+ JsonUtils.stringToJsonNode(StringEscapeUtils.unescapeJava(response.split(": ")[1])).get(tableNameWithType);
+ String isZKWriteSuccess = tableLevelDetails.get("reloadJobMetaZKStorageStatus").asText();
+ assertEquals("SUCCESS", isZKWriteSuccess);
+ jobId = tableLevelDetails.get("reloadJobId").asText();
Review Comment:
(minor) Remove the declaration on line 613
```suggestion
String jobId = tableLevelDetails.get("reloadJobId").asText();
```
##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/LLCRealtimeClusterIntegrationTest.java:
##########
@@ -206,7 +206,7 @@ public void setUp()
@Override
public void tearDown()
throws Exception {
- FileUtils.deleteDirectory(new File(CONSUMER_DIRECTORY));
+ FileUtils.deleteDirectory(new File(CONSUMER_DIRECTORY));
Review Comment:
(nit) revert
##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTestSet.java:
##########
@@ -606,6 +608,37 @@ public void testReset(TableType tableType)
}, 30_000L, "Failed to wait for all segments come back online");
}
+ public String reloadTableAndValidateResponse(String tableName, TableType tableType, boolean forceDownload)
+ throws IOException {
+ String jobId = null;
+ String response =
+ sendPostRequest(_controllerRequestURLBuilder.forTableReload(tableName, tableType, forceDownload), null);
+ String tableNameWithType = TableNameBuilder.forType(tableType).tableNameWithType(tableName);
+ JsonNode tableLevelDetails =
+ JsonUtils.stringToJsonNode(StringEscapeUtils.unescapeJava(response.split(": ")[1])).get(tableNameWithType);
+ String isZKWriteSuccess = tableLevelDetails.get("reloadJobMetaZKStorageStatus").asText();
+ assertEquals("SUCCESS", isZKWriteSuccess);
+ jobId = tableLevelDetails.get("reloadJobId").asText();
+ String jobStatusResponse = sendGetRequest(_controllerRequestURLBuilder.forControllerJobStatus(jobId));
+ JsonNode jobStatus = JsonUtils.stringToJsonNode(jobStatusResponse);
+
+ // Validate all fields are present
+ assertEquals(jobStatus.get("metadata").get("jobId").asText(), jobId);
+ assertEquals(jobStatus.get("metadata").get("jobType").asText(), "RELOAD_ALL_SEGMENTS");
+ assertEquals(jobStatus.get("metadata").get("tableName").asText(), tableNameWithType);
+ return jobId;
+ }
+
+ public boolean isReloadJobCompleted(String reloadJobId)
Review Comment:
We can remove the `OfflineClusterIntegrationTest.validateReloadJobSuccess()` and replace all usage with `assertTrue(isReloadJobCompleted(jobId)`. Currently we didn't really validate if the reload is done
##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/LLCRealtimeClusterIntegrationTest.java:
##########
@@ -277,6 +277,48 @@ public void testReload()
testReload(false);
}
+ @Test
+ public void testEnableDisableDictionaryWithReload()
+ throws Exception {
+ String query = "SELECT * FROM myTable WHERE ActualElapsedTime > 0";
+ JsonNode queryResponse = postQuery(query);
+ long numTotalDocs = queryResponse.get("totalDocs").asLong();
+
+ // Enable dictionary.
+ TableConfig tableConfig = getRealtimeTableConfig();
+ tableConfig.getIndexingConfig().getNoDictionaryColumns().remove("ActualElapsedTime");
+ updateTableConfig(tableConfig);
+ String enableDictReloadId = reloadTableAndValidateResponse(getTableName(), TableType.REALTIME, false);
+ TestUtils.waitForCondition(aVoid -> {
+ try {
+ return isReloadJobCompleted(enableDictReloadId);
+ } catch (Exception e) {
+ throw new RuntimeException(e);
+ }
+ }, 60_000L, "Failed to reload.");
+
+ queryResponse = postQuery(query);
+ long numTotalDocsAfterReload = queryResponse.get("totalDocs").asLong();
+ assertEquals(numTotalDocs, numTotalDocsAfterReload);
Review Comment:
We want to validate if total doc and query response is always correct during the reload. We can also pick a query that can return different metadata based on whether the column is dictionary encoded to ensure the changing index actually happens. E.g. if we pick a value that doesn't exist, `SELECT COUNT(*) FROM myTable WHERE ActualElapsedTime = <val>` will have docs scanned in filter for raw index, but no scan for dictionary
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] vvivekiyer commented on a diff in pull request #9875: Update cardinality when converting raw column to dict based
Posted by GitBox <gi...@apache.org>.
vvivekiyer commented on code in PR #9875:
URL: https://github.com/apache/pinot/pull/9875#discussion_r1042899409
##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/LLCRealtimeClusterIntegrationTest.java:
##########
@@ -277,6 +277,48 @@ public void testReload()
testReload(false);
}
+ @Test
+ public void testEnableDisableDictionaryWithReload()
+ throws Exception {
+ String query = "SELECT * FROM myTable WHERE ActualElapsedTime > 0";
+ JsonNode queryResponse = postQuery(query);
+ long numTotalDocs = queryResponse.get("totalDocs").asLong();
+
+ // Enable dictionary.
+ TableConfig tableConfig = getRealtimeTableConfig();
+ tableConfig.getIndexingConfig().getNoDictionaryColumns().remove("ActualElapsedTime");
+ updateTableConfig(tableConfig);
+ String enableDictReloadId = reloadTableAndValidateResponse(getTableName(), TableType.REALTIME, false);
+ TestUtils.waitForCondition(aVoid -> {
+ try {
+ return isReloadJobCompleted(enableDictReloadId);
+ } catch (Exception e) {
+ throw new RuntimeException(e);
+ }
+ }, 60_000L, "Failed to reload.");
+
+ queryResponse = postQuery(query);
+ long numTotalDocsAfterReload = queryResponse.get("totalDocs").asLong();
+ assertEquals(numTotalDocs, numTotalDocsAfterReload);
Review Comment:
Oh good point. Added this test.
Just want to mention I enabled both dictionary and inverted index because "docs scanned in filter" is zero only if inverted index is added.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter commented on pull request #9875: Update cardinality when converting raw column to dict based
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #9875:
URL: https://github.com/apache/pinot/pull/9875#issuecomment-1331817358
# [Codecov](https://codecov.io/gh/apache/pinot/pull/9875?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#9875](https://codecov.io/gh/apache/pinot/pull/9875?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4c3186f) into [master](https://codecov.io/gh/apache/pinot/commit/ca4c16cc616a977349996717f0c16c37e5e22b3f?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ca4c16c) will **decrease** coverage by `6.18%`.
> The diff coverage is `92.00%`.
```diff
@@ Coverage Diff @@
## master #9875 +/- ##
============================================
- Coverage 70.37% 64.19% -6.19%
+ Complexity 5468 4986 -482
============================================
Files 1973 1920 -53
Lines 105815 103393 -2422
Branches 16032 15744 -288
============================================
- Hits 74469 66368 -8101
- Misses 26161 32241 +6080
+ Partials 5185 4784 -401
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `?` | |
| integration2 | `?` | |
| unittests1 | `67.83% <92.00%> (+0.03%)` | :arrow_up: |
| unittests2 | `15.79% <0.00%> (-0.04%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/pinot/pull/9875?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ocal/segment/index/loader/ForwardIndexHandler.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9Gb3J3YXJkSW5kZXhIYW5kbGVyLmphdmE=) | `85.75% <92.00%> (+0.13%)` | :arrow_up: |
| [...va/org/apache/pinot/common/config/NettyConfig.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vY29uZmlnL05ldHR5Q29uZmlnLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/pinot/common/metrics/MinionMeter.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...g/apache/pinot/common/metrics/ControllerMeter.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Db250cm9sbGVyTWV0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../apache/pinot/common/metrics/MinionQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25RdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ache/pinot/server/access/AccessControlFactory.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VydmVyL2FjY2Vzcy9BY2Nlc3NDb250cm9sRmFjdG9yeS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...he/pinot/common/messages/SegmentReloadMessage.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvU2VnbWVudFJlbG9hZE1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...he/pinot/common/messages/TableDeletionMessage.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvVGFibGVEZWxldGlvbk1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pinot/core/data/manager/realtime/TimerService.java](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvVGltZXJTZXJ2aWNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [441 more](https://codecov.io/gh/apache/pinot/pull/9875/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang merged pull request #9875: Update cardinality when converting raw column to dict based
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged PR #9875:
URL: https://github.com/apache/pinot/pull/9875
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] vvivekiyer commented on pull request #9875: Update cardinality when converting raw column to dict based
Posted by GitBox <gi...@apache.org>.
vvivekiyer commented on PR #9875:
URL: https://github.com/apache/pinot/pull/9875#issuecomment-1334721514
@Jackie-Jiang @siddharthteotia please review. Ideally, would like to add a unit test but it wasn't straightforward. I'm trying to see if I can add some sort of testing in `LLCRealtimeClusterIntegrationTest`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang commented on pull request #9875: Update cardinality when converting raw column to dict based
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on PR #9875:
URL: https://github.com/apache/pinot/pull/9875#issuecomment-1335672091
@vvivekiyer Adding a test in `LLCRealtimeClusterIntegrationTest` should be relatively easy. You can pick a column with dictionary and a column without dictionary, then switch the encoding, reload and check, switch them back, reload and check
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org