You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/11/04 21:00:17 UTC

[GitHub] [pinot] walterddr opened a new pull request, #9733: show table metadata info in aggregate index size form

walterddr opened a new pull request, #9733:
URL: https://github.com/apache/pinot/pull/9733

   follow up on #9712 .
   
   This PR
   - show column index size if available instead of just YES and NO
   - show aggregated column index size (avg) on table metadata API


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1016095488


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java:
##########
@@ -256,6 +258,13 @@ public String getSegmentMetadata(
               int maxNumMultiValues = columnMetadata.getMaxNumberOfMultiValues();
               maxNumMultiValuesMap.merge(column, (double) maxNumMultiValues, Double::sum);
             }
+            for (Map.Entry<ColumnIndexType, Long> entry : columnMetadata.getIndexSizeMap().entrySet()) {
+              Map<String, Double> columnIndexSizes = columnIndexSizesMap.getOrDefault(column, new HashMap<>());
+              Double indexSize = columnIndexSizes.getOrDefault(entry.getKey().getIndexName(), 0d);
+              indexSize += entry.getValue();
+              columnIndexSizes.put(entry.getKey().getIndexName(), indexSize);

Review Comment:
   👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] GSharayu commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
GSharayu commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1015758480


##########
pinot-common/src/main/java/org/apache/pinot/common/restlet/resources/TableMetadataInfo.java:
##########
@@ -42,20 +42,23 @@ public class TableMetadataInfo {
   private final Map<String, Double> _columnLengthMap;
   private final Map<String, Double> _columnCardinalityMap;
   private final Map<String, Double> _maxNumMultiValuesMap;
+  private final Map<String, Map<String, Double>> _columnIndexSizeMap;
 
   @JsonCreator
   public TableMetadataInfo(@JsonProperty("tableName") String tableName,
       @JsonProperty("diskSizeInBytes") long sizeInBytes, @JsonProperty("numSegments") long numSegments,
       @JsonProperty("numRows") long numRows, @JsonProperty("columnLengthMap") Map<String, Double> columnLengthMap,
       @JsonProperty("columnCardinalityMap") Map<String, Double> columnCardinalityMap,
-      @JsonProperty("maxNumMultiValuesMap") Map<String, Double> maxNumMultiValuesMap) {
+      @JsonProperty("maxNumMultiValuesMap") Map<String, Double> maxNumMultiValuesMap,
+      @JsonProperty("columnIndexSizeMap") Map<String, Map<String, Double>> columnIndexSizeMap) {

Review Comment:
   Can we make this Nullable property?



##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java:
##########
@@ -256,6 +258,13 @@ public String getSegmentMetadata(
               int maxNumMultiValues = columnMetadata.getMaxNumberOfMultiValues();
               maxNumMultiValuesMap.merge(column, (double) maxNumMultiValues, Double::sum);
             }
+            for (Map.Entry<ColumnIndexType, Long> entry : columnMetadata.getIndexSizeMap().entrySet()) {
+              Map<String, Double> columnIndexSizes = columnIndexSizesMap.getOrDefault(column, new HashMap<>());
+              Double indexSize = columnIndexSizes.getOrDefault(entry.getKey().getIndexName(), 0d);
+              indexSize += entry.getValue();
+              columnIndexSizes.put(entry.getKey().getIndexName(), indexSize);

Review Comment:
   nit: we can extract the index name once



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1018498327


##########
pinot-server/src/test/java/org/apache/pinot/server/api/TablesResourceTest.java:
##########
@@ -99,6 +101,38 @@ public void getSegments()
     Assert.assertEquals(response.getStatus(), Response.Status.NOT_FOUND.getStatusCode());
   }
 
+  @Test
+  public void getTableMetadata()
+      throws Exception {
+    String tableMetadataPath = "/tables/" + TableNameBuilder.REALTIME.tableNameWithType(TABLE_NAME) + "/metadata";

Review Comment:
   Can we add a test for this API for both REALTIME and OFFLINE



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1014860375


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   AVAILABLE as it is configured but not yet generated (or the stats are not reflecting the current size).
   which is the current behavior of the REST API
   
   maybe a better string would be `AVAILABLE, SIZE UNKNOWN`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1015816168


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   ah I see the confusion. so together with #9712 .
   we have 4 APIs touched 
   - `/tables/{tableName}/segments/{segmentName}/metadata` (modified in #9712)
     - `/segments/{tableName}/metadata` is also affected by the same code change 
   - `/tables/{tableName}/metadata` (current PR modified this)
   - `/segments/{tableNameWithType}/{segmentName}` (this PR used to modify this but not anymore, b/c i found out that it is also used by the UI and it breaks the UI)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1016095838


##########
pinot-common/src/main/java/org/apache/pinot/common/restlet/resources/TableMetadataInfo.java:
##########
@@ -42,20 +42,23 @@ public class TableMetadataInfo {
   private final Map<String, Double> _columnLengthMap;
   private final Map<String, Double> _columnCardinalityMap;
   private final Map<String, Double> _maxNumMultiValuesMap;
+  private final Map<String, Map<String, Double>> _columnIndexSizeMap;
 
   @JsonCreator
   public TableMetadataInfo(@JsonProperty("tableName") String tableName,
       @JsonProperty("diskSizeInBytes") long sizeInBytes, @JsonProperty("numSegments") long numSegments,
       @JsonProperty("numRows") long numRows, @JsonProperty("columnLengthMap") Map<String, Double> columnLengthMap,
       @JsonProperty("columnCardinalityMap") Map<String, Double> columnCardinalityMap,
-      @JsonProperty("maxNumMultiValuesMap") Map<String, Double> maxNumMultiValuesMap) {
+      @JsonProperty("maxNumMultiValuesMap") Map<String, Double> maxNumMultiValuesMap,
+      @JsonProperty("columnIndexSizeMap") Map<String, Map<String, Double>> columnIndexSizeMap) {

Review Comment:
   columnIndexSize map will not be null. it will be empty if there's no index scheme. but unless a pinot table contains all non-dictionary column at least it will show the forward index size and dictionary size



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1014860375


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   AVAILABLE as it is configured but not yet generated (or the stats are not reflecting the current size).
   which is the current behavior of the REST API



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1015816168


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   ah I see the confusion. so together with #9712 .
   we have 4 APIs touched 
   - `/tables/{tableName}/segments/{segmentName}/metadata` (modified in #9712)
     - `/segments/{tableName}/metadata` is also affected
   - `/tables/{tableName}/metadata` (current PR modified this)
   - `/segments/{tableNameWithType}/{segmentName}` (this PR used to modify this but not anymore, b/c i found out that it is also used by the UI and it breaks the UI)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1015759165


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   hmm this seems confusing. I am ok with following up separately. Didn't get a chance to review the previous PR so not sure of the full context.
   
   If the API is called after the segments have been loaded / mapped on the server, then all indexes should exist right.
   
   So may be the response could be something like
   
   - <some> INDEX available in all segments with avg / aggregated size <blah>
   - <some> INDEX available on N segments with avg / aggregated size <blah>
   - <some> INDEX not available / size unknown on M segments
   
   I am guessing size not available will happen when server has not downloaded + loaded all segments and / or there is some bug ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1015816168


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   ah I see the confusion. so together with #9712 .
   we have 3 APIs touched 
   - `/tables/{tableName}/segments/{segmentName}/metadata` (modified in #9712)
     - `/segments/{tableName}/metadata` is also affected by the same code change 
   - `/tables/{tableName}/metadata` (current PR modified this)
   
   The changes I removed is for: 
   - `/tables/{tableName}/segments/{segmentName}/metadata` to also include the index info in not just the column map, but in the index map.
   This is because i found out that it is also used by the UI and it breaks the UI, so we will no longer change this endpoint until we figure out the questions you posted above



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #9733:
URL: https://github.com/apache/pinot/pull/9733#issuecomment-1304311217

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/9733?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#9733](https://codecov.io/gh/apache/pinot/pull/9733?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (974e5cd) into [master](https://codecov.io/gh/apache/pinot/commit/9898b913e03592200e5743c61b1c18972901346a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9898b91) will **decrease** coverage by `2.44%`.
   > The diff coverage is `100.00%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #9733      +/-   ##
   ============================================
   - Coverage     70.04%   67.60%   -2.45%     
   - Complexity     4907     5178     +271     
   ============================================
     Files          1951     1448     -503     
     Lines        104512    75836   -28676     
     Branches      15824    12076    -3748     
   ============================================
   - Hits          73210    51272   -21938     
   + Misses        26175    20904    -5271     
   + Partials       5127     3660    -1467     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `?` | |
   | unittests1 | `67.60% <100.00%> (+0.02%)` | :arrow_up: |
   | unittests2 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/9733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ot/common/restlet/resources/TableMetadataInfo.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vcmVzdGxldC9yZXNvdXJjZXMvVGFibGVNZXRhZGF0YUluZm8uamF2YQ==) | `100.00% <100.00%> (ø)` | |
   | [...va/org/apache/pinot/core/routing/RoutingTable.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9yb3V0aW5nL1JvdXRpbmdUYWJsZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...va/org/apache/pinot/common/config/NettyConfig.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vY29uZmlnL05ldHR5Q29uZmlnLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/pinot/common/metrics/MinionMeter.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...g/apache/pinot/common/metrics/ControllerMeter.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Db250cm9sbGVyTWV0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/pinot/common/metrics/MinionQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25RdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ache/pinot/server/access/AccessControlFactory.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VydmVyL2FjY2Vzcy9BY2Nlc3NDb250cm9sRmFjdG9yeS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/pinot/common/messages/SegmentReloadMessage.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvU2VnbWVudFJlbG9hZE1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/pinot/common/messages/TableDeletionMessage.java](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvVGFibGVEZWxldGlvbk1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [744 more](https://codecov.io/gh/apache/pinot/pull/9733/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr merged pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr merged PR #9733:
URL: https://github.com/apache/pinot/pull/9733


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1015660598


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   we have to follow up with this in a separate PR. as this API is also used by the UI and it is currently not rendering the numbers correctly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1018506209


##########
pinot-server/src/test/java/org/apache/pinot/server/api/TablesResourceTest.java:
##########
@@ -99,6 +101,38 @@ public void getSegments()
     Assert.assertEquals(response.getStatus(), Response.Status.NOT_FOUND.getStatusCode());
   }
 
+  @Test
+  public void getTableMetadata()
+      throws Exception {
+    String tableMetadataPath = "/tables/" + TableNameBuilder.REALTIME.tableNameWithType(TABLE_NAME) + "/metadata";

Review Comment:
   wilco



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1015816168


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   ah I see the confusion. so together with #9712 .
   we have 3 APIs touched 
   - `/tables/{tableName}/segments/{segmentName}/metadata` (modified in #9712)
     - `/segments/{tableName}/metadata` is also affected by the same code change 
   - `/tables/{tableName}/metadata` (current PR modified this)
   
   The changes I removed is for: 
   - `/tables/{tableName}/segments/{segmentName}/metadata` to also include the index info in not just the column map, but in the index map.
   This is because i found out that it is also used by the UI and it breaks the UI, so we will not make this change until the UI issue is figured out. 
   
   to answer your question: 
   - the response will not have any avg stats, b/c the problem occurs on a segment level not the table level. 
   - this is an additional feature, so wont affect any current behavior, thus we can also decided to not follow up
   - the data is already returned, the only question is whether we want to organized them by 
     - column-->index-->size (done in #9712), or 
     - index-->column-->size (not done)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1014722904


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   > `indexSize == null ? INDEX_AVAILABLE`
   
   Not sure I follow. Why put AVAILABLE when size is null ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1015816168


##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/SegmentMetadataFetcher.java:
##########
@@ -108,48 +111,55 @@ private static Map<String, Map<String, String>> getIndexesForSegmentColumns(Segm
    * Helper to loop through column datasource to create a index map as follows for each column:
    * {<"bloom-filter", "YES">, <"dictionary", "NO">}
    */
-  private static Map<String, String> getColumnIndexes(DataSource dataSource) {
+  private static Map<String, String> getColumnIndexes(DataSource dataSource, ColumnMetadata columnMetadata) {
     Map<String, String> indexStatus = new LinkedHashMap<>();
     if (Objects.isNull(dataSource.getBloomFilter())) {
       indexStatus.put(BLOOM_FILTER, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(BLOOM_FILTER, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.BLOOM_FILTER);
+      indexStatus.put(BLOOM_FILTER, indexSize == null ? INDEX_AVAILABLE : String.valueOf(indexSize));
     }
 
     if (Objects.isNull(dataSource.getDictionary())) {
       indexStatus.put(DICTIONARY, INDEX_NOT_AVAILABLE);
     } else {
-      indexStatus.put(DICTIONARY, INDEX_AVAILABLE);
+      Long indexSize = columnMetadata.getIndexSizeMap().get(ColumnIndexType.DICTIONARY);

Review Comment:
   ah I see the confusion. so together with #9712 .
   we have 4 APIs touched 
   - `/tables/{tableName}/segments/{segmentName}/metadata` (modified in #9712)
     - `/segments/{tableName}/metadata` is also affected by the same code change 
   - `/tables/{tableName}/metadata` (current PR modified this)
   - `/segments/{tableNameWithType}/{segmentName}` (this PR used to modify this but not anymore, b/c i found out that it is also used by the UI and it breaks the UI, so we will no longer change this endpoint until we figure out the questions you posted above)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9733: show table metadata info in aggregate index size form

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9733:
URL: https://github.com/apache/pinot/pull/9733#discussion_r1016095838


##########
pinot-common/src/main/java/org/apache/pinot/common/restlet/resources/TableMetadataInfo.java:
##########
@@ -42,20 +42,23 @@ public class TableMetadataInfo {
   private final Map<String, Double> _columnLengthMap;
   private final Map<String, Double> _columnCardinalityMap;
   private final Map<String, Double> _maxNumMultiValuesMap;
+  private final Map<String, Map<String, Double>> _columnIndexSizeMap;
 
   @JsonCreator
   public TableMetadataInfo(@JsonProperty("tableName") String tableName,
       @JsonProperty("diskSizeInBytes") long sizeInBytes, @JsonProperty("numSegments") long numSegments,
       @JsonProperty("numRows") long numRows, @JsonProperty("columnLengthMap") Map<String, Double> columnLengthMap,
       @JsonProperty("columnCardinalityMap") Map<String, Double> columnCardinalityMap,
-      @JsonProperty("maxNumMultiValuesMap") Map<String, Double> maxNumMultiValuesMap) {
+      @JsonProperty("maxNumMultiValuesMap") Map<String, Double> maxNumMultiValuesMap,
+      @JsonProperty("columnIndexSizeMap") Map<String, Map<String, Double>> columnIndexSizeMap) {

Review Comment:
   columnIndexSize map will not be null. it will be empty if there's no index scheme. but unless a pinot table contains all non-dictionary columns, at least it will show the forward index size and dictionary size for some columns



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org