You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/02/03 22:43:10 UTC

[GitHub] [incubator-pinot] icefury71 opened a new pull request #6541: - Adding new validation for Json, TEXT indexing

icefury71 opened a new pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541


    - Adding semantic validation for indexing and field config
   
   ## Description
   Related to #5942 . This PR adds some missing config validation for range, varLength, Json, TEXT, FST indexing config
   
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade?
   No
   Does this PR fix a zero-downtime upgrade introduced earlier?
   No
   Does this PR otherwise need attention when creating release notes?
   No


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-io commented on pull request #6541: - Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#issuecomment-775519172


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6541?src=pr&el=h1) Report
   > Merging [#6541](https://codecov.io/gh/apache/incubator-pinot/pull/6541?src=pr&el=desc) (1803632) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1beaab59b73f26c4e35f3b9bc856b03806cddf5a?el=desc) (1beaab5) will **decrease** coverage by `1.39%`.
   > The diff coverage is `58.04%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6541/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/6541?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #6541      +/-   ##
   ==========================================
   - Coverage   66.44%   65.05%   -1.40%     
   ==========================================
     Files        1075     1339     +264     
     Lines       54773    65898   +11125     
     Branches     8168     9626    +1458     
   ==========================================
   + Hits        36396    42872    +6476     
   - Misses      15700    19960    +4260     
   - Partials     2677     3066     +389     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `65.05% <58.04%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6541?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...e/pinot/broker/api/resources/PinotBrokerDebug.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYXBpL3Jlc291cmNlcy9QaW5vdEJyb2tlckRlYnVnLmphdmE=) | `0.00% <0.00%> (-79.32%)` | :arrow_down: |
   | [...ot/broker/broker/AllowAllAccessControlFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL0FsbG93QWxsQWNjZXNzQ29udHJvbEZhY3RvcnkuamF2YQ==) | `71.42% <ø> (-28.58%)` | :arrow_down: |
   | [.../helix/BrokerUserDefinedMessageHandlerFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL2hlbGl4L0Jyb2tlclVzZXJEZWZpbmVkTWVzc2FnZUhhbmRsZXJGYWN0b3J5LmphdmE=) | `33.96% <0.00%> (-32.71%)` | :arrow_down: |
   | [...ker/routing/instanceselector/InstanceSelector.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9pbnN0YW5jZXNlbGVjdG9yL0luc3RhbmNlU2VsZWN0b3IuamF2YQ==) | `100.00% <ø> (ø)` | |
   | [...ava/org/apache/pinot/client/AbstractResultSet.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Fic3RyYWN0UmVzdWx0U2V0LmphdmE=) | `66.66% <0.00%> (+9.52%)` | :arrow_up: |
   | [.../main/java/org/apache/pinot/client/Connection.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Nvbm5lY3Rpb24uamF2YQ==) | `35.55% <0.00%> (-13.29%)` | :arrow_down: |
   | [...inot/client/JsonAsyncHttpPinotClientTransport.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0pzb25Bc3luY0h0dHBQaW5vdENsaWVudFRyYW5zcG9ydC5qYXZh) | `10.90% <0.00%> (-51.10%)` | :arrow_down: |
   | [...not/common/assignment/InstancePartitionsUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vYXNzaWdubWVudC9JbnN0YW5jZVBhcnRpdGlvbnNVdGlscy5qYXZh) | `73.80% <ø> (+0.63%)` | :arrow_up: |
   | [...common/config/tuner/NoOpTableTableConfigTuner.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vY29uZmlnL3R1bmVyL05vT3BUYWJsZVRhYmxlQ29uZmlnVHVuZXIuamF2YQ==) | `100.00% <ø> (ø)` | |
   | [...ot/common/config/tuner/RealTimeAutoIndexTuner.java](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vY29uZmlnL3R1bmVyL1JlYWxUaW1lQXV0b0luZGV4VHVuZXIuamF2YQ==) | `100.00% <ø> (ø)` | |
   | ... and [1193 more](https://codecov.io/gh/apache/incubator-pinot/pull/6541/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6541?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6541?src=pr&el=footer). Last update [174a77b...1803632](https://codecov.io/gh/apache/incubator-pinot/pull/6541?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] icefury71 commented on a change in pull request #6541: - Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
icefury71 commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r574179544



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -474,11 +478,48 @@ private static void validateIndexingConfig(@Nullable IndexingConfig indexingConf
       Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
           "Column Name " + columnName + " defined in " + configName + " must be a valid column defined in the schema");
     }
+
+    // Range index semantic validation
+    // Range index can be defined on numeric columns and any column with a dictionary
+    if (indexingConfig.getRangeIndexColumns() != null) {
+      for (String rangeIndexCol : indexingConfig.getRangeIndexColumns()) {
+        Preconditions.checkState(
+            schema.getFieldSpecFor(rangeIndexCol).getDataType().isNumeric() || !noDictionaryColumnsSet
+                .contains(rangeIndexCol),
+            "Cannot create a range index on non-numeric/no-dictionary column " + rangeIndexCol);
+      }
+    }
+
+    // Var length dictionary semantic validation
+    if (indexingConfig.getVarLengthDictionaryColumns() != null) {
+      for (String varLenDictCol : indexingConfig.getVarLengthDictionaryColumns()) {
+        FieldSpec varLenDictFieldSpec = schema.getFieldSpecFor(varLenDictCol);
+        switch (varLenDictFieldSpec.getDataType()) {
+          case STRING:
+          case BYTES:
+            continue;
+          default:
+            throw new IllegalStateException(
+                "var length dictionary can only be created for columns of type STRING and BYTES. Invalid for column "

Review comment:
       I suppose multi-value columns should be ok here - not really related to the encoding. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] icefury71 commented on a change in pull request #6541: Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
icefury71 commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r586740573



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -488,18 +527,37 @@ private static void validateFieldConfigList(@Nullable List<FieldConfig> fieldCon
 
     for (FieldConfig fieldConfig : fieldConfigList) {
       String columnName = fieldConfig.getName();
-      Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
+      FieldSpec fieldConfigColSpec = schema.getFieldSpecFor(columnName);
+      Preconditions.checkState(fieldConfigColSpec != null,
           "Column Name " + columnName + " defined in field config list must be a valid column defined in the schema");
 
-      if (fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY &&
-          indexingConfigs.getNoDictionaryColumns() != null) {
-        Preconditions.checkArgument(!indexingConfigs.getNoDictionaryColumns().contains(columnName),
-            "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+      List<String> noDictionaryColumns = indexingConfigs.getNoDictionaryColumns();
+      switch (fieldConfig.getEncodingType()) {
+        case RAW:
+          Preconditions.checkState(noDictionaryColumns != null && noDictionaryColumns.contains(columnName),
+              "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          break;
+        case DICTIONARY:
+          if (noDictionaryColumns != null) {
+            Preconditions.checkArgument(!noDictionaryColumns.contains(columnName),
+                "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          }
       }
-      // FST Index is only available on dictionary encoded columns.
-      if (fieldConfig.getIndexType() == FieldConfig.IndexType.FST) {
-        Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
-            "FST Index is only enabled on dictionary encoded columns");
+
+      switch (fieldConfig.getIndexType()) {
+        case FST:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
+              "FST Index is only enabled on dictionary encoded columns");
+          Preconditions.checkState(
+              fieldConfigColSpec.isSingleValueField() && fieldConfigColSpec.getDataType() == FieldSpec.DataType.STRING,
+              "FST Index is only supported for single value string columns");
+          break;
+        case TEXT:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.RAW,

Review comment:
       @siddharthteotia the check doesn't exist as you can see : https://github.com/apache/incubator-pinot/blob/master/pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java#L550




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] icefury71 commented on a change in pull request #6541: - Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
icefury71 commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r574179684



##########
File path: pinot-core/src/test/java/org/apache/pinot/core/util/TableConfigUtilsTest.java
##########
@@ -870,6 +916,69 @@ public void testValidateIndexingConfig() {
     } catch (Exception e) {
       // expected
     }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("non-existent-column")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for non existent column in Json index config");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("intCol")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json index defined on non string column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("multiValCol")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json index defined on multi-value column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setRangeIndexColumns(columnList).
+        build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+    } catch (Exception e) {
+      Assert.fail("Should work for range index defined on dictionary encoded string column");
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setRangeIndexColumns(columnList)
+        .setNoDictionaryColumns(columnList).
+            build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for range index defined on non numeric/no-dictionary column");
+    } catch (Exception e) {
+      // Expected

Review comment:
       This is just for the unit test - I followed the same pattern as for others. 

##########
File path: pinot-core/src/test/java/org/apache/pinot/core/util/TableConfigUtilsTest.java
##########
@@ -870,6 +916,69 @@ public void testValidateIndexingConfig() {
     } catch (Exception e) {
       // expected
     }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("non-existent-column")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for non existent column in Json index config");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("intCol")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json index defined on non string column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("multiValCol")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json index defined on multi-value column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setRangeIndexColumns(columnList).
+        build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+    } catch (Exception e) {
+      Assert.fail("Should work for range index defined on dictionary encoded string column");
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setRangeIndexColumns(columnList)
+        .setNoDictionaryColumns(columnList).
+            build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for range index defined on non numeric/no-dictionary column");
+    } catch (Exception e) {
+      // Expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setVarLengthDictionaryColumns(Arrays.asList("intCol")).
+        build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Var length dictionary defined for non string/bytes column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setJsonIndexColumns(Arrays.asList("multiValCol")).
+        build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json Index defined on a multi value column");
+    } catch (Exception e) {
+      // expected

Review comment:
       Same comment as above




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6541: - Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r572484772



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -474,11 +478,46 @@ private static void validateIndexingConfig(@Nullable IndexingConfig indexingConf
       Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
           "Column Name " + columnName + " defined in " + configName + " must be a valid column defined in the schema");
     }
+
+    // Range index semantic validation
+    // Range index can be defined on numeric columns and any column with a dictionary
+    if (indexingConfig.getRangeIndexColumns() != null) {
+      for (String rangeIndexCol : indexingConfig.getRangeIndexColumns()) {
+        Preconditions.checkState(
+            schema.getFieldSpecFor(rangeIndexCol).getDataType().isNumeric() || !noDictionaryColumnsSet
+                .contains(rangeIndexCol),
+            "Cannot create a range index on non-numeric/no-dictionary column " + rangeIndexCol);
+      }
+    }
+
+    // Var length dictionary semantic validation
+    if (indexingConfig.getVarLengthDictionaryColumns() != null) {
+      for (String varLenDictCol : indexingConfig.getVarLengthDictionaryColumns()) {
+        FieldSpec varLenDictFieldSpec = schema.getFieldSpecFor(varLenDictCol);
+        switch (varLenDictFieldSpec.getDataType()) {
+          case STRING:
+          case BYTES:
+            continue;
+          default:
+            throw new IllegalStateException(
+                "var length dictionary can only be created for columns of type STRING and BYTES. Invalid for column "
+                    + varLenDictCol);
+        }
+      }
+    }
+
+    if (indexingConfig.getJsonIndexColumns() != null) {

Review comment:
       Json index can only be applied to SV string column




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6541: Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r586720500



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -488,18 +529,35 @@ private static void validateFieldConfigList(@Nullable List<FieldConfig> fieldCon
 
     for (FieldConfig fieldConfig : fieldConfigList) {
       String columnName = fieldConfig.getName();
-      Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
+      FieldSpec fieldConfigColSpec = schema.getFieldSpecFor(columnName);
+      Preconditions.checkState(fieldConfigColSpec != null,
           "Column Name " + columnName + " defined in field config list must be a valid column defined in the schema");
 
-      if (fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY &&
-          indexingConfigs.getNoDictionaryColumns() != null) {
-        Preconditions.checkArgument(!indexingConfigs.getNoDictionaryColumns().contains(columnName),
-            "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+      List<String> noDictionaryColumns = indexingConfigs.getNoDictionaryColumns();
+      switch (fieldConfig.getEncodingType()) {
+        case RAW:
+          Preconditions.checkState(noDictionaryColumns != null && noDictionaryColumns.contains(columnName),

Review comment:
       This check seems to be a needless one, given that we are in the case `RAW`. @icefury71 the new `FieldConfig` is where we want to move things, and eventually drop arrays in table config like `invertedIndexColumns`, and `noDictionaryColumns` etc.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #6541: Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r586713743



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -488,18 +527,37 @@ private static void validateFieldConfigList(@Nullable List<FieldConfig> fieldCon
 
     for (FieldConfig fieldConfig : fieldConfigList) {
       String columnName = fieldConfig.getName();
-      Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
+      FieldSpec fieldConfigColSpec = schema.getFieldSpecFor(columnName);
+      Preconditions.checkState(fieldConfigColSpec != null,
           "Column Name " + columnName + " defined in field config list must be a valid column defined in the schema");
 
-      if (fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY &&
-          indexingConfigs.getNoDictionaryColumns() != null) {
-        Preconditions.checkArgument(!indexingConfigs.getNoDictionaryColumns().contains(columnName),
-            "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+      List<String> noDictionaryColumns = indexingConfigs.getNoDictionaryColumns();
+      switch (fieldConfig.getEncodingType()) {
+        case RAW:
+          Preconditions.checkState(noDictionaryColumns != null && noDictionaryColumns.contains(columnName),
+              "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          break;
+        case DICTIONARY:
+          if (noDictionaryColumns != null) {
+            Preconditions.checkArgument(!noDictionaryColumns.contains(columnName),
+                "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          }
       }
-      // FST Index is only available on dictionary encoded columns.
-      if (fieldConfig.getIndexType() == FieldConfig.IndexType.FST) {
-        Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
-            "FST Index is only enabled on dictionary encoded columns");
+
+      switch (fieldConfig.getIndexType()) {
+        case FST:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
+              "FST Index is only enabled on dictionary encoded columns");
+          Preconditions.checkState(
+              fieldConfigColSpec.isSingleValueField() && fieldConfigColSpec.getDataType() == FieldSpec.DataType.STRING,
+              "FST Index is only supported for single value string columns");
+          break;
+        case TEXT:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.RAW,

Review comment:
       @icefury71  looks like this check was added back in another PR https://github.com/apache/incubator-pinot/pull/6620
   Text index is supported on dictionary columns 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #6541: Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r586713857



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -488,18 +527,37 @@ private static void validateFieldConfigList(@Nullable List<FieldConfig> fieldCon
 
     for (FieldConfig fieldConfig : fieldConfigList) {
       String columnName = fieldConfig.getName();
-      Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
+      FieldSpec fieldConfigColSpec = schema.getFieldSpecFor(columnName);
+      Preconditions.checkState(fieldConfigColSpec != null,
           "Column Name " + columnName + " defined in field config list must be a valid column defined in the schema");
 
-      if (fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY &&
-          indexingConfigs.getNoDictionaryColumns() != null) {
-        Preconditions.checkArgument(!indexingConfigs.getNoDictionaryColumns().contains(columnName),
-            "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+      List<String> noDictionaryColumns = indexingConfigs.getNoDictionaryColumns();
+      switch (fieldConfig.getEncodingType()) {
+        case RAW:
+          Preconditions.checkState(noDictionaryColumns != null && noDictionaryColumns.contains(columnName),
+              "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          break;
+        case DICTIONARY:
+          if (noDictionaryColumns != null) {
+            Preconditions.checkArgument(!noDictionaryColumns.contains(columnName),
+                "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          }
       }
-      // FST Index is only available on dictionary encoded columns.
-      if (fieldConfig.getIndexType() == FieldConfig.IndexType.FST) {
-        Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
-            "FST Index is only enabled on dictionary encoded columns");
+
+      switch (fieldConfig.getIndexType()) {
+        case FST:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
+              "FST Index is only enabled on dictionary encoded columns");
+          Preconditions.checkState(
+              fieldConfigColSpec.isSingleValueField() && fieldConfigColSpec.getDataType() == FieldSpec.DataType.STRING,
+              "FST Index is only supported for single value string columns");
+          break;
+        case TEXT:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.RAW,

Review comment:
       cc @jtao15 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mayankshriv commented on a change in pull request #6541: - Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
mayankshriv commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r574093157



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -474,11 +478,48 @@ private static void validateIndexingConfig(@Nullable IndexingConfig indexingConf
       Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
           "Column Name " + columnName + " defined in " + configName + " must be a valid column defined in the schema");
     }
+
+    // Range index semantic validation
+    // Range index can be defined on numeric columns and any column with a dictionary
+    if (indexingConfig.getRangeIndexColumns() != null) {
+      for (String rangeIndexCol : indexingConfig.getRangeIndexColumns()) {
+        Preconditions.checkState(
+            schema.getFieldSpecFor(rangeIndexCol).getDataType().isNumeric() || !noDictionaryColumnsSet
+                .contains(rangeIndexCol),
+            "Cannot create a range index on non-numeric/no-dictionary column " + rangeIndexCol);
+      }
+    }
+
+    // Var length dictionary semantic validation
+    if (indexingConfig.getVarLengthDictionaryColumns() != null) {
+      for (String varLenDictCol : indexingConfig.getVarLengthDictionaryColumns()) {
+        FieldSpec varLenDictFieldSpec = schema.getFieldSpecFor(varLenDictCol);
+        switch (varLenDictFieldSpec.getDataType()) {
+          case STRING:
+          case BYTES:
+            continue;
+          default:
+            throw new IllegalStateException(
+                "var length dictionary can only be created for columns of type STRING and BYTES. Invalid for column "

Review comment:
       What about MV columns, do we support var length for MV columns?
   
   Also, s/var length/Var length

##########
File path: pinot-core/src/test/java/org/apache/pinot/core/util/TableConfigUtilsTest.java
##########
@@ -870,6 +916,69 @@ public void testValidateIndexingConfig() {
     } catch (Exception e) {
       // expected
     }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("non-existent-column")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for non existent column in Json index config");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("intCol")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json index defined on non string column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("multiValCol")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json index defined on multi-value column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setRangeIndexColumns(columnList).
+        build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+    } catch (Exception e) {
+      Assert.fail("Should work for range index defined on dictionary encoded string column");
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setRangeIndexColumns(columnList)
+        .setNoDictionaryColumns(columnList).
+            build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for range index defined on non numeric/no-dictionary column");
+    } catch (Exception e) {
+      // Expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setVarLengthDictionaryColumns(Arrays.asList("intCol")).
+        build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Var length dictionary defined for non string/bytes column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setJsonIndexColumns(Arrays.asList("multiValCol")).
+        build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json Index defined on a multi value column");
+    } catch (Exception e) {
+      // expected

Review comment:
       Same here.

##########
File path: pinot-core/src/test/java/org/apache/pinot/core/util/TableConfigUtilsTest.java
##########
@@ -870,6 +916,69 @@ public void testValidateIndexingConfig() {
     } catch (Exception e) {
       // expected
     }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("non-existent-column")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for non existent column in Json index config");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("intCol")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json index defined on non string column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME)
+        .setJsonIndexColumns(Arrays.asList("multiValCol")).build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for Json index defined on multi-value column");
+    } catch (Exception e) {
+      // expected
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setRangeIndexColumns(columnList).
+        build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+    } catch (Exception e) {
+      Assert.fail("Should work for range index defined on dictionary encoded string column");
+    }
+
+    tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME).setRangeIndexColumns(columnList)
+        .setNoDictionaryColumns(columnList).
+            build();
+    try {
+      TableConfigUtils.validate(tableConfig, schema);
+      Assert.fail("Should fail for range index defined on non numeric/no-dictionary column");
+    } catch (Exception e) {
+      // Expected

Review comment:
       Swallowing exception?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] npawar merged pull request #6541: Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
npawar merged pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] icefury71 commented on a change in pull request #6541: Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
icefury71 commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r586738428



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -488,18 +527,37 @@ private static void validateFieldConfigList(@Nullable List<FieldConfig> fieldCon
 
     for (FieldConfig fieldConfig : fieldConfigList) {
       String columnName = fieldConfig.getName();
-      Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
+      FieldSpec fieldConfigColSpec = schema.getFieldSpecFor(columnName);
+      Preconditions.checkState(fieldConfigColSpec != null,
           "Column Name " + columnName + " defined in field config list must be a valid column defined in the schema");
 
-      if (fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY &&
-          indexingConfigs.getNoDictionaryColumns() != null) {
-        Preconditions.checkArgument(!indexingConfigs.getNoDictionaryColumns().contains(columnName),
-            "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+      List<String> noDictionaryColumns = indexingConfigs.getNoDictionaryColumns();
+      switch (fieldConfig.getEncodingType()) {
+        case RAW:
+          Preconditions.checkState(noDictionaryColumns != null && noDictionaryColumns.contains(columnName),
+              "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          break;
+        case DICTIONARY:
+          if (noDictionaryColumns != null) {
+            Preconditions.checkArgument(!noDictionaryColumns.contains(columnName),
+                "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          }
       }
-      // FST Index is only available on dictionary encoded columns.
-      if (fieldConfig.getIndexType() == FieldConfig.IndexType.FST) {
-        Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
-            "FST Index is only enabled on dictionary encoded columns");
+
+      switch (fieldConfig.getIndexType()) {
+        case FST:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
+              "FST Index is only enabled on dictionary encoded columns");
+          Preconditions.checkState(
+              fieldConfigColSpec.isSingleValueField() && fieldConfigColSpec.getDataType() == FieldSpec.DataType.STRING,
+              "FST Index is only supported for single value string columns");
+          break;
+        case TEXT:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.RAW,

Review comment:
       @siddharthteotia are you sure it was #6620 ? It doesn't even touch TableConfigUtils




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] icefury71 commented on a change in pull request #6541: - Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
icefury71 commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r573286455



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -488,18 +527,37 @@ private static void validateFieldConfigList(@Nullable List<FieldConfig> fieldCon
 
     for (FieldConfig fieldConfig : fieldConfigList) {
       String columnName = fieldConfig.getName();
-      Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
+      FieldSpec fieldConfigColSpec = schema.getFieldSpecFor(columnName);
+      Preconditions.checkState(fieldConfigColSpec != null,
           "Column Name " + columnName + " defined in field config list must be a valid column defined in the schema");
 
-      if (fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY &&
-          indexingConfigs.getNoDictionaryColumns() != null) {
-        Preconditions.checkArgument(!indexingConfigs.getNoDictionaryColumns().contains(columnName),
-            "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+      List<String> noDictionaryColumns = indexingConfigs.getNoDictionaryColumns();
+      switch (fieldConfig.getEncodingType()) {
+        case RAW:
+          Preconditions.checkState(noDictionaryColumns != null && noDictionaryColumns.contains(columnName),
+              "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          break;
+        case DICTIONARY:
+          if (noDictionaryColumns != null) {
+            Preconditions.checkArgument(!noDictionaryColumns.contains(columnName),
+                "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          }
       }
-      // FST Index is only available on dictionary encoded columns.
-      if (fieldConfig.getIndexType() == FieldConfig.IndexType.FST) {
-        Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
-            "FST Index is only enabled on dictionary encoded columns");
+
+      switch (fieldConfig.getIndexType()) {
+        case FST:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
+              "FST Index is only enabled on dictionary encoded columns");
+          Preconditions.checkState(
+              fieldConfigColSpec.isSingleValueField() && fieldConfigColSpec.getDataType() == FieldSpec.DataType.STRING,
+              "FST Index is only supported for single value string columns");
+          break;
+        case TEXT:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.RAW,

Review comment:
       That's my bad - removing this check. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] icefury71 commented on a change in pull request #6541: - Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
icefury71 commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r573287248



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -488,18 +527,37 @@ private static void validateFieldConfigList(@Nullable List<FieldConfig> fieldCon
 
     for (FieldConfig fieldConfig : fieldConfigList) {
       String columnName = fieldConfig.getName();
-      Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
+      FieldSpec fieldConfigColSpec = schema.getFieldSpecFor(columnName);
+      Preconditions.checkState(fieldConfigColSpec != null,
           "Column Name " + columnName + " defined in field config list must be a valid column defined in the schema");
 
-      if (fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY &&
-          indexingConfigs.getNoDictionaryColumns() != null) {
-        Preconditions.checkArgument(!indexingConfigs.getNoDictionaryColumns().contains(columnName),
-            "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+      List<String> noDictionaryColumns = indexingConfigs.getNoDictionaryColumns();
+      switch (fieldConfig.getEncodingType()) {
+        case RAW:
+          Preconditions.checkState(noDictionaryColumns != null && noDictionaryColumns.contains(columnName),
+              "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          break;
+        case DICTIONARY:
+          if (noDictionaryColumns != null) {
+            Preconditions.checkArgument(!noDictionaryColumns.contains(columnName),
+                "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          }
       }
-      // FST Index is only available on dictionary encoded columns.
-      if (fieldConfig.getIndexType() == FieldConfig.IndexType.FST) {
-        Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
-            "FST Index is only enabled on dictionary encoded columns");
+
+      switch (fieldConfig.getIndexType()) {
+        case FST:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
+              "FST Index is only enabled on dictionary encoded columns");
+          Preconditions.checkState(
+              fieldConfigColSpec.isSingleValueField() && fieldConfigColSpec.getDataType() == FieldSpec.DataType.STRING,
+              "FST Index is only supported for single value string columns");
+          break;
+        case TEXT:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.RAW,

Review comment:
       My bad - removing this extra check




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #6541: - Adding new validation for Json, TEXT indexing

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #6541:
URL: https://github.com/apache/incubator-pinot/pull/6541#discussion_r572510770



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
##########
@@ -488,18 +527,37 @@ private static void validateFieldConfigList(@Nullable List<FieldConfig> fieldCon
 
     for (FieldConfig fieldConfig : fieldConfigList) {
       String columnName = fieldConfig.getName();
-      Preconditions.checkState(schema.getFieldSpecFor(columnName) != null,
+      FieldSpec fieldConfigColSpec = schema.getFieldSpecFor(columnName);
+      Preconditions.checkState(fieldConfigColSpec != null,
           "Column Name " + columnName + " defined in field config list must be a valid column defined in the schema");
 
-      if (fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY &&
-          indexingConfigs.getNoDictionaryColumns() != null) {
-        Preconditions.checkArgument(!indexingConfigs.getNoDictionaryColumns().contains(columnName),
-            "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+      List<String> noDictionaryColumns = indexingConfigs.getNoDictionaryColumns();
+      switch (fieldConfig.getEncodingType()) {
+        case RAW:
+          Preconditions.checkState(noDictionaryColumns != null && noDictionaryColumns.contains(columnName),
+              "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          break;
+        case DICTIONARY:
+          if (noDictionaryColumns != null) {
+            Preconditions.checkArgument(!noDictionaryColumns.contains(columnName),
+                "FieldConfig encoding type is different from indexingConfig for column: " + columnName);
+          }
       }
-      // FST Index is only available on dictionary encoded columns.
-      if (fieldConfig.getIndexType() == FieldConfig.IndexType.FST) {
-        Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
-            "FST Index is only enabled on dictionary encoded columns");
+
+      switch (fieldConfig.getIndexType()) {
+        case FST:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.DICTIONARY,
+              "FST Index is only enabled on dictionary encoded columns");
+          Preconditions.checkState(
+              fieldConfigColSpec.isSingleValueField() && fieldConfigColSpec.getDataType() == FieldSpec.DataType.STRING,
+              "FST Index is only supported for single value string columns");
+          break;
+        case TEXT:
+          Preconditions.checkArgument(fieldConfig.getEncodingType() == FieldConfig.EncodingType.RAW,

Review comment:
       Text index is supported on dictionary encoded SV string columns. Why add this check?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org