You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/05/05 05:59:17 UTC

[GitHub] [incubator-pinot] amrishlal opened a new pull request #6878: JSON column datatype support.

amrishlal opened a new pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878


   ## Description
   This PR adds support for `JSON` column type. `JSON` column type can contain only valid JSONObject. `JSON` column type supports conversion to and from `STRING`.
   
   A new property (`storageFormat`) was added to column metadata to record the current storage format of any given column type. This was done to allow for backward compatibility and segment data migration if the storage format of a column type is changed in future. By default, the storage format of all column types is set to `original` in segment metadata.properties file (`column.jsoncolumn.storageFormat = original`). 
   
   Columns of type `JSON` can be indexed using `JsonIndex`. Queries using `JSON_EXTRACT_SCALAR` and `JSON_MATCH` functions can run against a `JSON` column type. Test cases in `JsonDatatypeTest` are used to validate functionality.
   
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)
   * [ ] Yes (Please label as **<code>backward-incompat</code>**, and complete the section below on Release Notes)
   
   Does this PR fix a zero-downtime upgrade introduced earlier?
   * [ ] Yes (Please label this as **<code>backward-incompat</code>**, and complete the section below on Release Notes)
   
   Does this PR otherwise need attention when creating release notes? Things to consider:
   - New configuration options
   - Deprecation of configurations
   - Signature changes to public methods/interfaces
   - New plugins added or old plugins removed
   * [ ] Yes (Please label this PR as **<code>release-notes</code>** and complete the section on Release Notes)
   ## Release Notes
   <!-- If you have tagged this as either backward-incompat or release-notes,
   you MUST add text here that you would like to see appear in release notes of the
   next release. -->
   
   <!-- If you have a series of commits adding or enabling a feature, then
   add this section only in final commit that marks the feature completed.
   Refer to earlier release notes to see examples of text.
   -->
   ## Documentation
   <!-- If you have introduced a new feature or configuration, please add it to the documentation as well.
   See https://docs.pinot.apache.org/developers/developers-and-contributors/update-document
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628589659



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {

Review comment:
       That's supposed to be the Test case for V1 JSON_MATCH. I think we should keep it (although don't need to update it anymore as the V1 JSON_MATCH is deprecated). If in future we remove V1 JSON_MATCH, then we can remove JsonMatchPredicateTest. In the meantime I think it would be a good idea to modify the name (as suggested in your PR) to avoid any confusion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628365082



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));
+    indexLoadingConfig.setReadMode(ReadMode.mmap);
+
+    ImmutableSegment immutableSegment =
+        ImmutableSegmentLoader.load(new File(INDEX_DIR, SEGMENT_NAME), indexLoadingConfig);
+    _indexSegment = immutableSegment;
+    _indexSegments = Arrays.asList(immutableSegment, immutableSegment);
+  }
+
+  /** Verify result column type of a simple select query against JSON column */
+  @Test
+  public void testSimpleSelectOnJsonColumn() {
+    try {
+      Operator operator = getOperatorForSqlQuery("select jsonColumn FROM testTable");
+      IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+      Collection<Object[]> rows = block.getSelectionResult();
+      Assert.assertEquals(rows.size(), 9);
+      Assert.assertEquals(block.getDataSchema().getColumnDataType(0), DataSchema.ColumnDataType.JSON);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  /** Test filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select intColumn, json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Object[] row = iterator.next();
+    Assert.assertEquals(row[0], 1);
+    Assert.assertEquals(row[1], "duck");
+  }
+
+  /** Test filtering on number value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericIntFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'INT') = 171");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "dwag");
+  }
+
+  /** Test filtering on float value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericFloatFilter() {
+
+    // query to retrieve result as INT
+    Operator operator1 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'INT') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block1 = (IntermediateResultsBlock) operator1.nextBlock();
+    Collection<Object[]> rows1 = block1.getSelectionResult();
+    Assert.assertEquals(rows1.size(), 1);
+
+    Iterator<Object[]> iterator1 = rows1.iterator();
+    Assert.assertTrue(iterator1.hasNext());
+    Assert.assertEquals(iterator1.next()[0], 161);
+
+    // query to retrieve result as DOUBLE
+    Operator operator2 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'DOUBLE') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block2 = (IntermediateResultsBlock) operator2.nextBlock();
+    Collection<Object[]> rows2 = block2.getSelectionResult();
+    Assert.assertEquals(rows2.size(), 1);
+
+    Iterator<Object[]> iterator2 = rows2.iterator();
+    Assert.assertTrue(iterator2.hasNext());
+    Assert.assertEquals(iterator2.next()[0], 161.5d);
+  }
+
+  /** Retrieve JSON array after filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarArrayWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.data', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "[\"a\",\"b\",\"c\",\"d\"]");
+  }
+
+  /** Test filtering on string value within a JSON array*/
+  @Test
+  public void testExtractScalarWithArrayFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.data[0]', 'STRING') IN ('i', 'k')");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 2);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daisy");
+    Assert.assertEquals(iterator.next()[0], "goofy");
+  }
+
+  @Test
+  public void testJsonMatchWithoutIndex() {
+    try {
+      Operator operator = getOperatorForSqlQuery(
+          "select json_extract_scalar(stringColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(stringColumn, 'id=101')");
+      Assert.assertTrue(false);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  @Test
+  public void testJsonMatchAtLevel1WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'id=101')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+
+  @Test
+  public void testJsonMatchAtLevel2WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'name.first = ''daffy''')");
+

Review comment:
       This test filters on name.first using json index and extracts it as well. Can we please add another test (or queries in the same test) where we filter on something (using json index / json match) and extract something else using json_extract_scalar ?
   
   Also, let's try going a bit deeper than level 2 in tests.  For non array fields, we should be able to go to any depth in the current json index implementation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628351204



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,293 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));
+    indexLoadingConfig.setReadMode(ReadMode.mmap);
+
+    ImmutableSegment immutableSegment =
+        ImmutableSegmentLoader.load(new File(INDEX_DIR, SEGMENT_NAME), indexLoadingConfig);
+    _indexSegment = immutableSegment;
+    _indexSegments = Arrays.asList(immutableSegment, immutableSegment);
+  }
+
+  /** Test filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select intColumn, json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Object[] row = iterator.next();
+    Assert.assertEquals(row[0], 1);
+    Assert.assertEquals(row[1], "duck");
+  }
+
+  /** Test filtering on number value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericIntFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'INT') = 171");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "dwag");
+  }
+
+  /** Test filtering on float value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericFloatFilter() {
+
+    // query to retrieve result as INT
+    Operator operator1 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'INT') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block1 = (IntermediateResultsBlock) operator1.nextBlock();
+    Collection<Object[]> rows1 = block1.getSelectionResult();
+    Assert.assertEquals(rows1.size(), 1);
+
+    Iterator<Object[]> iterator1 = rows1.iterator();
+    Assert.assertTrue(iterator1.hasNext());
+    Assert.assertEquals(iterator1.next()[0], 161);
+
+    // query to retrieve result as DOUBLE
+    Operator operator2 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'DOUBLE') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block2 = (IntermediateResultsBlock) operator2.nextBlock();
+    Collection<Object[]> rows2 = block2.getSelectionResult();
+    Assert.assertEquals(rows2.size(), 1);
+
+    Iterator<Object[]> iterator2 = rows2.iterator();
+    Assert.assertTrue(iterator2.hasNext());
+    Assert.assertEquals(iterator2.next()[0], 161.5d);
+  }
+
+  /** Retrieve JSON array after filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarArrayWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.data', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "[\"a\",\"b\",\"c\",\"d\"]");
+  }
+
+  /** Test filtering on string value within a JSON array*/
+  @Test
+  public void testExtractScalarWithArrayFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.data[0]', 'STRING') IN ('i', 'k')");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 2);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daisy");
+    Assert.assertEquals(iterator.next()[0], "goofy");
+  }
+
+  @Test
+  public void testJsonMatchWithoutIndex() {
+    try {
+      Operator operator = getOperatorForSqlQuery(
+          "select json_extract_scalar(stringColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(stringColumn, 'id=101')");
+      Assert.assertTrue(false);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  @Test
+  public void testJsonMatchAtLevel1WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'id=101')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+
+  @Test
+  public void testJsonMatchAtLevel2WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'name.first = ''daffy''')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+
+  @Test
+  public void testJsonMatchArrayWithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, '\"data[0]\" IN (''k'', ''j'')')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 2);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "pluto");
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "goofy");
+  }
+

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia merged pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
siddharthteotia merged pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628351400



##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/data/FieldSpec.java
##########
@@ -52,6 +52,7 @@
   public static final Integer DEFAULT_DIMENSION_NULL_VALUE_OF_BOOLEAN = 0;
   public static final Long DEFAULT_DIMENSION_NULL_VALUE_OF_TIMESTAMP = 0L;
   public static final String DEFAULT_DIMENSION_NULL_VALUE_OF_STRING = "null";
+  public static final String DEFAULT_DIMENSION_NULL_VALUE_OF_JSON = "{}";

Review comment:
       Changed to `"null"`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r627775093



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/function/FunctionUtils.java
##########
@@ -31,6 +31,9 @@
   private FunctionUtils() {
   }
 
+  // TODO: Do we need to create a class for handling JSON. It doesn't seem like this is needed since JSON type directly

Review comment:
       Removed comment. I think String is good enough for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628488701



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));
+    indexLoadingConfig.setReadMode(ReadMode.mmap);
+
+    ImmutableSegment immutableSegment =
+        ImmutableSegmentLoader.load(new File(INDEX_DIR, SEGMENT_NAME), indexLoadingConfig);
+    _indexSegment = immutableSegment;
+    _indexSegments = Arrays.asList(immutableSegment, immutableSegment);
+  }
+
+  /** Verify result column type of a simple select query against JSON column */
+  @Test
+  public void testSimpleSelectOnJsonColumn() {
+    try {
+      Operator operator = getOperatorForSqlQuery("select jsonColumn FROM testTable");
+      IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+      Collection<Object[]> rows = block.getSelectionResult();
+      Assert.assertEquals(rows.size(), 9);
+      Assert.assertEquals(block.getDataSchema().getColumnDataType(0), DataSchema.ColumnDataType.JSON);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  /** Test filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select intColumn, json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Object[] row = iterator.next();
+    Assert.assertEquals(row[0], 1);
+    Assert.assertEquals(row[1], "duck");
+  }
+
+  /** Test filtering on number value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericIntFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'INT') = 171");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "dwag");
+  }
+
+  /** Test filtering on float value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericFloatFilter() {
+
+    // query to retrieve result as INT
+    Operator operator1 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'INT') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block1 = (IntermediateResultsBlock) operator1.nextBlock();
+    Collection<Object[]> rows1 = block1.getSelectionResult();
+    Assert.assertEquals(rows1.size(), 1);
+
+    Iterator<Object[]> iterator1 = rows1.iterator();
+    Assert.assertTrue(iterator1.hasNext());
+    Assert.assertEquals(iterator1.next()[0], 161);
+
+    // query to retrieve result as DOUBLE
+    Operator operator2 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'DOUBLE') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block2 = (IntermediateResultsBlock) operator2.nextBlock();
+    Collection<Object[]> rows2 = block2.getSelectionResult();
+    Assert.assertEquals(rows2.size(), 1);
+
+    Iterator<Object[]> iterator2 = rows2.iterator();
+    Assert.assertTrue(iterator2.hasNext());
+    Assert.assertEquals(iterator2.next()[0], 161.5d);
+  }
+
+  /** Retrieve JSON array after filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarArrayWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.data', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "[\"a\",\"b\",\"c\",\"d\"]");
+  }
+
+  /** Test filtering on string value within a JSON array*/
+  @Test
+  public void testExtractScalarWithArrayFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.data[0]', 'STRING') IN ('i', 'k')");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 2);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daisy");
+    Assert.assertEquals(iterator.next()[0], "goofy");
+  }
+
+  @Test
+  public void testJsonMatchWithoutIndex() {
+    try {
+      Operator operator = getOperatorForSqlQuery(
+          "select json_extract_scalar(stringColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(stringColumn, 'id=101')");
+      Assert.assertTrue(false);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  @Test
+  public void testJsonMatchAtLevel1WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'id=101')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+
+  @Test
+  public void testJsonMatchAtLevel2WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'name.first = ''daffy''')");
+

Review comment:
       Will update test cases once Jackie's PR (https://github.com/apache/incubator-pinot/pull/6877/files) on V2 JSON_MATCH function is merged. Added TODO for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628358014



##########
File path: pinot-common/src/test/java/org/apache/pinot/common/utils/PinotDataTypeTest.java
##########
@@ -102,43 +127,64 @@ public void testObject() {
     assertTrue(OBJECT.toBoolean(new NumberObject("123")));
     assertEquals(OBJECT.toTimestamp(new NumberObject("123")).getTime(), 123L);
     assertEquals(OBJECT.toString(new NumberObject("123")), "123");
+    assertEquals(OBJECT.toJson(getGenericTestObject()), "{\"bytes\":\"AAE=\",\"map\":{\"key1\":\"value\",\"key2\":null,\"array\":[-5.4,4,\"2\"]},\"timestamp\":1620324238610}");
     assertEquals(OBJECT_ARRAY.getSingleValueType(), OBJECT);
   }
 
+  private static Object getGenericTestObject() {
+    Map<String, Object> map1 = new HashMap<>();
+    map1.put("array", Arrays.asList(-5.4,4, "2"));
+    map1.put("key1", "value");
+    map1.put("key2", null);
+
+    Map<String, Object> map2 = new HashMap<>();
+    map2.put("map", map1);
+    map2.put("bytes", new byte[]{0,1});
+    map2.put("timestamp", Timestamp.valueOf("2021-05-06 11:03:58.61"));
+
+    return map2;
+  }
+
   @Test
   public void testInvalidConversion() {
     for (PinotDataType sourceType : values()) {
       if (sourceType.isSingleValue() && sourceType != STRING && sourceType != BYTES) {
-        assertInvalidConversion(sourceType, BYTES);
+        assertInvalidConversion(null, sourceType, BYTES, UnsupportedOperationException.class);

Review comment:
       These are original failure case test cases and I agree that they don't seem to be doing much, but here I am just refactoring the code to allow for testing conversion of invalid strings to JSON.
   
   `assertInvalidConversion("xyz", STRING, JSON, RuntimeException.class);`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-commenter edited a comment on pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#issuecomment-832453091


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#6878](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7978e7c) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/bb2519494a15a3125dd509286d60ffd124f91c37?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bb25194) will **decrease** coverage by `8.05%`.
   > The diff coverage is `71.42%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6878/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #6878      +/-   ##
   ============================================
   - Coverage     73.55%   65.49%   -8.06%     
     Complexity       12       12              
   ============================================
     Files          1423     1423              
     Lines         70029    70064      +35     
     Branches      10119    10120       +1     
   ============================================
   - Hits          51509    45891    -5618     
   - Misses        15103    20892    +5789     
   + Partials       3417     3281     -136     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | integration | `?` | `?` | |
   | unittests | `65.49% <71.42%> (-0.01%)` | `12.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...ator/transform/function/CastTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQ2FzdFRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `56.00% <0.00%> (-6.50%)` | `0.00 <0.00> (ø)` | |
   | [...rc/main/java/org/apache/pinot/spi/data/Schema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9TY2hlbWEuamF2YQ==) | `75.00% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...ain/java/org/apache/pinot/spi/utils/JsonUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvSnNvblV0aWxzLmphdmE=) | `69.53% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/common/utils/PinotDataType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvUGlub3REYXRhVHlwZS5qYXZh) | `67.26% <70.83%> (+0.16%)` | `0.00 <0.00> (ø)` | |
   | [...java/org/apache/pinot/common/utils/DataSchema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvRGF0YVNjaGVtYS5qYXZh) | `74.13% <75.00%> (-0.87%)` | `0.00 <0.00> (ø)` | |
   | [...ator/transform/function/BaseTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQmFzZVRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `36.55% <100.00%> (-2.19%)` | `0.00 <0.00> (ø)` | |
   | [.../columnminmaxvalue/ColumnMinMaxValueGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9jb2x1bW5taW5tYXh2YWx1ZS9Db2x1bW5NaW5NYXhWYWx1ZUdlbmVyYXRvci5qYXZh) | `63.38% <100.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | [...main/java/org/apache/pinot/spi/data/FieldSpec.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9GaWVsZFNwZWMuamF2YQ==) | `83.33% <100.00%> (+0.27%)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/minion/metrics/MinionMeter.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [348 more](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b87787...7978e7c](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628358171



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +540,64 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      return Integer.parseInt(value.toString().trim());
+    }
+
+    @Override
+    public long toLong(Object value) {
+      return Long.parseLong(value.toString().trim());
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      return Float.parseFloat(value.toString().trim());
+    }
+
+    @Override
+    public double toDouble(Object value) {
+      return Double.parseDouble(value.toString().trim());
+    }
+
+    @Override
+    public boolean toBoolean(Object value) {
+      return Boolean.parseBoolean(value.toString().trim());
+    }
+
+    @Override
+    public Timestamp toTimestamp(Object value) {
+      return (value instanceof Long) ? new Timestamp(((Long) value).longValue())
+          : Timestamp.valueOf(value.toString().trim());
+    }
+
+    @Override
+    public String toString(Object value) {
+      return value.toString();
+    }
+
+    @Override
+    public byte[] toBytes(Object value) {
+      if (value instanceof String) {
+        // Assume that input JSON string value is Base64 encoded.
+        try {
+          return Base64.getDecoder().decode(((String) value).getBytes("UTF-8"));

Review comment:
       Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628358409



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +540,64 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      return Integer.parseInt(value.toString().trim());
+    }
+
+    @Override
+    public long toLong(Object value) {
+      return Long.parseLong(value.toString().trim());
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      return Float.parseFloat(value.toString().trim());
+    }
+
+    @Override
+    public double toDouble(Object value) {
+      return Double.parseDouble(value.toString().trim());
+    }
+
+    @Override
+    public boolean toBoolean(Object value) {
+      return Boolean.parseBoolean(value.toString().trim());
+    }
+
+    @Override
+    public Timestamp toTimestamp(Object value) {
+      return (value instanceof Long) ? new Timestamp(((Long) value).longValue())
+          : Timestamp.valueOf(value.toString().trim());
+    }
+
+    @Override
+    public String toString(Object value) {
+      return value.toString();
+    }
+
+    @Override
+    public byte[] toBytes(Object value) {
+      if (value instanceof String) {

Review comment:
       Fixed.

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +540,64 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      return Integer.parseInt(value.toString().trim());
+    }
+
+    @Override
+    public long toLong(Object value) {
+      return Long.parseLong(value.toString().trim());
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      return Float.parseFloat(value.toString().trim());

Review comment:
       Fixed.

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +540,64 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      return Integer.parseInt(value.toString().trim());
+    }
+
+    @Override
+    public long toLong(Object value) {
+      return Long.parseLong(value.toString().trim());
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      return Float.parseFloat(value.toString().trim());
+    }
+
+    @Override
+    public double toDouble(Object value) {
+      return Double.parseDouble(value.toString().trim());
+    }
+
+    @Override
+    public boolean toBoolean(Object value) {
+      return Boolean.parseBoolean(value.toString().trim());
+    }
+
+    @Override
+    public Timestamp toTimestamp(Object value) {
+      return (value instanceof Long) ? new Timestamp(((Long) value).longValue())

Review comment:
       Fixed.

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -717,7 +777,8 @@ public String toString(Object value) {
   OBJECT_ARRAY;
 
   /**
-   * NOTE: override toInt(), toLong(), toFloat(), toDouble(), toBoolean(), toTimestamp(), toString() and toBytes() for single-value types.
+   * NOTE: override toInt(), toLong(), toFloat(), toDouble(), toBoolean(), toTimestamp(), toString(), toJson(), and

Review comment:
       Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-commenter edited a comment on pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#issuecomment-832453091


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#6878](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b55f306) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/bb2519494a15a3125dd509286d60ffd124f91c37?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bb25194) will **decrease** coverage by `8.06%`.
   > The diff coverage is `71.42%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6878/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #6878      +/-   ##
   ============================================
   - Coverage     73.55%   65.49%   -8.07%     
     Complexity       12       12              
   ============================================
     Files          1423     1423              
     Lines         70029    70062      +33     
     Branches      10119    10120       +1     
   ============================================
   - Hits          51509    45884    -5625     
   - Misses        15103    20895    +5792     
   + Partials       3417     3283     -134     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | integration | `?` | `?` | |
   | unittests | `65.49% <71.42%> (-0.01%)` | `12.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...ator/transform/function/CastTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQ2FzdFRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `56.00% <0.00%> (-6.50%)` | `0.00 <0.00> (ø)` | |
   | [...rc/main/java/org/apache/pinot/spi/data/Schema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9TY2hlbWEuamF2YQ==) | `75.00% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...ain/java/org/apache/pinot/spi/utils/JsonUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvSnNvblV0aWxzLmphdmE=) | `69.53% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/common/utils/PinotDataType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvUGlub3REYXRhVHlwZS5qYXZh) | `67.26% <70.83%> (+0.16%)` | `0.00 <0.00> (ø)` | |
   | [...java/org/apache/pinot/common/utils/DataSchema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvRGF0YVNjaGVtYS5qYXZh) | `74.13% <75.00%> (-0.87%)` | `0.00 <0.00> (ø)` | |
   | [...ator/transform/function/BaseTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQmFzZVRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `36.55% <100.00%> (-2.19%)` | `0.00 <0.00> (ø)` | |
   | [.../columnminmaxvalue/ColumnMinMaxValueGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9jb2x1bW5taW5tYXh2YWx1ZS9Db2x1bW5NaW5NYXhWYWx1ZUdlbmVyYXRvci5qYXZh) | `63.38% <100.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | [...main/java/org/apache/pinot/spi/data/FieldSpec.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9GaWVsZFNwZWMuamF2YQ==) | `83.33% <100.00%> (+0.27%)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/minion/metrics/MinionMeter.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [346 more](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b87787...b55f306](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628488178



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));

Review comment:
       Added following TODO: `Update these test cases to: 1) use V2 JSON_MATCH function, 2) use multi-dimensional JSON array addressing, do json_extract_scalar on a column other than the JSON_MATCH column, and 4) query deeper levels of nesting.`
   
   I think all of these test cases will need to be adjusted to use V2 JSON_MATCH function (https://github.com/apache/incubator-pinot/pull/6877/files). We probably shouldn't claim support for V1 JSON_MATCH with new JSON datatype.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r627775240



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
##########
@@ -285,7 +285,7 @@ void readIntValues(int[] docIds, int length, int[] valueBuffer) {
         _reader.readDictIds(docIds, length, dictIdBuffer, readerContext);
         _dictionary.readIntValues(dictIdBuffer, length, valueBuffer);
       } else {
-        switch (_reader.getValueType()) {
+        switch (_reader.getValueType().getStoredType()) {

Review comment:
       Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628354318



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/CastTransformFunction.java
##########
@@ -130,6 +132,9 @@ public TransformResultMetadata getResultMetadata() {
             _stringValuesSV[i] = new Timestamp(longValuesSV[i]).toString();
           }
           break;
+        case JSON:

Review comment:
       Removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-commenter edited a comment on pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#issuecomment-832453091


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#6878](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b6d8798) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1c09b7866b4d1c8267847f464e9c4618787b5a15?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1c09b78) will **increase** coverage by `8.13%`.
   > The diff coverage is `69.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6878/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #6878      +/-   ##
   ============================================
   + Coverage     65.48%   73.61%   +8.13%     
     Complexity       12       12              
   ============================================
     Files          1421     1423       +2     
     Lines         69980    70066      +86     
     Branches      10112    10122      +10     
   ============================================
   + Hits          45825    51580    +5755     
   + Misses        20874    15071    -5803     
   - Partials       3281     3415     +134     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | integration | `43.41% <10.25%> (?)` | `7.00 <0.00> (?)` | |
   | unittests | `65.51% <69.23%> (+0.02%)` | `12.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...ator/transform/function/CastTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQ2FzdFRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `58.82% <0.00%> (+0.49%)` | `0.00 <0.00> (ø)` | |
   | [...e/pinot/core/query/reduce/HavingFilterHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvSGF2aW5nRmlsdGVySGFuZGxlci5qYXZh) | `89.13% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...rc/main/java/org/apache/pinot/spi/data/Schema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9TY2hlbWEuamF2YQ==) | `75.00% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...ain/java/org/apache/pinot/spi/utils/JsonUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvSnNvblV0aWxzLmphdmE=) | `69.53% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...java/org/apache/pinot/common/utils/DataSchema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvRGF0YVNjaGVtYS5qYXZh) | `74.56% <50.00%> (+0.44%)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/common/utils/PinotDataType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvUGlub3REYXRhVHlwZS5qYXZh) | `67.55% <74.07%> (+0.45%)` | `0.00 <0.00> (ø)` | |
   | [...ator/transform/function/BaseTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQmFzZVRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `38.90% <100.00%> (+2.51%)` | `0.00 <0.00> (ø)` | |
   | [.../columnminmaxvalue/ColumnMinMaxValueGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9jb2x1bW5taW5tYXh2YWx1ZS9Db2x1bW5NaW5NYXhWYWx1ZUdlbmVyYXRvci5qYXZh) | `63.38% <100.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | [...main/java/org/apache/pinot/spi/data/FieldSpec.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9GaWVsZFNwZWMuamF2YQ==) | `83.33% <100.00%> (+0.27%)` | `0.00 <0.00> (ø)` | |
   | [...he/pinot/spi/data/readers/BaseRecordExtractor.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9yZWFkZXJzL0Jhc2VSZWNvcmRFeHRyYWN0b3IuamF2YQ==) | `66.07% <0.00%> (-3.58%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [354 more](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [1c09b78...b6d8798](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628354108



##########
File path: pinot-core/src/test/java/org/apache/pinot/core/transport/QueryRoutingTest.java
##########
@@ -211,7 +211,7 @@ public void testServerDown()
     assertEquals(serverResponse.getResponseSize(), 0);
     assertEquals(serverResponse.getDeserializationTimeMs(), 0);
     // Query should early terminate
-    assertTrue(System.currentTimeMillis() - startTimeMs < 1000);
+    assertTrue(System.currentTimeMillis() - startTimeMs < 1010);

Review comment:
       Will put it in a separate PR. Test case fails every once a while because it needs a little extra time over 1000 ms timeout to validate the timeout. Flakyness may be due to a bug.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628354108



##########
File path: pinot-core/src/test/java/org/apache/pinot/core/transport/QueryRoutingTest.java
##########
@@ -211,7 +211,7 @@ public void testServerDown()
     assertEquals(serverResponse.getResponseSize(), 0);
     assertEquals(serverResponse.getDeserializationTimeMs(), 0);
     // Query should early terminate
-    assertTrue(System.currentTimeMillis() - startTimeMs < 1000);
+    assertTrue(System.currentTimeMillis() - startTimeMs < 1010);

Review comment:
       Will put it in a separate PR. Test case fails every once a while because it needs a little extra time over 1000 ms timeout to validate the timeout.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628450306



##########
File path: pinot-core/src/test/java/org/apache/pinot/core/transport/QueryRoutingTest.java
##########
@@ -211,7 +211,7 @@ public void testServerDown()
     assertEquals(serverResponse.getResponseSize(), 0);
     assertEquals(serverResponse.getDeserializationTimeMs(), 0);
     // Query should early terminate
-    assertTrue(System.currentTimeMillis() - startTimeMs < 1000);
+    assertTrue(System.currentTimeMillis() - startTimeMs < 1010);

Review comment:
       See flaky test PR at https://github.com/apache/incubator-pinot/pull/6893. I will revert the change from here once the flaky test PR is merged.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628354478



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/CastTransformFunction.java
##########
@@ -76,6 +76,8 @@ public void init(List<TransformFunction> arguments, Map<String, DataSource> data
         case "VARCHAR":
           _resultMetadata = STRING_SV_NO_DICTIONARY_METADATA;
           break;
+        case "JSON":
+          _resultMetadata = JSON_SV_NO_DICTIONARY_METADATA;

Review comment:
       Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-commenter edited a comment on pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#issuecomment-832453091


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#6878](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4bfae70) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1c09b7866b4d1c8267847f464e9c4618787b5a15?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1c09b78) will **increase** coverage by `8.05%`.
   > The diff coverage is `28.40%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6878/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #6878      +/-   ##
   ============================================
   + Coverage     65.48%   73.54%   +8.05%     
     Complexity       12       12              
   ============================================
     Files          1421     1421              
     Lines         69980    70042      +62     
     Branches      10112    10115       +3     
   ============================================
   + Hits          45825    51510    +5685     
   + Misses        20874    15119    -5755     
   - Partials       3281     3413     +132     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | integration | `43.40% <13.63%> (?)` | `7.00 <0.00> (?)` | |
   | unittests | `65.43% <28.40%> (-0.06%)` | `12.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/pinot/common/function/FunctionUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vRnVuY3Rpb25VdGlscy5qYXZh) | `97.72% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...java/org/apache/pinot/core/common/DataFetcher.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9jb21tb24vRGF0YUZldGNoZXIuamF2YQ==) | `89.95% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | [...ter/predicate/EqualsPredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL0VxdWFsc1ByZWRpY2F0ZUV2YWx1YXRvckZhY3RvcnkuamF2YQ==) | `82.53% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [.../filter/predicate/InPredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL0luUHJlZGljYXRlRXZhbHVhdG9yRmFjdG9yeS5qYXZh) | `78.21% <ø> (+6.93%)` | `0.00 <0.00> (ø)` | |
   | [.../predicate/NotEqualsPredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL05vdEVxdWFsc1ByZWRpY2F0ZUV2YWx1YXRvckZhY3RvcnkuamF2YQ==) | `77.92% <ø> (+18.18%)` | `0.00 <0.00> (ø)` | |
   | [...lter/predicate/NotInPredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL05vdEluUHJlZGljYXRlRXZhbHVhdG9yRmFjdG9yeS5qYXZh) | `79.09% <ø> (+12.72%)` | `0.00 <0.00> (ø)` | |
   | [...lter/predicate/RangePredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL1JhbmdlUHJlZGljYXRlRXZhbHVhdG9yRmFjdG9yeS5qYXZh) | `82.62% <ø> (+14.55%)` | `0.00 <0.00> (ø)` | |
   | [...ator/transform/function/CastTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQ2FzdFRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `58.82% <0.00%> (+0.49%)` | `0.00 <0.00> (ø)` | |
   | [...e/pinot/core/query/reduce/HavingFilterHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvSGF2aW5nRmlsdGVySGFuZGxlci5qYXZh) | `89.13% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...egment/local/segment/creator/impl/V1Constants.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9WMUNvbnN0YW50cy5qYXZh) | `12.50% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | ... and [357 more](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [1c09b78...4bfae70](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628354233



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/reduce/HavingFilterHandler.java
##########
@@ -154,6 +154,7 @@ public boolean isMatch(Object[] row) {
         case TIMESTAMP:
           return _predicateEvaluator.applySV(((Timestamp) value).getTime());
         case STRING:
+        case JSON:

Review comment:
       Removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628366484



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));
+    indexLoadingConfig.setReadMode(ReadMode.mmap);
+
+    ImmutableSegment immutableSegment =
+        ImmutableSegmentLoader.load(new File(INDEX_DIR, SEGMENT_NAME), indexLoadingConfig);
+    _indexSegment = immutableSegment;
+    _indexSegments = Arrays.asList(immutableSegment, immutableSegment);
+  }
+
+  /** Verify result column type of a simple select query against JSON column */
+  @Test
+  public void testSimpleSelectOnJsonColumn() {
+    try {
+      Operator operator = getOperatorForSqlQuery("select jsonColumn FROM testTable");
+      IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+      Collection<Object[]> rows = block.getSelectionResult();
+      Assert.assertEquals(rows.size(), 9);
+      Assert.assertEquals(block.getDataSchema().getColumnDataType(0), DataSchema.ColumnDataType.JSON);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  /** Test filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select intColumn, json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Object[] row = iterator.next();
+    Assert.assertEquals(row[0], 1);
+    Assert.assertEquals(row[1], "duck");
+  }
+
+  /** Test filtering on number value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericIntFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'INT') = 171");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "dwag");
+  }
+
+  /** Test filtering on float value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericFloatFilter() {
+
+    // query to retrieve result as INT
+    Operator operator1 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'INT') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block1 = (IntermediateResultsBlock) operator1.nextBlock();
+    Collection<Object[]> rows1 = block1.getSelectionResult();
+    Assert.assertEquals(rows1.size(), 1);
+
+    Iterator<Object[]> iterator1 = rows1.iterator();
+    Assert.assertTrue(iterator1.hasNext());
+    Assert.assertEquals(iterator1.next()[0], 161);
+
+    // query to retrieve result as DOUBLE
+    Operator operator2 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'DOUBLE') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block2 = (IntermediateResultsBlock) operator2.nextBlock();
+    Collection<Object[]> rows2 = block2.getSelectionResult();
+    Assert.assertEquals(rows2.size(), 1);
+
+    Iterator<Object[]> iterator2 = rows2.iterator();
+    Assert.assertTrue(iterator2.hasNext());
+    Assert.assertEquals(iterator2.next()[0], 161.5d);
+  }
+
+  /** Retrieve JSON array after filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarArrayWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.data', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "[\"a\",\"b\",\"c\",\"d\"]");
+  }
+
+  /** Test filtering on string value within a JSON array*/
+  @Test
+  public void testExtractScalarWithArrayFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.data[0]', 'STRING') IN ('i', 'k')");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 2);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daisy");
+    Assert.assertEquals(iterator.next()[0], "goofy");
+  }
+
+  @Test
+  public void testJsonMatchWithoutIndex() {
+    try {
+      Operator operator = getOperatorForSqlQuery(
+          "select json_extract_scalar(stringColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(stringColumn, 'id=101')");
+      Assert.assertTrue(false);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  @Test
+  public void testJsonMatchAtLevel1WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'id=101')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+
+  @Test
+  public void testJsonMatchAtLevel2WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'name.first = ''daffy''')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+

Review comment:
       Can we add tests for GROUP BY on json fields ?
   
   I think until flattening support (PRs from Uber) is implemented or GROUP BY support for unflattened JSON data is there, the only way to GROUP BY is using json_extract_scalar transform function. If we don't have tests for them, let's add them here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628493077



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));
+    indexLoadingConfig.setReadMode(ReadMode.mmap);
+
+    ImmutableSegment immutableSegment =
+        ImmutableSegmentLoader.load(new File(INDEX_DIR, SEGMENT_NAME), indexLoadingConfig);
+    _indexSegment = immutableSegment;
+    _indexSegments = Arrays.asList(immutableSegment, immutableSegment);
+  }
+
+  /** Verify result column type of a simple select query against JSON column */
+  @Test
+  public void testSimpleSelectOnJsonColumn() {
+    try {
+      Operator operator = getOperatorForSqlQuery("select jsonColumn FROM testTable");
+      IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+      Collection<Object[]> rows = block.getSelectionResult();
+      Assert.assertEquals(rows.size(), 9);
+      Assert.assertEquals(block.getDataSchema().getColumnDataType(0), DataSchema.ColumnDataType.JSON);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  /** Test filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select intColumn, json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Object[] row = iterator.next();
+    Assert.assertEquals(row[0], 1);
+    Assert.assertEquals(row[1], "duck");
+  }
+
+  /** Test filtering on number value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericIntFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'INT') = 171");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "dwag");
+  }
+
+  /** Test filtering on float value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericFloatFilter() {
+
+    // query to retrieve result as INT
+    Operator operator1 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'INT') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block1 = (IntermediateResultsBlock) operator1.nextBlock();
+    Collection<Object[]> rows1 = block1.getSelectionResult();
+    Assert.assertEquals(rows1.size(), 1);
+
+    Iterator<Object[]> iterator1 = rows1.iterator();
+    Assert.assertTrue(iterator1.hasNext());
+    Assert.assertEquals(iterator1.next()[0], 161);
+
+    // query to retrieve result as DOUBLE
+    Operator operator2 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'DOUBLE') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block2 = (IntermediateResultsBlock) operator2.nextBlock();
+    Collection<Object[]> rows2 = block2.getSelectionResult();
+    Assert.assertEquals(rows2.size(), 1);
+
+    Iterator<Object[]> iterator2 = rows2.iterator();
+    Assert.assertTrue(iterator2.hasNext());
+    Assert.assertEquals(iterator2.next()[0], 161.5d);
+  }
+
+  /** Retrieve JSON array after filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarArrayWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.data', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "[\"a\",\"b\",\"c\",\"d\"]");
+  }
+
+  /** Test filtering on string value within a JSON array*/
+  @Test
+  public void testExtractScalarWithArrayFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.data[0]', 'STRING') IN ('i', 'k')");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 2);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daisy");
+    Assert.assertEquals(iterator.next()[0], "goofy");
+  }
+
+  @Test
+  public void testJsonMatchWithoutIndex() {
+    try {
+      Operator operator = getOperatorForSqlQuery(
+          "select json_extract_scalar(stringColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(stringColumn, 'id=101')");
+      Assert.assertTrue(false);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  @Test
+  public void testJsonMatchAtLevel1WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'id=101')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+
+  @Test
+  public void testJsonMatchAtLevel2WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'name.first = ''daffy''')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+

Review comment:
       Will update test cases once Jackie's PR (#6877) on V2 JSON_MATCH function is merged. Added TODO for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628585502



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {

Review comment:
       This looks the same as the current `JsonMatchPredicateTest`, let's just modify that to use `JSON` type




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r627878000



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +540,64 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      return Integer.parseInt(value.toString().trim());
+    }
+
+    @Override
+    public long toLong(Object value) {
+      return Long.parseLong(value.toString().trim());
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      return Float.parseFloat(value.toString().trim());

Review comment:
       No need to trim, same for double (refer to the comments for STRING)

##########
File path: pinot-core/src/test/java/org/apache/pinot/core/transport/QueryRoutingTest.java
##########
@@ -211,7 +211,7 @@ public void testServerDown()
     assertEquals(serverResponse.getResponseSize(), 0);
     assertEquals(serverResponse.getDeserializationTimeMs(), 0);
     // Query should early terminate
-    assertTrue(System.currentTimeMillis() - startTimeMs < 1000);
+    assertTrue(System.currentTimeMillis() - startTimeMs < 1010);

Review comment:
       Seems unrelated. If we find this test is flaky, maybe having another PR to fix it?

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +540,64 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      return Integer.parseInt(value.toString().trim());
+    }
+
+    @Override
+    public long toLong(Object value) {
+      return Long.parseLong(value.toString().trim());
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      return Float.parseFloat(value.toString().trim());
+    }
+
+    @Override
+    public double toDouble(Object value) {
+      return Double.parseDouble(value.toString().trim());
+    }
+
+    @Override
+    public boolean toBoolean(Object value) {
+      return Boolean.parseBoolean(value.toString().trim());
+    }
+
+    @Override
+    public Timestamp toTimestamp(Object value) {
+      return (value instanceof Long) ? new Timestamp(((Long) value).longValue())

Review comment:
       Value won't be long here. You may use the same way to handle this as STRING type

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +540,64 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      return Integer.parseInt(value.toString().trim());
+    }
+
+    @Override
+    public long toLong(Object value) {
+      return Long.parseLong(value.toString().trim());
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      return Float.parseFloat(value.toString().trim());
+    }
+
+    @Override
+    public double toDouble(Object value) {
+      return Double.parseDouble(value.toString().trim());
+    }
+
+    @Override
+    public boolean toBoolean(Object value) {
+      return Boolean.parseBoolean(value.toString().trim());
+    }
+
+    @Override
+    public Timestamp toTimestamp(Object value) {
+      return (value instanceof Long) ? new Timestamp(((Long) value).longValue())
+          : Timestamp.valueOf(value.toString().trim());
+    }
+
+    @Override
+    public String toString(Object value) {
+      return value.toString();
+    }
+
+    @Override
+    public byte[] toBytes(Object value) {
+      if (value instanceof String) {
+        // Assume that input JSON string value is Base64 encoded.
+        try {
+          return Base64.getDecoder().decode(((String) value).getBytes("UTF-8"));

Review comment:
       You can directly decode string here
   ```suggestion
             return Base64.getDecoder().decode(value.toString());
   ```

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +540,64 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      return Integer.parseInt(value.toString().trim());
+    }
+
+    @Override
+    public long toLong(Object value) {
+      return Long.parseLong(value.toString().trim());
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      return Float.parseFloat(value.toString().trim());
+    }
+
+    @Override
+    public double toDouble(Object value) {
+      return Double.parseDouble(value.toString().trim());
+    }
+
+    @Override
+    public boolean toBoolean(Object value) {
+      return Boolean.parseBoolean(value.toString().trim());
+    }
+
+    @Override
+    public Timestamp toTimestamp(Object value) {
+      return (value instanceof Long) ? new Timestamp(((Long) value).longValue())
+          : Timestamp.valueOf(value.toString().trim());
+    }
+
+    @Override
+    public String toString(Object value) {
+      return value.toString();
+    }
+
+    @Override
+    public byte[] toBytes(Object value) {
+      if (value instanceof String) {

Review comment:
       Use `value.toString()` here to be consistent with other methods? Value should always be of string type here

##########
File path: pinot-common/src/test/java/org/apache/pinot/common/utils/PinotDataTypeTest.java
##########
@@ -102,43 +127,64 @@ public void testObject() {
     assertTrue(OBJECT.toBoolean(new NumberObject("123")));
     assertEquals(OBJECT.toTimestamp(new NumberObject("123")).getTime(), 123L);
     assertEquals(OBJECT.toString(new NumberObject("123")), "123");
+    assertEquals(OBJECT.toJson(getGenericTestObject()), "{\"bytes\":\"AAE=\",\"map\":{\"key1\":\"value\",\"key2\":null,\"array\":[-5.4,4,\"2\"]},\"timestamp\":1620324238610}");
     assertEquals(OBJECT_ARRAY.getSingleValueType(), OBJECT);
   }
 
+  private static Object getGenericTestObject() {
+    Map<String, Object> map1 = new HashMap<>();
+    map1.put("array", Arrays.asList(-5.4,4, "2"));
+    map1.put("key1", "value");
+    map1.put("key2", null);
+
+    Map<String, Object> map2 = new HashMap<>();
+    map2.put("map", map1);
+    map2.put("bytes", new byte[]{0,1});
+    map2.put("timestamp", Timestamp.valueOf("2021-05-06 11:03:58.61"));
+
+    return map2;
+  }
+
   @Test
   public void testInvalidConversion() {
     for (PinotDataType sourceType : values()) {
       if (sourceType.isSingleValue() && sourceType != STRING && sourceType != BYTES) {
-        assertInvalidConversion(sourceType, BYTES);
+        assertInvalidConversion(null, sourceType, BYTES, UnsupportedOperationException.class);

Review comment:
       Why are we testing `null` here? `convert()` should never be called on `null` values

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +540,64 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      return Integer.parseInt(value.toString().trim());
+    }
+
+    @Override
+    public long toLong(Object value) {
+      return Long.parseLong(value.toString().trim());
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      return Float.parseFloat(value.toString().trim());
+    }
+
+    @Override
+    public double toDouble(Object value) {
+      return Double.parseDouble(value.toString().trim());
+    }
+
+    @Override
+    public boolean toBoolean(Object value) {
+      return Boolean.parseBoolean(value.toString().trim());
+    }
+
+    @Override
+    public Timestamp toTimestamp(Object value) {
+      return (value instanceof Long) ? new Timestamp(((Long) value).longValue())
+          : Timestamp.valueOf(value.toString().trim());
+    }
+
+    @Override
+    public String toString(Object value) {
+      return value.toString();
+    }
+
+    @Override
+    public byte[] toBytes(Object value) {
+      if (value instanceof String) {
+        // Assume that input JSON string value is Base64 encoded.

Review comment:
       Add some comment here on why using base64 decoder here (byte[] will be serialized using base64 in `toJson()`)

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -717,7 +777,8 @@ public String toString(Object value) {
   OBJECT_ARRAY;
 
   /**
-   * NOTE: override toInt(), toLong(), toFloat(), toDouble(), toBoolean(), toTimestamp(), toString() and toBytes() for single-value types.
+   * NOTE: override toInt(), toLong(), toFloat(), toDouble(), toBoolean(), toTimestamp(), toString(), toJson(), and

Review comment:
       (nit) revert?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/CastTransformFunction.java
##########
@@ -130,6 +132,9 @@ public TransformResultMetadata getResultMetadata() {
             _stringValuesSV[i] = new Timestamp(longValuesSV[i]).toString();
           }
           break;
+        case JSON:

Review comment:
       No need to add this case here. `JSON` type doesn't require extra conversion

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/CastTransformFunction.java
##########
@@ -76,6 +76,8 @@ public void init(List<TransformFunction> arguments, Map<String, DataSource> data
         case "VARCHAR":
           _resultMetadata = STRING_SV_NO_DICTIONARY_METADATA;
           break;
+        case "JSON":
+          _resultMetadata = JSON_SV_NO_DICTIONARY_METADATA;

Review comment:
       break;

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/reduce/HavingFilterHandler.java
##########
@@ -154,6 +154,7 @@ public boolean isMatch(Object[] row) {
         case TIMESTAMP:
           return _predicateEvaluator.applySV(((Timestamp) value).getTime());
         case STRING:
+        case JSON:

Review comment:
       We won't reach here as `JSON` is not supported for predicate evaluator yet

##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/data/FieldSpec.java
##########
@@ -52,6 +52,7 @@
   public static final Integer DEFAULT_DIMENSION_NULL_VALUE_OF_BOOLEAN = 0;
   public static final Long DEFAULT_DIMENSION_NULL_VALUE_OF_TIMESTAMP = 0L;
   public static final String DEFAULT_DIMENSION_NULL_VALUE_OF_STRING = "null";
+  public static final String DEFAULT_DIMENSION_NULL_VALUE_OF_JSON = "{}";

Review comment:
       `"null"` might be better?

##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,293 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));
+    indexLoadingConfig.setReadMode(ReadMode.mmap);
+
+    ImmutableSegment immutableSegment =
+        ImmutableSegmentLoader.load(new File(INDEX_DIR, SEGMENT_NAME), indexLoadingConfig);
+    _indexSegment = immutableSegment;
+    _indexSegments = Arrays.asList(immutableSegment, immutableSegment);
+  }
+
+  /** Test filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select intColumn, json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Object[] row = iterator.next();
+    Assert.assertEquals(row[0], 1);
+    Assert.assertEquals(row[1], "duck");
+  }
+
+  /** Test filtering on number value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericIntFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'INT') = 171");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "dwag");
+  }
+
+  /** Test filtering on float value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericFloatFilter() {
+
+    // query to retrieve result as INT
+    Operator operator1 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'INT') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block1 = (IntermediateResultsBlock) operator1.nextBlock();
+    Collection<Object[]> rows1 = block1.getSelectionResult();
+    Assert.assertEquals(rows1.size(), 1);
+
+    Iterator<Object[]> iterator1 = rows1.iterator();
+    Assert.assertTrue(iterator1.hasNext());
+    Assert.assertEquals(iterator1.next()[0], 161);
+
+    // query to retrieve result as DOUBLE
+    Operator operator2 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'DOUBLE') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block2 = (IntermediateResultsBlock) operator2.nextBlock();
+    Collection<Object[]> rows2 = block2.getSelectionResult();
+    Assert.assertEquals(rows2.size(), 1);
+
+    Iterator<Object[]> iterator2 = rows2.iterator();
+    Assert.assertTrue(iterator2.hasNext());
+    Assert.assertEquals(iterator2.next()[0], 161.5d);
+  }
+
+  /** Retrieve JSON array after filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarArrayWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.data', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "[\"a\",\"b\",\"c\",\"d\"]");
+  }
+
+  /** Test filtering on string value within a JSON array*/
+  @Test
+  public void testExtractScalarWithArrayFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.data[0]', 'STRING') IN ('i', 'k')");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 2);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daisy");
+    Assert.assertEquals(iterator.next()[0], "goofy");
+  }
+
+  @Test
+  public void testJsonMatchWithoutIndex() {
+    try {
+      Operator operator = getOperatorForSqlQuery(
+          "select json_extract_scalar(stringColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(stringColumn, 'id=101')");
+      Assert.assertTrue(false);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  @Test
+  public void testJsonMatchAtLevel1WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'id=101')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+
+  @Test
+  public void testJsonMatchAtLevel2WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'name.first = ''daffy''')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+
+  @Test
+  public void testJsonMatchArrayWithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, '\"data[0]\" IN (''k'', ''j'')')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 2);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "pluto");
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "goofy");
+  }
+

Review comment:
       Add a test for selecting a json column, and check the data schema of the response




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r627773862



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +600,58 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to INT");
+    }
+
+    @Override
+    public long toLong(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to LONG");
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to FLOAT");
+    }
+
+    @Override
+    public double toDouble(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to DOUBLE");
+    }
+
+    @Override
+    public boolean toBoolean(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to BOOLEAN");
+    }
+
+    @Override
+    public Timestamp toTimestamp(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to TIMESTAMP");
+    }
+
+    @Override
+    public String toString(Object value) {
+      return value.toString();
+    }
+
+    @Override
+    public String toJson(Object value) {
+      return value.toString();
+    }
+
+    @Override
+    public byte[] toBytes(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to BYTES");
+    }
+
+    @Override
+    public String convert(Object value, PinotDataType sourceType) {
+      return sourceType.toString(value);

Review comment:
       Fixed.

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +600,58 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {

Review comment:
       Fixed.

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -748,6 +874,20 @@ public String toString(Object value) {
     return getSingleValueType().toString(toObjectArray(value)[0]);
   }
 
+  public String toJson(Object value) {
+    String jsonString = getSingleValueType().toString(toObjectArray(value)[0]);
+
+    // Check if the string is valid JSON.
+    JsonNode jsonNode = null;
+    try {
+      jsonNode = JsonUtils.stringToJsonNode(jsonString);
+    } catch (IOException ioe) {
+      throw new UnsupportedOperationException("Cannot convert value from STRING to JSON");
+    }
+
+    return jsonNode.toString();
+  }

Review comment:
       Fixed.

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -986,7 +1126,7 @@ public PinotDataType getSingleValueType() {
   /**
    * Returns the {@link PinotDataType} for the given {@link FieldSpec} for data ingestion purpose. Returns object array
    * type for multi-valued types.
-   * TODO: Add MV support for BOOLEAN, TIMESTAMP, BYTES
+   * TODO: Add MV support for BOOLEAN, TIMESTAMP, BYTES, JSON

Review comment:
       Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r627776375



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/predicate/EqualsPredicateEvaluatorFactory.java
##########
@@ -72,6 +72,7 @@ public static BaseRawValueBasedPredicateEvaluator newRawValueBasedEvaluator(EqPr
       case TIMESTAMP:
         return new LongRawValueBasedEqPredicateEvaluator(TimestampUtils.toMillisSinceEpoch(value));
       case STRING:
+      case JSON:

Review comment:
       Reverting all PredicateEvaluator changes. We may need to support EQUALS and NOT_EQUALS, but it doesn't need to be done with this PR.

##########
File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/inv/json/BaseJsonIndexCreator.java
##########
@@ -87,7 +89,15 @@
   @Override
   public void add(String jsonString)
       throws IOException {
-    addFlattenedRecords(JsonUtils.flatten(JsonUtils.stringToJsonNode(jsonString)));
+    JsonNode jsonNode = JsonUtils.stringToJsonNode(jsonString);

Review comment:
       Fixed.

##########
File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java
##########
@@ -633,6 +633,7 @@ public static void addColumnMetadataInfo(PropertiesConfiguration properties, Str
     properties.setProperty(getKeyFor(column, TOTAL_DOCS), String.valueOf(totalDocs));
     DataType dataType = fieldSpec.getDataType();
     properties.setProperty(getKeyFor(column, DATA_TYPE), String.valueOf(dataType));
+    properties.setProperty(getKeyFor(column, STORAGE_FORMAT), V1Constants.Str.STORAGE_FORMAT_DEFAULT_VALUE);

Review comment:
       Removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-commenter commented on pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#issuecomment-832453091


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#6878](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4bfae70) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1c09b7866b4d1c8267847f464e9c4618787b5a15?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1c09b78) will **decrease** coverage by `0.05%`.
   > The diff coverage is `28.40%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6878/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #6878      +/-   ##
   ============================================
   - Coverage     65.48%   65.43%   -0.06%     
     Complexity       12       12              
   ============================================
     Files          1421     1421              
     Lines         69980    70042      +62     
     Branches      10112    10115       +3     
   ============================================
   + Hits          45825    45829       +4     
   - Misses        20874    20926      +52     
   - Partials       3281     3287       +6     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | unittests | `65.43% <28.40%> (-0.06%)` | `12.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/pinot/common/function/FunctionUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vRnVuY3Rpb25VdGlscy5qYXZh) | `97.72% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...java/org/apache/pinot/core/common/DataFetcher.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9jb21tb24vRGF0YUZldGNoZXIuamF2YQ==) | `89.95% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | [...ter/predicate/EqualsPredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL0VxdWFsc1ByZWRpY2F0ZUV2YWx1YXRvckZhY3RvcnkuamF2YQ==) | `82.53% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [.../filter/predicate/InPredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL0luUHJlZGljYXRlRXZhbHVhdG9yRmFjdG9yeS5qYXZh) | `71.28% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [.../predicate/NotEqualsPredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL05vdEVxdWFsc1ByZWRpY2F0ZUV2YWx1YXRvckZhY3RvcnkuamF2YQ==) | `59.74% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...lter/predicate/NotInPredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL05vdEluUHJlZGljYXRlRXZhbHVhdG9yRmFjdG9yeS5qYXZh) | `66.36% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...lter/predicate/RangePredicateEvaluatorFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvcHJlZGljYXRlL1JhbmdlUHJlZGljYXRlRXZhbHVhdG9yRmFjdG9yeS5qYXZh) | `68.07% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...ator/transform/function/CastTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQ2FzdFRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `54.90% <0.00%> (-3.44%)` | `0.00 <0.00> (ø)` | |
   | [...e/pinot/core/query/reduce/HavingFilterHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvSGF2aW5nRmlsdGVySGFuZGxlci5qYXZh) | `89.13% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...egment/local/segment/creator/impl/V1Constants.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9WMUNvbnN0YW50cy5qYXZh) | `12.50% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | ... and [20 more](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [1c09b78...4bfae70](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-commenter edited a comment on pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#issuecomment-832453091


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#6878](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5b70574) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/bb2519494a15a3125dd509286d60ffd124f91c37?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bb25194) will **decrease** coverage by `30.17%`.
   > The diff coverage is `23.07%`.
   
   > :exclamation: Current head 5b70574 differs from pull request most recent head 7978e7c. Consider uploading reports for the commit 7978e7c to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6878/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #6878       +/-   ##
   =============================================
   - Coverage     73.55%   43.38%   -30.18%     
   + Complexity       12        7        -5     
   =============================================
     Files          1423     1423               
     Lines         70029    70064       +35     
     Branches      10119    10120        +1     
   =============================================
   - Hits          51509    30395    -21114     
   - Misses        15103    37064    +21961     
   + Partials       3417     2605      -812     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | integration | `43.38% <23.07%> (+<0.01%)` | `7.00 <0.00> (ø)` | |
   | unittests | `?` | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...roker/requesthandler/BaseBrokerRequestHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcmVxdWVzdGhhbmRsZXIvQmFzZUJyb2tlclJlcXVlc3RIYW5kbGVyLmphdmE=) | `66.85% <ø> (-5.76%)` | `0.00 <0.00> (ø)` | |
   | [...ator/transform/function/CastTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQ2FzdFRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `26.00% <0.00%> (-36.50%)` | `0.00 <0.00> (ø)` | |
   | [...main/java/org/apache/pinot/spi/data/FieldSpec.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9GaWVsZFNwZWMuamF2YQ==) | `0.00% <0.00%> (-83.07%)` | `0.00 <0.00> (ø)` | |
   | [...rc/main/java/org/apache/pinot/spi/data/Schema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9TY2hlbWEuamF2YQ==) | `0.00% <ø> (-75.00%)` | `0.00 <0.00> (ø)` | |
   | [...ain/java/org/apache/pinot/spi/utils/JsonUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvSnNvblV0aWxzLmphdmE=) | `0.00% <ø> (-69.54%)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/common/utils/PinotDataType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvUGlub3REYXRhVHlwZS5qYXZh) | `18.45% <4.16%> (-48.65%)` | `0.00 <0.00> (ø)` | |
   | [...java/org/apache/pinot/common/utils/DataSchema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvRGF0YVNjaGVtYS5qYXZh) | `42.67% <25.00%> (-32.33%)` | `0.00 <0.00> (ø)` | |
   | [...org/apache/pinot/sql/parsers/CalciteSqlParser.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcWwvcGFyc2Vycy9DYWxjaXRlU3FsUGFyc2VyLmphdmE=) | `73.47% <47.05%> (-19.13%)` | `0.00 <0.00> (ø)` | |
   | [...ator/transform/function/BaseTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQmFzZVRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `14.09% <100.00%> (-24.65%)` | `0.00 <0.00> (ø)` | |
   | [.../columnminmaxvalue/ColumnMinMaxValueGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9jb2x1bW5taW5tYXh2YWx1ZS9Db2x1bW5NaW5NYXhWYWx1ZUdlbmVyYXRvci5qYXZh) | `52.11% <100.00%> (-11.27%)` | `0.00 <0.00> (ø)` | |
   | ... and [902 more](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b87787...7978e7c](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-commenter edited a comment on pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#issuecomment-832453091


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#6878](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2cae511) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1c09b7866b4d1c8267847f464e9c4618787b5a15?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1c09b78) will **decrease** coverage by `22.06%`.
   > The diff coverage is `10.25%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6878/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #6878       +/-   ##
   =============================================
   - Coverage     65.48%   43.41%   -22.07%     
   + Complexity       12        7        -5     
   =============================================
     Files          1421     1421               
     Lines         69980    70043       +63     
     Branches      10112    10118        +6     
   =============================================
   - Hits          45825    30412    -15413     
   - Misses        20874    37032    +16158     
   + Partials       3281     2599      -682     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | integration | `43.41% <10.25%> (?)` | `7.00 <0.00> (?)` | |
   | unittests | `?` | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...ator/transform/function/CastTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQ2FzdFRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `25.49% <0.00%> (-32.85%)` | `0.00 <0.00> (ø)` | |
   | [...e/pinot/core/query/reduce/HavingFilterHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvSGF2aW5nRmlsdGVySGFuZGxlci5qYXZh) | `78.26% <ø> (-10.87%)` | `0.00 <0.00> (ø)` | |
   | [...main/java/org/apache/pinot/spi/data/FieldSpec.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9GaWVsZFNwZWMuamF2YQ==) | `0.00% <0.00%> (-83.07%)` | `0.00 <0.00> (ø)` | |
   | [...rc/main/java/org/apache/pinot/spi/data/Schema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9TY2hlbWEuamF2YQ==) | `0.00% <ø> (-75.00%)` | `0.00 <0.00> (ø)` | |
   | [...ain/java/org/apache/pinot/spi/utils/JsonUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvSnNvblV0aWxzLmphdmE=) | `0.00% <ø> (-69.54%)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/common/utils/PinotDataType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvUGlub3REYXRhVHlwZS5qYXZh) | `18.28% <3.70%> (-48.81%)` | `0.00 <0.00> (ø)` | |
   | [...java/org/apache/pinot/common/utils/DataSchema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvRGF0YVNjaGVtYS5qYXZh) | `42.67% <25.00%> (-31.46%)` | `0.00 <0.00> (ø)` | |
   | [...ator/transform/function/BaseTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQmFzZVRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `14.09% <100.00%> (-22.29%)` | `0.00 <0.00> (ø)` | |
   | [.../columnminmaxvalue/ColumnMinMaxValueGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9jb2x1bW5taW5tYXh2YWx1ZS9Db2x1bW5NaW5NYXhWYWx1ZUdlbmVyYXRvci5qYXZh) | `52.11% <100.00%> (-11.27%)` | `0.00 <0.00> (ø)` | |
   | [...c/main/java/org/apache/pinot/common/tier/Tier.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdGllci9UaWVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [1074 more](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [1c09b78...2cae511](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628488178



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));

Review comment:
       Added following TODO: `Update these test cases to: 1) use V2 JSON_MATCH function, 2) use multi-dimensional JSON array addressing, do json_extract_scalar on a column other than the JSON_MATCH column, and 4) query deeper levels of nesting.`
   
   I think all of these test cases will need to be adjusted to use V2 JSON_MATCH function (https://github.com/apache/incubator-pinot/pull/6877). We probably shouldn't claim support for V1 JSON_MATCH with new JSON datatype.

##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));
+    indexLoadingConfig.setReadMode(ReadMode.mmap);
+
+    ImmutableSegment immutableSegment =
+        ImmutableSegmentLoader.load(new File(INDEX_DIR, SEGMENT_NAME), indexLoadingConfig);
+    _indexSegment = immutableSegment;
+    _indexSegments = Arrays.asList(immutableSegment, immutableSegment);
+  }
+
+  /** Verify result column type of a simple select query against JSON column */
+  @Test
+  public void testSimpleSelectOnJsonColumn() {
+    try {
+      Operator operator = getOperatorForSqlQuery("select jsonColumn FROM testTable");
+      IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+      Collection<Object[]> rows = block.getSelectionResult();
+      Assert.assertEquals(rows.size(), 9);
+      Assert.assertEquals(block.getDataSchema().getColumnDataType(0), DataSchema.ColumnDataType.JSON);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  /** Test filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select intColumn, json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Object[] row = iterator.next();
+    Assert.assertEquals(row[0], 1);
+    Assert.assertEquals(row[1], "duck");
+  }
+
+  /** Test filtering on number value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericIntFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.last', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'INT') = 171");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "dwag");
+  }
+
+  /** Test filtering on float value associated with  JSON key*/
+  @Test
+  public void testExtractScalarWithNumericFloatFilter() {
+
+    // query to retrieve result as INT
+    Operator operator1 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'INT') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block1 = (IntermediateResultsBlock) operator1.nextBlock();
+    Collection<Object[]> rows1 = block1.getSelectionResult();
+    Assert.assertEquals(rows1.size(), 1);
+
+    Iterator<Object[]> iterator1 = rows1.iterator();
+    Assert.assertTrue(iterator1.hasNext());
+    Assert.assertEquals(iterator1.next()[0], 161);
+
+    // query to retrieve result as DOUBLE
+    Operator operator2 = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.id', 'DOUBLE') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.id', 'FLOAT') = 161.5");
+    IntermediateResultsBlock block2 = (IntermediateResultsBlock) operator2.nextBlock();
+    Collection<Object[]> rows2 = block2.getSelectionResult();
+    Assert.assertEquals(rows2.size(), 1);
+
+    Iterator<Object[]> iterator2 = rows2.iterator();
+    Assert.assertTrue(iterator2.hasNext());
+    Assert.assertEquals(iterator2.next()[0], 161.5d);
+  }
+
+  /** Retrieve JSON array after filtering on string value associated with  JSON key*/
+  @Test
+  public void testExtractScalarArrayWithStringFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.data', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.name.first', 'STRING') = 'daffy'");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "[\"a\",\"b\",\"c\",\"d\"]");
+  }
+
+  /** Test filtering on string value within a JSON array*/
+  @Test
+  public void testExtractScalarWithArrayFilter() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_extract_scalar(jsonColumn, '$.data[0]', 'STRING') IN ('i', 'k')");
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 2);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daisy");
+    Assert.assertEquals(iterator.next()[0], "goofy");
+  }
+
+  @Test
+  public void testJsonMatchWithoutIndex() {
+    try {
+      Operator operator = getOperatorForSqlQuery(
+          "select json_extract_scalar(stringColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(stringColumn, 'id=101')");
+      Assert.assertTrue(false);
+    } catch (IllegalStateException ise) {
+      Assert.assertTrue(true);
+    }
+  }
+
+  @Test
+  public void testJsonMatchAtLevel1WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'id=101')");
+
+    IntermediateResultsBlock block = (IntermediateResultsBlock) operator.nextBlock();
+    Collection<Object[]> rows = block.getSelectionResult();
+    Assert.assertEquals(rows.size(), 1);
+
+    Iterator<Object[]> iterator = rows.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Assert.assertEquals(iterator.next()[0], "daffy");
+  }
+
+  @Test
+  public void testJsonMatchAtLevel2WithIndex() {
+    Operator operator = getOperatorForSqlQuery(
+        "select json_extract_scalar(jsonColumn, '$.name.first', 'STRING') FROM testTable WHERE json_match(jsonColumn, 'name.first = ''daffy''')");
+

Review comment:
       Will update test cases once Jackie's PR (https://github.com/apache/incubator-pinot/pull/6877) on V2 JSON_MATCH function is merged. Added TODO for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-commenter edited a comment on pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#issuecomment-832453091


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#6878](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b6d8798) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1c09b7866b4d1c8267847f464e9c4618787b5a15?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1c09b78) will **increase** coverage by `0.02%`.
   > The diff coverage is `69.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6878/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #6878      +/-   ##
   ============================================
   + Coverage     65.48%   65.51%   +0.02%     
     Complexity       12       12              
   ============================================
     Files          1421     1423       +2     
     Lines         69980    70066      +86     
     Branches      10112    10122      +10     
   ============================================
   + Hits          45825    45901      +76     
   - Misses        20874    20885      +11     
   + Partials       3281     3280       -1     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | unittests | `65.51% <69.23%> (+0.02%)` | `12.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...ator/transform/function/CastTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQ2FzdFRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `54.90% <0.00%> (-3.44%)` | `0.00 <0.00> (ø)` | |
   | [...e/pinot/core/query/reduce/HavingFilterHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvSGF2aW5nRmlsdGVySGFuZGxlci5qYXZh) | `89.13% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...rc/main/java/org/apache/pinot/spi/data/Schema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9TY2hlbWEuamF2YQ==) | `75.00% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...ain/java/org/apache/pinot/spi/utils/JsonUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvSnNvblV0aWxzLmphdmE=) | `69.53% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [...java/org/apache/pinot/common/utils/DataSchema.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvRGF0YVNjaGVtYS5qYXZh) | `73.70% <50.00%> (-0.42%)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/common/utils/PinotDataType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvUGlub3REYXRhVHlwZS5qYXZh) | `67.55% <74.07%> (+0.45%)` | `0.00 <0.00> (ø)` | |
   | [...ator/transform/function/BaseTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQmFzZVRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `36.55% <100.00%> (+0.16%)` | `0.00 <0.00> (ø)` | |
   | [.../columnminmaxvalue/ColumnMinMaxValueGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9jb2x1bW5taW5tYXh2YWx1ZS9Db2x1bW5NaW5NYXhWYWx1ZUdlbmVyYXRvci5qYXZh) | `63.38% <100.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | [...main/java/org/apache/pinot/spi/data/FieldSpec.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9GaWVsZFNwZWMuamF2YQ==) | `83.33% <100.00%> (+0.27%)` | `0.00 <0.00> (ø)` | |
   | [...he/pinot/spi/data/readers/BaseRecordExtractor.java](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9yZWFkZXJzL0Jhc2VSZWNvcmRFeHRyYWN0b3IuamF2YQ==) | `66.07% <0.00%> (-3.58%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [19 more](https://codecov.io/gh/apache/incubator-pinot/pull/6878/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [1c09b78...b6d8798](https://codecov.io/gh/apache/incubator-pinot/pull/6878?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628362469



##########
File path: pinot-core/src/test/java/org/apache/pinot/queries/JsonDatatypeTest.java
##########
@@ -0,0 +1,308 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.queries;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.blocks.IntermediateResultsBlock;
+import org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader;
+import org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl;
+import org.apache.pinot.segment.local.segment.index.loader.IndexLoadingConfig;
+import org.apache.pinot.segment.local.segment.readers.GenericRowRecordReader;
+import org.apache.pinot.segment.spi.ImmutableSegment;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.creator.SegmentGeneratorConfig;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.utils.ReadMode;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+/**
+ * Test cases verifying query evaluation against column of type JSON.
+ */
+public class JsonDatatypeTest extends BaseQueriesTest {
+  private static final File INDEX_DIR = new File(FileUtils.getTempDirectory(), "JsonMatchPredicateTest");
+  private static final String RAW_TABLE_NAME = "testTable";
+  private static final String SEGMENT_NAME = "testSegment";
+  private static final int NUM_RECORDS = 10;
+
+  private static final String INT_COLUMN = "intColumn";
+  private static final String JSON_COLUMN = "jsonColumn";
+  private static final String STRING_COLUMN = "stringColumn";
+  private static final Schema SCHEMA =
+      new Schema.SchemaBuilder().addSingleValueDimension(INT_COLUMN, FieldSpec.DataType.INT)
+          .addSingleValueDimension(JSON_COLUMN, FieldSpec.DataType.JSON)
+          .addSingleValueDimension(STRING_COLUMN, FieldSpec.DataType.STRING).build();
+  private static final TableConfig TABLE_CONFIG =
+      new TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+
+  private IndexSegment _indexSegment;
+  private List<IndexSegment> _indexSegments;
+
+  @Override
+  protected String getFilter() {
+    return "";
+  }
+
+  @Override
+  protected IndexSegment getIndexSegment() {
+    return _indexSegment;
+  }
+
+  GenericRow createRecord(int intValue, String stringValue, String jsonValue) {
+    GenericRow record = new GenericRow();
+    record.putValue(INT_COLUMN, intValue);
+    record.putValue(STRING_COLUMN, stringValue);
+    record.putValue(JSON_COLUMN, jsonValue);
+
+    return record;
+  }
+
+  @Override
+  protected List<IndexSegment> getIndexSegments() {
+    return _indexSegments;
+  }
+
+  @BeforeClass
+  public void setUp()
+      throws Exception {
+    FileUtils.deleteDirectory(INDEX_DIR);
+
+    List<GenericRow> records = new ArrayList<>(NUM_RECORDS);
+    records.add(createRecord(1, "daffy duck",
+        "{\"name\": {\"first\": \"daffy\", \"last\": \"duck\"}, \"id\": 101, \"data\": [\"a\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(2, "mickey mouse",
+        "{\"name\": {\"first\": \"mickey\", \"last\": \"mouse\"}, \"id\": 111, \"data\": [\"e\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(3, "donald duck",
+        "{\"name\": {\"first\": \"donald\", \"last\": \"duck\"}, \"id\": 121, \"data\": [\"f\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(4, "scrooge mcduck",
+        "{\"name\": {\"first\": \"scrooge\", \"last\": \"mcduck\"}, \"id\": 131, \"data\": [\"g\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(5, "minnie mouse",
+        "{\"name\": {\"first\": \"minnie\", \"last\": \"mouse\"}, \"id\": 141, \"data\": [\"h\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(6, "daisy duck",
+        "{\"name\": {\"first\": \"daisy\", \"last\": \"duck\"}, \"id\": 161.5, \"data\": [\"i\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(7, "pluto dog",
+        "{\"name\": {\"first\": \"pluto\", \"last\": \"dog\"}, \"id\": 161, \"data\": [\"j\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(8, "goofy dwag",
+        "{\"name\": {\"first\": \"goofy\", \"last\": \"dwag\"}, \"id\": 171, \"data\": [\"k\", \"b\", \"c\", \"d\"]}"));
+    records.add(createRecord(9, "ludwik von drake",
+        "{\"name\": {\"first\": \"ludwik\", \"last\": \"von drake\"}, \"id\": 181, \"data\": [\"l\", \"b\", \"c\", \"d\"]}"));
+
+    List<String> jsonIndexColumns = new ArrayList<>();
+    jsonIndexColumns.add("jsonColumn");
+    TABLE_CONFIG.getIndexingConfig().setJsonIndexColumns(jsonIndexColumns);
+    SegmentGeneratorConfig segmentGeneratorConfig = new SegmentGeneratorConfig(TABLE_CONFIG, SCHEMA);
+    segmentGeneratorConfig.setTableName(RAW_TABLE_NAME);
+    segmentGeneratorConfig.setSegmentName(SEGMENT_NAME);
+    segmentGeneratorConfig.setOutDir(INDEX_DIR.getPath());
+
+    SegmentIndexCreationDriverImpl driver = new SegmentIndexCreationDriverImpl();
+    driver.init(segmentGeneratorConfig, new GenericRowRecordReader(records));
+    driver.build();
+
+    IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig();
+    indexLoadingConfig.setTableConfig(TABLE_CONFIG);
+    indexLoadingConfig.setJsonIndexColumns(new HashSet<String>(jsonIndexColumns));

Review comment:
       Can you please add a TODO here indicating that once multi-dimensional array support in JSON index is implemented, we can update these tests to cover additional scenarios ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r626835206



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +600,58 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to INT");
+    }
+
+    @Override
+    public long toLong(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to LONG");
+    }
+
+    @Override
+    public float toFloat(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to FLOAT");
+    }
+
+    @Override
+    public double toDouble(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to DOUBLE");
+    }
+
+    @Override
+    public boolean toBoolean(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to BOOLEAN");
+    }
+
+    @Override
+    public Timestamp toTimestamp(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to TIMESTAMP");
+    }
+
+    @Override
+    public String toString(Object value) {
+      return value.toString();
+    }
+
+    @Override
+    public String toJson(Object value) {
+      return value.toString();
+    }
+
+    @Override
+    public byte[] toBytes(Object value) {
+      throw new UnsupportedOperationException("Cannot convert value from JSON to BYTES");
+    }
+
+    @Override
+    public String convert(Object value, PinotDataType sourceType) {
+      return sourceType.toString(value);

Review comment:
       (Critical)
   ```suggestion
         return sourceType.toJson(value);
   ```

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -538,6 +600,58 @@ public String convert(Object value, PinotDataType sourceType) {
     }
   },
 
+  JSON {
+    @Override
+    public int toInt(Object value) {

Review comment:
       We should parse the json and convert to value if the json is a value node, or use the same method as the STRING type because value node json can be read as plain string

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -986,7 +1126,7 @@ public PinotDataType getSingleValueType() {
   /**
    * Returns the {@link PinotDataType} for the given {@link FieldSpec} for data ingestion purpose. Returns object array
    * type for multi-valued types.
-   * TODO: Add MV support for BOOLEAN, TIMESTAMP, BYTES
+   * TODO: Add MV support for BOOLEAN, TIMESTAMP, BYTES, JSON

Review comment:
       I don't think we need MV support for JSON as it can hold array

##########
File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/columnminmaxvalue/ColumnMinMaxValueGenerator.java
##########
@@ -89,7 +89,7 @@ private void addColumnMinMaxValueForColumn(String columnName)
     }
 
     PinotDataBuffer dictionaryBuffer = _segmentWriter.getIndexFor(columnName, ColumnIndexType.DICTIONARY);
-    DataType dataType = columnMetadata.getDataType();
+    DataType dataType = columnMetadata.getDataType().getStoredType();

Review comment:
       Good catch

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
##########
@@ -285,7 +285,7 @@ void readIntValues(int[] docIds, int length, int[] valueBuffer) {
         _reader.readDictIds(docIds, length, dictIdBuffer, readerContext);
         _dictionary.readIntValues(dictIdBuffer, length, valueBuffer);
       } else {
-        switch (_reader.getValueType()) {
+        switch (_reader.getValueType().getStoredType()) {

Review comment:
       We don't need stored type lookup here as the value type of reader is already stored type

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/function/FunctionUtils.java
##########
@@ -31,6 +31,9 @@
   private FunctionUtils() {
   }
 
+  // TODO: Do we need to create a class for handling JSON. It doesn't seem like this is needed since JSON type directly

Review comment:
       Currently we use `String` to represent JSON. If we change that to use `JsonNode`, then we can add `JsonNode` here. One problem of adding `JsonNode` is that it is an abstract class, and won't match the actual class

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -748,6 +874,20 @@ public String toString(Object value) {
     return getSingleValueType().toString(toObjectArray(value)[0]);
   }
 
+  public String toJson(Object value) {
+    String jsonString = getSingleValueType().toString(toObjectArray(value)[0]);
+
+    // Check if the string is valid JSON.
+    JsonNode jsonNode = null;
+    try {
+      jsonNode = JsonUtils.stringToJsonNode(jsonString);
+    } catch (IOException ioe) {
+      throw new UnsupportedOperationException("Cannot convert value from STRING to JSON");
+    }
+
+    return jsonNode.toString();
+  }

Review comment:
       For string value, we should try to parse it as json string; for other type value, we should try to parse it as general object:
   ```suggestion
     public String toJson(Object value) {
       if (value instanceof String) {
         try {
           return JsonUtils.stringToJsonNode((String) value).toString();
         } catch (Exception e) {
           throw new RuntimeException("Caught exception while parsing string '" + value + "' as json", e);
         }
       } else {
         try {
           return JsonUtils.objectToString(value).toString();
         } catch (Exception e) {
           throw new RuntimeException("Caught exception while serializing object '" + value + "' as json", e);
         }
       }
     }
   ```

##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/PinotDataType.java
##########
@@ -748,6 +874,20 @@ public String toString(Object value) {
     return getSingleValueType().toString(toObjectArray(value)[0]);
   }
 
+  public String toJson(Object value) {
+    String jsonString = getSingleValueType().toString(toObjectArray(value)[0]);
+
+    // Check if the string is valid JSON.
+    JsonNode jsonNode = null;
+    try {
+      jsonNode = JsonUtils.stringToJsonNode(jsonString);
+    } catch (IOException ioe) {
+      throw new UnsupportedOperationException("Cannot convert value from STRING to JSON");
+    }
+
+    return jsonNode.toString();
+  }

Review comment:
       No need to override this method for different data types

##########
File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java
##########
@@ -633,6 +633,7 @@ public static void addColumnMetadataInfo(PropertiesConfiguration properties, Str
     properties.setProperty(getKeyFor(column, TOTAL_DOCS), String.valueOf(totalDocs));
     DataType dataType = fieldSpec.getDataType();
     properties.setProperty(getKeyFor(column, DATA_TYPE), String.valueOf(dataType));
+    properties.setProperty(getKeyFor(column, STORAGE_FORMAT), V1Constants.Str.STORAGE_FORMAT_DEFAULT_VALUE);

Review comment:
       Let's not add the storage format for now. I don't have a clear picture on how this field is going to be used. When we want to support another storage format, we can add it then.

##########
File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/inv/json/BaseJsonIndexCreator.java
##########
@@ -87,7 +89,15 @@
   @Override
   public void add(String jsonString)
       throws IOException {
-    addFlattenedRecords(JsonUtils.flatten(JsonUtils.stringToJsonNode(jsonString)));
+    JsonNode jsonNode = JsonUtils.stringToJsonNode(jsonString);

Review comment:
       We don't really need another validation here. It is already validated during the type conversion (PinotDataType)

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/predicate/EqualsPredicateEvaluatorFactory.java
##########
@@ -72,6 +72,7 @@ public static BaseRawValueBasedPredicateEvaluator newRawValueBasedEvaluator(EqPr
       case TIMESTAMP:
         return new LongRawValueBasedEqPredicateEvaluator(TimestampUtils.toMillisSinceEpoch(value));
       case STRING:
+      case JSON:

Review comment:
       Not sure if we want to allow EQ operation on `JSON` field, same for other predicates




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] amrishlal commented on a change in pull request #6878: JSON column datatype support.

Posted by GitBox <gi...@apache.org>.
amrishlal commented on a change in pull request #6878:
URL: https://github.com/apache/incubator-pinot/pull/6878#discussion_r628354108



##########
File path: pinot-core/src/test/java/org/apache/pinot/core/transport/QueryRoutingTest.java
##########
@@ -211,7 +211,7 @@ public void testServerDown()
     assertEquals(serverResponse.getResponseSize(), 0);
     assertEquals(serverResponse.getDeserializationTimeMs(), 0);
     // Query should early terminate
-    assertTrue(System.currentTimeMillis() - startTimeMs < 1000);
+    assertTrue(System.currentTimeMillis() - startTimeMs < 1010);

Review comment:
       Will put it in a separate PR. Test case fails every once a while because it needs a little extra time over 1000 ms timeout to validate the timeout. It's possible that the flakyness may be due to issues with `QueryRouter.markServerDown`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org