You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by sn...@apache.org on 2023/11/27 19:40:03 UTC

(pinot) branch revert-12032-santizeNanNegativeZeroBehaviour created (now 1ed7b01757)

This is an automated email from the ASF dual-hosted git repository.

snlee pushed a change to branch revert-12032-santizeNanNegativeZeroBehaviour
in repository https://gitbox.apache.org/repos/asf/pinot.git


      at 1ed7b01757 Revert "Added new SpecialValueTransformer and tests."

This branch includes the following new commits:

     new 15e41e1089 Revert "fixed note in SpecialValueTransformer."
     new 09b2358754 Revert "modifed test to ensure order of transformers."
     new f51e685a49 Revert "added code to remove NaN from multivalued columns and modified order of transformers."
     new 6cfec8c319 Revert "addressed comments and added test for ensuring order of transformers."
     new 04df04adf1 Revert "fix typo in comment."
     new 1ed7b01757 Revert "Added new SpecialValueTransformer and tests."

The 6 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


(pinot) 02/06: Revert "modifed test to ensure order of transformers."

Posted by sn...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

snlee pushed a commit to branch revert-12032-santizeNanNegativeZeroBehaviour
in repository https://gitbox.apache.org/repos/asf/pinot.git

commit 09b235875489d4cb24ba2d7da85f029dfb259313
Author: Seunghyun Lee <sn...@apache.org>
AuthorDate: Mon Nov 27 11:39:57 2023 -0800

    Revert "modifed test to ensure order of transformers."
    
    This reverts commit 05be5cfd9ade65d2785a412f6b0692f7ba0c0eb6.
---
 .../local/recordtransformer/SpecialValueTransformer.java |  2 +-
 .../local/recordtransformer/RecordTransformerTest.java   | 16 +++++++---------
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
index 3383803194..26df94933a 100644
--- a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
+++ b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
@@ -97,7 +97,7 @@ public class SpecialValueTransformer implements RecordTransformer {
             negativeZeroNanSanitizedValues.add(nanTransformedValue);
           }
         }
-        record.putValue(element, negativeZeroNanSanitizedValues.toArray());
+        record.putValue(element,negativeZeroNanSanitizedValues.toArray());
       } else {
         // Single-valued column.
         Object zeroTransformedValue = transformNegativeZero(value);
diff --git a/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java b/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
index 410fa19564..2685f8b6da 100644
--- a/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
+++ b/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
@@ -305,9 +305,9 @@ public class RecordTransformerTest {
     Schema schema = new Schema.SchemaBuilder().addSingleValueDimension("svInt", DataType.INT)
         .addSingleValueDimension("svDouble", DataType.DOUBLE)
         .addSingleValueDimension("expressionTestColumn", DataType.INT)
-        .addSingleValueDimension("svNaN", DataType.FLOAT).addMultiValueDimension("mvNaN", DataType.FLOAT)
+        .addSingleValueDimension("svNaN", DataType.FLOAT).addMultiValueDimension("mvNaN",DataType.FLOAT)
         .addSingleValueDimension("emptyDimensionForNullValueTransformer", DataType.FLOAT)
-        .addSingleValueDimension("svStringNull", DataType.STRING)
+        .addSingleValueDimension("svStringWithNullCharacters", DataType.STRING)
         .addSingleValueDimension("indexableExtras", DataType.JSON)
         .addDateTime("timeCol", DataType.TIMESTAMP, "1:MILLISECONDS:TIMESTAMP", "1:MILLISECONDS").build();
 
@@ -326,7 +326,7 @@ public class RecordTransformerTest {
     List<RecordTransformer> currentListOfTransformers =
         CompositeTransformer.getDefaultTransformers(tableConfig, schema);
 
-    // Create a list of transformers in the original order to compare.
+    // Create a list of transformers to compare.
     List<RecordTransformer> expectedListOfTransformers =
         List.of(new ExpressionTransformer(tableConfig, schema), new FilterTransformer(tableConfig),
             new SchemaConformingTransformer(tableConfig, schema), new DataTypeTransformer(tableConfig, schema),
@@ -352,15 +352,14 @@ public class RecordTransformerTest {
 
     // Data for SpecialValue Transformer.
     record.putValue("svNaN", Float.NaN);
-    record.putValue("mvNaN", new Float[]{1.0f, Float.NaN, 2.0f});
+    record.putValue("mvNaN",new Float[]{1.0f,Float.NaN,2.0f});
 
     // Data for sanitization transformer.
-    record.putValue("svStringNull", null);
+    record.putValue("svStringWithNullCharacters", "1\0002\0003");
 
     for (int i = 0; i < NUMBER_OF_TRANSFORMERS; i++) {
-      GenericRow copyRecord = record.copy();
       GenericRow currentRecord = currentListOfTransformers.get(i).transform(record);
-      GenericRow expectedRecord = expectedListOfTransformers.get(i).transform(copyRecord);
+      GenericRow expectedRecord = expectedListOfTransformers.get(i).transform(record);
       assertEquals(currentRecord, expectedRecord);
       record = expectedRecord;
     }
@@ -653,8 +652,7 @@ public class RecordTransformerTest {
           new Float[]{0.0f, 2.0f});
       assertEquals(record.getValue("mvDoubleNaN"),
           new Double[]{0.0d, 2.0d});
-      assertEquals(new ArrayList<>(record.getNullValueFields()),
-          new ArrayList<>(Arrays.asList("svFloatNaN", "svDoubleNaN")));
+      assertEquals(new ArrayList<>(record.getNullValueFields()), new ArrayList<>(Arrays.asList("svFloatNaN","svDoubleNaN")));
     }
 
     // Test empty record


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


(pinot) 06/06: Revert "Added new SpecialValueTransformer and tests."

Posted by sn...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

snlee pushed a commit to branch revert-12032-santizeNanNegativeZeroBehaviour
in repository https://gitbox.apache.org/repos/asf/pinot.git

commit 1ed7b01757824787f63081c4d96b05fd72fea0fe
Author: Seunghyun Lee <sn...@apache.org>
AuthorDate: Mon Nov 27 11:39:57 2023 -0800

    Revert "Added new SpecialValueTransformer and tests."
    
    This reverts commit 95d4950dab166373ed9ec58555435852e5f2ffeb.
---
 .../recordtransformer/CompositeTransformer.java    | 11 +--
 .../recordtransformer/SpecialValueTransformer.java | 96 ----------------------
 .../recordtransformer/RecordTransformerTest.java   | 49 -----------
 3 files changed, 3 insertions(+), 153 deletions(-)

diff --git a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/CompositeTransformer.java b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/CompositeTransformer.java
index 50ca2a97c1..b6fb694306 100644
--- a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/CompositeTransformer.java
+++ b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/CompositeTransformer.java
@@ -68,18 +68,13 @@ public class CompositeTransformer implements RecordTransformer {
    *     Optional {@link SanitizationTransformer} after {@link NullValueTransformer} so that before sanitation, all
    *     values are non-null and follow the data types defined in the schema
    *   </li>
-   *   <li>
-   *     {@link SpecialValueTransformer} after {@link DataTypeTransformer} so that we already have the values complying
-   *      with the schema before handling special values
-   *   </li>
    * </ul>
    */
   public static List<RecordTransformer> getDefaultTransformers(TableConfig tableConfig, Schema schema) {
     return Stream.of(new ExpressionTransformer(tableConfig, schema), new FilterTransformer(tableConfig),
-            new SchemaConformingTransformer(tableConfig, schema), new DataTypeTransformer(tableConfig, schema),
-            new TimeValidationTransformer(tableConfig, schema), new NullValueTransformer(tableConfig, schema),
-            new SpecialValueTransformer(schema), new SanitizationTransformer(schema)).filter(t -> !t.isNoOp())
-        .collect(Collectors.toList());
+        new SchemaConformingTransformer(tableConfig, schema), new DataTypeTransformer(tableConfig, schema),
+        new TimeValidationTransformer(tableConfig, schema), new NullValueTransformer(tableConfig, schema),
+        new SanitizationTransformer(schema)).filter(t -> !t.isNoOp()).collect(Collectors.toList());
   }
 
   public static CompositeTransformer getDefaultTransformer(TableConfig tableConfig, Schema schema) {
diff --git a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
deleted file mode 100644
index 2bd5b9920d..0000000000
--- a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
+++ /dev/null
@@ -1,96 +0,0 @@
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-package org.apache.pinot.segment.local.recordtransformer;
-
-import java.util.HashSet;
-import org.apache.pinot.spi.data.FieldSpec;
-import org.apache.pinot.spi.data.FieldSpec.DataType;
-import org.apache.pinot.spi.data.Schema;
-import org.apache.pinot.spi.data.readers.GenericRow;
-
-
-/**
- * The {@code SpecialValueTransformer} class will transform special values the values to follow certain rules including:
- * <ul>
- *   <li>Negative zero (-0.0) should be converted to 0.0</li>
- *   <li>NaN should be converted to default null</li>
- * </ul>
- * <p>NOTE: should put this after the {@link DataTypeTransformer} so that all values follow the data types in
- * {@link FieldSpec}.
- */
-public class SpecialValueTransformer implements RecordTransformer {
-  private final HashSet<String> _specialValuesKeySet = new HashSet<>();
-
-  public SpecialValueTransformer(Schema schema) {
-    for (FieldSpec fieldSpec : schema.getAllFieldSpecs()) {
-      if (!fieldSpec.isVirtualColumn() && (fieldSpec.getDataType() == DataType.FLOAT
-          || fieldSpec.getDataType() == DataType.DOUBLE)) {
-        _specialValuesKeySet.add(fieldSpec.getName());
-      }
-    }
-  }
-
-  private Object transformNegativeZero(Object value) {
-    if ((value instanceof Float) && (Float.floatToRawIntBits((float) value) == Float.floatToRawIntBits(-0.0f))) {
-      value = 0.0f;
-    } else if ((value instanceof Double) && (Double.doubleToLongBits((double) value) == Double.doubleToLongBits(
-        -0.0d))) {
-      value = 0.0d;
-    }
-    return value;
-  }
-
-  private Object transformNaN(Object value) {
-    if ((value instanceof Float) && ((Float) value).isNaN()) {
-      value = FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT;
-    } else if ((value instanceof Double) && ((Double) value).isNaN()) {
-      value = FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE;
-    }
-    return value;
-  }
-
-  @Override
-  public boolean isNoOp() {
-    return _specialValuesKeySet.isEmpty();
-  }
-
-  @Override
-  public GenericRow transform(GenericRow record) {
-    for (String element : _specialValuesKeySet) {
-      Object value = record.getValue(element);
-      if (value instanceof Float || value instanceof Double) {
-        // Single-valued column.
-        Object zeroTransformedValue = transformNegativeZero(value);
-        Object nanTransformedValue = transformNaN(zeroTransformedValue);
-        if (nanTransformedValue != value) {
-          record.putValue(element, nanTransformedValue);
-        }
-      } else if (value instanceof Object[]) {
-        // Multi-valued column.
-        Object[] values = (Object[]) value;
-        int numValues = values.length;
-        for (int i = 0; i < numValues; i++) {
-          values[i] = transformNegativeZero(values[i]);
-          values[i] = transformNaN(values[i]);
-        }
-      }
-    }
-    return record;
-  }
-}
diff --git a/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java b/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
index e48b6605f5..baded1b97a 100644
--- a/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
+++ b/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
@@ -51,13 +51,6 @@ public class RecordTransformerTest {
       .addSingleValueDimension("svStringWithNullCharacters", DataType.STRING)
       .addSingleValueDimension("svStringWithLengthLimit", DataType.STRING)
       .addMultiValueDimension("mvString1", DataType.STRING).addMultiValueDimension("mvString2", DataType.STRING)
-      // For negative zero and NaN conversions
-      .addSingleValueDimension("svFloatNegativeZero", DataType.FLOAT)
-      .addMultiValueDimension("mvFloatNegativeZero", DataType.FLOAT)
-      .addSingleValueDimension("svDoubleNegativeZero", DataType.DOUBLE)
-      .addMultiValueDimension("mvDoubleNegativeZero", DataType.DOUBLE)
-      .addSingleValueDimension("svFloatNaN", DataType.FLOAT).addMultiValueDimension("mvFloatNaN", DataType.FLOAT)
-      .addSingleValueDimension("svDoubleNaN", DataType.DOUBLE).addMultiValueDimension("mvDoubleNaN", DataType.DOUBLE)
       .build();
   private static final TableConfig TABLE_CONFIG =
       new TableConfigBuilder(TableType.OFFLINE).setTableName("testTable").build();
@@ -89,14 +82,6 @@ public class RecordTransformerTest {
     record.putValue("mvString1", new Object[]{"123", 123, 123L, 123f, 123.0});
     record.putValue("mvString2", new Object[]{123, 123L, 123f, 123.0, "123"});
     record.putValue("svNullString", null);
-    record.putValue("svFloatNegativeZero", -0.00f);
-    record.putValue("svDoubleNegativeZero", -0.00d);
-    record.putValue("mvFloatNegativeZero", new Float[]{-0.0f, 1.0f, 0.0f, 3.0f});
-    record.putValue("mvDoubleNegativeZero", new Double[]{-0.0d, 1.0d, 0.0d, 3.0d});
-    record.putValue("svFloatNaN", Float.NaN);
-    record.putValue("svDoubleNaN", Double.NaN);
-    record.putValue("mvFloatNaN", new Float[]{-0.0f, Float.NaN, 2.0f});
-    record.putValue("mvDoubleNaN", new Double[]{-0.0d, Double.NaN, 2.0d});
     return record;
   }
 
@@ -269,28 +254,6 @@ public class RecordTransformerTest {
     }
   }
 
-  @Test
-  public void testSpecialValueTransformer() {
-    RecordTransformer transformer = new SpecialValueTransformer(SCHEMA);
-    GenericRow record = getRecord();
-    for (int i = 0; i < NUM_ROUNDS; i++) {
-      record = transformer.transform(record);
-      assertNotNull(record);
-      assertEquals(Float.floatToRawIntBits((float) record.getValue("svFloatNegativeZero")),
-          Float.floatToRawIntBits(0.0f));
-      assertEquals(Double.doubleToRawLongBits((double) record.getValue("svDoubleNegativeZero")),
-          Double.doubleToRawLongBits(0.0d));
-      assertEquals(record.getValue("mvFloatNegativeZero"), new Float[]{0.0f, 1.0f, 0.0f, 3.0f});
-      assertEquals(record.getValue("mvDoubleNegativeZero"), new Double[]{0.0d, 1.0d, 0.0d, 3.0d});
-      assertEquals(record.getValue("svFloatNaN"), FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT);
-      assertEquals(record.getValue("svDoubleNaN"), FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE);
-      assertEquals(record.getValue("mvFloatNaN"),
-          new Float[]{0.0f, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT, 2.0f});
-      assertEquals(record.getValue("mvDoubleNaN"),
-          new Double[]{0.0d, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE, 2.0d});
-    }
-  }
-
   @Test
   public void testScalarOps() {
     IngestionConfig ingestionConfig = new IngestionConfig();
@@ -567,18 +530,6 @@ public class RecordTransformerTest {
       assertEquals(record.getValue("mvString2"), new Object[]{"123", "123", "123.0", "123.0", "123"});
       assertNull(record.getValue("$virtual"));
       assertTrue(record.getNullValueFields().isEmpty());
-      assertEquals(Float.floatToRawIntBits((float) record.getValue("svFloatNegativeZero")),
-          Float.floatToRawIntBits(0.0f));
-      assertEquals(Double.doubleToRawLongBits((double) record.getValue("svDoubleNegativeZero")),
-          Double.doubleToRawLongBits(0.0d));
-      assertEquals(record.getValue("mvFloatNegativeZero"), new Float[]{0.0f, 1.0f, 0.0f, 3.0f});
-      assertEquals(record.getValue("mvDoubleNegativeZero"), new Double[]{0.0d, 1.0d, 0.0d, 3.0d});
-      assertEquals(record.getValue("svFloatNaN"), FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT);
-      assertEquals(record.getValue("svDoubleNaN"), FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE);
-      assertEquals(record.getValue("mvFloatNaN"),
-          new Float[]{0.0f, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT, 2.0f});
-      assertEquals(record.getValue("mvDoubleNaN"),
-          new Double[]{0.0d, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE, 2.0d});
     }
 
     // Test empty record


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


(pinot) 03/06: Revert "added code to remove NaN from multivalued columns and modified order of transformers."

Posted by sn...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

snlee pushed a commit to branch revert-12032-santizeNanNegativeZeroBehaviour
in repository https://gitbox.apache.org/repos/asf/pinot.git

commit f51e685a4972666866785f41d6d1a088be58be34
Author: Seunghyun Lee <sn...@apache.org>
AuthorDate: Mon Nov 27 11:39:57 2023 -0800

    Revert "added code to remove NaN from multivalued columns and modified order of transformers."
    
    This reverts commit 06e91c13a11c5a07939adf326de28e95e39f4596.
---
 .../recordtransformer/CompositeTransformer.java    |  7 +++----
 .../recordtransformer/SpecialValueTransformer.java | 18 ++++++----------
 .../recordtransformer/RecordTransformerTest.java   | 24 +++++++++-------------
 3 files changed, 19 insertions(+), 30 deletions(-)

diff --git a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/CompositeTransformer.java b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/CompositeTransformer.java
index 0ba394e9c9..50ca2a97c1 100644
--- a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/CompositeTransformer.java
+++ b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/CompositeTransformer.java
@@ -70,16 +70,15 @@ public class CompositeTransformer implements RecordTransformer {
    *   </li>
    *   <li>
    *     {@link SpecialValueTransformer} after {@link DataTypeTransformer} so that we already have the values complying
-   *      with the schema before handling special values and before {@link NullValueTransformer} so that it transforms
-   *      all the null values properly
+   *      with the schema before handling special values
    *   </li>
    * </ul>
    */
   public static List<RecordTransformer> getDefaultTransformers(TableConfig tableConfig, Schema schema) {
     return Stream.of(new ExpressionTransformer(tableConfig, schema), new FilterTransformer(tableConfig),
             new SchemaConformingTransformer(tableConfig, schema), new DataTypeTransformer(tableConfig, schema),
-            new TimeValidationTransformer(tableConfig, schema), new SpecialValueTransformer(schema),
-            new NullValueTransformer(tableConfig, schema), new SanitizationTransformer(schema)).filter(t -> !t.isNoOp())
+            new TimeValidationTransformer(tableConfig, schema), new NullValueTransformer(tableConfig, schema),
+            new SpecialValueTransformer(schema), new SanitizationTransformer(schema)).filter(t -> !t.isNoOp())
         .collect(Collectors.toList());
   }
 
diff --git a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
index 26df94933a..4cfbbc8ce9 100644
--- a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
+++ b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
@@ -18,9 +18,7 @@
  */
 package org.apache.pinot.segment.local.recordtransformer;
 
-import java.util.ArrayList;
 import java.util.HashSet;
-import java.util.List;
 import org.apache.pinot.spi.data.FieldSpec;
 import org.apache.pinot.spi.data.FieldSpec.DataType;
 import org.apache.pinot.spi.data.Schema;
@@ -67,10 +65,10 @@ public class SpecialValueTransformer implements RecordTransformer {
   private Object transformNaN(Object value) {
     if ((value instanceof Float) && ((Float) value).isNaN()) {
       LOGGER.info("Float.NaN detected, converting to default null.");
-      value = null;
+      value = FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT;
     } else if ((value instanceof Double) && ((Double) value).isNaN()) {
       LOGGER.info("Double.NaN detected, converting to default null.");
-      value = null;
+      value = FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE;
     }
     return value;
   }
@@ -88,16 +86,12 @@ public class SpecialValueTransformer implements RecordTransformer {
         // Multi-valued column.
         Object[] values = (Object[]) value;
         int numValues = values.length;
-        List<Object> negativeZeroNanSanitizedValues = new ArrayList<>(numValues);
-        int numberOfElements = values.length;
-        for (Object o : values) {
-          Object zeroTransformedValue = transformNegativeZero(o);
-          Object nanTransformedValue = transformNaN(zeroTransformedValue);
-          if (nanTransformedValue != null) {
-            negativeZeroNanSanitizedValues.add(nanTransformedValue);
+        for (int i = 0; i < numValues; i++) {
+          if (values[i] != null) {
+            values[i] = transformNegativeZero(values[i]);
+            values[i] = transformNaN(values[i]);
           }
         }
-        record.putValue(element,negativeZeroNanSanitizedValues.toArray());
       } else {
         // Single-valued column.
         Object zeroTransformedValue = transformNegativeZero(value);
diff --git a/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java b/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
index 2685f8b6da..6361963785 100644
--- a/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
+++ b/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
@@ -19,8 +19,6 @@
 package org.apache.pinot.segment.local.recordtransformer;
 
 import java.sql.Timestamp;
-import java.util.ArrayList;
-import java.util.Arrays;
 import java.util.Collections;
 import java.util.List;
 import java.util.concurrent.TimeUnit;
@@ -288,12 +286,12 @@ public class RecordTransformerTest {
           Double.doubleToRawLongBits(0.0d));
       assertEquals(record.getValue("mvFloatNegativeZero"), new Float[]{0.0f, 1.0f, 0.0f, 3.0f});
       assertEquals(record.getValue("mvDoubleNegativeZero"), new Double[]{0.0d, 1.0d, 0.0d, 3.0d});
-      assertNull(record.getValue("svFloatNaN"));
-      assertNull(record.getValue("svDoubleNaN"));
+      assertEquals(record.getValue("svFloatNaN"), FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT);
+      assertEquals(record.getValue("svDoubleNaN"), FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE);
       assertEquals(record.getValue("mvFloatNaN"),
-          new Float[]{0.0f, 2.0f});
+          new Float[]{0.0f, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT, 2.0f});
       assertEquals(record.getValue("mvDoubleNaN"),
-          new Double[]{0.0d, 2.0d});
+          new Double[]{0.0d, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE, 2.0d});
     }
   }
 
@@ -304,8 +302,7 @@ public class RecordTransformerTest {
     // Build Schema and ingestionConfig in such a way that all the transformers are loaded.
     Schema schema = new Schema.SchemaBuilder().addSingleValueDimension("svInt", DataType.INT)
         .addSingleValueDimension("svDouble", DataType.DOUBLE)
-        .addSingleValueDimension("expressionTestColumn", DataType.INT)
-        .addSingleValueDimension("svNaN", DataType.FLOAT).addMultiValueDimension("mvNaN",DataType.FLOAT)
+        .addSingleValueDimension("expressionTestColumn", DataType.INT).addSingleValueDimension("svNaN", DataType.FLOAT)
         .addSingleValueDimension("emptyDimensionForNullValueTransformer", DataType.FLOAT)
         .addSingleValueDimension("svStringWithNullCharacters", DataType.STRING)
         .addSingleValueDimension("indexableExtras", DataType.JSON)
@@ -330,8 +327,8 @@ public class RecordTransformerTest {
     List<RecordTransformer> expectedListOfTransformers =
         List.of(new ExpressionTransformer(tableConfig, schema), new FilterTransformer(tableConfig),
             new SchemaConformingTransformer(tableConfig, schema), new DataTypeTransformer(tableConfig, schema),
-            new TimeValidationTransformer(tableConfig, schema), new SpecialValueTransformer(schema),
-            new NullValueTransformer(tableConfig, schema), new SanitizationTransformer(schema));
+            new TimeValidationTransformer(tableConfig, schema), new NullValueTransformer(tableConfig, schema),
+            new SpecialValueTransformer(schema), new SanitizationTransformer(schema));
 
     // Check that the number of current transformers match the expected number of transformers.
     assertEquals(currentListOfTransformers.size(), NUMBER_OF_TRANSFORMERS);
@@ -352,7 +349,6 @@ public class RecordTransformerTest {
 
     // Data for SpecialValue Transformer.
     record.putValue("svNaN", Float.NaN);
-    record.putValue("mvNaN",new Float[]{1.0f,Float.NaN,2.0f});
 
     // Data for sanitization transformer.
     record.putValue("svStringWithNullCharacters", "1\0002\0003");
@@ -649,10 +645,10 @@ public class RecordTransformerTest {
       assertEquals(record.getValue("svFloatNaN"), FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT);
       assertEquals(record.getValue("svDoubleNaN"), FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE);
       assertEquals(record.getValue("mvFloatNaN"),
-          new Float[]{0.0f, 2.0f});
+          new Float[]{0.0f, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT, 2.0f});
       assertEquals(record.getValue("mvDoubleNaN"),
-          new Double[]{0.0d, 2.0d});
-      assertEquals(new ArrayList<>(record.getNullValueFields()), new ArrayList<>(Arrays.asList("svFloatNaN","svDoubleNaN")));
+          new Double[]{0.0d, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE, 2.0d});
+      assertTrue(record.getNullValueFields().isEmpty());
     }
 
     // Test empty record


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


(pinot) 01/06: Revert "fixed note in SpecialValueTransformer."

Posted by sn...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

snlee pushed a commit to branch revert-12032-santizeNanNegativeZeroBehaviour
in repository https://gitbox.apache.org/repos/asf/pinot.git

commit 15e41e1089c37d4aa352220ed11f12a1ba9d78f9
Author: Seunghyun Lee <sn...@apache.org>
AuthorDate: Mon Nov 27 11:39:57 2023 -0800

    Revert "fixed note in SpecialValueTransformer."
    
    This reverts commit f79b618141e07c140663791959d96956ad91d811.
---
 .../segment/local/recordtransformer/SpecialValueTransformer.java     | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
index 29596e0935..3383803194 100644
--- a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
+++ b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
@@ -35,9 +35,8 @@ import org.slf4j.LoggerFactory;
  *   <li>Negative zero (-0.0) should be converted to 0.0</li>
  *   <li>NaN should be converted to default null</li>
  * </ul>
- * <p>NOTE: should put this after the {@link DataTypeTransformer} so that we already have the values complying
- * with the schema before handling special values and before {@link NullValueTransformer} so that it transforms
- * all the null values properly.
+ * <p>NOTE: should put this after the {@link DataTypeTransformer} so that all values follow the data types in
+ * {@link FieldSpec}.
  */
 public class SpecialValueTransformer implements RecordTransformer {
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


(pinot) 04/06: Revert "addressed comments and added test for ensuring order of transformers."

Posted by sn...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

snlee pushed a commit to branch revert-12032-santizeNanNegativeZeroBehaviour
in repository https://gitbox.apache.org/repos/asf/pinot.git

commit 6cfec8c319da0547239593ac04c3de61cbdb110e
Author: Seunghyun Lee <sn...@apache.org>
AuthorDate: Mon Nov 27 11:39:57 2023 -0800

    Revert "addressed comments and added test for ensuring order of transformers."
    
    This reverts commit 2c05f8dca9db402de904c044264d54cc76858238.
---
 .../recordtransformer/SpecialValueTransformer.java | 28 +++------
 .../recordtransformer/RecordTransformerTest.java   | 72 +---------------------
 2 files changed, 10 insertions(+), 90 deletions(-)

diff --git a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
index 4cfbbc8ce9..9e55c84cf1 100644
--- a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
+++ b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
@@ -23,8 +23,6 @@ import org.apache.pinot.spi.data.FieldSpec;
 import org.apache.pinot.spi.data.FieldSpec.DataType;
 import org.apache.pinot.spi.data.Schema;
 import org.apache.pinot.spi.data.readers.GenericRow;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 
 
 /**
@@ -37,8 +35,6 @@ import org.slf4j.LoggerFactory;
  * {@link FieldSpec}.
  */
 public class SpecialValueTransformer implements RecordTransformer {
-
-  private static final Logger LOGGER = LoggerFactory.getLogger(NullValueTransformer.class);
   private final HashSet<String> _specialValuesKeySet = new HashSet<>();
 
   public SpecialValueTransformer(Schema schema) {
@@ -52,11 +48,9 @@ public class SpecialValueTransformer implements RecordTransformer {
 
   private Object transformNegativeZero(Object value) {
     if ((value instanceof Float) && (Float.floatToRawIntBits((float) value) == Float.floatToRawIntBits(-0.0f))) {
-      LOGGER.info("-0.0f value detected, converting to 0.0.");
       value = 0.0f;
     } else if ((value instanceof Double) && (Double.doubleToLongBits((double) value) == Double.doubleToLongBits(
         -0.0d))) {
-      LOGGER.info("-0.0d value detected, converting to 0.0.");
       value = 0.0d;
     }
     return value;
@@ -64,10 +58,8 @@ public class SpecialValueTransformer implements RecordTransformer {
 
   private Object transformNaN(Object value) {
     if ((value instanceof Float) && ((Float) value).isNaN()) {
-      LOGGER.info("Float.NaN detected, converting to default null.");
       value = FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT;
     } else if ((value instanceof Double) && ((Double) value).isNaN()) {
-      LOGGER.info("Double.NaN detected, converting to default null.");
       value = FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE;
     }
     return value;
@@ -82,23 +74,21 @@ public class SpecialValueTransformer implements RecordTransformer {
   public GenericRow transform(GenericRow record) {
     for (String element : _specialValuesKeySet) {
       Object value = record.getValue(element);
-      if (value instanceof Object[]) {
-        // Multi-valued column.
-        Object[] values = (Object[]) value;
-        int numValues = values.length;
-        for (int i = 0; i < numValues; i++) {
-          if (values[i] != null) {
-            values[i] = transformNegativeZero(values[i]);
-            values[i] = transformNaN(values[i]);
-          }
-        }
-      } else {
+      if (value instanceof Float || value instanceof Double) {
         // Single-valued column.
         Object zeroTransformedValue = transformNegativeZero(value);
         Object nanTransformedValue = transformNaN(zeroTransformedValue);
         if (nanTransformedValue != value) {
           record.putValue(element, nanTransformedValue);
         }
+      } else if (value instanceof Object[]) {
+        // Multi-valued column.
+        Object[] values = (Object[]) value;
+        int numValues = values.length;
+        for (int i = 0; i < numValues; i++) {
+          values[i] = transformNegativeZero(values[i]);
+          values[i] = transformNaN(values[i]);
+        }
       }
     }
     return record;
diff --git a/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java b/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
index 6361963785..e48b6605f5 100644
--- a/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
+++ b/pinot-segment-local/src/test/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformerTest.java
@@ -20,14 +20,11 @@ package org.apache.pinot.segment.local.recordtransformer;
 
 import java.sql.Timestamp;
 import java.util.Collections;
-import java.util.List;
 import java.util.concurrent.TimeUnit;
 import org.apache.pinot.spi.config.table.TableConfig;
 import org.apache.pinot.spi.config.table.TableType;
 import org.apache.pinot.spi.config.table.ingestion.FilterConfig;
 import org.apache.pinot.spi.config.table.ingestion.IngestionConfig;
-import org.apache.pinot.spi.config.table.ingestion.SchemaConformingTransformerConfig;
-import org.apache.pinot.spi.config.table.ingestion.TransformConfig;
 import org.apache.pinot.spi.data.DateTimeFormatSpec;
 import org.apache.pinot.spi.data.DimensionFieldSpec;
 import org.apache.pinot.spi.data.FieldSpec;
@@ -72,7 +69,6 @@ public class RecordTransformerTest {
 
   // Transform multiple times should return the same result
   private static final int NUM_ROUNDS = 5;
-  private static final int NUMBER_OF_TRANSFORMERS = 8;
 
   private static GenericRow getRecord() {
     GenericRow record = new GenericRow();
@@ -295,72 +291,6 @@ public class RecordTransformerTest {
     }
   }
 
-  @Test
-  public void testOrderForTransformers() {
-    // This test checks that the specified order is maintained for different transformers.
-
-    // Build Schema and ingestionConfig in such a way that all the transformers are loaded.
-    Schema schema = new Schema.SchemaBuilder().addSingleValueDimension("svInt", DataType.INT)
-        .addSingleValueDimension("svDouble", DataType.DOUBLE)
-        .addSingleValueDimension("expressionTestColumn", DataType.INT).addSingleValueDimension("svNaN", DataType.FLOAT)
-        .addSingleValueDimension("emptyDimensionForNullValueTransformer", DataType.FLOAT)
-        .addSingleValueDimension("svStringWithNullCharacters", DataType.STRING)
-        .addSingleValueDimension("indexableExtras", DataType.JSON)
-        .addDateTime("timeCol", DataType.TIMESTAMP, "1:MILLISECONDS:TIMESTAMP", "1:MILLISECONDS").build();
-
-    IngestionConfig ingestionConfig = new IngestionConfig();
-    TableConfig tableConfig =
-        new TableConfigBuilder(TableType.OFFLINE).setTableName("testTable").setIngestionConfig(ingestionConfig)
-            .setTimeColumnName("timeCol").build();
-    ingestionConfig.setFilterConfig(new FilterConfig("svInt = 123 AND svDouble <= 200"));
-    ingestionConfig.setTransformConfigs(List.of(new TransformConfig("expressionTestColumn", "plus(x,10)")));
-    ingestionConfig.setSchemaConformingTransformerConfig(
-        new SchemaConformingTransformerConfig("indexableExtras", null, null, null));
-    ingestionConfig.setRowTimeValueCheck(true);
-    ingestionConfig.setContinueOnError(false);
-
-    // Get the list of transformers.
-    List<RecordTransformer> currentListOfTransformers =
-        CompositeTransformer.getDefaultTransformers(tableConfig, schema);
-
-    // Create a list of transformers to compare.
-    List<RecordTransformer> expectedListOfTransformers =
-        List.of(new ExpressionTransformer(tableConfig, schema), new FilterTransformer(tableConfig),
-            new SchemaConformingTransformer(tableConfig, schema), new DataTypeTransformer(tableConfig, schema),
-            new TimeValidationTransformer(tableConfig, schema), new NullValueTransformer(tableConfig, schema),
-            new SpecialValueTransformer(schema), new SanitizationTransformer(schema));
-
-    // Check that the number of current transformers match the expected number of transformers.
-    assertEquals(currentListOfTransformers.size(), NUMBER_OF_TRANSFORMERS);
-
-    GenericRow record = new GenericRow();
-
-    // Data for expression Transformer.
-    record.putValue("expressionTestColumn", 100);
-
-    // Data for filter transformer.
-    record.putValue("svDouble", 123d);
-
-    // Data for DataType Transformer.
-    record.putValue("svInt", (byte) 123);
-
-    // Data for TimeValidation transformer.
-    record.putValue("timeCol", System.currentTimeMillis());
-
-    // Data for SpecialValue Transformer.
-    record.putValue("svNaN", Float.NaN);
-
-    // Data for sanitization transformer.
-    record.putValue("svStringWithNullCharacters", "1\0002\0003");
-
-    for (int i = 0; i < NUMBER_OF_TRANSFORMERS; i++) {
-      GenericRow currentRecord = currentListOfTransformers.get(i).transform(record);
-      GenericRow expectedRecord = expectedListOfTransformers.get(i).transform(record);
-      assertEquals(currentRecord, expectedRecord);
-      record = expectedRecord;
-    }
-  }
-
   @Test
   public void testScalarOps() {
     IngestionConfig ingestionConfig = new IngestionConfig();
@@ -636,6 +566,7 @@ public class RecordTransformerTest {
       assertEquals(record.getValue("mvString1"), new Object[]{"123", "123", "123", "123.0", "123.0"});
       assertEquals(record.getValue("mvString2"), new Object[]{"123", "123", "123.0", "123.0", "123"});
       assertNull(record.getValue("$virtual"));
+      assertTrue(record.getNullValueFields().isEmpty());
       assertEquals(Float.floatToRawIntBits((float) record.getValue("svFloatNegativeZero")),
           Float.floatToRawIntBits(0.0f));
       assertEquals(Double.doubleToRawLongBits((double) record.getValue("svDoubleNegativeZero")),
@@ -648,7 +579,6 @@ public class RecordTransformerTest {
           new Float[]{0.0f, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_FLOAT, 2.0f});
       assertEquals(record.getValue("mvDoubleNaN"),
           new Double[]{0.0d, FieldSpec.DEFAULT_DIMENSION_NULL_VALUE_OF_DOUBLE, 2.0d});
-      assertTrue(record.getNullValueFields().isEmpty());
     }
 
     // Test empty record


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


(pinot) 05/06: Revert "fix typo in comment."

Posted by sn...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

snlee pushed a commit to branch revert-12032-santizeNanNegativeZeroBehaviour
in repository https://gitbox.apache.org/repos/asf/pinot.git

commit 04df04adf17e0d9291f4c95fd5fb65982d248de4
Author: Seunghyun Lee <sn...@apache.org>
AuthorDate: Mon Nov 27 11:39:57 2023 -0800

    Revert "fix typo in comment."
    
    This reverts commit 5c8f43ead1912b4f78a3ec91630fbeb0073698ab.
---
 .../pinot/segment/local/recordtransformer/SpecialValueTransformer.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
index 9e55c84cf1..2bd5b9920d 100644
--- a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
+++ b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/SpecialValueTransformer.java
@@ -26,7 +26,7 @@ import org.apache.pinot.spi.data.readers.GenericRow;
 
 
 /**
- * The {@code SpecialValueTransformer} class will transform special values according to the following rules:
+ * The {@code SpecialValueTransformer} class will transform special values the values to follow certain rules including:
  * <ul>
  *   <li>Negative zero (-0.0) should be converted to 0.0</li>
  *   <li>NaN should be converted to default null</li>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org