You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/09/02 03:38:46 UTC

[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #9320: Allow ingestion of errored records with incorrect datatype

Jackie-Jiang commented on code in PR #9320:
URL: https://github.com/apache/pinot/pull/9320#discussion_r961260316


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/DataTypeTransformer.java:
##########
@@ -40,14 +47,24 @@
  */
 @SuppressWarnings("rawtypes")
 public class DataTypeTransformer implements RecordTransformer {
+  private static final Logger LOGGER = LoggerFactory.getLogger(DataTypeTransformer.class);
+
   private final Map<String, PinotDataType> _dataTypes = new HashMap<>();
+  private boolean _useDefaultValueOnError;

Review Comment:
   Let's make 2 configs for these 2 things: `continueOnError` and `validateTimeValue`
   
   Suggest changing `useDefaultValueOnError` to `continueOnError` because it can be used for other ingestion errors, and we don't want to tie it to the implementation



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/DataTypeTransformer.java:
##########
@@ -40,14 +47,24 @@
  */
 @SuppressWarnings("rawtypes")
 public class DataTypeTransformer implements RecordTransformer {
+  private static final Logger LOGGER = LoggerFactory.getLogger(DataTypeTransformer.class);
+
   private final Map<String, PinotDataType> _dataTypes = new HashMap<>();
+  private boolean _useDefaultValueOnError;
+  private DateTimeFieldSpec _timeColumnSpec;
 
-  public DataTypeTransformer(Schema schema) {
+  public DataTypeTransformer(TableConfig tableConfig, Schema schema) {
     for (FieldSpec fieldSpec : schema.getAllFieldSpecs()) {
       if (!fieldSpec.isVirtualColumn()) {
         _dataTypes.put(fieldSpec.getName(), PinotDataType.getPinotDataTypeForIngestion(fieldSpec));
       }
     }
+
+    _useDefaultValueOnError = tableConfig.getIndexingConfig().useDefaultValueOnError();

Review Comment:
   Suggest making it inside the `IngestionConfig`



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/DataTypeTransformer.java:
##########
@@ -40,14 +47,24 @@
  */
 @SuppressWarnings("rawtypes")
 public class DataTypeTransformer implements RecordTransformer {
+  private static final Logger LOGGER = LoggerFactory.getLogger(DataTypeTransformer.class);
+
   private final Map<String, PinotDataType> _dataTypes = new HashMap<>();
+  private boolean _useDefaultValueOnError;

Review Comment:
   These fields can be `final`



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/DataTypeTransformer.java:
##########
@@ -40,14 +47,24 @@
  */
 @SuppressWarnings("rawtypes")
 public class DataTypeTransformer implements RecordTransformer {
+  private static final Logger LOGGER = LoggerFactory.getLogger(DataTypeTransformer.class);
+
   private final Map<String, PinotDataType> _dataTypes = new HashMap<>();
+  private boolean _useDefaultValueOnError;
+  private DateTimeFieldSpec _timeColumnSpec;

Review Comment:
   We may store the `_timeColumnName` and `_timeFormatSpec` for better performance



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org