You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/06/26 19:11:12 UTC

[GitHub] [incubator-druid] jon-wei opened a new issue #7970: Consider adding support for a pre-transform filter for transform specs

jon-wei opened a new issue #7970: Consider adding support for a pre-transform filter for transform specs
URL: https://github.com/apache/incubator-druid/issues/7970
 
 
   A user asked the following on the mailing list: https://groups.google.com/d/msg/druid-user/5QfVuff8MJw/65w9qsZ_BQAJ
   
   > I've noticed that in ingestion, when specifying filters section in transformSpec, I can only filter only fields that are in the dimensions list. Even if the field is present in the raw data, if it does not appear in the dimensions list the filter won't consider that field.
   This creates a situation where I need to add a field I do not care about to the Data Source's dimensions even though I do not care about that field apart from filtering purposes.
   >
   >How can we work around this?
   
   Transform spec filters are currently applied after the transforms:
   
   ```
   /**
      * Transforms an input row, or returns null if the row should be filtered out.
      *
      * @param row the input row
      */
     @Nullable
     public InputRow transform(@Nullable final InputRow row)
     {
       if (row == null) {
         return null;
       }
   
       final InputRow transformedRow;
   
       if (transforms.isEmpty()) {
         transformedRow = row;
       } else {
         transformedRow = new TransformedInputRow(row, transforms);
       }
   
       if (valueMatcher != null) {
         rowSupplierForValueMatcher.set(transformedRow);
         if (!valueMatcher.matches()) {
           return null;
         }
       }
   
       return transformedRow;
     }
   ```
   
   If pre-transform filters were supported, you could filter on a column and apply a transform to null out the filter column so that it's not written to the final segments, for cases where the user doesn't want to keep the filter column.
   
   Maybe there are better ways to support that use case as well.
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org