You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/03/31 00:19:59 UTC

[GitHub] [druid] suneet-s opened a new issue #9589: TransformSpec for firehoses appear to perform the operation twice

suneet-s opened a new issue #9589: TransformSpec for firehoses appear to perform the operation twice
URL: https://github.com/apache/druid/issues/9589
 
 
   ### Affected Version
   
   Tested in 0.18
   
   ### Description
   
   I am writing integration tests for transform specs and noticed that when using a transform spec with a parser, the transformation is being applied twice. See the below ingestion spec.
   
   You can re-create this by sym-linking `/resources` to `$DRUID_CODEBASE/integration-tests/src/test/resources`
   
   ```
   {
       "type": "index",
       "spec": {
           "dataSchema": {
               "dataSource": "wiki-tests-2",
               "metricsSpec": [
                   {
                       "type": "count",
                       "name": "count"
                   },
                   {
                       "type": "doubleSum",
                       "name": "added",
                       "fieldName": "added"
                   },
                   {
                       "type": "doubleSum",
                       "name": "triple-added",
                       "fieldName": "triple-added"
                   },
                   {
                       "type": "doubleSum",
                       "name": "deleted",
                       "fieldName": "deleted"
                   },
                   {
                       "type": "doubleSum",
                       "name": "delta",
                       "fieldName": "delta"
                   },
                   {
                       "name": "thetaSketch",
                       "type": "thetaSketch",
                       "fieldName": "user"
                   },
                   {
                       "name": "quantilesDoublesSketch",
                       "type": "quantilesDoublesSketch",
                       "fieldName": "delta"
                   },
                   {
                       "name": "HLLSketchBuild",
                       "type": "HLLSketchBuild",
                       "fieldName": "user"
                   }
               ],
               "granularitySpec": {
                   "segmentGranularity": "DAY",
                   "queryGranularity": "second",
                   "intervals" : [ "2013-08-31/2013-09-02" ]
               },
               "parser": {
                   "parseSpec": {
                       "format" : "json",
                       "timestampSpec": {
                           "column": "timestamp"
                       },
                       "dimensionsSpec": {
                           "dimensions": [
                               "page",
                               "language",
                               "user",
                               "unpatrolled",
                               "newPage",
                               "robot",
                               "anonymous",
                               "namespace",
                               "continent",
                               "country",
                               "region",
                               "city"
                           ]
                       }
                   }
               },
               "transformSpec": {
                   "transforms": [
                       {
                           "type": "expression",
                           "name": "language",
                           "expression": "concat('l-', language)"
                       },
                       {
                           "type": "expression",
                           "name": "triple-added",
                           "expression": "added * 3"
                       }
                   ]
               }
           },
           "ioConfig": {
               "type": "index",
               "firehose": {
                   "type": "local",
                   "baseDir": "/resources/data/batch_index",
                   "filter": "wikipedia_index_data*"
               }
           },
           "tuningConfig": {
               "type": "index",
               "maxRowsPerSegment": 10
           }
       }
   }
   ```
   
   However if you switch to the new format (inputSource/ inputFormat instead of Firehoses), it will perform the operation as expected.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org