You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/06/16 16:43:49 UTC

[GitHub] [druid] JackDavidson edited a comment on issue #10011: Druid SQL Differing Results When Grouping By 1 Column vs 1 Col With Dummy Column

JackDavidson edited a comment on issue #10011:
URL: https://github.com/apache/druid/issues/10011#issuecomment-644881093


   While we are at it, Here is a very similar situation of two queries that ought to return the same thing but don't.
   
   Data source:
   
   ```
   {
     "type": "index_parallel",
     "spec": {
       "ioConfig": {
         "type": "index_parallel",
         "inputSource": {
           "type": "inline",
           "data": "a|b,c\nd|b,e"
         },
         "inputFormat": {
           "type": "csv",
           "columns": [
             "c1",
             "c2"
           ],
           "listDelimiter": "|"
         }
       },
       "tuningConfig": {
         "type": "index_parallel",
         "partitionsSpec": {
           "type": "dynamic"
         }
       },
       "dataSchema": {
         "dataSource": "inline_data_2",
         "granularitySpec": {
           "type": "uniform",
           "queryGranularity": "HOUR",
           "rollup": true,
           "segmentGranularity": "HOUR"
         },
         "timestampSpec": {
           "column": "!!!_no_such_column_!!!",
           "missingValue": "2010-01-01T00:00:00Z"
         },
         "dimensionsSpec": {
           "dimensions": [
             "c1",
             "c2"
           ]
         },
         "metricsSpec": [
           {
             "name": "count",
             "type": "count"
           }
         ]
       }
     }
   }
   ```
   
   ```
   SELECT
   c1, c2
   FROM inline_data_2
   GROUP BY grouping sets ( (c1, c2), () )
   HAVING c1 = 'b'
   ```
   vs
   ```
   SELECT
   c1, c2
   FROM inline_data_2
   GROUP BY c1, c2
   HAVING c1 = 'b'
   ```
   
   The first query, with the grouping sets, applies the HAVING clause on the results of the group by, thereby filtering to only give me results matching c1 = 'b'. The other query does not. Since HAVING is supposed to happen after the group by, I would think both situations here ought to return only results where c1 = 'b'.
   
   By the way, I'm finding all this because I'm trying to get results that don't have a bunch of aggregations that I don't need. Does anyone know how to dependably query druid SQL to have your filters applied to the end results?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org