You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2017/01/18 22:25:26 UTC

[jira] [Created] (CALCITE-1591) Drill adapter: Use "groupBy" query with extractionFn for time dimension

Julian Hyde created CALCITE-1591:
------------------------------------

             Summary: Drill adapter: Use "groupBy" query with extractionFn for time dimension
                 Key: CALCITE-1591
                 URL: https://issues.apache.org/jira/browse/CALCITE-1591
             Project: Calcite
          Issue Type: Bug
            Reporter: Julian Hyde
            Assignee: Julian Hyde


For queries that aggregate on the time dimension, or a function of it such as {{FLOOR(__time TO DAY)}}, as of the fix for CALCITE-1579 we generate a "groupBy" query that does not sort or apply limit. It would be better (in the sense that Druid is doing more of the work, and Hive is doing less work) if we use an extractionFn to create a dimension that we can sort on.

In CALCITE-1578, [~nishantbangarwa] gives the following example query:

{code}
{
  "queryType": "groupBy",
  "dataSource": "druid_tpcds_ss_sold_time_subset",
  "granularity": "ALL",
  "dimensions": [
    "i_brand_id",
    {
      "type" : "extraction",
      "dimension" : "__time",
      "outputName" :  "year",
      "extractionFn" : {
        "type" : "timeFormat",
        "granularity" : "YEAR"
      }
    }
  ],
  "limitSpec": {
    "type": "default",
    "limit": 10,
    "columns": [
      {
        "dimension": "$f3",
        "direction": "ascending"
      }
    ]
  },
  "aggregations": [
    {
      "type": "longMax",
      "name": "$f2",
      "fieldName": "ss_quantity"
    },
    {
      "type": "doubleSum",
      "name": "$f3",
      "fieldName": "ss_wholesale_cost"
    }
  ],
  "intervals": [
    "1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"
  ]
}
{code}

and for {{DruidAdapterIt. testGroupByDaySortDescLimit}}, [~bslim] suggests

{code}
{
  "queryType": "groupBy",
  "dataSource": "foodmart",
  "granularity": "all",
  "dimensions": [
    "brand_name",
    {
      "type": "extraction",
      "dimension": "__time",
      "outputName": "day",
      "extractionFn": {
        "type": "timeFormat",
        "granularity": "DAY"
      }
    }
  ],
  "aggregations": [
    {
      "type": "longSum",
      "name": "S",
      "fieldName": "unit_sales"
    }
  ],
  "limitSpec": {
    "type": "default",
    "limit": 30,
    "columns": [
      {
        "dimension": "S",
        "direction": "ascending"
      }
    ]
  }
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)