You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by lk_hadoop <lk...@163.com> on 2019/06/11 03:20:02 UTC

why query will be push down when query a dimension table's dimension values

hi , all:

My model json string is :

{
  "uuid": "c28014c1-7dae-6900-6264-8794b683ffa7",
  "last_modified": 1560153343761,
  "version": "2.6.1.0",
  "name": "scrm_model",
  "owner": "ADMIN",
  "is_draft": false,
  "description": "###",
  "fact_table": "GJST.SH_FETCH_SALE_BASE_FACT_ALL_NEW",
  "lookups": [
    {
      "table": "TEST.MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
      "kind": "FACT",
      "alias": "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
      "join": {
        "type": "inner",
        "primary_key": [
          "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4.CARD_ID"
        ],
        "foreign_key": [
          "SH_FETCH_SALE_BASE_FACT_ALL_NEW.CARD_ID"
        ]
      }
    }
  ],
  "dimensions": [
    {
      "table": "SH_FETCH_SALE_BASE_FACT_ALL_NEW",
      "columns": [
        "DATES",
        "CARD_ID",
        "TGOODS_ID",
        "ENT_NAME",
        "ORG_NAME",
        "DATA_FROM",
        "GOODS_NAME",
        "ORG_NO",
        "ATC1_NEW",
        "ATC2_NEW",
        "ATC3_NEW",
        "ATC4_NEW"
      ]
    },
    {
      "table": "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
      "columns": [
        "CARD_ID",
        "USER_ID",
        "SEX",
        "AGE",
        "BIRTHDAYS",
        "NAME",
        "NICK_NAME",
        "IS_SUBSCRIBE_WX",
        "IS_RECEIVE_CARD",
        "SUBSCRIBE_TIME",
        "SUBSCRIBE_STORE",
        "ACTIVATE_TIME",
        "ACTIVATE_STORE",
        "FIRST_BUY_DATE",
        "RECENT_CONSUME_DATE",
        "RECENT_CONSUMPTION_INTERVAL_DAY",
        "GAOXUEYA_BUYS",
        "GAOXUEYA_FLAG",
        "GAOXUEZHI_BUYS",
        "GAOXUEZHI_FLAG",
        "TANGNIAOBING_BUYS",
        "TANGNIAOBING_FLAG",
        "TOTAL_POINTS",
        "REMAINDER_POINTS",
        "TOTAL_COUPONS_NUMBER",
        "AVAILABLE_COUPONS_NUMBER",
        "TOTAL_USE_COUPONS_NUMBER",
        "MAINTAIN_NUMBERS",
        "MAINTAIN_TYPE",
        "MARKET_PROGRAM",
        "RECENT_MAINTAIN_INTERVAL_DAY",
        "BELONG_STORE",
        "BUSINESS_ID"
      ]
    }
  ],
  "metrics": [
    "SH_FETCH_SALE_BASE_FACT_ALL_NEW.PAID_IN_AMT",
    "SH_FETCH_SALE_BASE_FACT_ALL_NEW.TBILL_CODE",
    "SH_FETCH_SALE_BASE_FACT_ALL_NEW.PROFIT"
  ],
  "filter_condition": "SH_FETCH_SALE_BASE_FACT_ALL_NEW.data_from_new <>'' and SH_FETCH_SALE_BASE_FACT_ALL_NEW.card_id is not null",
  "partition_desc": {
    "partition_date_column": "SH_FETCH_SALE_BASE_FACT_ALL_NEW.CDT",
    "partition_time_column": null,
    "partition_date_start": 0,
    "partition_date_format": "yyyy-MM-dd",
    "partition_time_format": "HH:mm:ss",
    "partition_type": "APPEND",
    "partition_condition_builder": "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder"
  },
  "capacity": "MEDIUM"
}

when I want to see one dimension table's dimension contain what kind of values , I sent a query like : 
SELECT `MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`.`SEX` FROM `TEST`.`MEMBERSHIP_PRECISE_SELLING_EXTEND_V4` `MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`  GROUP BY `MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`.`SEX`
and the query will be push down to query hive table, if I want to run the query to search hbase I must write sql like this:
SELECT "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4"."SEX" 
FROM "GJST"."SH_FETCH_SALE_BASE_FACT_ALL_NEW" "SH_FETCH_SALE_BASE_FACT_ALL_NEW" 
INNER JOIN "TEST"."MEMBERSHIP_PRECISE_SELLING_EXTEND_V4" MEMBERSHIP_PRECISE_SELLING_EXTEND_V4 ON MEMBERSHIP_PRECISE_SELLING_EXTEND_V4.CARD_ID = SH_FETCH_SALE_BASE_FACT_ALL_NEW.CARD_ID 
GROUP BY "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4"."SEX"

from the model's json string we can see that the lookup table's kind also is "fact" and there is no derived column in the cube,I want to konw why kylin will push down the query when directly query the dimension table.

Thanks.
2019-06-11


lk_hadoop 

Re: why query will be push down when query a dimension table's dimension values

Posted by ShaoFeng Shi <sh...@apache.org>.
What's the Kylin version? I remember there is an enhancement which has
supported such query.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofengshi@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscribe@kylin.apache.org
Join Kylin dev mail group: dev-subscribe@kylin.apache.org




lk_hadoop <lk...@163.com> 于2019年6月11日周二 上午11:20写道:

> hi , all:
>
> My model json string is :
>
>
> {
>   "uuid": "c28014c1-7dae-6900-6264-8794b683ffa7",
>   "last_modified": 1560153343761,
>   "version": "2.6.1.0",
>   "name": "scrm_model",
>   "owner": "ADMIN",
>   "is_draft": false,
>   "description": "###",
>   "fact_table": "GJST.SH_FETCH_SALE_BASE_FACT_ALL_NEW",
>   "lookups": [
>     {
>       "table": "TEST.MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
>       "kind": "FACT",
>       "alias": "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
>       "join": {
>         "type": "inner",
>         "primary_key": [
>           "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4.CARD_ID"
>         ],
>         "foreign_key": [
>           "SH_FETCH_SALE_BASE_FACT_ALL_NEW.CARD_ID"
>         ]
>       }
>     }
>   ],
>   "dimensions": [
>     {
>       "table": "SH_FETCH_SALE_BASE_FACT_ALL_NEW",
>       "columns": [
>         "DATES",
>         "CARD_ID",
>         "TGOODS_ID",
>         "ENT_NAME",
>         "ORG_NAME",
>         "DATA_FROM",
>         "GOODS_NAME",
>         "ORG_NO",
>         "ATC1_NEW",
>         "ATC2_NEW",
>         "ATC3_NEW",
>         "ATC4_NEW"
>       ]
>     },
>     {
>       "table": "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
>       "columns": [
>         "CARD_ID",
>         "USER_ID",
>         "SEX",
>         "AGE",
>         "BIRTHDAYS",
>         "NAME",
>         "NICK_NAME",
>         "IS_SUBSCRIBE_WX",
>         "IS_RECEIVE_CARD",
>         "SUBSCRIBE_TIME",
>         "SUBSCRIBE_STORE",
>         "ACTIVATE_TIME",
>         "ACTIVATE_STORE",
>         "FIRST_BUY_DATE",
>         "RECENT_CONSUME_DATE",
>         "RECENT_CONSUMPTION_INTERVAL_DAY",
>         "GAOXUEYA_BUYS",
>         "GAOXUEYA_FLAG",
>         "GAOXUEZHI_BUYS",
>         "GAOXUEZHI_FLAG",
>         "TANGNIAOBING_BUYS",
>         "TANGNIAOBING_FLAG",
>         "TOTAL_POINTS",
>         "REMAINDER_POINTS",
>         "TOTAL_COUPONS_NUMBER",
>         "AVAILABLE_COUPONS_NUMBER",
>         "TOTAL_USE_COUPONS_NUMBER",
>         "MAINTAIN_NUMBERS",
>         "MAINTAIN_TYPE",
>         "MARKET_PROGRAM",
>         "RECENT_MAINTAIN_INTERVAL_DAY",
>         "BELONG_STORE",
>         "BUSINESS_ID"
>       ]
>     }
>   ],
>   "metrics": [
>     "SH_FETCH_SALE_BASE_FACT_ALL_NEW.PAID_IN_AMT",
>     "SH_FETCH_SALE_BASE_FACT_ALL_NEW.TBILL_CODE",
>     "SH_FETCH_SALE_BASE_FACT_ALL_NEW.PROFIT"
>   ],
>   "filter_condition": "SH_FETCH_SALE_BASE_FACT_ALL_NEW.data_from_new <>'' and SH_FETCH_SALE_BASE_FACT_ALL_NEW.card_id is not null",
>   "partition_desc": {
>     "partition_date_column": "SH_FETCH_SALE_BASE_FACT_ALL_NEW.CDT",
>     "partition_time_column": null,
>     "partition_date_start": 0,
>     "partition_date_format": "yyyy-MM-dd",
>     "partition_time_format": "HH:mm:ss",
>     "partition_type": "APPEND",
>     "partition_condition_builder": "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder"
>   },
>   "capacity": "MEDIUM"
> }
>
>
> when I want to see one dimension table's dimension contain what kind of
> values , I sent a query like :
> SELECT `MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`.`SEX` FROM
> `TEST`.`MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`
> `MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`  GROUP BY
> `MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`.`SEX`
> and the query will be push down to query hive table, if I want to run the
> query to search hbase I must write sql like this:
> SELECT "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4"."SEX"
> FROM "GJST"."SH_FETCH_SALE_BASE_FACT_ALL_NEW"
> "SH_FETCH_SALE_BASE_FACT_ALL_NEW"
> INNER JOIN "TEST"."MEMBERSHIP_PRECISE_SELLING_EXTEND_V4"
> MEMBERSHIP_PRECISE_SELLING_EXTEND_V4 ON
> MEMBERSHIP_PRECISE_SELLING_EXTEND_V4.CARD_ID =
> SH_FETCH_SALE_BASE_FACT_ALL_NEW.CARD_ID
> GROUP BY "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4"."SEX"
>
> from the model's json string we can see that the lookup table's kind also
> is "fact" and there is no derived column in the cube,I want to konw
> why kylin will push down the query when directly query the dimension table.
>
> Thanks.
> 2019-06-11
> ------------------------------
> lk_hadoop
>

Re: why query will be push down when query a dimension table's dimension values

Posted by ShaoFeng Shi <sh...@apache.org>.
What's the Kylin version? I remember there is an enhancement which has
supported such query.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofengshi@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscribe@kylin.apache.org
Join Kylin dev mail group: dev-subscribe@kylin.apache.org




lk_hadoop <lk...@163.com> 于2019年6月11日周二 上午11:20写道:

> hi , all:
>
> My model json string is :
>
>
> {
>   "uuid": "c28014c1-7dae-6900-6264-8794b683ffa7",
>   "last_modified": 1560153343761,
>   "version": "2.6.1.0",
>   "name": "scrm_model",
>   "owner": "ADMIN",
>   "is_draft": false,
>   "description": "###",
>   "fact_table": "GJST.SH_FETCH_SALE_BASE_FACT_ALL_NEW",
>   "lookups": [
>     {
>       "table": "TEST.MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
>       "kind": "FACT",
>       "alias": "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
>       "join": {
>         "type": "inner",
>         "primary_key": [
>           "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4.CARD_ID"
>         ],
>         "foreign_key": [
>           "SH_FETCH_SALE_BASE_FACT_ALL_NEW.CARD_ID"
>         ]
>       }
>     }
>   ],
>   "dimensions": [
>     {
>       "table": "SH_FETCH_SALE_BASE_FACT_ALL_NEW",
>       "columns": [
>         "DATES",
>         "CARD_ID",
>         "TGOODS_ID",
>         "ENT_NAME",
>         "ORG_NAME",
>         "DATA_FROM",
>         "GOODS_NAME",
>         "ORG_NO",
>         "ATC1_NEW",
>         "ATC2_NEW",
>         "ATC3_NEW",
>         "ATC4_NEW"
>       ]
>     },
>     {
>       "table": "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
>       "columns": [
>         "CARD_ID",
>         "USER_ID",
>         "SEX",
>         "AGE",
>         "BIRTHDAYS",
>         "NAME",
>         "NICK_NAME",
>         "IS_SUBSCRIBE_WX",
>         "IS_RECEIVE_CARD",
>         "SUBSCRIBE_TIME",
>         "SUBSCRIBE_STORE",
>         "ACTIVATE_TIME",
>         "ACTIVATE_STORE",
>         "FIRST_BUY_DATE",
>         "RECENT_CONSUME_DATE",
>         "RECENT_CONSUMPTION_INTERVAL_DAY",
>         "GAOXUEYA_BUYS",
>         "GAOXUEYA_FLAG",
>         "GAOXUEZHI_BUYS",
>         "GAOXUEZHI_FLAG",
>         "TANGNIAOBING_BUYS",
>         "TANGNIAOBING_FLAG",
>         "TOTAL_POINTS",
>         "REMAINDER_POINTS",
>         "TOTAL_COUPONS_NUMBER",
>         "AVAILABLE_COUPONS_NUMBER",
>         "TOTAL_USE_COUPONS_NUMBER",
>         "MAINTAIN_NUMBERS",
>         "MAINTAIN_TYPE",
>         "MARKET_PROGRAM",
>         "RECENT_MAINTAIN_INTERVAL_DAY",
>         "BELONG_STORE",
>         "BUSINESS_ID"
>       ]
>     }
>   ],
>   "metrics": [
>     "SH_FETCH_SALE_BASE_FACT_ALL_NEW.PAID_IN_AMT",
>     "SH_FETCH_SALE_BASE_FACT_ALL_NEW.TBILL_CODE",
>     "SH_FETCH_SALE_BASE_FACT_ALL_NEW.PROFIT"
>   ],
>   "filter_condition": "SH_FETCH_SALE_BASE_FACT_ALL_NEW.data_from_new <>'' and SH_FETCH_SALE_BASE_FACT_ALL_NEW.card_id is not null",
>   "partition_desc": {
>     "partition_date_column": "SH_FETCH_SALE_BASE_FACT_ALL_NEW.CDT",
>     "partition_time_column": null,
>     "partition_date_start": 0,
>     "partition_date_format": "yyyy-MM-dd",
>     "partition_time_format": "HH:mm:ss",
>     "partition_type": "APPEND",
>     "partition_condition_builder": "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder"
>   },
>   "capacity": "MEDIUM"
> }
>
>
> when I want to see one dimension table's dimension contain what kind of
> values , I sent a query like :
> SELECT `MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`.`SEX` FROM
> `TEST`.`MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`
> `MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`  GROUP BY
> `MEMBERSHIP_PRECISE_SELLING_EXTEND_V4`.`SEX`
> and the query will be push down to query hive table, if I want to run the
> query to search hbase I must write sql like this:
> SELECT "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4"."SEX"
> FROM "GJST"."SH_FETCH_SALE_BASE_FACT_ALL_NEW"
> "SH_FETCH_SALE_BASE_FACT_ALL_NEW"
> INNER JOIN "TEST"."MEMBERSHIP_PRECISE_SELLING_EXTEND_V4"
> MEMBERSHIP_PRECISE_SELLING_EXTEND_V4 ON
> MEMBERSHIP_PRECISE_SELLING_EXTEND_V4.CARD_ID =
> SH_FETCH_SALE_BASE_FACT_ALL_NEW.CARD_ID
> GROUP BY "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4"."SEX"
>
> from the model's json string we can see that the lookup table's kind also
> is "fact" and there is no derived column in the cube,I want to konw
> why kylin will push down the query when directly query the dimension table.
>
> Thanks.
> 2019-06-11
> ------------------------------
> lk_hadoop
>