You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Roberto Tardío <ro...@stratebi.com> on 2017/11/01 13:39:29 UTC
Re: Issues with order for joins on Kylin 2.1
Thanks ShaoFeng,
I created a JIRA to disccus this improvement
https://issues.apache.org/jira/browse/KYLIN-2983
Best Regards,
El 23/10/2017 a las 17:07, ShaoFeng Shi escribió:
> It looks more like a limitation of the BI tool. Most of the tools I
> see allow the user to specify the fact/lookup table, and the generated
> queries are starting from the fact table.
>
> You can compose this into a JIRA; When more people comment on this,
> the team will investigate. Thanks!
>
> 2017-10-23 19:09 GMT+08:00 Roberto Tardío <roberto.tardio@stratebi.com
> <ma...@stratebi.com>>:
>
> Many thanks ShaoFeng Shi,
>
> I understand this could be necesary to support snowflake schema.
> However, some BI tools could generate queries putting first a
> dimension table and after the fact table, with correct ANSI-92 SQL
> sintax but incorrect for Kylin 2.1. Maybe could be useful and
> option to select between Star Schema and Snowflake schema when you
> define data model on Kylin. What do you think about?
>
> Best Regards,
>
>
> El 23/10/2017 a las 10:10, ShaoFeng Shi escribió:
>> Should be related to the snowflake support; Now all joined query
>> should start from the fact table. Add the second join doesn't
>> work I believe.
>>
>> 2017-10-22 0:36 GMT+08:00 Roberto Tardío
>> <roberto.tardio@stratebi.com <ma...@stratebi.com>>:
>>
>> Hi,
>>
>> I have replaced (not updated) Kylin 1.6 for Kylin 2.1. I
>> created a cube (and also underlying model) with the same
>> sources and metadata that I have used for the same I
>> previously implementend on Kylin 1.6. The cube construcction
>> was Ok. However, some strange occurs with join queries. The
>> following query goes
>>
>> /F_RENDIMIENTO is the fact table and //D_CURSO_ACADEMICO_VK
>> is a dimension table:
>> /
>>
>> /select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO,
>> sum(CREDITOS)/
>> /from F_RENDIMIENTO JOIN D_CURSO_ACADEMICO_VK ON
>> F_RENDIMIENTO.ID_CURSO_ACADEMICO =
>> D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO/
>> /group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO/
>>
>> But susprisingly if I change the INNER JOIN order the
>> following query does not go
>>
>> /select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO,
>> sum(CREDITOS)//
>> //from D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON
>> F_RENDIMIENTO.ID_CURSO_ACADEMICO =
>> D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO//
>> //group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO/
>>
>>
>> /Error while executing SQL "select
>> D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS) from
>> D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON
>> F_RENDIMIENTO.ID_CURSO_ACADEMICO =
>> D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO group by
>> D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO LIMIT 50000": No
>> realization found for
>> rel#7393:OLAPTableScan.OLAP.[](table=[DM_ACAD_KYLIN_ORC,
>> D_CURSO_ACADEMICO_VK],fields=[0, 1]), JoinDesc [type=INNER,
>> primary_key=[ID_CURSO_ACADEMICO],
>> foreign_key=[ID_CURSO_ACADEMICO]]/
>>
>> This does not happend with the same cube implemented using
>> Kylin 1.6.
>>
>> Why does this happen?
>>
>> Maybe is related to the new snowflake schema support. I used
>> I a star schema and I defined the INNER JOIN as I show in the
>> next picture
>>
>> Maybe I have to add a second explicit JOIN between
>> D_CURSO_ACADEMICO --> F_RENDIMIENTO, i.e, the inverted join.
>>
>> Regards,
>>
>> Roberto
>>
>> --
>>
>> *Roberto Tardío Olmos*
>>
>> /Senior Big Data & Business Intelligence Consultant/
>> Avenida de Brasil, 17
>> <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
>> Planta 16.28020 Madrid
>> Fijo: 91.788.34.10
>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>
> --
>
> *Roberto Tardío Olmos*
>
> /Senior Big Data & Business Intelligence Consultant/
> Avenida de Brasil, 17
> <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
> Planta 16.28020 Madrid
> Fijo: 91.788.34.10
>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
--
*Roberto Tardío Olmos*
/Senior Big Data & Business Intelligence Consultant/
Avenida de Brasil, 17, Planta 16.28020 Madrid
Fijo: 91.788.34.10
Re: Issues with order for joins on Kylin 2.1
Posted by ShaoFeng Shi <sh...@apache.org>.
Got it; thanks for the inputs to Kylin.
2017-11-01 21:39 GMT+08:00 Roberto Tardío <ro...@stratebi.com>:
> Thanks ShaoFeng,
>
> I created a JIRA to disccus this improvement https://issues.apache.org/
> jira/browse/KYLIN-2983
>
> Best Regards,
>
> El 23/10/2017 a las 17:07, ShaoFeng Shi escribió:
>
> It looks more like a limitation of the BI tool. Most of the tools I see
> allow the user to specify the fact/lookup table, and the generated queries
> are starting from the fact table.
>
> You can compose this into a JIRA; When more people comment on this, the
> team will investigate. Thanks!
>
> 2017-10-23 19:09 GMT+08:00 Roberto Tardío <ro...@stratebi.com>:
>
>> Many thanks ShaoFeng Shi,
>>
>> I understand this could be necesary to support snowflake schema. However,
>> some BI tools could generate queries putting first a dimension table and
>> after the fact table, with correct ANSI-92 SQL sintax but incorrect for
>> Kylin 2.1. Maybe could be useful and option to select between Star Schema
>> and Snowflake schema when you define data model on Kylin. What do you think
>> about?
>>
>> Best Regards,
>>
>> El 23/10/2017 a las 10:10, ShaoFeng Shi escribió:
>>
>> Should be related to the snowflake support; Now all joined query should
>> start from the fact table. Add the second join doesn't work I believe.
>>
>> 2017-10-22 0:36 GMT+08:00 Roberto Tardío <ro...@stratebi.com>:
>>
>>> Hi,
>>>
>>> I have replaced (not updated) Kylin 1.6 for Kylin 2.1. I created a cube
>>> (and also underlying model) with the same sources and metadata that I have
>>> used for the same I previously implementend on Kylin 1.6. The cube
>>> construcction was Ok. However, some strange occurs with join queries. The
>>> following query goes
>>>
>>> *F_RENDIMIENTO is the fact table and *
>>> *D_CURSO_ACADEMICO_VK is a dimension table: *
>>>
>>> *select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS)*
>>> * from F_RENDIMIENTO JOIN D_CURSO_ACADEMICO_VK ON
>>> F_RENDIMIENTO.ID_CURSO_ACADEMICO = D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO*
>>> * group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO*
>>>
>>> But susprisingly if I change the INNER JOIN order the following query
>>> does not go
>>>
>>> *select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS)*
>>> * from D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON
>>> F_RENDIMIENTO.ID_CURSO_ACADEMICO = D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO*
>>> * group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO*
>>>
>>>
>>> *Error while executing SQL "select
>>> D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS) from
>>> D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON F_RENDIMIENTO.ID_CURSO_ACADEMICO
>>> = D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO group by
>>> D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO LIMIT 50000": No realization found
>>> for rel#7393:OLAPTableScan.OLAP.[](table=[DM_ACAD_KYLIN_ORC,
>>> D_CURSO_ACADEMICO_VK],fields=[0, 1]), JoinDesc [type=INNER,
>>> primary_key=[ID_CURSO_ACADEMICO], foreign_key=[ID_CURSO_ACADEMICO]]*
>>>
>>> This does not happend with the same cube implemented using Kylin 1.6.
>>>
>>> Why does this happen?
>>>
>>> Maybe is related to the new snowflake schema support. I used I a star
>>> schema and I defined the INNER JOIN as I show in the next picture
>>>
>>> Maybe I have to add a second explicit JOIN between D_CURSO_ACADEMICO -->
>>> F_RENDIMIENTO, i.e, the inverted join.
>>>
>>> Regards,
>>>
>>> Roberto
>>> --
>>>
>>> *Roberto Tardío Olmos*
>>> *Senior Big Data & Business Intelligence Consultant*
>>> Avenida de Brasil, 17
>>> <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
>>> Planta 16.28020 Madrid
>>> Fijo: 91.788.34.10
>>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>> --
>>
>> *Roberto Tardío Olmos*
>> *Senior Big Data & Business Intelligence Consultant*
>> Avenida de Brasil, 17
>> <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
>> Planta 16.28020 Madrid
>> Fijo: 91.788.34.10
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>
> --
>
> *Roberto Tardío Olmos*
> *Senior Big Data & Business Intelligence Consultant*
> Avenida de Brasil, 17
> <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
> Planta 16.28020 Madrid
> Fijo: 91.788.34.10
>
--
Best regards,
Shaofeng Shi 史少锋