You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@kylin.apache.org by Roberto Tardío <ro...@stratebi.com> on 2017/11/01 13:39:29 UTC

Re: Issues with order for joins on Kylin 2.1

Thanks ShaoFeng,

I created a JIRA to disccus this improvement 
https://issues.apache.org/jira/browse/KYLIN-2983

Best Regards,


El 23/10/2017 a las 17:07, ShaoFeng Shi escribió:
> It looks more like a limitation of the BI tool. Most of the tools I 
> see allow the user to specify the fact/lookup table, and the generated 
> queries are starting from the fact table.
>
> You can compose this into a JIRA; When more people comment on this, 
> the team will investigate. Thanks!
>
> 2017-10-23 19:09 GMT+08:00 Roberto Tardío <roberto.tardio@stratebi.com 
> <ma...@stratebi.com>>:
>
>     Many thanks ShaoFeng Shi,
>
>     I understand this could be necesary to support snowflake schema.
>     However, some BI tools could generate queries putting first a
>     dimension table and after the fact table, with correct ANSI-92 SQL
>     sintax but incorrect for Kylin 2.1. Maybe could be useful and
>     option to select between Star Schema and Snowflake schema when you
>     define data model on Kylin. What do you think about?
>
>     Best Regards,
>
>
>     El 23/10/2017 a las 10:10, ShaoFeng Shi escribió:
>>     Should be related to the snowflake support; Now all joined query
>>     should start from the fact table. Add the second join doesn't
>>     work I believe.
>>
>>     2017-10-22 0:36 GMT+08:00 Roberto Tardío
>>     <roberto.tardio@stratebi.com <ma...@stratebi.com>>:
>>
>>         Hi,
>>
>>         I have replaced (not updated) Kylin 1.6 for Kylin 2.1. I
>>         created a cube (and also underlying model) with the same
>>         sources and metadata that I have used for the same I
>>         previously implementend on Kylin 1.6. The cube construcction
>>         was Ok. However, some strange occurs with join queries. The
>>         following query goes
>>
>>         /F_RENDIMIENTO is the fact table and //D_CURSO_ACADEMICO_VK
>>         is a dimension table:
>>         /
>>
>>             /select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO,
>>             sum(CREDITOS)/
>>             /from F_RENDIMIENTO JOIN D_CURSO_ACADEMICO_VK ON
>>             F_RENDIMIENTO.ID_CURSO_ACADEMICO =
>>             D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO/
>>             /group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO/
>>
>>         But susprisingly if I change the INNER JOIN order the
>>         following query does not go
>>
>>             /select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO,
>>             sum(CREDITOS)//
>>             //from D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON
>>             F_RENDIMIENTO.ID_CURSO_ACADEMICO =
>>             D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO//
>>             //group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO/
>>
>>
>>         /Error while executing SQL "select
>>         D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS) from
>>         D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON
>>         F_RENDIMIENTO.ID_CURSO_ACADEMICO =
>>         D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO group by
>>         D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO LIMIT 50000": No
>>         realization found for
>>         rel#7393:OLAPTableScan.OLAP.[](table=[DM_ACAD_KYLIN_ORC,
>>         D_CURSO_ACADEMICO_VK],fields=[0, 1]), JoinDesc [type=INNER,
>>         primary_key=[ID_CURSO_ACADEMICO],
>>         foreign_key=[ID_CURSO_ACADEMICO]]/
>>
>>         This does not happend with the same cube implemented using
>>         Kylin 1.6.
>>
>>         Why does this happen?
>>
>>         Maybe is related to the new snowflake schema support.  I used
>>         I a star schema and I defined the INNER JOIN as I show in the
>>         next picture
>>
>>         Maybe I have to add a second explicit JOIN between
>>         D_CURSO_ACADEMICO --> F_RENDIMIENTO, i.e, the inverted join.
>>
>>         Regards,
>>
>>         Roberto
>>
>>         -- 
>>
>>         *Roberto Tardío Olmos*
>>
>>         /Senior Big Data & Business Intelligence Consultant/
>>         Avenida de Brasil, 17
>>         <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
>>         Planta 16.28020 Madrid
>>         Fijo: 91.788.34.10
>>
>>
>>
>>
>>     -- 
>>     Best regards,
>>
>>     Shaofeng Shi 史少锋
>>
>
>     -- 
>
>     *Roberto Tardío Olmos*
>
>     /Senior Big Data & Business Intelligence Consultant/
>     Avenida de Brasil, 17
>     <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
>     Planta 16.28020 Madrid
>     Fijo: 91.788.34.10
>
>
>
>
> -- 
> Best regards,
>
> Shaofeng Shi 史少锋
>

-- 

*Roberto Tardío Olmos*

/Senior Big Data & Business Intelligence Consultant/
Avenida de Brasil, 17, Planta 16.28020 Madrid
Fijo: 91.788.34.10

Re: Issues with order for joins on Kylin 2.1

Posted by ShaoFeng Shi <sh...@apache.org>.

Got it; thanks for the inputs to Kylin.

2017-11-01 21:39 GMT+08:00 Roberto Tardío <ro...@stratebi.com>:

> Thanks ShaoFeng,
>
> I created a JIRA to disccus this improvement https://issues.apache.org/
> jira/browse/KYLIN-2983
>
> Best Regards,
>
> El 23/10/2017 a las 17:07, ShaoFeng Shi escribió:
>
> It looks more like a limitation of the BI tool. Most of the tools I see
> allow the user to specify the fact/lookup table, and the generated queries
> are starting from the fact table.
>
> You can compose this into a JIRA; When more people comment on this, the
> team will investigate. Thanks!
>
> 2017-10-23 19:09 GMT+08:00 Roberto Tardío <ro...@stratebi.com>:
>
>> Many thanks ShaoFeng Shi,
>>
>> I understand this could be necesary to support snowflake schema. However,
>> some BI tools could generate queries putting first a dimension table and
>> after the fact table, with correct ANSI-92 SQL sintax but incorrect for
>> Kylin 2.1. Maybe could be useful and option to select between Star Schema
>> and Snowflake schema when you define data model on Kylin. What do you think
>> about?
>>
>> Best Regards,
>>
>> El 23/10/2017 a las 10:10, ShaoFeng Shi escribió:
>>
>> Should be related to the snowflake support; Now all joined query should
>> start from the fact table. Add the second join doesn't work I believe.
>>
>> 2017-10-22 0:36 GMT+08:00 Roberto Tardío <ro...@stratebi.com>:
>>
>>> Hi,
>>>
>>> I have replaced (not updated) Kylin 1.6 for Kylin 2.1. I created a cube
>>> (and also underlying model) with the same sources and metadata that I have
>>> used for the same I previously implementend on Kylin 1.6. The cube
>>> construcction was Ok. However, some strange occurs with join queries. The
>>> following query goes
>>>
>>> *F_RENDIMIENTO is the fact table and *
>>> *D_CURSO_ACADEMICO_VK is a dimension table: *
>>>
>>> *select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS)*
>>> * from F_RENDIMIENTO JOIN D_CURSO_ACADEMICO_VK ON
>>> F_RENDIMIENTO.ID_CURSO_ACADEMICO = D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO*
>>> * group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO*
>>>
>>> But susprisingly if I change the INNER JOIN order the following query
>>> does not go
>>>
>>> *select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS)*
>>> * from D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON
>>> F_RENDIMIENTO.ID_CURSO_ACADEMICO = D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO*
>>> * group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO*
>>>
>>>
>>> *Error while executing SQL "select
>>> D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS) from
>>> D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON F_RENDIMIENTO.ID_CURSO_ACADEMICO
>>> = D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO group by
>>> D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO LIMIT 50000": No realization found
>>> for rel#7393:OLAPTableScan.OLAP.[](table=[DM_ACAD_KYLIN_ORC,
>>> D_CURSO_ACADEMICO_VK],fields=[0, 1]), JoinDesc [type=INNER,
>>> primary_key=[ID_CURSO_ACADEMICO], foreign_key=[ID_CURSO_ACADEMICO]]*
>>>
>>> This does not happend with the same cube implemented using Kylin 1.6.
>>>
>>> Why does this happen?
>>>
>>> Maybe is related to the new snowflake schema support.  I used I a star
>>> schema and I defined the INNER JOIN as I show in the next picture
>>>
>>> Maybe I have to add a second explicit JOIN between D_CURSO_ACADEMICO -->
>>> F_RENDIMIENTO, i.e, the inverted join.
>>>
>>> Regards,
>>>
>>> Roberto
>>> --
>>>
>>> *Roberto Tardío Olmos*
>>> *Senior Big Data & Business Intelligence Consultant*
>>> Avenida de Brasil, 17
>>> <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
>>> Planta 16.28020 Madrid
>>> Fijo: 91.788.34.10
>>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>> --
>>
>> *Roberto Tardío Olmos*
>> *Senior Big Data & Business Intelligence Consultant*
>> Avenida de Brasil, 17
>> <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
>> Planta 16.28020 Madrid
>> Fijo: 91.788.34.10
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>
> --
>
> *Roberto Tardío Olmos*
> *Senior Big Data & Business Intelligence Consultant*
> Avenida de Brasil, 17
> <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g>,
> Planta 16.28020 Madrid
> Fijo: 91.788.34.10
>



-- 
Best regards,

Shaofeng Shi 史少锋