You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by ShaoFeng Shi <sh...@apache.org> on 2016/07/01 02:45:17 UTC

Re: Joint VS Derived

This slide has introduction on the Derived Dim:
http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin

"Joint" is introduced since 1.5.1; Using "joint" to combine multiple
dimensions into "one" ; Say if you have Dim A, B, C; among them AB are
always appeared together, e.g, "select ... group by A, B" or "select ...
where A = xx group by B"; in this case you can declare AB as "joint"; for
Kylin it looks like a 2 dim cube: AB, C, the combinations are decreased
from 2^3 to 2^2;

2016-06-30 6:56 GMT+08:00 Alberto Ramón <a....@gmail.com>:

> Hi
>
> I don't understand the difference between:
>
> - Joint Dim (from Dimension Step )
> - Derived Dim (from Advance Setings  Step)
>
> Some example ?  :)
>



-- 
Best regards,

Shaofeng Shi

RE: Joint VS Derived

Posted by "Richard Calaba (Fishbowl)" <rc...@fishbowl.com>.
Hello ShaoFeng Shi,

 

 

I tested this:

 

1)      Joint Dimensions AB and joint dimensions AC -> while saving UI gives error:

 

Error Message

Aggregation group 0 a dim exist in more than one joint

 

-          Doesn’t say which dimension, but okay

 

   I would raise question whether this is valid restriction, maybe this is too strong limitation … ??  I can have i.e. one joint-dimension (date / customer) and 2nd joint dimension (date / store) -> this way I am saying that I would always analyze customers over time (or vice versa) and stores over time (and vice versa) BUT never store over customer (or customer over store) – so to me this seems to be beneficial to be able to improve the dimension pruning even better …. Or ???

 

 

2)      I also tried to put dimension A into mandatory dimension and then add one joint dimension AC (I know it should mean that C is also mandatory), I got this error while saving:

 

Error Message

Failed to deal with the request: null

 

                                Some some NPE occurred I guess – this seems to be a bug for sure.

 

 

Thanx, Richard.

 

From: ShaoFeng Shi [mailto:shaofengshi@apache.org] 
Sent: Friday, July 01, 2016 2:01 AM
To: user@kylin.apache.org
Cc: Richard Calaba (Fishbowl) <rc...@fishbowl.com>
Subject: Re: Joint VS Derived

 

"joint" just tell Kylin to prune certain combinations; for example, "joint AB" will prune all combinations that only has A or only has B; the order in "joint" doesn't matter;

 

for case 3), it should not be allowed on Kylin's GUI, can you verify?

 

 

 

2016-07-01 14:47 GMT+08:00 Richard Calaba (Fishbowl) <rcalaba@fishbowl.com <ma...@fishbowl.com> >:

Ok, so AB are joined, if:

 

1)      Both A and B have to be specified in either in WHERE-clause or GROUP BY-clause ; or one in WHERE and the other in GROUP-BY

 

2)      The order of the Joint Dimension is or is NOT important ??? If AB is joint, then BA is also joint, right?

 

Meaning "select ... group by B, A" or "select ... where B = xx group by A" is also valid for AB as joint-dimension ???

 

3)      If AB is joint and AC is joint:

a.       It does NOT mean that ABC is necessarily joint group, right ???  

b.       Also BC doesn’t have to be joint, correct ??

 

Thanx, Richard.

 

From: ShaoFeng Shi [mailto:shaofengshi@apache.org <ma...@apache.org> ] 
Sent: Thursday, June 30, 2016 7:45 PM
To: user@kylin.apache.org <ma...@kylin.apache.org> 
Subject: Re: Joint VS Derived

 

This slide has introduction on the Derived Dim: http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin

 

"Joint" is introduced since 1.5.1; Using "joint" to combine multiple dimensions into "one" ; Say if you have Dim A, B, C; among them AB are always appeared together, e.g, "select ... group by A, B" or "select ... where A = xx group by B"; in this case you can declare AB as "joint"; for Kylin it looks like a 2 dim cube: AB, C, the combinations are decreased from 2^3 to 2^2; 

 

2016-06-30 6:56 GMT+08:00 Alberto Ramón <a.ramonportoles@gmail.com <ma...@gmail.com> >:

Hi

I don't understand the difference between:

- Joint Dim (from Dimension Step )
- Derived Dim (from Advance Setings  Step)

Some example ?  :)





 

-- 

Best regards,

 

Shaofeng Shi

 

No virus found in this message.
Checked by AVG - www.avg.com <http://www.avg.com> 
Version: 2016.0.7640 / Virus Database: 4613/12530 - Release Date: 07/01/16





 

-- 

Best regards,

 

Shaofeng Shi

 

No virus found in this message.
Checked by AVG - www.avg.com <http://www.avg.com> 
Version: 2016.0.7640 / Virus Database: 4613/12535 - Release Date: 07/01/16


Re: Joint VS Derived

Posted by Julian Hyde <jh...@apache.org>.
I think I understand. So if I wanted to group by “country” Kylin would group by “country, state, city” and then roll up? I suppose that’s OK if state and city are of medium cardinality, or if people never want to access by country.

It would be more compelling if you could find an example where no one ever wanted to group by X and Y separately, only X and Y together. 

> On Jul 4, 2016, at 2:46 AM, ShaoFeng Shi <sh...@apache.org> wrote:
> 
> Hi Julian,
> 
> Say I have a cube which has many dimensions, among them there are 3 dimensions: "country", "state", "city"; As we know, these three dimensions are a hierarchy, so in the past we would suggest user to declare them as a "hierarchy" relationship in Kylin, then Kylin will calculate the following combinations (for these 3 dimensions):
> 
> *
> country
> country, state
> country, state, city
> 
> Using "hierarchy" helps to reduce the combinations from 2^3 to 4, this is good;
> 
> While, my report or dashboard will group by (or filter on) all these 3 dimensions when doing location-based analysis; this means some combinations with "hierarchy" are usless; In this case, I would declare them as a "joint", then Kylin will calculation these combinations (for these 3 dimensions):
> 
> *
> country, state, city
> 
> You can see now the combination number is 2, that's a further optimization on "hierarchy", quite simple and effective.
> 
> Hope this can help. 
> 
> 
> 
> 2016-07-02 0:20 GMT+08:00 Julian Hyde <jhyde.apache@gmail.com <ma...@gmail.com>>:
> A real(ish) world example would help me understand. Can you give an example of a joint dimension in terms of sales, customers, products, etc.?
> 
>> On Jul 1, 2016, at 2:01 AM, ShaoFeng Shi <shaofengshi@apache.org <ma...@apache.org>> wrote:
>> 
>> "joint" just tell Kylin to prune certain combinations; for example, "joint AB" will prune all combinations that only has A or only has B; the order in "joint" doesn't matter;
>> 
>> for case 3), it should not be allowed on Kylin's GUI, can you verify?
>> 
>> 
>> 
>> 2016-07-01 14:47 GMT+08:00 Richard Calaba (Fishbowl) <rcalaba@fishbowl.com <ma...@fishbowl.com>>:
>> Ok, so AB are joined, if:
>> 
>>  
>> 
>> 1)      Both A and B have to be specified in either in WHERE-clause or GROUP BY-clause ; or one in WHERE and the other in GROUP-BY
>> 
>>  
>> 
>> 2)      The order of the Joint Dimension is or is NOT important ??? If AB is joint, then BA is also joint, right?
>> 
>>  
>> 
>> Meaning "select ... group by B, A" or "select ... where B = xx group by A" is also valid for AB as joint-dimension ???
>> 
>>  
>> 
>> 3)      If AB is joint and AC is joint:
>> 
>> a.       It does NOT mean that ABC is necessarily joint group, right ???  
>> 
>> b.       Also BC doesn’t have to be joint, correct ??
>> 
>>  
>> 
>> Thanx, Richard.
>> 
>>  
>> 
>> From: ShaoFeng Shi [mailto:shaofengshi@apache.org <ma...@apache.org>] 
>> Sent: Thursday, June 30, 2016 7:45 PM
>> To: user@kylin.apache.org <ma...@kylin.apache.org>
>> Subject: Re: Joint VS Derived
>> 
>>  
>> 
>> This slide has introduction on the Derived Dim: http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin <http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin>
>>  
>> 
>> "Joint" is introduced since 1.5.1; Using "joint" to combine multiple dimensions into "one" ; Say if you have Dim A, B, C; among them AB are always appeared together, e.g, "select ... group by A, B" or "select ... where A = xx group by B"; in this case you can declare AB as "joint"; for Kylin it looks like a 2 dim cube: AB, C, the combinations are decreased from 2^3 to 2^2; 
>> 
>>  
>> 
>> 2016-06-30 6:56 GMT+08:00 Alberto Ramón <a.ramonportoles@gmail.com <ma...@gmail.com>>:
>> 
>> Hi
>> 
>> I don't understand the difference between:
>> 
>> - Joint Dim (from Dimension Step )
>> - Derived Dim (from Advance Setings  Step)
>> 
>> Some example ?  :)
>> 
>> 
>> 
>> 
>>  
>> 
>> --
>> 
>> Best regards,
>> 
>>  
>> 
>> Shaofeng Shi
>> 
>>  
>> 
>> No virus found in this message.
>> Checked by AVG - www.avg.com <http://www.avg.com/>
>> Version: 2016.0.7640 / Virus Database: 4613/12530 - Release Date: 07/01/16
>> 
>> 
>> 
>> 
>> -- 
>> Best regards,
>> 
>> Shaofeng Shi
>> 
> 
> 
> 
> 
> -- 
> Best regards,
> 
> Shaofeng Shi
> 


Re: Joint VS Derived

Posted by ShaoFeng Shi <sh...@apache.org>.
Hi Julian,

Say I have a cube which has many dimensions, among them there are 3
dimensions: "country", "state", "city"; As we know, these three dimensions
are a hierarchy, so in the past we would suggest user to declare them as a
"hierarchy" relationship in Kylin, then Kylin will calculate the following
combinations (for these 3 dimensions):

*
country
country, state
country, state, city

Using "hierarchy" helps to reduce the combinations from 2^3 to 4, this is
good;

While, my report or dashboard will group by (or filter on) all these 3
dimensions when doing location-based analysis; this means some combinations
with "hierarchy" are usless; In this case, I would declare them as a
"joint", then Kylin will calculation these combinations (for these 3
dimensions):

*
country, state, city

You can see now the combination number is 2, that's a further optimization
on "hierarchy", quite simple and effective.

Hope this can help.



2016-07-02 0:20 GMT+08:00 Julian Hyde <jh...@gmail.com>:

> A real(ish) world example would help me understand. Can you give an
> example of a joint dimension in terms of sales, customers, products, etc.?
>
> On Jul 1, 2016, at 2:01 AM, ShaoFeng Shi <sh...@apache.org> wrote:
>
> "joint" just tell Kylin to prune certain combinations; for example, "joint
> AB" will prune all combinations that only has A or only has B; the order in
> "joint" doesn't matter;
>
> for case 3), it should not be allowed on Kylin's GUI, can you verify?
>
>
>
> 2016-07-01 14:47 GMT+08:00 Richard Calaba (Fishbowl) <rcalaba@fishbowl.com
> >:
>
>> Ok, so AB are joined, if:
>>
>>
>>
>> 1)      *Both* A and B have to be specified in either in WHERE-clause or
>> GROUP BY-clause ; or one in WHERE and the other in GROUP-BY
>>
>>
>>
>> 2)      The order of the Joint Dimension is or is NOT important ??? If
>> AB is joint, then BA is also joint, right?
>>
>>
>>
>> Meaning "select ... group by B, A" or "select ... where B = xx group by
>> A" is also valid for AB as joint-dimension ???
>>
>>
>>
>> 3)      If AB is joint and AC is joint:
>>
>> a.       It does NOT mean that ABC is necessarily joint group, right ???
>>
>>
>> b.       Also BC doesn’t have to be joint, correct ??
>>
>>
>>
>> Thanx, Richard.
>>
>>
>>
>> *From:* ShaoFeng Shi [mailto:shaofengshi@apache.org]
>> *Sent:* Thursday, June 30, 2016 7:45 PM
>> *To:* user@kylin.apache.org
>> *Subject:* Re: Joint VS Derived
>>
>>
>>
>> This slide has introduction on the Derived Dim:
>> http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin
>>
>>
>>
>> "Joint" is introduced since 1.5.1; Using "joint" to combine multiple
>> dimensions into "one" ; Say if you have Dim A, B, C; among them AB are
>> always appeared together, e.g, "select ... group by A, B" or "select ...
>> where A = xx group by B"; in this case you can declare AB as "joint"; for
>> Kylin it looks like a 2 dim cube: AB, C, the combinations are decreased
>> from 2^3 to 2^2;
>>
>>
>>
>> 2016-06-30 6:56 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>
>> Hi
>>
>> I don't understand the difference between:
>>
>> - Joint Dim (from Dimension Step )
>> - Derived Dim (from Advance Setings  Step)
>>
>> Some example ?  :)
>>
>>
>>
>>
>>
>> --
>>
>> Best regards,
>>
>>
>>
>> Shaofeng Shi
>>
>>
>>
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2016.0.7640 / Virus Database: 4613/12530 - Release Date: 07/01/16
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>
>
>


-- 
Best regards,

Shaofeng Shi

Re: Joint VS Derived

Posted by Julian Hyde <jh...@gmail.com>.
A real(ish) world example would help me understand. Can you give an example of a joint dimension in terms of sales, customers, products, etc.?

> On Jul 1, 2016, at 2:01 AM, ShaoFeng Shi <sh...@apache.org> wrote:
> 
> "joint" just tell Kylin to prune certain combinations; for example, "joint AB" will prune all combinations that only has A or only has B; the order in "joint" doesn't matter;
> 
> for case 3), it should not be allowed on Kylin's GUI, can you verify?
> 
> 
> 
> 2016-07-01 14:47 GMT+08:00 Richard Calaba (Fishbowl) <rcalaba@fishbowl.com <ma...@fishbowl.com>>:
> Ok, so AB are joined, if:
> 
>  
> 
> 1)      Both A and B have to be specified in either in WHERE-clause or GROUP BY-clause ; or one in WHERE and the other in GROUP-BY
> 
>  
> 
> 2)      The order of the Joint Dimension is or is NOT important ??? If AB is joint, then BA is also joint, right?
> 
>  
> 
> Meaning "select ... group by B, A" or "select ... where B = xx group by A" is also valid for AB as joint-dimension ???
> 
>  
> 
> 3)      If AB is joint and AC is joint:
> 
> a.       It does NOT mean that ABC is necessarily joint group, right ???  
> 
> b.       Also BC doesn’t have to be joint, correct ??
> 
>  
> 
> Thanx, Richard.
> 
>  
> 
> From: ShaoFeng Shi [mailto:shaofengshi@apache.org <ma...@apache.org>] 
> Sent: Thursday, June 30, 2016 7:45 PM
> To: user@kylin.apache.org <ma...@kylin.apache.org>
> Subject: Re: Joint VS Derived
> 
>  
> 
> This slide has introduction on the Derived Dim: http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin <http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin>
>  
> 
> "Joint" is introduced since 1.5.1; Using "joint" to combine multiple dimensions into "one" ; Say if you have Dim A, B, C; among them AB are always appeared together, e.g, "select ... group by A, B" or "select ... where A = xx group by B"; in this case you can declare AB as "joint"; for Kylin it looks like a 2 dim cube: AB, C, the combinations are decreased from 2^3 to 2^2; 
> 
>  
> 
> 2016-06-30 6:56 GMT+08:00 Alberto Ramón <a.ramonportoles@gmail.com <ma...@gmail.com>>:
> 
> Hi
> 
> I don't understand the difference between:
> 
> - Joint Dim (from Dimension Step )
> - Derived Dim (from Advance Setings  Step)
> 
> Some example ?  :)
> 
> 
> 
> 
>  
> 
> --
> 
> Best regards,
> 
>  
> 
> Shaofeng Shi
> 
>  
> 
> No virus found in this message.
> Checked by AVG - www.avg.com <http://www.avg.com/>
> Version: 2016.0.7640 / Virus Database: 4613/12530 - Release Date: 07/01/16
> 
> 
> 
> 
> -- 
> Best regards,
> 
> Shaofeng Shi
> 


Re: Joint VS Derived

Posted by ShaoFeng Shi <sh...@apache.org>.
"joint" just tell Kylin to prune certain combinations; for example, "joint
AB" will prune all combinations that only has A or only has B; the order in
"joint" doesn't matter;

for case 3), it should not be allowed on Kylin's GUI, can you verify?



2016-07-01 14:47 GMT+08:00 Richard Calaba (Fishbowl) <rc...@fishbowl.com>:

> Ok, so AB are joined, if:
>
>
>
> 1)      *Both* A and B have to be specified in either in WHERE-clause or
> GROUP BY-clause ; or one in WHERE and the other in GROUP-BY
>
>
>
> 2)      The order of the Joint Dimension is or is NOT important ??? If AB
> is joint, then BA is also joint, right?
>
>
>
> Meaning "select ... group by B, A" or "select ... where B = xx group by
> A" is also valid for AB as joint-dimension ???
>
>
>
> 3)      If AB is joint and AC is joint:
>
> a.       It does NOT mean that ABC is necessarily joint group, right ???
>
> b.       Also BC doesn’t have to be joint, correct ??
>
>
>
> Thanx, Richard.
>
>
>
> *From:* ShaoFeng Shi [mailto:shaofengshi@apache.org]
> *Sent:* Thursday, June 30, 2016 7:45 PM
> *To:* user@kylin.apache.org
> *Subject:* Re: Joint VS Derived
>
>
>
> This slide has introduction on the Derived Dim:
> http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin
>
>
>
> "Joint" is introduced since 1.5.1; Using "joint" to combine multiple
> dimensions into "one" ; Say if you have Dim A, B, C; among them AB are
> always appeared together, e.g, "select ... group by A, B" or "select ...
> where A = xx group by B"; in this case you can declare AB as "joint"; for
> Kylin it looks like a 2 dim cube: AB, C, the combinations are decreased
> from 2^3 to 2^2;
>
>
>
> 2016-06-30 6:56 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>
> Hi
>
> I don't understand the difference between:
>
> - Joint Dim (from Dimension Step )
> - Derived Dim (from Advance Setings  Step)
>
> Some example ?  :)
>
>
>
>
>
> --
>
> Best regards,
>
>
>
> Shaofeng Shi
>
>
>
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2016.0.7640 / Virus Database: 4613/12530 - Release Date: 07/01/16
>



-- 
Best regards,

Shaofeng Shi

RE: Joint VS Derived

Posted by "Richard Calaba (Fishbowl)" <rc...@fishbowl.com>.
Ok, so AB are joined, if:

 

1)      Both A and B have to be specified in either in WHERE-clause or GROUP BY-clause ; or one in WHERE and the other in GROUP-BY

 

2)      The order of the Joint Dimension is or is NOT important ??? If AB is joint, then BA is also joint, right?

 

Meaning "select ... group by B, A" or "select ... where B = xx group by A" is also valid for AB as joint-dimension ???

 

3)      If AB is joint and AC is joint:

a.       It does NOT mean that ABC is necessarily joint group, right ???  

b.       Also BC doesn’t have to be joint, correct ??

 

Thanx, Richard.

 

From: ShaoFeng Shi [mailto:shaofengshi@apache.org] 
Sent: Thursday, June 30, 2016 7:45 PM
To: user@kylin.apache.org
Subject: Re: Joint VS Derived

 

This slide has introduction on the Derived Dim: http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin

 

"Joint" is introduced since 1.5.1; Using "joint" to combine multiple dimensions into "one" ; Say if you have Dim A, B, C; among them AB are always appeared together, e.g, "select ... group by A, B" or "select ... where A = xx group by B"; in this case you can declare AB as "joint"; for Kylin it looks like a 2 dim cube: AB, C, the combinations are decreased from 2^3 to 2^2; 

 

2016-06-30 6:56 GMT+08:00 Alberto Ramón <a.ramonportoles@gmail.com <ma...@gmail.com> >:

Hi

I don't understand the difference between:

- Joint Dim (from Dimension Step )
- Derived Dim (from Advance Setings  Step)

Some example ?  :)





 

-- 

Best regards,

 

Shaofeng Shi

 

No virus found in this message.
Checked by AVG - www.avg.com <http://www.avg.com> 
Version: 2016.0.7640 / Virus Database: 4613/12530 - Release Date: 07/01/16