You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by Mario Copperfield <xw...@gmail.com> on 2017/01/11 10:24:02 UTC

Re: Consulting "EXTENDED_COLUMN"

I have a question that is EXTENDED_COLUMN the same as DERIVED_COLUMN?

On 1 Dec 2016, 07:35 +0800, Billy(Yiming) Liu <li...@gmail.com>, wrote:
> Thanks, Alberto. The explanation is accurate. EXTENDED_COLUMN is only used
> for representation, but not filtering or grouping which is done by
> HOST_COLUMN. So EXTENDED_COLUMN is not a dimension, it works like a
> key/value map against the HOST_COLUMN.
>
> If the value in EXTENDED_COLUMN is not long, you could just define two
> dimensions with joint dimension setting, it has almost the same performance
> impact with EXTENDED_COLUMN which reduces one dimension, but better
> understanding.
>
> 2016-11-30 19:00 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>
> > This will help you
> > http://kylin.apache.org/docs/howto/howto_optimize_cubes.html
> >
> > The idea is always, How I can reduce the number of Dimension ?
> > If you reduce Dim, the time / resources to build the cube and final size of
> > it decrease --> Its good
> >
> > An example can be DIM_Persons: Id_Person , Name, Surname, Address, .....
> > Id_Person can be HostColumn
> > and other columns can be calculated from ID --> are Extended Column
> >
> >
> >
> >
> > 2016-11-30 11:35 GMT+01:00 仇同心 <qi...@jd.com>:
> >
> > > Hi ,all
> > > I don’t understand the usage scenarios of EXTENDED_COLUMN,although I saw
> > > this article “https://issues.apache.org/jira/browse/KYLIN-1313”.
> > > What,s the means about parameters of “Host Column” and “Extended Column”?
> > > Why use this expression,and what aspects of optimization that this
> > > expression solved?
> > > Can be combined with a SQL statement to explain?
> > >
> > >
> > > Thanks~
> > >
> >
>
>
>
> --
> With Warm regards
>
> Yiming Liu (刘一鸣)

Re: Consulting "EXTENDED_COLUMN"

Posted by Mario Copperfield <xw...@gmail.com>.
Thanks, and now I understand!

On 11 Jan 2017, 20:27 +0800, ShaoFeng Shi <sh...@apache.org>, wrote:
> It is similar but has difference: 1) "DERIVED" must be a column on lookup
> table, "EXTENDED" doesn't need this, it can be a column on fact; 2)
> "DERIVED" value is from lookup table's snapthost; "EXTENDED" value is from
> this measure.
>
> 2017-01-11 18:24 GMT+08:00 Mario Copperfield <xw...@gmail.com>:
>
> > I have a question that is EXTENDED_COLUMN the same as DERIVED_COLUMN?
> >
> > On 1 Dec 2016, 07:35 +0800, Billy(Yiming) Liu <li...@gmail.com>,
> > wrote:
> > > Thanks, Alberto. The explanation is accurate. EXTENDED_COLUMN is only
> > used
> > > for representation, but not filtering or grouping which is done by
> > > HOST_COLUMN. So EXTENDED_COLUMN is not a dimension, it works like a
> > > key/value map against the HOST_COLUMN.
> > >
> > > If the value in EXTENDED_COLUMN is not long, you could just define two
> > > dimensions with joint dimension setting, it has almost the same
> > performance
> > > impact with EXTENDED_COLUMN which reduces one dimension, but better
> > > understanding.
> > >
> > > 2016-11-30 19:00 GMT+08:00 Alberto Ramón <a....@gmail.com>:
> > >
> > > > This will help you
> > > > http://kylin.apache.org/docs/howto/howto_optimize_cubes.html
> > > >
> > > > The idea is always, How I can reduce the number of Dimension ?
> > > > If you reduce Dim, the time / resources to build the cube and final
> > size of
> > > > it decrease --> Its good
> > > >
> > > > An example can be DIM_Persons: Id_Person , Name, Surname, Address,
> > .....
> > > > Id_Person can be HostColumn
> > > > and other columns can be calculated from ID --> are Extended Column
> > > >
> > > >
> > > >
> > > >
> > > > 2016-11-30 11:35 GMT+01:00 仇同心 <qi...@jd.com>:
> > > >
> > > > > Hi ,all
> > > > > I don’t understand the usage scenarios of EXTENDED_COLUMN,although I
> > saw
> > > > > this article “https://issues.apache.org/jira/browse/KYLIN-1313”.
> > > > > What,s the means about parameters of “Host Column” and “Extended
> > Column”?
> > > > > Why use this expression,and what aspects of optimization that this
> > > > > expression solved?
> > > > > Can be combined with a SQL statement to explain?
> > > > >
> > > > >
> > > > > Thanks~
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > With Warm regards
> > >
> > > Yiming Liu (刘一鸣)
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋

Re: Consulting "EXTENDED_COLUMN"

Posted by ShaoFeng Shi <sh...@apache.org>.
It is similar but has difference: 1) "DERIVED" must be a column on lookup
table, "EXTENDED" doesn't need this, it can be a column on fact; 2)
"DERIVED" value is from lookup table's snapthost; "EXTENDED" value is from
this measure.

2017-01-11 18:24 GMT+08:00 Mario Copperfield <xw...@gmail.com>:

> I have a question that is EXTENDED_COLUMN the same as DERIVED_COLUMN?
>
> On 1 Dec 2016, 07:35 +0800, Billy(Yiming) Liu <li...@gmail.com>,
> wrote:
> > Thanks, Alberto. The explanation is accurate. EXTENDED_COLUMN is only
> used
> > for representation, but not filtering or grouping which is done by
> > HOST_COLUMN. So EXTENDED_COLUMN is not a dimension, it works like a
> > key/value map against the HOST_COLUMN.
> >
> > If the value in EXTENDED_COLUMN is not long, you could just define two
> > dimensions with joint dimension setting, it has almost the same
> performance
> > impact with EXTENDED_COLUMN which reduces one dimension, but better
> > understanding.
> >
> > 2016-11-30 19:00 GMT+08:00 Alberto Ramón <a....@gmail.com>:
> >
> > > This will help you
> > > http://kylin.apache.org/docs/howto/howto_optimize_cubes.html
> > >
> > > The idea is always, How I can reduce the number of Dimension ?
> > > If you reduce Dim, the time / resources to build the cube and final
> size of
> > > it decrease --> Its good
> > >
> > > An example can be DIM_Persons: Id_Person , Name, Surname, Address,
> .....
> > > Id_Person can be HostColumn
> > > and other columns can be calculated from ID --> are Extended Column
> > >
> > >
> > >
> > >
> > > 2016-11-30 11:35 GMT+01:00 仇同心 <qi...@jd.com>:
> > >
> > > > Hi ,all
> > > > I don’t understand the usage scenarios of EXTENDED_COLUMN,although I
> saw
> > > > this article “https://issues.apache.org/jira/browse/KYLIN-1313”.
> > > > What,s the means about parameters of “Host Column” and “Extended
> Column”?
> > > > Why use this expression,and what aspects of optimization that this
> > > > expression solved?
> > > > Can be combined with a SQL statement to explain?
> > > >
> > > >
> > > > Thanks~
> > > >
> > >
> >
> >
> >
> > --
> > With Warm regards
> >
> > Yiming Liu (刘一鸣)
>



-- 
Best regards,

Shaofeng Shi 史少锋