You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by 宋海涛 <ht...@gaojihealth.com> on 2018/11/05 08:58:53 UTC

Extend Column Is Empty

Hi guys, 

 

         When I run a SQL which contains extended columns, I got empty values for extended columns . Anyone knows why?

For example , ENT_NAME and ORG_NAME are extended column based on DATA_FROM and ORG_NO: 



 

Thanks a lot. 

 

Leo


答复: Extend Column Is Empty

Posted by 宋海涛 <ht...@gaojihealth.com>.
Got it. Thanks a lot. 

 

发件人: ShaoFeng Shi [mailto:shaofengshi@apache.org] 
发送时间: 2018年11月7日 9:16
收件人: user
主题: Re: Extend Column Is Empty

 

One correction: from the code I see, when the extended column is not determined, it will use the first appeared value; In your case, the first value might be all blank, so it appeared like all empty.  

The related code is: https://github.com/apache/kylin/blob/master/core-metadata/src/main/java/org/apache/kylin/measure/extendedcolumn/ExtendedColumnMeasureType.java#L281

 

This measure has such limitations, please not use it if you are not sure about it. I'm thinking whether we need to disable it.

 

ShaoFeng Shi <sh...@apache.org> 于2018年11月7日周三 上午9:04写道:

Hi Leo,

 

"Extended column" is a special measure, which is to store a dimension value as a measure. It doesn't support aggregation, so if there are multiple values in this "measure", it will be empty.

 

Let me give an example.

 

Say you have a table, which has dimensions like "user_id", "user_name", "age"; Here the "user_id" is unique, which will be used to do filtering/searching; The "user_name" will not be used for filtering, but it is nice to show the value in the result. So, in this case, we can declare "user_id" and "age" as dimensions, but define "user_name" as an extended column. Then the cube will have 2 dimensions instead of 3, its size will be smaller.

 

When you run a query like "select user_id, user_name, count(*) from fact_table group by user_id, user_name", it will be expected, because the "user_name" value is decided (明确的) under the context of a "user_id";

 

But if you run a query without "user_id", for example, "select age, user_name, count(*) from fact_table group by age, user_name", this column will be empty, because it has no "user_id" in the row. The extended column can not work in the query independently

 

So, the conclusion is, the extended column(s) need to appear together with its "host" column. If its "host" column is absent, the value will be empty.

 

We will add this to Kylin document.

 

 

 

Francis Liang <so...@hotmail.com> 于2018年11月5日周一 下午5:21写道:

Actually you need at least one aggregation, such as sum(lost_day_cnt), or you may try select a, b from table group by a, b, which should not be the normal case however. Hope this makes sense. Best, Feng.

 

发件人:  <ma...@gaojihealth.com> 宋海涛
发送时间: 2018年11月5日 17:16
收件人: user@kylin.apache.org
主题: 答复: Extend Column Is Empty 

 

select zone, data_from,ent_name,org_no, org_name, goods_id,goods_spec, big_mc,mid_mc,sma_mc,mini_mc,lost_day_cnt,

lost_amt,avg_paid_in_amt,dt 

from gjst.st_goods_lost_monitor_detail_day  where dt='20181102' and 

zone='西南平台' and data_from='haitang' and org_no='16208'

 

发件人: Francis Liang [mailto:soulvagrant@hotmail.com] 
发送时间: 2018年11月5日 17:14
收件人: user@kylin.apache.org
主题: 答复: Extend Column Is Empty 

 

Could you share your sql statement? Thanks. Best, Feng.

 



发件人: 宋海涛 <ht...@gaojihealth.com>
发送时间: Monday, November 5, 2018 4:58:53 PM
收件人: user@kylin.apache.org
主题: Extend Column Is Empty 

 

Hi guys, 

 

         When I run a SQL which contains extended columns, I got empty values for extended columns . Anyone knows why?

For example , ENT_NAME and ORG_NAME are extended column based on DATA_FROM and ORG_NO: 



 

Thanks a lot. 

 

Leo

 




 

-- 

Best regards,

 

Shaofeng Shi 史少锋

 




 

-- 

Best regards,

 

Shaofeng Shi 史少锋

 


Re: Extend Column Is Empty

Posted by ShaoFeng Shi <sh...@apache.org>.
Kaisen introduced extended column in his blog:
https://blog.bcmeng.com/post/kylin-dimension.html#extended-column

ShaoFeng Shi <sh...@apache.org> 于2018年11月7日周三 上午9:15写道:

> One correction: from the code I see, when the extended column is not
> determined, it will use the first appeared value; In your case, the first
> value might be all blank, so it appeared like all empty.
> The related code is:
> https://github.com/apache/kylin/blob/master/core-metadata/src/main/java/org/apache/kylin/measure/extendedcolumn/ExtendedColumnMeasureType.java#L281
>
> This measure has such limitations, please not use it if you are not sure
> about it. I'm thinking whether we need to disable it.
>
> ShaoFeng Shi <sh...@apache.org> 于2018年11月7日周三 上午9:04写道:
>
>> Hi Leo,
>>
>> "Extended column" is a special measure, which is to store a dimension
>> value as a measure. It doesn't support aggregation, so if there are
>> multiple values in this "measure", it will be empty.
>>
>> Let me give an example.
>>
>> Say you have a table, which has dimensions like "*user_id*", "*user_name*",
>> "*age*"; Here the "*user_id*" is unique, which will be used to do
>> filtering/searching; The "user_name" will not be used for filtering, but it
>> is nice to show the value in the result. So, in this case, we can declare
>> "user_id" and "age" as dimensions, but define "user_name" as an extended
>> column. Then the cube will have 2 dimensions instead of 3, its size will be
>> smaller.
>>
>> When you run a query like "select user_id, user_name, count(*) from
>> fact_table group by user_id, user_name", it will be expected, because the
>> "user_name" value is decided (明确的) under the context of a "user_id";
>>
>> But if you run a query without "user_id", for example, "select age,
>> user_name, count(*) from fact_table group by age, user_name", this
>> column will be empty, because it has no "user_id" in the row. The extended
>> column can not work in the query independently
>>
>> So, the conclusion is, the extended column(s) need to appear together
>> with its "host" column. If its "host" column is absent, the value will be
>> empty.
>>
>> We will add this to Kylin document.
>>
>>
>>
>> Francis Liang <so...@hotmail.com> 于2018年11月5日周一 下午5:21写道:
>>
>>> Actually you need at least one aggregation, such as sum(lost_day_cnt),
>>> or you may try select a, b from table group by a, b, which should not be
>>> the normal case however. Hope this makes sense. Best, Feng.
>>>
>>>
>>>
>>> *发件人: *宋海涛 <ht...@gaojihealth.com>
>>> *发送时间: *2018年11月5日 17:16
>>> *收件人: *user@kylin.apache.org
>>> *主题: *答复: Extend Column Is Empty
>>>
>>>
>>>
>>> select zone, data_from,ent_name,org_no, org_name, goods_id,goods_spec,
>>> big_mc,mid_mc,sma_mc,mini_mc,lost_day_cnt,
>>>
>>> lost_amt,avg_paid_in_amt,dt
>>>
>>> from gjst.st_goods_lost_monitor_detail_day  where dt='20181102' and
>>>
>>> zone='西南平台' and data_from='haitang' and org_no='16208'
>>>
>>>
>>>
>>> *发件人:* Francis Liang [mailto:soulvagrant@hotmail.com]
>>> *发送时间:* 2018年11月5日 17:14
>>> *收件人:* user@kylin.apache.org
>>> *主题:* 答复: Extend Column Is Empty
>>>
>>>
>>>
>>> Could you share your sql statement? Thanks. Best, Feng.
>>>
>>>
>>>
>>> *发件人**:* 宋海涛 <ht...@gaojihealth.com>
>>> *发送时间**:* Monday, November 5, 2018 4:58:53 PM
>>> *收件人**:* user@kylin.apache.org
>>> *主题**:* Extend Column Is Empty
>>>
>>>
>>>
>>> Hi guys,
>>>
>>>
>>>
>>>          When I run a SQL which contains extended columns, I got empty
>>> values for extended columns . Anyone knows why?
>>>
>>> For example , ENT_NAME and ORG_NAME are extended column based on
>>> DATA_FROM and ORG_NO:
>>>
>>> [image: cid:image001.png@01D4752B.489B5550]
>>>
>>>
>>>
>>> Thanks a lot.
>>>
>>>
>>>
>>> Leo
>>>
>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

-- 
Best regards,

Shaofeng Shi 史少锋

Re: Extend Column Is Empty

Posted by ShaoFeng Shi <sh...@apache.org>.
One correction: from the code I see, when the extended column is not
determined, it will use the first appeared value; In your case, the first
value might be all blank, so it appeared like all empty.
The related code is:
https://github.com/apache/kylin/blob/master/core-metadata/src/main/java/org/apache/kylin/measure/extendedcolumn/ExtendedColumnMeasureType.java#L281

This measure has such limitations, please not use it if you are not sure
about it. I'm thinking whether we need to disable it.

ShaoFeng Shi <sh...@apache.org> 于2018年11月7日周三 上午9:04写道:

> Hi Leo,
>
> "Extended column" is a special measure, which is to store a dimension
> value as a measure. It doesn't support aggregation, so if there are
> multiple values in this "measure", it will be empty.
>
> Let me give an example.
>
> Say you have a table, which has dimensions like "*user_id*", "*user_name*",
> "*age*"; Here the "*user_id*" is unique, which will be used to do
> filtering/searching; The "user_name" will not be used for filtering, but it
> is nice to show the value in the result. So, in this case, we can declare
> "user_id" and "age" as dimensions, but define "user_name" as an extended
> column. Then the cube will have 2 dimensions instead of 3, its size will be
> smaller.
>
> When you run a query like "select user_id, user_name, count(*) from
> fact_table group by user_id, user_name", it will be expected, because the
> "user_name" value is decided (明确的) under the context of a "user_id";
>
> But if you run a query without "user_id", for example, "select age,
> user_name, count(*) from fact_table group by age, user_name", this column
> will be empty, because it has no "user_id" in the row. The extended column
> can not work in the query independently
>
> So, the conclusion is, the extended column(s) need to appear together with
> its "host" column. If its "host" column is absent, the value will be empty.
>
> We will add this to Kylin document.
>
>
>
> Francis Liang <so...@hotmail.com> 于2018年11月5日周一 下午5:21写道:
>
>> Actually you need at least one aggregation, such as sum(lost_day_cnt), or
>> you may try select a, b from table group by a, b, which should not be the
>> normal case however. Hope this makes sense. Best, Feng.
>>
>>
>>
>> *发件人: *宋海涛 <ht...@gaojihealth.com>
>> *发送时间: *2018年11月5日 17:16
>> *收件人: *user@kylin.apache.org
>> *主题: *答复: Extend Column Is Empty
>>
>>
>>
>> select zone, data_from,ent_name,org_no, org_name, goods_id,goods_spec,
>> big_mc,mid_mc,sma_mc,mini_mc,lost_day_cnt,
>>
>> lost_amt,avg_paid_in_amt,dt
>>
>> from gjst.st_goods_lost_monitor_detail_day  where dt='20181102' and
>>
>> zone='西南平台' and data_from='haitang' and org_no='16208'
>>
>>
>>
>> *发件人:* Francis Liang [mailto:soulvagrant@hotmail.com]
>> *发送时间:* 2018年11月5日 17:14
>> *收件人:* user@kylin.apache.org
>> *主题:* 答复: Extend Column Is Empty
>>
>>
>>
>> Could you share your sql statement? Thanks. Best, Feng.
>>
>>
>>
>> *发件人**:* 宋海涛 <ht...@gaojihealth.com>
>> *发送时间**:* Monday, November 5, 2018 4:58:53 PM
>> *收件人**:* user@kylin.apache.org
>> *主题**:* Extend Column Is Empty
>>
>>
>>
>> Hi guys,
>>
>>
>>
>>          When I run a SQL which contains extended columns, I got empty
>> values for extended columns . Anyone knows why?
>>
>> For example , ENT_NAME and ORG_NAME are extended column based on
>> DATA_FROM and ORG_NO:
>>
>> [image: cid:image001.png@01D4752B.489B5550]
>>
>>
>>
>> Thanks a lot.
>>
>>
>>
>> Leo
>>
>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

-- 
Best regards,

Shaofeng Shi 史少锋

Re: Extend Column Is Empty

Posted by ShaoFeng Shi <sh...@apache.org>.
Hi Leo,

"Extended column" is a special measure, which is to store a dimension value
as a measure. It doesn't support aggregation, so if there are multiple
values in this "measure", it will be empty.

Let me give an example.

Say you have a table, which has dimensions like "*user_id*", "*user_name*",
"*age*"; Here the "*user_id*" is unique, which will be used to do
filtering/searching; The "user_name" will not be used for filtering, but it
is nice to show the value in the result. So, in this case, we can declare
"user_id" and "age" as dimensions, but define "user_name" as an extended
column. Then the cube will have 2 dimensions instead of 3, its size will be
smaller.

When you run a query like "select user_id, user_name, count(*) from
fact_table group by user_id, user_name", it will be expected, because the
"user_name" value is decided (明确的) under the context of a "user_id";

But if you run a query without "user_id", for example, "select age,
user_name, count(*) from fact_table group by age, user_name", this column
will be empty, because it has no "user_id" in the row. The extended column
can not work in the query independently

So, the conclusion is, the extended column(s) need to appear together with
its "host" column. If its "host" column is absent, the value will be empty.

We will add this to Kylin document.



Francis Liang <so...@hotmail.com> 于2018年11月5日周一 下午5:21写道:

> Actually you need at least one aggregation, such as sum(lost_day_cnt), or
> you may try select a, b from table group by a, b, which should not be the
> normal case however. Hope this makes sense. Best, Feng.
>
>
>
> *发件人: *宋海涛 <ht...@gaojihealth.com>
> *发送时间: *2018年11月5日 17:16
> *收件人: *user@kylin.apache.org
> *主题: *答复: Extend Column Is Empty
>
>
>
> select zone, data_from,ent_name,org_no, org_name, goods_id,goods_spec,
> big_mc,mid_mc,sma_mc,mini_mc,lost_day_cnt,
>
> lost_amt,avg_paid_in_amt,dt
>
> from gjst.st_goods_lost_monitor_detail_day  where dt='20181102' and
>
> zone='西南平台' and data_from='haitang' and org_no='16208'
>
>
>
> *发件人:* Francis Liang [mailto:soulvagrant@hotmail.com]
> *发送时间:* 2018年11月5日 17:14
> *收件人:* user@kylin.apache.org
> *主题:* 答复: Extend Column Is Empty
>
>
>
> Could you share your sql statement? Thanks. Best, Feng.
>
>
>
> *发件人**:* 宋海涛 <ht...@gaojihealth.com>
> *发送时间**:* Monday, November 5, 2018 4:58:53 PM
> *收件人**:* user@kylin.apache.org
> *主题**:* Extend Column Is Empty
>
>
>
> Hi guys,
>
>
>
>          When I run a SQL which contains extended columns, I got empty
> values for extended columns . Anyone knows why?
>
> For example , ENT_NAME and ORG_NAME are extended column based on DATA_FROM
> and ORG_NO:
>
> [image: cid:image001.png@01D4752B.489B5550]
>
>
>
> Thanks a lot.
>
>
>
> Leo
>
>
>


-- 
Best regards,

Shaofeng Shi 史少锋

答复: Extend Column Is Empty

Posted by Francis Liang <so...@hotmail.com>.
Actually you need at least one aggregation, such as sum(lost_day_cnt), or you may try select a, b from table group by a, b, which should not be the normal case however. Hope this makes sense. Best, Feng.

发件人: 宋海涛<ma...@gaojihealth.com>
发送时间: 2018年11月5日 17:16
收件人: user@kylin.apache.org<ma...@kylin.apache.org>
主题: 答复: Extend Column Is Empty

select zone, data_from,ent_name,org_no, org_name, goods_id,goods_spec, big_mc,mid_mc,sma_mc,mini_mc,lost_day_cnt,
lost_amt,avg_paid_in_amt,dt
from gjst.st_goods_lost_monitor_detail_day  where dt='20181102' and
zone='西南平台' and data_from='haitang' and org_no='16208'

发件人: Francis Liang [mailto:soulvagrant@hotmail.com]
发送时间: 2018年11月5日 17:14
收件人: user@kylin.apache.org
主题: 答复: Extend Column Is Empty

Could you share your sql statement? Thanks. Best, Feng.


发件人: 宋海涛 <ht...@gaojihealth.com>
发送时间: Monday, November 5, 2018 4:58:53 PM
收件人: user@kylin.apache.org
主题: Extend Column Is Empty

Hi guys,

         When I run a SQL which contains extended columns, I got empty values for extended columns . Anyone knows why?
For example , ENT_NAME and ORG_NAME are extended column based on DATA_FROM and ORG_NO:
[cid:image001.png@01D4752B.489B5550]

Thanks a lot.

Leo


答复: Extend Column Is Empty

Posted by 宋海涛 <ht...@gaojihealth.com>.
select zone, data_from,ent_name,org_no, org_name, goods_id,goods_spec,
big_mc,mid_mc,sma_mc,mini_mc,lost_day_cnt,

lost_amt,avg_paid_in_amt,dt 

from gjst.st_goods_lost_monitor_detail_day  where dt='20181102' and 

zone='西南平台' and data_from='haitang' and org_no='16208'

 

发件人: Francis Liang [mailto:soulvagrant@hotmail.com] 
发送时间: 2018年11月5日 17:14
收件人: user@kylin.apache.org
主题: 答复: Extend Column Is Empty 

 

Could you share your sql statement? Thanks. Best, Feng.

 

  _____  

发件人: 宋海涛 <ht...@gaojihealth.com>
发送时间: Monday, November 5, 2018 4:58:53 PM
收件人: user@kylin.apache.org
主题: Extend Column Is Empty 

 

Hi guys, 

 

         When I run a SQL which contains extended columns, I got empty
values for extended columns . Anyone knows why?

For example , ENT_NAME and ORG_NAME are extended column based on DATA_FROM
and ORG_NO: 



 

Thanks a lot. 

 

Leo


答复: Extend Column Is Empty

Posted by Francis Liang <so...@hotmail.com>.
Could you share your sql statement? Thanks. Best, Feng.

________________________________
发件人: 宋海涛 <ht...@gaojihealth.com>
发送时间: Monday, November 5, 2018 4:58:53 PM
收件人: user@kylin.apache.org
主题: Extend Column Is Empty

Hi guys,

         When I run a SQL which contains extended columns, I got empty values for extended columns . Anyone knows why?
For example , ENT_NAME and ORG_NAME are extended column based on DATA_FROM and ORG_NO:
[cid:image001.png@01D47528.D530E500]

Thanks a lot.

Leo