You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Francis Liang <so...@hotmail.com> on 2018/09/25 03:14:34 UTC

答复: Segment columns and dimension columns

Hi Shaofeng, many thanks for your reply. Just to confirm that per your example, if the “partition_date” are only used as a partition column and not being included in the dimension list, queries including it will work exactly the same as it are included both as a partition column and in the dimension list, is it right? If it is not, what’s the difference? Thanks again! Best, Feng.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

________________________________
发件人: ShaoFeng Shi <sh...@apache.org>
发送时间: Monday, September 24, 2018 2:41:55 PM
收件人: user
主题: Re: Segment columns and dimension columns

Hi Feng,

Usually, the partition date column is a dimension.

But it is not required. For example, your table has two date columns: "partition_date" and "order_date"; You data is partitioned with the "partition_date" in Hive, so you'd better use the same in Kylin, but you can use the "order_date" as a dimension and "partition_date" not in dimension list.

Francis Liang <so...@hotmail.com>> 于2018年9月19日周三 下午3:13写道:
Hi:

I am having a question regarding two types of columns mentioned above. If I am using a column as a segment column, such as DAY, and it’s required as the filter in queries, should I also put it in the dimension columns, or is it redundant to do so? If it’s included in the dimension columns, does it mean the records from different segments can’t be merged since the DAY field is different (DAY=20180918, DAY=20180919)?

Thanks for your help!

Best, Feng.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用



--
Best regards,

Shaofeng Shi 史少锋


答复: Segment columns and dimension columns

Posted by Francis Liang <so...@hotmail.com>.
Thanks a lot, Shaofeng, it really helps!

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

________________________________
发件人: ShaoFeng Shi <sh...@apache.org>
发送时间: Tuesday, September 25, 2018 11:57:00 AM
收件人: user
主题: Re: Segment columns and dimension columns

Right; The cuboids that have no "partition_date" can be merged.

Francis Liang <so...@hotmail.com>> 于2018年9月25日周二 上午11:44写道:
So the cuboids in different segments with “partition_date” included in the dimensions can not be merged since the “partition_date” is different, is it right?

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

________________________________
发件人: ShaoFeng Shi <sh...@apache.org>>
发送时间: Tuesday, September 25, 2018 11:37:40 AM
收件人: user
主题: Re: Segment columns and dimension columns

If “partition_date” is not in dimension list, you couldn't query it in SQL. In this case, it is just a cursor to load data into Cube. For query please use "ORDER_DATE".

Francis Liang <so...@hotmail.com>> 于2018年9月25日周二 上午11:14写道:
Hi Shaofeng, many thanks for your reply. Just to confirm that per your example, if the “partition_date” are only used as a partition column and not being included in the dimension list, queries including it will work exactly the same as it are included both as a partition column and in the dimension list, is it right? If it is not, what’s the difference? Thanks again! Best, Feng.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

________________________________
发件人: ShaoFeng Shi <sh...@apache.org>>
发送时间: Monday, September 24, 2018 2:41:55 PM
收件人: user
主题: Re: Segment columns and dimension columns

Hi Feng,

Usually, the partition date column is a dimension.

But it is not required. For example, your table has two date columns: "partition_date" and "order_date"; You data is partitioned with the "partition_date" in Hive, so you'd better use the same in Kylin, but you can use the "order_date" as a dimension and "partition_date" not in dimension list.

Francis Liang <so...@hotmail.com>> 于2018年9月19日周三 下午3:13写道:
Hi:

I am having a question regarding two types of columns mentioned above. If I am using a column as a segment column, such as DAY, and it’s required as the filter in queries, should I also put it in the dimension columns, or is it redundant to do so? If it’s included in the dimension columns, does it mean the records from different segments can’t be merged since the DAY field is different (DAY=20180918, DAY=20180919)?

Thanks for your help!

Best, Feng.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用



--
Best regards,

Shaofeng Shi 史少锋



--
Best regards,

Shaofeng Shi 史少锋



--
Best regards,

Shaofeng Shi 史少锋


Re: Segment columns and dimension columns

Posted by ShaoFeng Shi <sh...@apache.org>.
Right; The cuboids that have no "partition_date" can be merged.

Francis Liang <so...@hotmail.com> 于2018年9月25日周二 上午11:44写道:

> So the cuboids in different segments with “partition_date” included in the
> dimensions can not be merged since the “partition_date” is different, is it
> right?
>
>
>
> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
> ------------------------------
> *发件人:* ShaoFeng Shi <sh...@apache.org>
> *发送时间:* Tuesday, September 25, 2018 11:37:40 AM
> *收件人:* user
> *主题:* Re: Segment columns and dimension columns
>
> If “partition_date” is not in dimension list, you couldn't query it in
> SQL. In this case, it is just a cursor to load data into Cube. For query
> please use "ORDER_DATE".
>
> Francis Liang <so...@hotmail.com> 于2018年9月25日周二 上午11:14写道:
>
>> Hi Shaofeng, many thanks for your reply. Just to confirm that per your
>> example, if the “partition_date” are only used as a partition column and
>> not being included in the dimension list, queries including it will work
>> exactly the same as it are included both as a partition column and in the
>> dimension list, is it right? If it is not, what’s the difference? Thanks
>> again! Best, Feng.
>>
>>
>>
>> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>>
>>
>> ------------------------------
>> *发件人:* ShaoFeng Shi <sh...@apache.org>
>> *发送时间:* Monday, September 24, 2018 2:41:55 PM
>> *收件人:* user
>> *主题:* Re: Segment columns and dimension columns
>>
>> Hi Feng,
>>
>> Usually, the partition date column is a dimension.
>>
>> But it is not required. For example, your table has two date columns:
>> "partition_date" and "order_date"; You data is partitioned with the
>> "partition_date" in Hive, so you'd better use the same in Kylin, but you
>> can use the "order_date" as a dimension and "partition_date" not in
>> dimension list.
>>
>> Francis Liang <so...@hotmail.com> 于2018年9月19日周三 下午3:13写道:
>>
>>> Hi:
>>>
>>>
>>>
>>> I am having a question regarding two types of columns mentioned above.
>>> If I am using a column as a segment column, such as DAY, and it’s required
>>> as the filter in queries, should I also put it in the dimension columns, or
>>> is it redundant to do so? If it’s included in the dimension columns, does
>>> it mean the records from different segments can’t be merged since the DAY
>>> field is different (DAY=20180918, DAY=20180919)?
>>>
>>>
>>>
>>> Thanks for your help!
>>>
>>>
>>>
>>> Best, Feng.
>>>
>>>
>>>
>>> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>>>
>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

-- 
Best regards,

Shaofeng Shi 史少锋

答复: Segment columns and dimension columns

Posted by Francis Liang <so...@hotmail.com>.
So the cuboids in different segments with “partition_date” included in the dimensions can not be merged since the “partition_date” is different, is it right?

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

________________________________
发件人: ShaoFeng Shi <sh...@apache.org>
发送时间: Tuesday, September 25, 2018 11:37:40 AM
收件人: user
主题: Re: Segment columns and dimension columns

If “partition_date” is not in dimension list, you couldn't query it in SQL. In this case, it is just a cursor to load data into Cube. For query please use "ORDER_DATE".

Francis Liang <so...@hotmail.com>> 于2018年9月25日周二 上午11:14写道:
Hi Shaofeng, many thanks for your reply. Just to confirm that per your example, if the “partition_date” are only used as a partition column and not being included in the dimension list, queries including it will work exactly the same as it are included both as a partition column and in the dimension list, is it right? If it is not, what’s the difference? Thanks again! Best, Feng.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

________________________________
发件人: ShaoFeng Shi <sh...@apache.org>>
发送时间: Monday, September 24, 2018 2:41:55 PM
收件人: user
主题: Re: Segment columns and dimension columns

Hi Feng,

Usually, the partition date column is a dimension.

But it is not required. For example, your table has two date columns: "partition_date" and "order_date"; You data is partitioned with the "partition_date" in Hive, so you'd better use the same in Kylin, but you can use the "order_date" as a dimension and "partition_date" not in dimension list.

Francis Liang <so...@hotmail.com>> 于2018年9月19日周三 下午3:13写道:
Hi:

I am having a question regarding two types of columns mentioned above. If I am using a column as a segment column, such as DAY, and it’s required as the filter in queries, should I also put it in the dimension columns, or is it redundant to do so? If it’s included in the dimension columns, does it mean the records from different segments can’t be merged since the DAY field is different (DAY=20180918, DAY=20180919)?

Thanks for your help!

Best, Feng.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用



--
Best regards,

Shaofeng Shi 史少锋



--
Best regards,

Shaofeng Shi 史少锋


Re: Segment columns and dimension columns

Posted by ShaoFeng Shi <sh...@apache.org>.
If “partition_date” is not in dimension list, you couldn't query it in SQL.
In this case, it is just a cursor to load data into Cube. For query please
use "ORDER_DATE".

Francis Liang <so...@hotmail.com> 于2018年9月25日周二 上午11:14写道:

> Hi Shaofeng, many thanks for your reply. Just to confirm that per your
> example, if the “partition_date” are only used as a partition column and
> not being included in the dimension list, queries including it will work
> exactly the same as it are included both as a partition column and in the
> dimension list, is it right? If it is not, what’s the difference? Thanks
> again! Best, Feng.
>
>
>
> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
> ------------------------------
> *发件人:* ShaoFeng Shi <sh...@apache.org>
> *发送时间:* Monday, September 24, 2018 2:41:55 PM
> *收件人:* user
> *主题:* Re: Segment columns and dimension columns
>
> Hi Feng,
>
> Usually, the partition date column is a dimension.
>
> But it is not required. For example, your table has two date columns:
> "partition_date" and "order_date"; You data is partitioned with the
> "partition_date" in Hive, so you'd better use the same in Kylin, but you
> can use the "order_date" as a dimension and "partition_date" not in
> dimension list.
>
> Francis Liang <so...@hotmail.com> 于2018年9月19日周三 下午3:13写道:
>
>> Hi:
>>
>>
>>
>> I am having a question regarding two types of columns mentioned above. If
>> I am using a column as a segment column, such as DAY, and it’s required as
>> the filter in queries, should I also put it in the dimension columns, or is
>> it redundant to do so? If it’s included in the dimension columns, does it
>> mean the records from different segments can’t be merged since the DAY
>> field is different (DAY=20180918, DAY=20180919)?
>>
>>
>>
>> Thanks for your help!
>>
>>
>>
>> Best, Feng.
>>
>>
>>
>> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>>
>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

-- 
Best regards,

Shaofeng Shi 史少锋