You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Alberto Ramón <a....@gmail.com> on 2016/12/08 13:27:52 UTC

Cut Size

I'm reading this MailList
<http://apache-kylin.74782.x6.nabble.com/Update-default-config-for-sandbox-environment-td6561.html>
and have some doubts (Example
<https://github.com/apache/kylin/blob/master/examples/test_case_data/sandbox/kylin.properties#L99>
):

region-cut-gb
max-region-count
hfile-size-gb

when you have hfile-size-gb, you re-split HFile using max-region-count and
region-cut-gb ?? or is for normal ingest, Kylin 1323?

Medium , small, . ..  is deprecated (KYLIN-1669
<https://issues.apache.org/jira/browse/KYLIN-1669>)? "# E.g, for cube whose
capacity be marked as "SMALL", split region per 10GB by default"  (From
Example)

Re: Cut Size

Posted by ShaoFeng Shi <sh...@apache.org>.
Ok... let me explain; The default cut is 5GB per region; If the cube's
estimated size is 3000GB, then there should be 600 regions; While, since
"max-region-count" is set to 500, so the region number in one HTable will
be controlled at <= 500; so finally it will split the regions by 3000/500 =
6GB. This is the cap; it controll the max region count in one table.

For "kylin.storage.hbase.hfile-size-gb", see
https://issues.apache.org/jira/browse/KYLIN-1323

2016-12-12 20:22 GMT+08:00 Alberto Ramón <a....@gmail.com>:

> "it will do a cap" I dont't know what cap. this means  :)
>
> Then what is the function of "kylin.storage.hbase.hfile-size-gb=2"
>
> 2016-12-12 2:58 GMT+01:00 ShaoFeng Shi <sh...@apache.org>:
>
>> when you have hfile-size-gb, you re-split HFile using max-region-count
>> and region-cut-gb ?
>>
>> --> Yes; Kylin will estimate the total size, then divide by
>> "regino-cut-gb" to get the region number; If the region number exceeds
>> "max-region-count", it will do a cap.
>>
>> Medium , small, . ..  is deprecated (KYLIN-1669
>> <https://issues.apache.org/jira/browse/KYLIN-1669>)?
>> --> Yes, that marker has been removed; Will use same split configuration
>> for all cubes; If user want to customize, he can overwrite the config
>> values at cube level.
>>
>> 2016-12-08 21:27 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>>
>>> I'm reading this MailList
>>> <http://apache-kylin.74782.x6.nabble.com/Update-default-config-for-sandbox-environment-td6561.html>
>>> and have some doubts (Example
>>> <https://github.com/apache/kylin/blob/master/examples/test_case_data/sandbox/kylin.properties#L99>
>>> ):
>>>
>>> region-cut-gb
>>> max-region-count
>>> hfile-size-gb
>>>
>>> when you have hfile-size-gb, you re-split HFile using max-region-count
>>> and region-cut-gb ?? or is for normal ingest, Kylin 1323?
>>>
>>> Medium , small, . ..  is deprecated (KYLIN-1669
>>> <https://issues.apache.org/jira/browse/KYLIN-1669>)? "# E.g, for cube
>>> whose capacity be marked as "SMALL", split region per 10GB by default"
>>> (From Example)
>>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Re: Cut Size

Posted by Alberto Ramón <a....@gmail.com>.
"it will do a cap" I dont't know what cap. this means  :)

Then what is the function of "kylin.storage.hbase.hfile-size-gb=2"

2016-12-12 2:58 GMT+01:00 ShaoFeng Shi <sh...@apache.org>:

> when you have hfile-size-gb, you re-split HFile using max-region-count and
> region-cut-gb ?
>
> --> Yes; Kylin will estimate the total size, then divide by
> "regino-cut-gb" to get the region number; If the region number exceeds
> "max-region-count", it will do a cap.
>
> Medium , small, . ..  is deprecated (KYLIN-1669
> <https://issues.apache.org/jira/browse/KYLIN-1669>)?
> --> Yes, that marker has been removed; Will use same split configuration
> for all cubes; If user want to customize, he can overwrite the config
> values at cube level.
>
> 2016-12-08 21:27 GMT+08:00 Alberto Ramón <a....@gmail.com>:
>
>> I'm reading this MailList
>> <http://apache-kylin.74782.x6.nabble.com/Update-default-config-for-sandbox-environment-td6561.html>
>> and have some doubts (Example
>> <https://github.com/apache/kylin/blob/master/examples/test_case_data/sandbox/kylin.properties#L99>
>> ):
>>
>> region-cut-gb
>> max-region-count
>> hfile-size-gb
>>
>> when you have hfile-size-gb, you re-split HFile using max-region-count
>> and region-cut-gb ?? or is for normal ingest, Kylin 1323?
>>
>> Medium , small, . ..  is deprecated (KYLIN-1669
>> <https://issues.apache.org/jira/browse/KYLIN-1669>)? "# E.g, for cube
>> whose capacity be marked as "SMALL", split region per 10GB by default"
>> (From Example)
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

Re: Cut Size

Posted by ShaoFeng Shi <sh...@apache.org>.
when you have hfile-size-gb, you re-split HFile using max-region-count and
region-cut-gb ?

--> Yes; Kylin will estimate the total size, then divide by "regino-cut-gb"
to get the region number; If the region number exceeds "max-region-count",
it will do a cap.

Medium , small, . ..  is deprecated (KYLIN-1669
<https://issues.apache.org/jira/browse/KYLIN-1669>)?
--> Yes, that marker has been removed; Will use same split configuration
for all cubes; If user want to customize, he can overwrite the config
values at cube level.

2016-12-08 21:27 GMT+08:00 Alberto Ramón <a....@gmail.com>:

> I'm reading this MailList
> <http://apache-kylin.74782.x6.nabble.com/Update-default-config-for-sandbox-environment-td6561.html>
> and have some doubts (Example
> <https://github.com/apache/kylin/blob/master/examples/test_case_data/sandbox/kylin.properties#L99>
> ):
>
> region-cut-gb
> max-region-count
> hfile-size-gb
>
> when you have hfile-size-gb, you re-split HFile using max-region-count and
> region-cut-gb ?? or is for normal ingest, Kylin 1323?
>
> Medium , small, . ..  is deprecated (KYLIN-1669
> <https://issues.apache.org/jira/browse/KYLIN-1669>)? "# E.g, for cube
> whose capacity be marked as "SMALL", split region per 10GB by default"
> (From Example)
>



-- 
Best regards,

Shaofeng Shi 史少锋