You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Roberto Tardío <ro...@stratebi.com> on 2018/02/06 17:02:44 UTC
Segments magnagement and auto merging
Hi,
I have to generate a big cube, about 400 M rows of historical data (and
many dimensions in small-mid size cluster). To avoid a very big cube
building process, I divided this process into month periods (about
30-40 M rows per month). When this process finish, an hourly load
process will begin. Then we will have several historical monthly
segments and then, new incremental hourly segments. About this scenario,
arise me the following questions:
* Do you recommend merge all the historical segments?
o Sometimes we will need to rebuilt some month from the last six
months. Due to the cube size, we thougth will be faster to
rebuilt just a month segment.
* I' going to define the following auto merge times after we get all
historical data, for hourly incremental load.
o 1 day
o 7 days
o 28 days
o I understand well, this means that
+ Every day, all hourly segments will be merged.
+ Every 7 days, all daily segments will be merged.
+ Every 28 days, all 7 days segments will be merged.
o This config arises my two questions:
+ 28 days segments will be automatically merged any time?
+ our historical big segments will be automatically merged?
* I thougth that maybe I need to develop an script that merge segments
as I need (using kylin rest API), instead of using Kylin cube auto
merge option.
Thanks in advance,
Roberto
--
*Roberto Tardío Olmos*
/Senior Big Data & Business Intelligence Consultant/
Avenida de Brasil, 17, Planta 16.28020 Madrid
Fijo: 91.788.34.10