You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Shaofeng SHI (JIRA)" <ji...@apache.org> on 2018/01/02 02:37:00 UTC
[jira] [Closed] (KYLIN-3123) Improve Spark Cubing
[ https://issues.apache.org/jira/browse/KYLIN-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shaofeng SHI closed KYLIN-3123.
-------------------------------
Resolution: Incomplete
> Improve Spark Cubing
> --------------------
>
> Key: KYLIN-3123
> URL: https://issues.apache.org/jira/browse/KYLIN-3123
> Project: Kylin
> Issue Type: Improvement
> Components: Spark Engine
> Affects Versions: v2.2.0
> Environment: HDP , Hbase, Spark 2.6, Centos7
> Reporter: vu thanh dat
> Labels: beginner
> Fix For: v2.2.0
>
> Attachments: dimension.bmp, measures.bmp, rowkeys.bmp, spark_so_slow_2.bmp
>
>
> Hi all,
> Im using Spark to bulid Kylin cube.
> Data is about 13 millions rows for one step. Partition by date, 10 dimension, no measures.
> I set config:
> kylin.storage.hbase.compression-codec=snappy
> kylin.engine.spark.rdd-partition-cut-mb=1000
> kylin.engine.spark.max-partition=5000
> kylin.engine.spark-conf.spark.master=yarn
> kylin.engine.spark-conf.spark.submit.deployMode=cluster
> kylin.engine.spark-conf.spark.dynamicAllocation.enabled=true
> kylin.engine.spark-conf.spark.dynamicAllocation.minExecutors=100
> kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors=10240
> kylin.engine.spark-conf.spark.dynamicAllocation.executorIdleTimeout=300
> kylin.engine.spark-conf.spark.shuffle.service.enabled=true
> kylin.engine.spark-conf.spark.shuffle.service.port=7337
> kylin.engine.spark-conf.spark.yarn.queue=default
> kylin.engine.spark-conf.spark.executor.memory=4G
> kylin.engine.spark-conf.spark.executor.cores=4
> Step Build Cube with Spark so slow, about 1hour for this step, can you show me to custom kylin config for speed up this step. I have 30s servers centos, storage 5.87T and 448 cores.
> I'm attach my config.
> Best regards and thanks!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)