You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by "ZhouQianhao (JIRA)" <ji...@apache.org> on 2015/10/25 17:55:27 UTC
[jira] [Created] (KYLIN-1094) improve performance of spark cubing
ZhouQianhao created KYLIN-1094:
----------------------------------
Summary: improve performance of spark cubing
Key: KYLIN-1094
URL: https://issues.apache.org/jira/browse/KYLIN-1094
Project: Kylin
Issue Type: Improvement
Components: Spark Engine
Affects Versions: v2.0
Reporter: ZhouQianhao
Assignee: ZhouQianhao
POC result of spark cubing shows that, on a dataset of 150 million records, MR is about 100% faster than Spark, however we believe that Spark could be at least at same speed as MR, so optimization is needed here.
We are asking Spark community for help now.
the cluster info:
vm: 8 nodes * (128G mem + 64 core)
hadoop cluster: hdp 2.2.6
spark running mode: yarn-client
spark version: 1.5.1
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)