You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Luke Han (JIRA)" <ji...@apache.org> on 2015/11/05 02:22:27 UTC
[jira] [Updated] (KYLIN-1094) improve performance of spark cubing
[ https://issues.apache.org/jira/browse/KYLIN-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Luke Han updated KYLIN-1094:
----------------------------
Fix Version/s: v2.1
> improve performance of spark cubing
> -----------------------------------
>
> Key: KYLIN-1094
> URL: https://issues.apache.org/jira/browse/KYLIN-1094
> Project: Kylin
> Issue Type: Sub-task
> Components: Spark Engine
> Affects Versions: v2.0
> Reporter: ZhouQianhao
> Assignee: ZhouQianhao
> Fix For: v2.1
>
>
> POC result of spark cubing shows that, on a dataset of 150 million records, MR is about 100% faster than Spark, however we believe that Spark could be at least at same speed as MR, so optimization is needed here.
> We are asking Spark community for help now.
> the cluster info:
> vm: 8 nodes * (128G mem + 64 core)
> hadoop cluster: hdp 2.2.6
> spark running mode: yarn-client
> spark version: 1.5.1
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)