You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Luke Han (JIRA)" <ji...@apache.org> on 2015/11/05 02:22:27 UTC

[jira] [Updated] (KYLIN-1094) improve performance of spark cubing

     [ https://issues.apache.org/jira/browse/KYLIN-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luke Han updated KYLIN-1094:
----------------------------
    Fix Version/s: v2.1

> improve performance of spark cubing
> -----------------------------------
>
>                 Key: KYLIN-1094
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1094
>             Project: Kylin
>          Issue Type: Sub-task
>          Components: Spark Engine
>    Affects Versions: v2.0
>            Reporter: ZhouQianhao
>            Assignee: ZhouQianhao
>             Fix For: v2.1
>
>
> POC result of spark cubing shows that, on a dataset of 150 million records, MR is about 100% faster than Spark, however we believe that Spark could be at least at same speed as MR, so optimization is needed here.
> We are asking Spark community for help now.
> the cluster info:
> vm: 8 nodes * (128G mem + 64 core)
> hadoop cluster: hdp 2.2.6
> spark running mode: yarn-client
> spark version: 1.5.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)