You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/04/06 08:29:00 UTC

[jira] [Commented] (KYLIN-4903) cache parent datasource to accelerate next layer's cuboid building

    [ https://issues.apache.org/jira/browse/KYLIN-4903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315344#comment-17315344 ] 

ASF GitHub Bot commented on KYLIN-4903:
---------------------------------------

zzcclp commented on a change in pull request #1583:
URL: https://github.com/apache/kylin/pull/1583#discussion_r607641622



##########
File path: kylin-spark-project/kylin-spark-engine/src/main/scala/org/apache/kylin/engine/spark/job/BuildLayoutWithUpdate.java
##########
@@ -38,8 +46,45 @@
     private ExecutorService pool = Executors.newCachedThreadPool();
     private CompletionService<JobResult> completionService = new ExecutorCompletionService<>(pool);
     private int currentLayoutsNum = 0;
+    private Map<Long, AtomicLong> toBuildCuboidSize = new ConcurrentHashMap<>();
+    private Semaphore semaphore;
+    private Map<Long, Dataset<Row>> layout2DataSet = new ConcurrentHashMap<>();
+    private KylinConfig kylinConfig;
+    private boolean persistParentDataset;
+
+    public BuildLayoutWithUpdate(KylinConfig kylinConfig) {
+        this.kylinConfig = kylinConfig;
+        this.persistParentDataset = !kylinConfig.getParentDatasetStorageLevel().equals(StorageLevel.NONE());

Review comment:
       `kylinConfig.getParentDatasetStorageLevel().equals(StorageLevel.NONE())` will always return false , because 
    string can't equal to StorageLevel, it needs to use `StorageLevel.fromString(value).equals(StorageLevel.NONE())`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> cache parent datasource to accelerate next layer's cuboid building
> ------------------------------------------------------------------
>
>                 Key: KYLIN-4903
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4903
>             Project: Kylin
>          Issue Type: Improvement
>    Affects Versions: v4.0.0-beta
>            Reporter: ShengJun Zheng
>            Assignee: ShengJun Zheng
>            Priority: Major
>             Fix For: v4.0.0-GA
>
>
> In Kylin V4, parent datasource is not cached in next layer's cuboid building, causing repeated HDFS files read. Cacheing parent datasource in memory will in enhance 20~30% build performance in our case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)