You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "pengzhiwei (Jira)" <ji...@apache.org> on 2020/11/30 13:50:00 UTC

[jira] [Created] (HUDI-1425) Performance loss with the additional hoodieRecords.isEmpty() in HoodieSparkSqlWriter#write

pengzhiwei created HUDI-1425:
--------------------------------

             Summary: Performance loss with the additional hoodieRecords.isEmpty() in HoodieSparkSqlWriter#write
                 Key: HUDI-1425
                 URL: https://issues.apache.org/jira/browse/HUDI-1425
             Project: Apache Hudi
          Issue Type: Improvement
          Components: Spark Integration
            Reporter: pengzhiwei
             Fix For: 0.6.1
         Attachments: 截屏2020-11-30 下午9.47.55.png

Currently in HoodieSparkSqlWriter#write, there is a _isEmpty()_ test for _hoodieRecords._ This may be a heavy operator in the case when the _hoodieRecords_ contains complex RDD operate.

!截屏2020-11-30 下午9.47.55.png|width=1255,height=161!

IMO this test does nothing to do with the performance improve,but rather affects performance.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)