You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Raymond Xu (Jira)" <ji...@apache.org> on 2020/06/09 02:37:00 UTC
[jira] [Comment Edited] (HUDI-781) Re-design test utilities
[ https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128778#comment-17128778 ]
Raymond Xu edited comment on HUDI-781 at 6/9/20, 2:36 AM:
----------------------------------------------------------
[~yanghua] [~vinoth] [~nishith29] [~garyli1019]
Here is an execution plan of the subtasks
* To begin with, I'm trying to finish subtask #1 as it can be a quick win. As shown in [https://github.com/apache/hudi/pull/1619#issuecomment-627610722,] we can reduce CI time by 10+ min by simply split the test tasks
* In parallel we can start #3. The proposed `hudi-testutils` module is to encompass all `testutils` from each module, which makes the test dependencies clearer. It will clean up some misplaced tests found during package restructure.
** org.apache.hudi.execution.TestBoundedInMemoryQueue in `hudi-client` should be put in `hudi-common` (due to client test harness dependency)
** org.apache.hudi.utilities.inline.fs.TestParquetInLining in `hudi-utilities` should be put in `hudi-common` (due to data generator dependency)
* Once a minimum setup of `hudi-testutils` is done, we can start #4
** Implement a shared spark session provider there
** Use the shared spark session provider for test suites, which group functional tests with similar setup/teardown logic (may need to figure out Junit 5 version of Junit 4 test suites with Rule / ClassRule )
** By using the new provider class on functional tests one by one, we should start observing reduced test time of hudi-client module or others
* #2 and #5 can be done in parallel
Each subtask has its own detailed points in its ticket. Please review this rough plan and feedback accordingly. Thanks!
was (Author: rxu):
[~yanghua] [~yanghua] [~nishith29] [~garyli1019]
Here is an execution plan of the subtasks
* To begin with, I'm trying to finish subtask #1 as it can be a quick win. As shown in [https://github.com/apache/hudi/pull/1619#issuecomment-627610722,] we can reduce CI time by 10+ min by simply split the test tasks
* In parallel we can start #3. The proposed `hudi-testutils` module is to encompass all `testutils` from each module, which makes the test dependencies clearer. It will clean up some misplaced tests found during package restructure.
** org.apache.hudi.execution.TestBoundedInMemoryQueue in `hudi-client` should be put in `hudi-common` (due to client test harness dependency)
** org.apache.hudi.utilities.inline.fs.TestParquetInLining in `hudi-utilities` should be put in `hudi-common` (due to data generator dependency)
* Once a minimum setup of `hudi-testutils` is done, we can start #4
** Implement a shared spark session provider there
** Use the shared spark session provider for test suites, which group functional tests with similar setup/teardown logic (may need to figure out Junit 5 version of Junit 4 test suites with Rule / ClassRule )
** By using the new provider class on functional tests one by one, we should start observing reduced test time of hudi-client module or others
* #2 and #5 can be done in parallel
Each subtask has its own detailed points in its ticket. Please review this rough plan and feedback accordingly. Thanks!
> Re-design test utilities
> ------------------------
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi
> Issue Type: Test
> Components: Testing
> Reporter: Raymond Xu
> Priority: Major
>
> Test utility classes are to re-designed with considerations like
> * Use more mockings
> * Reduce spark context setup
> * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)