You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kyle Weaver (Jira)" <ji...@apache.org> on 2021/04/01 23:16:00 UTC

[jira] [Comment Edited] (BEAM-11483) Spark PostCommit Test Improvements

    [ https://issues.apache.org/jira/browse/BEAM-11483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313471#comment-17313471 ] 

Kyle Weaver edited comment on BEAM-11483 at 4/1/21, 11:15 PM:
--------------------------------------------------------------

[~thiscensustaker] For more clarity, these test failures are in the Spark portable streaming runner, which is tested by the Jenkins job beam_PostCommit_Java_PVR_Spark_Streaming [0]. The Spark portable streaming runner has been basically neglected for a while, which is why these regressions snuck in. This runner is known to have important missing functionality, but ideally the tests should at least make it clear what functionality is really missing (as opposed to just failing due to test setup issues, etc.). It looks like the test suite last passed Sep 10, 2020 and has been failing ever since [1].

I tried running the tests locally. The commands are "./gradlew :runners:spark:2:job-server:validatesPortableRunnerStreaming" and "./gradlew :runners:spark:3:job-server:validatesPortableRunnerStreaming" for Spark versions 2 and 3, respectively. The only tests that failed for me were GroupByKeyTest$WindowTests. I wrote a PR to exclude those [2].

So why are so many tests flaking on Jenkins then? I’m not sure, but previously we had a problem with running Spark 2 and 3 together in a different test suite [3], so there may be a similar problem here. The simplest workaround would be to separate Spark 2 and 3 into separate test suites and see if they pass. The job is defined in [4].

[0] [https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Streaming|about:blank]

[1] [https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Streaming/75/|about:blank]

[2] [https://github.com/apache/beam/pull/14405]

[3] BEAM-11992

[4] [https://github.com/apache/beam/blob/44b7a87c5009315570864036baba27a303ca5eff/.test-infra/jenkins/job_PostCommit_Java_PortableValidatesRunner_Spark_Streaming.groovy#L39-L40|about:blank]


was (Author: ibzib):
[~thiscensustaker] For more clarity, these test failures are in the Spark portable streaming runner, which is tested by the Jenkins job beam_PostCommit_Java_PVR_Spark_Streaming [0]. The Spark portable streaming runner has been basically neglected for a while, which is why these regressions snuck in. This runner is known to have important missing functionality, but ideally the tests should at least make it clear what functionality is really missing (as opposed to just failing due to test setup issues, etc.). It looks like the test suite last passed Sep 10, 2020 and has been failing ever since [1].

I tried running the tests locally. The commands are ./gradlew :runners:spark:2:job-server:validatesPortableRunnerStreaming and ./gradlew :runners:spark:3:job-server:validatesPortableRunnerStreaming for Spark versions 2 and 3, respectively. The only tests that failed for me were GroupByKeyTest$WindowTests. I wrote a PR to exclude those [2].

So why are so many tests flaking on Jenkins then? I’m not sure, but previously we had a problem with running Spark 2 and 3 together in a different test suite [3], so there may be a similar problem here. The simplest workaround would be to separate Spark 2 and 3 into separate test suites and see if they pass. The job is defined in [4].

[0] [https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Streaming|about:blank]

[1] [https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Streaming/75/|about:blank]

[2] [https://github.com/apache/beam/pull/14405]

[3] BEAM-11992

[4] [https://github.com/apache/beam/blob/44b7a87c5009315570864036baba27a303ca5eff/.test-infra/jenkins/job_PostCommit_Java_PortableValidatesRunner_Spark_Streaming.groovy#L39-L40|about:blank]

> Spark PostCommit Test Improvements
> ----------------------------------
>
>                 Key: BEAM-11483
>                 URL: https://issues.apache.org/jira/browse/BEAM-11483
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-spark, test-failures
>            Reporter: Tyson Hamilton
>            Assignee: Fernando Morales
>            Priority: P1
>              Labels: flake, portability-spark
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Master bug for a group of the top failing Spark postcommit tests as of 12/17.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)