You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2018/06/14 17:07:40 UTC

[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

GitHub user mgaido91 opened a pull request:

    https://github.com/apache/spark/pull/21568

    [SPARK-24562][TESTS] Support different configs for same test in SQLQueryTestSuite

    ## What changes were proposed in this pull request?
    
    The PR proposes to add support for running the same SQL test input files against different configs, leading either to the same result or to a different one.
    
    ## How was this patch tested?
    
    Involved UTs


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mgaido91/spark SPARK-24562

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21568.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21568
    
----
commit ed01ff0d40fbe65b3a239b196a90013119ad3580
Author: Marco Gaido <ma...@...>
Date:   2018-06-13T15:57:17Z

    Different config for same test in SQLQueryTestSuite

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    We need to strictly handle the second case, too? If we accepted the same output files for that case, we could have simpler output file name rules as I described [here](https://github.com/apache/spark/pull/21568#discussion_r195892983)?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    We can deal with the decimal test file specially if that's the only use case. For now I'd say the join test is more important and let's finish it first.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195920341
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -104,11 +105,34 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
                           // We should ignore this file from processing.
       )
     
    +  private val configsAllJoinTypes = Seq(Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key ->
    --- End diff --
    
    sorry I am not sure what you mean exactly. Every config has it is own description which explains what it is intended for. Did you mean something different?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r201738385
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -138,18 +139,58 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
       private def runTest(testCase: TestCase): Unit = {
         val input = fileToString(new File(testCase.inputFile))
     
    +    val (comments, code) = input.split("\n").partition(_.startsWith("--"))
    +    val configSets = {
    +      val configLines = comments.filter(_.startsWith("--SET")).map(_.substring(5))
    +      val configs = configLines.map(_.split(",").map { confAndValue =>
    +        val (conf, value) = confAndValue.span(_ != '=')
    +        conf.trim -> value.substring(1).trim
    +      })
    +      // When we are regenerating the golden files we don't need to run all the configs as they
    +      // all need to return the same result
    +      if (regenerateGoldenFiles && configs.nonEmpty) {
    +        configs.take(1)
    +      } else {
    +        configs
    +      }
    +    }
         // List of SQL queries to run
    -    val queries: Seq[String] = {
    -      val cleaned = input.split("\n").filterNot(_.startsWith("--")).mkString("\n")
    -      // note: this is not a robust way to split queries using semicolon, but works for now.
    -      cleaned.split("(?<=[^\\\\]);").map(_.trim).filter(_ != "").toSeq
    +    // note: this is not a robust way to split queries using semicolon, but works for now.
    +    val queries = code.mkString("\n").split("(?<=[^\\\\]);").map(_.trim).filter(_ != "").toSeq
    +
    +    if (configSets.isEmpty) {
    +      runQueries(queries, testCase.resultFile, None)
    +    } else {
    +      configSets.foreach { configSet =>
    +        try {
    --- End diff --
    
    mmh, but we know which are the configs causing a failure also here (and we are logging them). Am I missing something from your comment?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195893037
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -104,11 +105,34 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
                           // We should ignore this file from processing.
       )
     
    +  private val configsAllJoinTypes = Seq(Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key ->
    +    SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.defaultValueString),
    +    Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1",
    +      SQLConf.PREFER_SORTMERGEJOIN.key -> "true"),
    +    Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1",
    +      SQLConf.PREFER_SORTMERGEJOIN.key -> "false")) -> true
    +
    +  /**
    +   * Maps a test with the set of configurations it has to run with and a flag indicating whether
    +   * the output must be the same with different configs or it has to be different.
    +   */
    +  private val testConfigs: Map[String, (Seq[Seq[(String, String)]], Boolean)] = Map(
    +    "typeCoercion/native/decimalArithmeticOperations.sql" ->
    +      (Seq(Seq(SQLConf.DECIMAL_OPERATIONS_ALLOW_PREC_LOSS.key -> "true"),
    +        Seq(SQLConf.DECIMAL_OPERATIONS_ALLOW_PREC_LOSS.key -> "false")) -> false),
    +    "subquery/in-subquery/in-joins.sql" -> configsAllJoinTypes,
    --- End diff --
    
    This entry and the others blow don't change existing test result output?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Sorry,what do you mean by a config matrix? And how would we discriminate whether each config should produce the same result or they should produce different ones?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91859/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Sorry, @maropu, I didn't get want you mean by
    
    > If we accepted the same output files for that case
    
    may you please explain me?
    
    Anyway, the problem with that proposal is not really about the filenames, but it is about adding the configs inside the files as comments. Because if we have the same output file we cannot include them in the header...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    It's good to cover both of the cases in one design, but I'd like to prioritize the join one.
    
    I feel it's common to try with different optimization/runtime configs and make sure we get corrected result. It's more important than the decimal one that just saves some typing.
    
    Seems it's hard to reach a consensus of a good design to cover both of the cases, how about we just do the join one? i.e. a SQL test file can specify a config matrix(we need to design a syntax for it), and the test framework should run this test file with specified configs and their values to make sure the results all match the golden file.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    The goal here was to address both cases. The need came out in previous PRs in order to avoid to copy and paste the same queries in order to test them with different configs. So the idea here was to have an infra which covers both cases in order to avoid this copy-and paste (see decimalOperations.sql for instance).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    @rxin then we can do what @maropu suggested, i.e. adding a numeric suffix and maybe logging the used configs so even though we.don't have them in the test name we can anyway know which of them is failing. What do you think? Do you have a better proposal?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195921297
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -104,11 +105,34 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
                           // We should ignore this file from processing.
       )
     
    +  private val configsAllJoinTypes = Seq(Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key ->
    --- End diff --
    
    I see now, got it, thanks. I am not sure as we do not usually do so in other test suites. @cloud-fan @maropu what do you think?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    The main other case here was to improve also the coverage for the join operations,  because with the default values we are testing only the broadcast join.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    If they produce different results why do you need any infrastructure for them? They are just part of the normal test flow.
    
    If they produce the same result, and you don't want to define the same test queries twice, we can create an infra for that. I thought that's what this is about?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195895110
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -104,11 +105,34 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
                           // We should ignore this file from processing.
       )
     
    +  private val configsAllJoinTypes = Seq(Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key ->
    +    SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.defaultValueString),
    +    Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1",
    +      SQLConf.PREFER_SORTMERGEJOIN.key -> "true"),
    +    Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1",
    +      SQLConf.PREFER_SORTMERGEJOIN.key -> "false")) -> true
    +
    +  /**
    +   * Maps a test with the set of configurations it has to run with and a flag indicating whether
    +   * the output must be the same with different configs or it has to be different.
    +   */
    +  private val testConfigs: Map[String, (Seq[Seq[(String, String)]], Boolean)] = Map(
    +    "typeCoercion/native/decimalArithmeticOperations.sql" ->
    +      (Seq(Seq(SQLConf.DECIMAL_OPERATIONS_ALLOW_PREC_LOSS.key -> "true"),
    +        Seq(SQLConf.DECIMAL_OPERATIONS_ALLOW_PREC_LOSS.key -> "false")) -> false),
    +    "subquery/in-subquery/in-joins.sql" -> configsAllJoinTypes,
    --- End diff --
    
    Exactly, changing these configs should not change the output.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195987686
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -250,11 +278,33 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
       }
     
       private def listTestCases(): Seq[TestCase] = {
    -    listFilesRecursively(new File(inputFilePath)).map { file =>
    -      val resultFile = file.getAbsolutePath.replace(inputFilePath, goldenFilePath) + ".out"
    -      val absPath = file.getAbsolutePath
    -      val testCaseName = absPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
    -      TestCase(testCaseName, absPath, resultFile)
    +    listFilesRecursively(new File(inputFilePath)).flatMap { file =>
    +      testCases(file.getAbsolutePath)
    +    }
    +  }
    +
    +  private def testCases(inputPath: String): Seq[TestCase] = {
    +    val baseResultFileName = inputPath.replace(inputFilePath, goldenFilePath)
    +    val testCaseName = inputPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
    +    testConfigs.get(testCaseName) match {
    --- End diff --
    
    If we need to take care of the issue, how about listing up all the test files inside `SQLQueryTestSuite.scala`? Then, if developers add/delete/rename test files, they need to update the list?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195920498
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -104,11 +105,34 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
                           // We should ignore this file from processing.
       )
     
    +  private val configsAllJoinTypes = Seq(Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key ->
    --- End diff --
    
    Oh, I meant that here you definitely want to test against different join types, maybe it is good if you can describe which join type each config set means.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/149/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    **[Test build #91859 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91859/testReport)** for PR 21568 at commit [`ed01ff0`](https://github.com/apache/spark/commit/ed01ff0d40fbe65b3a239b196a90013119ad3580).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    In the same result case, I'm worried that we cannot easily understand which SQL configs cause failures? 
    IMHO `withSQLConf` has the same issue, too;
    ```
    Seq(true, false).foreach { flag =>
      withSQLConf("spark.sql.anyFlag" -> flag.toString) {
        ...
      }
    }
    ```
    In this test case (common patterns?), we cannot understand which case (true or false) causes the failure at first glance. For example, can we use `withClue` to solve this?
    https://github.com/apache/spark/compare/master...maropu:AddConfigInfoInException


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    IIUC this pr modified code to run a single test file multiple times in `SQLQueryTestSuite` with different configurations.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    To me it is actually confusing to have the decimal one in there at all, by
    defining a list of queries that are reused for different functional
    testing. It is very easy to just ignore the subtle differences.
    
    We are also risk over engineering this with only one use case.
    
    On Tue, Jul 10, 2018 at 8:20 AM Wenchen Fan <no...@github.com>
    wrote:
    
    > We can deal with the decimal test file specially if that's the only use
    > case. For now I'd say the join test is more important and let's finish it
    > first.
    >
    > —
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/spark/pull/21568#issuecomment-403860889>, or mute
    > the thread
    > <https://github.com/notifications/unsubscribe-auth/AATvPMjJPsZhXrOo_pbuxz-GwvKdds9lks5uFMYkgaJpZM4UoVQo>
    > .
    >



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    kindly ping @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Can you just define a config matrix in the beginning of the file, and each file is run with the config matrix?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    **[Test build #91859 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91859/testReport)** for PR 21568 at commit [`ed01ff0`](https://github.com/apache/spark/commit/ed01ff0d40fbe65b3a239b196a90013119ad3580).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    I am not sure it is a great idea to do only one of the 2 scenarios, if we plan to later include both them as we might have to redo the same work twice. But if you all agree on this plan I'll stick to it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    What are the use cases other than decimal? I am not sure if we need to build a lot of infrastructure just for one or two use cases.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    @cloud-fan @rxin I updated the PR in order to handle only the case in which we have the same result for different configs and the configs are specified in the SQL files.
    In order to make clear which config caused a job error I added a ERROR message in the logs.
    
    Do you all agree with this approach?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195991799
  
    --- Diff: sql/core/src/test/resources/sql-tests/results/typeCoercion/native/decimalArithmeticOperations.sql_spark.sql.decimalOperations.allowPrecisionLoss-false.out ---
    @@ -0,0 +1,193 @@
    +-- Automatically generated by SQLQueryTestSuite
    --- End diff --
    
    I was thinking the second case also output multiple **same** result files for each config.
    ```
    // these files have the same result
    subquery/in-subquery/in-joins.sql.out.1 <- sort-merge joins
    subquery/in-subquery/in-joins.sql.out.2 <- hash joins
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    **[Test build #92857 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92857/testReport)** for PR 21568 at commit [`6f90e63`](https://github.com/apache/spark/commit/6f90e63b0ca2b929c2e6e8be6b090c4fe0bb9583).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195921363
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -250,11 +278,33 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
       }
     
       private def listTestCases(): Seq[TestCase] = {
    -    listFilesRecursively(new File(inputFilePath)).map { file =>
    -      val resultFile = file.getAbsolutePath.replace(inputFilePath, goldenFilePath) + ".out"
    -      val absPath = file.getAbsolutePath
    -      val testCaseName = absPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
    -      TestCase(testCaseName, absPath, resultFile)
    +    listFilesRecursively(new File(inputFilePath)).flatMap { file =>
    +      testCases(file.getAbsolutePath)
    +    }
    +  }
    +
    +  private def testCases(inputPath: String): Seq[TestCase] = {
    +    val baseResultFileName = inputPath.replace(inputFilePath, goldenFilePath)
    +    val testCaseName = inputPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
    +    testConfigs.get(testCaseName) match {
    --- End diff --
    
    Mmmh... Well, actually it is not very silently, as you would see it executes only once it the test runs and not many times with different suffixes. But of course it won't fail. @cloud-fan @maropu  what do you think?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195895102
  
    --- Diff: sql/core/src/test/resources/sql-tests/results/typeCoercion/native/decimalArithmeticOperations.sql_spark.sql.decimalOperations.allowPrecisionLoss-false.out ---
    @@ -0,0 +1,193 @@
    +-- Automatically generated by SQLQueryTestSuite
    --- End diff --
    
    Since we have two cases:
     1 - Different configs produce different results (so different files) and in this case your suggestion is fine;
     2 - Different configs produce the same results (so we have one golden file for all of them), how would you address this case?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    @rxin the PR does what was suggested in these 2 comments https://github.com/apache/spark/pull/20023#issuecomment-358306751 and https://github.com/apache/spark/pull/21529#issuecomment-396413144. Basically we want to run the same SQL test file with different configs.
    
    We have two cases:
     - Running with different configs produces different output (so different golden files);
     - Running with different configs produces the same output (so we have only one golden file) but the tests are run against different configs.
    
    The goals are to avoid to copy and paste the same queries after setting different configurations (as it was done in `decimalArithmeticOperations`) and to be able to improve test coverage for the joins (because with default configs we basically always execute broadcast joins).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195892983
  
    --- Diff: sql/core/src/test/resources/sql-tests/results/typeCoercion/native/decimalArithmeticOperations.sql_spark.sql.decimalOperations.allowPrecisionLoss-false.out ---
    @@ -0,0 +1,193 @@
    +-- Automatically generated by SQLQueryTestSuite
    --- End diff --
    
    This new feature pretty looks to me. Btw, if we set multiple configurations, filenames get too long? How about adding the configuration info. in the head of files?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    I think it's super confusing to have the config names encoded in file names. Makes the names super long and difficult to read, and also hard to verify what was set, and difficult to get multiple configs.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    cc @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4040/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r201740497
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -138,18 +139,58 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
       private def runTest(testCase: TestCase): Unit = {
         val input = fileToString(new File(testCase.inputFile))
     
    +    val (comments, code) = input.split("\n").partition(_.startsWith("--"))
    +    val configSets = {
    +      val configLines = comments.filter(_.startsWith("--SET")).map(_.substring(5))
    +      val configs = configLines.map(_.split(",").map { confAndValue =>
    +        val (conf, value) = confAndValue.span(_ != '=')
    +        conf.trim -> value.substring(1).trim
    +      })
    +      // When we are regenerating the golden files we don't need to run all the configs as they
    +      // all need to return the same result
    +      if (regenerateGoldenFiles && configs.nonEmpty) {
    +        configs.take(1)
    +      } else {
    +        configs
    +      }
    +    }
         // List of SQL queries to run
    -    val queries: Seq[String] = {
    -      val cleaned = input.split("\n").filterNot(_.startsWith("--")).mkString("\n")
    -      // note: this is not a robust way to split queries using semicolon, but works for now.
    -      cleaned.split("(?<=[^\\\\]);").map(_.trim).filter(_ != "").toSeq
    +    // note: this is not a robust way to split queries using semicolon, but works for now.
    +    val queries = code.mkString("\n").split("(?<=[^\\\\]);").map(_.trim).filter(_ != "").toSeq
    +
    +    if (configSets.isEmpty) {
    +      runQueries(queries, testCase.resultFile, None)
    +    } else {
    +      configSets.foreach { configSet =>
    +        try {
    --- End diff --
    
    ah sorry I misread the code.
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    I'm confused by the description. What does this PR actually do?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r196023232
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -250,11 +278,33 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
       }
     
       private def listTestCases(): Seq[TestCase] = {
    -    listFilesRecursively(new File(inputFilePath)).map { file =>
    -      val resultFile = file.getAbsolutePath.replace(inputFilePath, goldenFilePath) + ".out"
    -      val absPath = file.getAbsolutePath
    -      val testCaseName = absPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
    -      TestCase(testCaseName, absPath, resultFile)
    +    listFilesRecursively(new File(inputFilePath)).flatMap { file =>
    +      testCases(file.getAbsolutePath)
    +    }
    +  }
    +
    +  private def testCases(inputPath: String): Seq[TestCase] = {
    +    val baseResultFileName = inputPath.replace(inputFilePath, goldenFilePath)
    +    val testCaseName = inputPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
    +    testConfigs.get(testCaseName) match {
    --- End diff --
    
    Honestly I do not really like this idea @maropu, it puts an extra effort which is not really needed...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/847/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92857/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    @maropu I think we can, instead. Since the configs are put as suffix in the test case name (this happens also in the same result case) you know which configs failed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r201737045
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -138,18 +139,58 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
       private def runTest(testCase: TestCase): Unit = {
         val input = fileToString(new File(testCase.inputFile))
     
    +    val (comments, code) = input.split("\n").partition(_.startsWith("--"))
    +    val configSets = {
    +      val configLines = comments.filter(_.startsWith("--SET")).map(_.substring(5))
    +      val configs = configLines.map(_.split(",").map { confAndValue =>
    +        val (conf, value) = confAndValue.span(_ != '=')
    +        conf.trim -> value.substring(1).trim
    +      })
    +      // When we are regenerating the golden files we don't need to run all the configs as they
    +      // all need to return the same result
    +      if (regenerateGoldenFiles && configs.nonEmpty) {
    +        configs.take(1)
    +      } else {
    +        configs
    +      }
    +    }
         // List of SQL queries to run
    -    val queries: Seq[String] = {
    -      val cleaned = input.split("\n").filterNot(_.startsWith("--")).mkString("\n")
    -      // note: this is not a robust way to split queries using semicolon, but works for now.
    -      cleaned.split("(?<=[^\\\\]);").map(_.trim).filter(_ != "").toSeq
    +    // note: this is not a robust way to split queries using semicolon, but works for now.
    +    val queries = code.mkString("\n").split("(?<=[^\\\\]);").map(_.trim).filter(_ != "").toSeq
    +
    +    if (configSets.isEmpty) {
    +      runQueries(queries, testCase.resultFile, None)
    +    } else {
    +      configSets.foreach { configSet =>
    +        try {
    --- End diff --
    
    I think it's better to do the try-catch inside `runQueries`, so that we can know which config cause a failure.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195919861
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -104,11 +105,34 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
                           // We should ignore this file from processing.
       )
     
    +  private val configsAllJoinTypes = Seq(Seq(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key ->
    --- End diff --
    
    Add a comment describing what each config set means for?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21568#discussion_r195920603
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
    @@ -250,11 +278,33 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
       }
     
       private def listTestCases(): Seq[TestCase] = {
    -    listFilesRecursively(new File(inputFilePath)).map { file =>
    -      val resultFile = file.getAbsolutePath.replace(inputFilePath, goldenFilePath) + ".out"
    -      val absPath = file.getAbsolutePath
    -      val testCaseName = absPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
    -      TestCase(testCaseName, absPath, resultFile)
    +    listFilesRecursively(new File(inputFilePath)).flatMap { file =>
    +      testCases(file.getAbsolutePath)
    +    }
    +  }
    +
    +  private def testCases(inputPath: String): Seq[TestCase] = {
    +    val baseResultFileName = inputPath.replace(inputFilePath, goldenFilePath)
    +    val testCaseName = inputPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
    +    testConfigs.get(testCaseName) match {
    --- End diff --
    
    Once a test is renamed, it might silently turn to default test config. To avoid that, maybe we should explicitly define which test case needs to run against custom configs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21568: [SPARK-24562][TESTS] Support different configs for same ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21568
  
    **[Test build #92857 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92857/testReport)** for PR 21568 at commit [`6f90e63`](https://github.com/apache/spark/commit/6f90e63b0ca2b929c2e6e8be6b090c4fe0bb9583).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21568: [SPARK-24562][TESTS] Support different configs fo...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21568


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org