You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MaxGekk <gi...@git.apache.org> on 2018/10/07 20:05:43 UTC

[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

GitHub user MaxGekk opened a pull request:

    https://github.com/apache/spark/pull/22666

    [SPARK-25672][SQL] schema_of_csv() - schema inference from an example

    ## What changes were proposed in this pull request?
    
    In the PR, I propose to add new function - *schema_of_csv()* which infers schema of CSV string literal. The result of the function is a string containing a schema in DDL format. For example:
    
    ```sql
    select schema_of_csv('1|abc', map('delimiter', '|'))
    ``` 
    ```
    struct<_c0:int,_c1:string>
    ```
    
    ## How was this patch tested?
    
    Added new tests to `CsvFunctionsSuite`, `CsvExpressionsSuite` and SQL tests to `csv-functions.sql`


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MaxGekk/spark-1 schema_of_csv-function

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22666.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22666
    
----
commit 4c00900e8bfbe56d13576d6dc21fb2f2dbbb105d
Author: Maxim Gekk <ma...@...>
Date:   2018-10-07T14:17:05Z

    Dependency of uniVocity 2.7.3 is added for sql/catalyst

commit 25f330a617e41c1207efd880be766136ce9b0bca
Author: Maxim Gekk <ma...@...>
Date:   2018-10-07T14:37:50Z

    Moving CSVOptions to sql/catalyst

commit 0d7e7990799a307794f10fe52030eca850762927
Author: Maxim Gekk <ma...@...>
Date:   2018-10-07T17:42:02Z

    Moving CSVInferSchema to sql/catalyst

commit 7abbfcae8444e88391e1d456a9a249fa5fccf6f0
Author: Maxim Gekk <ma...@...>
Date:   2018-09-16T19:12:58Z

    Added an expression test

commit 6ca4fa3e2bf6b29b82f1ece33c5a75beaf934d87
Author: Maxim Gekk <ma...@...>
Date:   2018-09-21T15:03:39Z

    Support options

commit e76536bfc62911c4e2039d4fc63d771b1c3b5fe1
Author: Maxim Gekk <ma...@...>
Date:   2018-09-21T16:05:55Z

    Register schema_of_csv and adding SQL tests

commit ef03d3a38e3a7a31a04cda901821238b01ec8f37
Author: Maxim Gekk <ma...@...>
Date:   2018-09-21T17:27:33Z

    Adding schema_of_csv and tests

commit 8ed225f3d2c5fbe3df75f8518d539fcdd5f01a2e
Author: Maxim Gekk <ma...@...>
Date:   2018-09-21T17:54:43Z

    Support schema_of_csv in PySpark

commit 5fb17fbefd52198bcf735abc132b0ab9174cbe0f
Author: Maxim Gekk <ma...@...>
Date:   2018-10-07T18:49:00Z

    2.5 -> 3.0

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Argh, sorry, it was my mistake.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97091 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97091/testReport)** for PR 22666 at commit [`5fb17fb`](https://github.com/apache/spark/commit/5fb17fbefd52198bcf735abc132b0ab9174cbe0f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r226640860
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CsvExpressionsSuite.scala ---
    @@ -155,4 +155,15 @@ class CsvExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with P
         }.getCause
         assert(exception.getMessage.contains("from_csv() doesn't support the DROPMALFORMED mode"))
       }
    +
    +  test("infer schema of CSV strings") {
    +    checkEvaluation(new SchemaOfCsv(Literal.create("1,abc")), "struct<_c0:int,_c1:string>")
    +  }
    +
    +  test("infer schema of CSV strings by using options") {
    +    checkEvaluation(
    +      new SchemaOfCsv(Literal.create("1|abc"),
    +        CreateMap(Seq(Literal.create("delimiter"), Literal.create("|")))),
    --- End diff --
    
    the main constructor of `SchemaOfCsv` accepts `Map[String, String]` directly, shall we use that?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97318 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97318/testReport)** for PR 22666 at commit [`0c5e955`](https://github.com/apache/spark/commit/0c5e955be2c1e47893c70d36e10f288a2fea2d8d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r226641023
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
    @@ -3886,6 +3886,31 @@ object functions {
         withExpr(new CsvToStructs(e.expr, schema.expr, options.asScala.toMap))
       }
     
    +  /**
    +   * Parses a column containing a CSV string and infers its schema.
    +   *
    +   * @param e a string column containing CSV data.
    +   *
    +   * @group collection_funcs
    +   * @since 3.0.0
    +   */
    +  def schema_of_csv(e: Column): Column = withExpr(new SchemaOfCsv(e.expr))
    +
    +  /**
    +   * Parses a column containing a CSV string and infers its schema using options.
    +   *
    +   * @param e a string column containing CSV data.
    +   * @param options options to control how the CSV is parsed. accepts the same options and the
    +   *                json data source. See [[DataFrameReader#csv]].
    +   * @return a column with string literal containing schema in DDL format.
    +   *
    +   * @group collection_funcs
    +   * @since 3.0.0
    +   */
    +  def schema_of_csv(e: Column, options: java.util.Map[String, String]): Column = {
    --- End diff --
    
    shall we have an API with scala Map?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98125 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98125/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r228949792
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
    @@ -7,3 +7,11 @@ select from_csv('1', 'a InvalidType');
     select from_csv('1', 'a INT', named_struct('mode', 'PERMISSIVE'));
     select from_csv('1', 'a INT', map('mode', 1));
     select from_csv();
    +-- infer schema of json literal
    +select from_csv('1,abc', schema_of_csv('1,abc'));
    +select schema_of_csv('1|abc', map('delimiter', '|'));
    +select schema_of_csv(null);
    +CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a');
    +SELECT schema_of_csv(csvField) FROM csvTable;
    +-- Clean up
    +DROP VIEW IF EXISTS csvTable;
    --- End diff --
    
    I see but isn't it still better to explicitly clean tables up?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97586 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97586/testReport)** for PR 22666 at commit [`6cbc7fb`](https://github.com/apache/spark/commit/6cbc7fb45478882c15c6694fff964da043d2445c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97585 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97585/testReport)** for PR 22666 at commit [`c9df3ab`](https://github.com/apache/spark/commit/c9df3ab40f5130cb1c3f7207e1371ddd5fb922fc).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97675 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97675/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Ahhhhh no I am sorry @MaxGekk. I made the primary author as me mistakenly. I showed my email first.
    
    ```
    === Pull Request #22666 ===
    title	[SPARK-25672][SQL] schema_of_csv() - schema inference from an example
    source	MaxGekk/schema_of_csv-function
    target	master
    url	https://api.github.com/repos/apache/spark/pulls/22666
    
    Proceed with merging pull request #22666? (y/n): y
    git fetch apache-github pull/22666/head:PR_TOOL_MERGE_PR_22666
    From https://github.com/apache/spark
     * [new ref]                 refs/pull/22666/head -> PR_TOOL_MERGE_PR_22666
    git fetch apache master:PR_TOOL_MERGE_PR_22666_MASTER
    remote: Counting objects: 303, done.
    remote: Compressing objects: 100% (153/153), done.
    remote: Total 209 (delta 91), reused 0 (delta 0)
    Receiving objects: 100% (209/209), 91.89 KiB | 445.00 KiB/s, done.
    Resolving deltas: 100% (91/91), completed with 65 local objects.
    From https://git-wip-us.apache.org/repos/asf/spark
     * [new branch]              master     -> PR_TOOL_MERGE_PR_22666_MASTER
       57eddc7182e..c5ef477d2f6  master     -> apache/master
    git checkout PR_TOOL_MERGE_PR_22666_MASTER
    Switched to branch 'PR_TOOL_MERGE_PR_22666_MASTER'
    ['git', 'merge', 'PR_TOOL_MERGE_PR_22666', '--squash']
    Automatic merge went well; stopped before committing as requested
    ['git', 'log', 'HEAD..PR_TOOL_MERGE_PR_22666', '--pretty=format:%an <%ae>']
    Enter primary author in the format of "name <email>" [hyukjinkwon <gu...@apache.org>]: hyukjinkwon <gu...@apache.org>
    ['git', 'log', 'HEAD..PR_TOOL_MERGE_PR_22666', '--pretty=format:%h [%an] %s']
    ```
    
    Looks the commit order affects the name appearing for `Enter primary author in the format of "name <email>"`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    > Let's add from_csv first.
    
    Sure, I just wanted to make it ready since the changes are not overlapped so much.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r229153592
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
    @@ -7,3 +7,11 @@ select from_csv('1', 'a InvalidType');
     select from_csv('1', 'a INT', named_struct('mode', 'PERMISSIVE'));
     select from_csv('1', 'a INT', map('mode', 1));
     select from_csv();
    +-- infer schema of json literal
    +select from_csv('1,abc', schema_of_csv('1,abc'));
    +select schema_of_csv('1|abc', map('delimiter', '|'));
    +select schema_of_csv(null);
    +CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a');
    +SELECT schema_of_csv(csvField) FROM csvTable;
    +-- Clean up
    +DROP VIEW IF EXISTS csvTable;
    --- End diff --
    
    yea we need to clean up tables, as they are permanent.
    
    Actually I'm fine with it, as we clean up temp views in a lot of golden files. We can have another PR to remove these temp view clean up.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98068/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22666


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97662/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97585 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97585/testReport)** for PR 22666 at commit [`c9df3ab`](https://github.com/apache/spark/commit/c9df3ab40f5130cb1c3f7207e1371ddd5fb922fc).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class UnivocityParserSuite extends SparkFunSuite `


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97636/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97662 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97662/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97907/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98118 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98118/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97594 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97594/testReport)** for PR 22666 at commit [`1e90261`](https://github.com/apache/spark/commit/1e90261f964129efc605ed77433477715078745c).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98313/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97588/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97593 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97593/testReport)** for PR 22666 at commit [`aead783`](https://github.com/apache/spark/commit/aead783d895069b1b6781928eb0afda740085a21).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97585/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r226814727
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
    @@ -3886,6 +3886,31 @@ object functions {
         withExpr(new CsvToStructs(e.expr, schema.expr, options.asScala.toMap))
       }
     
    +  /**
    +   * Parses a column containing a CSV string and infers its schema.
    +   *
    +   * @param e a string column containing CSV data.
    +   *
    +   * @group collection_funcs
    +   * @since 3.0.0
    +   */
    +  def schema_of_csv(e: Column): Column = withExpr(new SchemaOfCsv(e.expr))
    +
    +  /**
    +   * Parses a column containing a CSV string and infers its schema using options.
    +   *
    +   * @param e a string column containing CSV data.
    +   * @param options options to control how the CSV is parsed. accepts the same options and the
    +   *                json data source. See [[DataFrameReader#csv]].
    +   * @return a column with string literal containing schema in DDL format.
    +   *
    +   * @group collection_funcs
    +   * @since 3.0.0
    +   */
    +  def schema_of_csv(e: Column, options: java.util.Map[String, String]): Column = {
    --- End diff --
    
    `schema_of_json` also has only Java specific (I actually suggested to minimise exposed functions) since Java specific one can be used in Scala side but Scala specific can't be used in Java side.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Should be ready for a look now. Would you mind taking a look please @cloud-fan and @gatorsmile?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98079 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98079/testReport)** for PR 22666 at commit [`3ef2503`](https://github.com/apache/spark/commit/3ef25031e4e54a01c989a3fd7f6b6e1094801bc0).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97911 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97911/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97607 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97607/testReport)** for PR 22666 at commit [`41c39db`](https://github.com/apache/spark/commit/41c39db502db10b405d1b47d75d0ea961616fd37).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    @gatorsmile @cloud-fan May I ask you to look at the PR, please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    This is a WIP.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97586 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97586/testReport)** for PR 22666 at commit [`6cbc7fb`](https://github.com/apache/spark/commit/6cbc7fb45478882c15c6694fff964da043d2445c).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97911/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97654 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97654/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98307 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98307/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98125 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98125/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    @HyukjinKwon Never mind. Thank you for your work on the PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97583 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97583/testReport)** for PR 22666 at commit [`80d6759`](https://github.com/apache/spark/commit/80d67596e8a0d2c5040816d090c6ff912b76c02c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97593 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97593/testReport)** for PR 22666 at commit [`aead783`](https://github.com/apache/spark/commit/aead783d895069b1b6781928eb0afda740085a21).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97318/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97594 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97594/testReport)** for PR 22666 at commit [`1e90261`](https://github.com/apache/spark/commit/1e90261f964129efc605ed77433477715078745c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97335 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97335/testReport)** for PR 22666 at commit [`28862a5`](https://github.com/apache/spark/commit/28862a5b70f0e77a0534c1db5400aa72f6018348).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98307/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97335/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97583 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97583/testReport)** for PR 22666 at commit [`80d6759`](https://github.com/apache/spark/commit/80d67596e8a0d2c5040816d090c6ff912b76c02c).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98307/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97314 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97314/testReport)** for PR 22666 at commit [`c038aaa`](https://github.com/apache/spark/commit/c038aaa2291b79c723af956bcf5e220ae8b776a3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97335 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97335/testReport)** for PR 22666 at commit [`28862a5`](https://github.com/apache/spark/commit/28862a5b70f0e77a0534c1db5400aa72f6018348).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r228787018
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
    @@ -7,3 +7,11 @@ select from_csv('1', 'a InvalidType');
     select from_csv('1', 'a INT', named_struct('mode', 'PERMISSIVE'));
     select from_csv('1', 'a INT', map('mode', 1));
     select from_csv();
    +-- infer schema of json literal
    +select from_csv('1,abc', schema_of_csv('1,abc'));
    +select schema_of_csv('1|abc', map('delimiter', '|'));
    +select schema_of_csv(null);
    +CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a');
    +SELECT schema_of_csv(csvField) FROM csvTable;
    +-- Clean up
    +DROP VIEW IF EXISTS csvTable;
    --- End diff --
    
    actually we don't need to clean up temp views. The golden file test is run with a fresh session.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97588 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97588/testReport)** for PR 22666 at commit [`1b86834`](https://github.com/apache/spark/commit/1b86834c1265992e3b46aaf079e1e17ea7c389c4).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97595/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98313 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98313/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97654 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97654/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).
     * This patch **fails PySpark pip packaging tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98118 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98118/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97911/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97584 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97584/testReport)** for PR 22666 at commit [`4869b76`](https://github.com/apache/spark/commit/4869b76e4f35b094793ff1f69cce3edbeb922ef1).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98112 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98112/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97662 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97662/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97318 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97318/testReport)** for PR 22666 at commit [`0c5e955`](https://github.com/apache/spark/commit/0c5e955be2c1e47893c70d36e10f288a2fea2d8d).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r226640362
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
    @@ -60,7 +63,7 @@ case class CsvToStructs(
       // Used in `FunctionRegistry`
       def this(child: Expression, schema: Expression, options: Map[String, String]) =
         this(
    -      schema = ExprUtils.evalSchemaExpr(schema),
    +      schema = ExprUtils.evalSchemaExpr(schema).asInstanceOf[StructType],
    --- End diff --
    
    why do we need `asInstanceOf`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97654/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r228786427
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala ---
    @@ -19,14 +19,39 @@ package org.apache.spark.sql.catalyst.expressions
     
     import org.apache.spark.sql.AnalysisException
     import org.apache.spark.sql.catalyst.util.ArrayBasedMapData
    -import org.apache.spark.sql.types.{MapType, StringType, StructType}
    +import org.apache.spark.sql.types.{DataType, MapType, StringType, StructType}
    +import org.apache.spark.unsafe.types.UTF8String
     
     object ExprUtils {
     
    -  def evalSchemaExpr(exp: Expression): StructType = exp match {
    -    case Literal(s, StringType) => StructType.fromDDL(s.toString)
    +  def evalSchemaExpr(exp: Expression): StructType = {
    +    // Use `DataType.fromDDL` since the type string can be struct<...>.
    +    val dataType = exp match {
    +      case Literal(s, StringType) =>
    +        DataType.fromDDL(s.toString)
    +      case e @ SchemaOfCsv(_: Literal, _) =>
    +        val ddlSchema = e.eval(EmptyRow).asInstanceOf[UTF8String]
    +        DataType.fromDDL(ddlSchema.toString)
    +      case e => throw new AnalysisException(
    +        "Schema should be specified in DDL format as a string literal or output of " +
    +          s"the schema_of_csv function instead of ${e.sql}")
    +    }
    +
    +    if (!dataType.isInstanceOf[StructType]) {
    +      throw new AnalysisException(
    +        s"Schema should be struct type but got ${dataType.sql}.")
    +    }
    +    dataType.asInstanceOf[StructType]
    +  }
    +
    +  def evalTypeExpr(exp: Expression): DataType = exp match {
    +    case Literal(s, StringType) => DataType.fromDDL(s.toString)
    --- End diff --
    
    we also need to update https://github.com/apache/spark/pull/22666/files#diff-5321c01e95bffc4413c5f3457696213eR157
    
     in case the constant folding rule is disabled.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97583/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97314/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r228787126
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala ---
    @@ -19,14 +19,39 @@ package org.apache.spark.sql.catalyst.expressions
     
     import org.apache.spark.sql.AnalysisException
     import org.apache.spark.sql.catalyst.util.ArrayBasedMapData
    -import org.apache.spark.sql.types.{MapType, StringType, StructType}
    +import org.apache.spark.sql.types.{DataType, MapType, StringType, StructType}
    +import org.apache.spark.unsafe.types.UTF8String
     
     object ExprUtils {
     
    -  def evalSchemaExpr(exp: Expression): StructType = exp match {
    -    case Literal(s, StringType) => StructType.fromDDL(s.toString)
    +  def evalSchemaExpr(exp: Expression): StructType = {
    +    // Use `DataType.fromDDL` since the type string can be struct<...>.
    +    val dataType = exp match {
    +      case Literal(s, StringType) =>
    +        DataType.fromDDL(s.toString)
    +      case e @ SchemaOfCsv(_: Literal, _) =>
    +        val ddlSchema = e.eval(EmptyRow).asInstanceOf[UTF8String]
    +        DataType.fromDDL(ddlSchema.toString)
    +      case e => throw new AnalysisException(
    +        "Schema should be specified in DDL format as a string literal or output of " +
    +          s"the schema_of_csv function instead of ${e.sql}")
    +    }
    +
    +    if (!dataType.isInstanceOf[StructType]) {
    +      throw new AnalysisException(
    +        s"Schema should be struct type but got ${dataType.sql}.")
    +    }
    +    dataType.asInstanceOf[StructType]
    +  }
    +
    +  def evalTypeExpr(exp: Expression): DataType = exp match {
    +    case Literal(s, StringType) => DataType.fromDDL(s.toString)
    --- End diff --
    
    Yup, that's what I initially thought that we should allow constant-foldable expressions as well but just decided to follow the initial intent - literal only support. I wasn't also sure about when we would need constant folding to construct a JSON example because I suspected that's usually copied and pasted from, for instance, a file.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Woah .. let me resolve the conflicts tonight.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97607 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97607/testReport)** for PR 22666 at commit [`41c39db`](https://github.com/apache/spark/commit/41c39db502db10b405d1b47d75d0ea961616fd37).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Thanks, @cloud-fan. The change looks good to me from my side. Let me take another look for this and leave a sign-off (which means a sign-off for @MaxGekk's code changes)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97314 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97314/testReport)** for PR 22666 at commit [`c038aaa`](https://github.com/apache/spark/commit/c038aaa2291b79c723af956bcf5e220ae8b776a3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    @HyukjinKwon May I ask you to look at this PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97607/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97595 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97595/testReport)** for PR 22666 at commit [`8763494`](https://github.com/apache/spark/commit/876349476f2a36e66fa94bb3d4e19b7acd2882a7).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98112/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97595 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97595/testReport)** for PR 22666 at commit [`8763494`](https://github.com/apache/spark/commit/876349476f2a36e66fa94bb3d4e19b7acd2882a7).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97586/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98125/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98068 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98068/testReport)** for PR 22666 at commit [`d876b92`](https://github.com/apache/spark/commit/d876b9270afa9b30defea6d4621bcc63dc61f3e0).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97907 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97907/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97091 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97091/testReport)** for PR 22666 at commit [`5fb17fb`](https://github.com/apache/spark/commit/5fb17fbefd52198bcf735abc132b0ab9174cbe0f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98079 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98079/testReport)** for PR 22666 at commit [`3ef2503`](https://github.com/apache/spark/commit/3ef25031e4e54a01c989a3fd7f6b6e1094801bc0).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98068 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98068/testReport)** for PR 22666 at commit [`d876b92`](https://github.com/apache/spark/commit/d876b9270afa9b30defea6d4621bcc63dc61f3e0).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97584/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97907 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97907/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97636 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97636/testReport)** for PR 22666 at commit [`bd79d87`](https://github.com/apache/spark/commit/bd79d87af764f3368cc7c8ad4048bd9d95a8da38).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97588 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97588/testReport)** for PR 22666 at commit [`1b86834`](https://github.com/apache/spark/commit/1b86834c1265992e3b46aaf079e1e17ea7c389c4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97582 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97582/testReport)** for PR 22666 at commit [`cd7e2ab`](https://github.com/apache/spark/commit/cd7e2abf4cea8744f0316fcbc7dafac4918079c7).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97594/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97584 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97584/testReport)** for PR 22666 at commit [`4869b76`](https://github.com/apache/spark/commit/4869b76e4f35b094793ff1f69cce3edbeb922ef1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Let's add from_csv first.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97091/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98118/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98112 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98112/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97593/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #98313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98313/testReport)** for PR 22666 at commit [`3aa79d4`](https://github.com/apache/spark/commit/3aa79d4e438a84ea7566f38afd3f2a18fd7cfbed).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98079/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97582 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97582/testReport)** for PR 22666 at commit [`cd7e2ab`](https://github.com/apache/spark/commit/cd7e2abf4cea8744f0316fcbc7dafac4918079c7).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97582/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97636 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97636/testReport)** for PR 22666 at commit [`bd79d87`](https://github.com/apache/spark/commit/bd79d87af764f3368cc7c8ad4048bd9d95a8da38).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97675/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22666#discussion_r228785835
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala ---
    @@ -19,14 +19,39 @@ package org.apache.spark.sql.catalyst.expressions
     
     import org.apache.spark.sql.AnalysisException
     import org.apache.spark.sql.catalyst.util.ArrayBasedMapData
    -import org.apache.spark.sql.types.{MapType, StringType, StructType}
    +import org.apache.spark.sql.types.{DataType, MapType, StringType, StructType}
    +import org.apache.spark.unsafe.types.UTF8String
     
     object ExprUtils {
     
    -  def evalSchemaExpr(exp: Expression): StructType = exp match {
    -    case Literal(s, StringType) => StructType.fromDDL(s.toString)
    +  def evalSchemaExpr(exp: Expression): StructType = {
    +    // Use `DataType.fromDDL` since the type string can be struct<...>.
    +    val dataType = exp match {
    +      case Literal(s, StringType) =>
    +        DataType.fromDDL(s.toString)
    +      case e @ SchemaOfCsv(_: Literal, _) =>
    +        val ddlSchema = e.eval(EmptyRow).asInstanceOf[UTF8String]
    +        DataType.fromDDL(ddlSchema.toString)
    +      case e => throw new AnalysisException(
    +        "Schema should be specified in DDL format as a string literal or output of " +
    +          s"the schema_of_csv function instead of ${e.sql}")
    +    }
    +
    +    if (!dataType.isInstanceOf[StructType]) {
    +      throw new AnalysisException(
    +        s"Schema should be struct type but got ${dataType.sql}.")
    +    }
    +    dataType.asInstanceOf[StructType]
    +  }
    +
    +  def evalTypeExpr(exp: Expression): DataType = exp match {
    +    case Literal(s, StringType) => DataType.fromDDL(s.toString)
    --- End diff --
    
    how about
    ```
    if (expr.isFoldable && expr.dataType == StringType) {
      DataType.fromDDL(expr.eval().asInstanceOf[UTF8String].toString)
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    **[Test build #97675 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97675/testReport)** for PR 22666 at commit [`49bac0e`](https://github.com/apache/spark/commit/49bac0eb9a9cec0c214b2516d560f301a64c475f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22666
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org