You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2018/10/22 04:15:50 UTC

[GitHub] spark pull request #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Exam...

GitHub user dongjoon-hyun opened a pull request:

    https://github.com/apache/spark/pull/22791

    [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Example

    ## What changes were proposed in this pull request?
    
    This PR aims to fix the following SparkR example in Spark 2.3.0 ~ 2.4.0.
    
    ```r
    > df <- read.df("examples/src/main/resources/people.csv", "csv")
    > namesAndAges <- select(df, "name", "age")
    ...
    Caused by: org.apache.spark.sql.AnalysisException: cannot resolve '`name`' given input columns: [_c0];;
    'Project ['name, 'age]
    +- AnalysisBarrier
          +- Relation[_c0#97] csv
    ```
     
    - https://dist.apache.org/repos/dist/dev/spark/v2.4.0-rc3-docs/_site/sql-programming-guide.html#manually-specifying-options
    - http://spark.apache.org/docs/2.3.2/sql-programming-guide.html#manually-specifying-options
    - http://spark.apache.org/docs/2.3.1/sql-programming-guide.html#manually-specifying-options
    - http://spark.apache.org/docs/2.3.0/sql-programming-guide.html#manually-specifying-options
    
    ## How was this patch tested?
    
    Manual test in SparkR. (Please note that `RSparkSQLExample.R` fails at the last JDBC example)
    
    ```
    > df <- read.df("examples/src/main/resources/people.csv", "csv", sep=";", inferSchema=T, header=T)
    > namesAndAges <- select(df, "name", "age")
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjoon-hyun/spark SPARK-25795

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22791.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22791
    
----

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Exam...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22791#discussion_r227493674
  
    --- Diff: examples/src/main/r/RSparkSQLExample.R ---
    @@ -114,7 +114,7 @@ write.df(namesAndAges, "namesAndAges.parquet", "parquet")
     
     
     # $example on:manual_load_options_csv$
    -df <- read.df("examples/src/main/resources/people.csv", "csv")
    +df <- read.df("examples/src/main/resources/people.csv", "csv", sep=";", inferSchema=T, header=T)
    --- End diff --
    
    If you don't mind, I included that [here](https://github.com/apache/spark/pull/22801/files#diff-eeffb959b904ebb5c864bc3dafe6437dR117)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Exam...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22791#discussion_r227014553
  
    --- Diff: examples/src/main/r/RSparkSQLExample.R ---
    @@ -114,7 +114,7 @@ write.df(namesAndAges, "namesAndAges.parquet", "parquet")
     
     
     # $example on:manual_load_options_csv$
    -df <- read.df("examples/src/main/resources/people.csv", "csv")
    +df <- read.df("examples/src/main/resources/people.csv", "csv", sep=";", inferSchema=T, header=T)
    --- End diff --
    
    Hi, @felixcheung .
    Could you review this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Exam...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22791


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Example

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22791
  
    **[Test build #97793 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97793/testReport)** for PR 22791 at commit [`f160711`](https://github.com/apache/spark/commit/f160711e57871d5865e842dbec1d1cf70e688659).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Exam...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22791#discussion_r227465232
  
    --- Diff: examples/src/main/r/RSparkSQLExample.R ---
    @@ -114,7 +114,7 @@ write.df(namesAndAges, "namesAndAges.parquet", "parquet")
     
     
     # $example on:manual_load_options_csv$
    -df <- read.df("examples/src/main/resources/people.csv", "csv")
    +df <- read.df("examples/src/main/resources/people.csv", "csv", sep=";", inferSchema=T, header=T)
    --- End diff --
    
    in R style we typical put space after param name, ie. https://github.com/apache/spark/pull/22791/files#diff-eeffb959b904ebb5c864bc3dafe6437dR168
    `, sep = ";", inferSchema = TRUE, header = TRUE`
    
    and pls don't use `T` for readability


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Exam...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22791#discussion_r227041945
  
    --- Diff: examples/src/main/r/RSparkSQLExample.R ---
    @@ -114,7 +114,7 @@ write.df(namesAndAges, "namesAndAges.parquet", "parquet")
     
     
     # $example on:manual_load_options_csv$
    -df <- read.df("examples/src/main/resources/people.csv", "csv")
    +df <- read.df("examples/src/main/resources/people.csv", "csv", sep=";", inferSchema=T, header=T)
    --- End diff --
    
    Also, ping @jomach and @gatorsmile because it was added by the following PR at Spark 2.3.
    - https://github.com/apache/spark/pull/19429/files#diff-eeffb959b904ebb5c864bc3dafe6437dR117
    
    BTW, [SPARK-20055](https://issues.apache.org/jira/browse/SPARK-20055) is still open.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Example

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22791
  
    **[Test build #97811 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97811/testReport)** for PR 22791 at commit [`f160711`](https://github.com/apache/spark/commit/f160711e57871d5865e842dbec1d1cf70e688659).
     * This patch passes all tests.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Exam...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22791#discussion_r227493317
  
    --- Diff: examples/src/main/r/RSparkSQLExample.R ---
    @@ -114,7 +114,7 @@ write.df(namesAndAges, "namesAndAges.parquet", "parquet")
     
     
     # $example on:manual_load_options_csv$
    -df <- read.df("examples/src/main/resources/people.csv", "csv")
    +df <- read.df("examples/src/main/resources/people.csv", "csv", sep=";", inferSchema=T, header=T)
    --- End diff --
    
    Thank you, @felixcheung .


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22791: [SPARK-25795][R][EXAMPLE] Fix CSV SparkR SQL Example

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22791
  
    Thank you for review and merging, @srowen .
    Merged to `master/branch-2.4/branch-2.3`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org