You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by yanboliang <gi...@git.apache.org> on 2015/12/04 10:12:54 UTC

[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

GitHub user yanboliang opened a pull request:

    https://github.com/apache/spark/pull/10145

    [SPARK-12146] [SparkR] SparkR jsonFile should support multiple input files

    SparkR jsonFile should support multiple input files

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yanboliang/spark spark-12146

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10145.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10145
    
----
commit 012e7cad9b584ba2e8073f97e5d22462614cdd2c
Author: Yanbo Liang <yb...@gmail.com>
Date:   2015-12-04T09:02:42Z

    SparkR jsonFile should support multiple input files

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46883335
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -208,24 +208,32 @@ setMethod("toDF", signature(x = "RDD"),
     #' @param sqlContext SQLContext to use
     #' @param path Path of file to read. A vector of multiple paths is allowed.
     #' @return DataFrame
    +#' @rdname read.json
    +#' @name read.json
     #' @export
     #' @examples
     #'\dontrun{
     #' sc <- sparkR.init()
     #' sqlContext <- sparkRSQL.init(sc)
     #' path <- "path/to/file.json"
    -#' df <- jsonFile(sqlContext, path)
    +#' df <- read.json(sqlContext, path)
     #' }
    -
    -jsonFile <- function(sqlContext, path) {
    +read.json <- function(sqlContext, path) {
       # Allow the user to have a more flexible definiton of the text file path
    -  path <- suppressWarnings(normalizePath(path))
    -  # Convert a string vector of paths to a string containing comma separated paths
    -  path <- paste(path, collapse = ",")
    -  sdf <- callJMethod(sqlContext, "jsonFile", path)
    +  paths <- as.list(suppressWarnings(normalizePath(splitString(path))))
    --- End diff --
    
    I thought @sun-rui noted we should take a list or vector? In such case we should change this code to
    ```
    paths <- as.list(suppressWarnings(normalizePath(path)))
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162793798
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47313/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162807690
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-163826340
  
    **[Test build #47563 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47563/consoleFull)** for PR 10145 at commit [`1d74b18`](https://github.com/apache/spark/commit/1d74b187f912bb1f84ed0107a280e1e30406c047).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46680386
  
    --- Diff: R/pkg/inst/tests/test_sparkSQL.R ---
    @@ -359,10 +359,21 @@ test_that("Collect DataFrame with complex types", {
       expect_equal(bob$height, 176.5)
     })
     
    -test_that("jsonFile() on a local file returns a DataFrame", {
    -  df <- jsonFile(sqlContext, jsonPath)
    +test_that("read.json() on a local file returns a DataFrame", {
    --- End diff --
    
    although jsonFile() is deprecated, we'd better to keep the test cases for it until it is removed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-163646313
  
    **[Test build #47516 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47516/consoleFull)** for PR 10145 at commit [`06ae53d`](https://github.com/apache/spark/commit/06ae53dfb7db7f1276d4ccf16160e85b285c3864).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46909122
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -208,24 +208,32 @@ setMethod("toDF", signature(x = "RDD"),
     #' @param sqlContext SQLContext to use
     #' @param path Path of file to read. A vector of multiple paths is allowed.
     #' @return DataFrame
    +#' @rdname read.json
    +#' @name read.json
     #' @export
     #' @examples
     #'\dontrun{
     #' sc <- sparkR.init()
     #' sqlContext <- sparkRSQL.init(sc)
     #' path <- "path/to/file.json"
    -#' df <- jsonFile(sqlContext, path)
    --- End diff --
    
    Keep this example, even jsonFile is deprecated


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46680300
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -214,18 +214,45 @@ setMethod("toDF", signature(x = "RDD"),
     #' sc <- sparkR.init()
     #' sqlContext <- sparkRSQL.init(sc)
     #' path <- "path/to/file.json"
    -#' df <- jsonFile(sqlContext, path)
    +#' df <- read.json(sqlContext, path)
     #' }
     
    -jsonFile <- function(sqlContext, path) {
    +read.json <- function(sqlContext, ...) {
       # Allow the user to have a more flexible definiton of the text file path
    -  path <- suppressWarnings(normalizePath(path))
    +  paths <- if (length(list(...)) > 1) {
    +    lapply(list(...), function(x) suppressWarnings(normalizePath(x)))
    +  } else {
    +    as.list(suppressWarnings(normalizePath(splitString(...))))
    +  }
       # Convert a string vector of paths to a string containing comma separated paths
    -  path <- paste(path, collapse = ",")
    -  sdf <- callJMethod(sqlContext, "jsonFile", path)
    +  read <- callJMethod(sqlContext, "read")
    +  sdf <- callJMethod(read, "json", paths)
       dataFrame(sdf)
     }
     
    +#' Create a DataFrame from a JSON file.
    --- End diff --
    
    @felixcheung, could read.json and jsonFile share a same function description since there are just aliases?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46726302
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -214,18 +214,45 @@ setMethod("toDF", signature(x = "RDD"),
     #' sc <- sparkR.init()
     #' sqlContext <- sparkRSQL.init(sc)
     #' path <- "path/to/file.json"
    -#' df <- jsonFile(sqlContext, path)
    +#' df <- read.json(sqlContext, path)
     #' }
     
    -jsonFile <- function(sqlContext, path) {
    +read.json <- function(sqlContext, ...) {
       # Allow the user to have a more flexible definiton of the text file path
    -  path <- suppressWarnings(normalizePath(path))
    +  paths <- if (length(list(...)) > 1) {
    +    lapply(list(...), function(x) suppressWarnings(normalizePath(x)))
    +  } else {
    +    as.list(suppressWarnings(normalizePath(splitString(...))))
    +  }
       # Convert a string vector of paths to a string containing comma separated paths
    -  path <- paste(path, collapse = ",")
    -  sdf <- callJMethod(sqlContext, "jsonFile", path)
    +  read <- callJMethod(sqlContext, "read")
    +  sdf <- callJMethod(read, "json", paths)
       dataFrame(sdf)
     }
     
    +#' Create a DataFrame from a JSON file.
    --- End diff --
    
    yes, see createDataFrame and as.DataFrame.
    They should have the same `@rdname` but distinct `@name`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162746819
  
    I found that it will complain errors if we use functions with ```.Deprecated``` after rebase master, so are we still keep the test cases for deprecated functions? @sun-rui 
    ```R
    2. Error: read.json()/jsonFile() on a local file returns a DataFrame -----------
    (由警告转换成)'jsonFile' is deprecated.
    Use 'read.json' instead.
    See help("Deprecated")
    1: withCallingHandlers(eval(code, new_test_environment), error = capture_calls, message = function(c) invokeRestart("muffleMessage"))
    2: eval(code, new_test_environment)
    3: eval(expr, envir, enclos)
    4: jsonFile(sqlContext, c(jsonPath, jsonPath2)) at test_sparkSQL.R:384
    5: .Deprecated("read.json")
    6: warning(paste(msg, collapse = ""), call. = FALSE, domain = NA)
    7: .signalSimpleWarning("'jsonFile' is deprecated.\nUse 'read.json' instead.\nSee help(\"Deprecated\")", 
           quote(NULL))
    8: withRestarts({
           .Internal(.signalCondition(simpleWarning(msg, call), msg, call))
           .Internal(.dfltWarn(msg, call))
       }, muffleWarning = function() NULL)
    9: withOneRestart(expr, restarts[[1L]])
    10: doWithOneRestart(return(expr), restart)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162621481
  
    @yanboliang We moved the test file locations in #10030 -- So you'll need to rebase to master branch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162497217
  
    **[Test build #47264 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47264/consoleFull)** for PR 10145 at commit [`4decf22`](https://github.com/apache/spark/commit/4decf221013cb0f62a6c53cd797db7e790880e66).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-163826478
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47563/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46680185
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -214,18 +214,45 @@ setMethod("toDF", signature(x = "RDD"),
     #' sc <- sparkR.init()
     #' sqlContext <- sparkRSQL.init(sc)
     #' path <- "path/to/file.json"
    -#' df <- jsonFile(sqlContext, path)
    +#' df <- read.json(sqlContext, path)
     #' }
     
    -jsonFile <- function(sqlContext, path) {
    +read.json <- function(sqlContext, ...) {
    --- End diff --
    
    Keep the original signature.
    do not use var arg for paths. Use a vector of string for paths, This is more R-like style.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-163695026
  
    **[Test build #47516 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47516/consoleFull)** for PR 10145 at commit [`06ae53d`](https://github.com/apache/spark/commit/06ae53dfb7db7f1276d4ccf16160e85b285c3864).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162793797
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162764102
  
    looks good, thanks for making these changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10145


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-163826475
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-164032045
  
    LGTM. Merging this to master and branch-1.6


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162763663
  
    **[Test build #47308 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47308/consoleFull)** for PR 10145 at commit [`47c7ee1`](https://github.com/apache/spark/commit/47c7ee114454d9ac57e6aaa6b891b23eeb9bfaac).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46909162
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -208,24 +208,32 @@ setMethod("toDF", signature(x = "RDD"),
     #' @param sqlContext SQLContext to use
     #' @param path Path of file to read. A vector of multiple paths is allowed.
     #' @return DataFrame
    +#' @rdname read.json
    +#' @name read.json
     #' @export
     #' @examples
     #'\dontrun{
     #' sc <- sparkR.init()
     #' sqlContext <- sparkRSQL.init(sc)
     #' path <- "path/to/file.json"
    -#' df <- jsonFile(sqlContext, path)
    +#' df <- read.json(sqlContext, path)
     #' }
    -
    -jsonFile <- function(sqlContext, path) {
    +read.json <- function(sqlContext, path) {
       # Allow the user to have a more flexible definiton of the text file path
    -  path <- suppressWarnings(normalizePath(path))
    -  # Convert a string vector of paths to a string containing comma separated paths
    -  path <- paste(path, collapse = ",")
    -  sdf <- callJMethod(sqlContext, "jsonFile", path)
    +  paths <- as.list(suppressWarnings(normalizePath(splitString(path))))
    --- End diff --
    
    yes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-161921950
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47193/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-161918155
  
    **[Test build #47193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47193/consoleFull)** for PR 10145 at commit [`012e7ca`](https://github.com/apache/spark/commit/012e7cad9b584ba2e8073f97e5d22462614cdd2c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-163708297
  
    @yanboliang Could you bring this PR up to date with master ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46809117
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -214,18 +214,45 @@ setMethod("toDF", signature(x = "RDD"),
     #' sc <- sparkR.init()
     #' sqlContext <- sparkRSQL.init(sc)
     #' path <- "path/to/file.json"
    -#' df <- jsonFile(sqlContext, path)
    +#' df <- read.json(sqlContext, path)
     #' }
     
    -jsonFile <- function(sqlContext, path) {
    +read.json <- function(sqlContext, ...) {
    --- End diff --
    
    I found [```parquetFile```](https://github.com/apache/spark/blob/master/R/pkg/R/SQLContext.R#L270) used var args already, we will get different signature for ```jsonFile``` and ```parquetFile```. Is this as expected? Or we will make break changing for parquetFile? Although I have reverted the code to keep the original signature.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-161921946
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-163695247
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47516/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-163823254
  
    **[Test build #47563 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47563/consoleFull)** for PR 10145 at commit [`1d74b18`](https://github.com/apache/spark/commit/1d74b187f912bb1f84ed0107a280e1e30406c047).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162497338
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46679607
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -206,7 +206,7 @@ setMethod("toDF", signature(x = "RDD"),
     #' It goes through the entire dataset once to determine the schema.
     #'
     #' @param sqlContext SQLContext to use
    -#' @param path Path of file to read. A vector of multiple paths is allowed.
    --- End diff --
    
    Could you move this change to the planned new JIRA issue about parquetFile? Let's focus this PR on jsonFile


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162750563
  
    I vote for adding suppressWarnings. And add comment for this in test cases


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-163695246
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162755531
  
    hmm, I guess deprecation is a warning which is now getting turned into an error.
    I think it's fine for the test for the deprecated function to suppresswarning


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162787774
  
    **[Test build #47313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47313/consoleFull)** for PR 10145 at commit [`06ae53d`](https://github.com/apache/spark/commit/06ae53dfb7db7f1276d4ccf16160e85b285c3864).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162761240
  
    **[Test build #47308 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47308/consoleFull)** for PR 10145 at commit [`47c7ee1`](https://github.com/apache/spark/commit/47c7ee114454d9ac57e6aaa6b891b23eeb9bfaac).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by sun-rui <gi...@git.apache.org>.
Github user sun-rui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10145#discussion_r46817076
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -214,18 +214,45 @@ setMethod("toDF", signature(x = "RDD"),
     #' sc <- sparkR.init()
     #' sqlContext <- sparkRSQL.init(sc)
     #' path <- "path/to/file.json"
    -#' df <- jsonFile(sqlContext, path)
    +#' df <- read.json(sqlContext, path)
     #' }
     
    -jsonFile <- function(sqlContext, path) {
    +read.json <- function(sqlContext, ...) {
    --- End diff --
    
    we can keep parquetFile as is, and mark it as deprecated. And then add read.parquet which will call DataFrame.read.parquet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162763947
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162763952
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47308/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162793714
  
    **[Test build #47313 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47313/consoleFull)** for PR 10145 at commit [`06ae53d`](https://github.com/apache/spark/commit/06ae53dfb7db7f1276d4ccf16160e85b285c3864).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-161921823
  
    **[Test build #47193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47193/consoleFull)** for PR 10145 at commit [`012e7ca`](https://github.com/apache/spark/commit/012e7cad9b584ba2e8073f97e5d22462614cdd2c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162488350
  
    **[Test build #47264 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47264/consoleFull)** for PR 10145 at commit [`4decf22`](https://github.com/apache/spark/commit/4decf221013cb0f62a6c53cd797db7e790880e66).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12146] [SparkR] SparkR jsonFile should ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10145#issuecomment-162497341
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47264/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org