You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by titicaca <gi...@git.apache.org> on 2017/01/24 06:50:26 UTC

[GitHub] spark pull request #16689: SPARK-19342 bug fixed in collect method for colle...

GitHub user titicaca opened a pull request:

    https://github.com/apache/spark/pull/16689

    SPARK-19342 bug fixed in collect method for collecting timestamp column

    ## What changes were proposed in this pull request?
    
    Fix a bug in collect method for collecting timestamp column, the bug can be reproduced as shown in the following codes and outputs:
    
    ```
    library(SparkR)
    sparkR.session(master = "local")
    df <- data.frame(col1 = c(0, 1, 2), 
                     col2 = c(as.POSIXct("2017-01-01 00:00:01"), NA, as.POSIXct("2017-01-01 12:00:01")))
    
    sdf1 <- createDataFrame(df)
    print(dtypes(sdf1))
    df1 <- collect(sdf1)
    print(lapply(df1, class))
    
    sdf2 <- filter(sdf1, "col1 > 0")
    print(dtypes(sdf2))
    df2 <- collect(sdf2)
    print(lapply(df2, class))
    ```
    
    As we can see from the printed output, the column type of col2 in df2 is converted to numeric unexpectedly, when NA exists at the top of the column. 
    
    This is caused by method `do.call(c, list)`, if we convert a list, i.e. `do.call(c, list(NA, as.POSIXct("2017-01-01 12:00:01"))`, the class of the result is numeric instead of POSIXct. 
    
    Therefore, we need to cast the data type of the vector explicitly. 
    
    
    
    ## How was this patch tested?
    
    The patch can be tested manually with the same code above.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/titicaca/spark sparkr-dev

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16689.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16689
    
----
commit a51c2eb54ca672ad63495d0709bd3ae7b254bd14
Author: titicaca <fa...@hotmail.com>
Date:   2017-01-24T06:24:47Z

    SPARK-19342 bug fixed in collect method for collecting timestamp column

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r98611766
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1138,6 +1138,11 @@ setMethod("collect",
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    +                    class(vec) <-
    +                      if (colType == "timestamp")
    +                        c("POSIXct", "POSIXt")
    --- End diff --
    
    Because `PRIMITIVE_TYPES[["timestamp"]]` is POSIXct, it usually comes with POSIXt together. POSIXt is virtual class used to allow operations such as subtraction to mix the two classes POSIXct and POSIXlt.
    The previous convertion will also convert timestamp to c("POSIXct", "POSIXt"). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #72378 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72378/testReport)** for PR 16689 at commit [`d6d454e`](https://github.com/apache/spark/commit/d6d454ec0a587c456d5e4a964784589643aa4730).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r98918704
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1138,6 +1138,11 @@ setMethod("collect",
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    +                    class(vec) <-
    +                      if (colType == "timestamp")
    +                        c("POSIXct", "POSIXt")
    --- End diff --
    
    It looks better if it won't affect other methods. I will try it. Thanks for the advice.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    great, this is a good catch and thank you for fixing this @titicaca 
    merging to master, branch-2.1



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72186/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #72182 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72182/testReport)** for PR 16689 at commit [`7903bb3`](https://github.com/apache/spark/commit/7903bb3cc44477d9f7a25971ef4487af0627d333).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Just to make sure you see this: https://github.com/apache/spark/pull/16689#issuecomment-275063425



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r97508643
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1136,9 +1136,17 @@ setMethod("collect",
     
                       # Note that "binary" columns behave like complex types.
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
    -                    vec <- do.call(c, col)
    +                    valueIndex <- which(!is.na(col))
    +                    if (length(valueIndex) > 0 && valueIndex[1] > 1) {
    +                      colTail <- col[-(1 : (valueIndex[1] - 1))]
    +                      vec <- do.call(c, colTail)
    +                      classVal <- class(vec)
    +                      vec <- c(rep(NA, valueIndex[1] - 1), vec)
    +                      class(vec) <- classVal
    --- End diff --
    
    Hmm, what happened here?
    if you want to drop the NA and use the rest to infer the class you can do `col[!is.na(col)]`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r97714703
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1138,6 +1138,9 @@ setMethod("collect",
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    +                    # If vec is an vector with only NAs, the type is logical
    --- End diff --
    
    In local R, if we try
    ```
    df <- data.frame(x = c(0,1,2), y = c(NA, NA, 1))
    class(head(df, 1)$y)
    ```
    The output is still numeric instead of logical. But the existed test is expecting NA logical instead of NA numeric.
    
    So is it necessary to correct the existed tests, for example @test_sparkSQL.R#1280
    from `expect_equal(collect(select(df, first(df$age)))[[1]], NA)` to
    `expect_equal(collect(select(df, first(df$age)))[[1]], NA_real_)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Jenkins, ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #72186 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72186/testReport)** for PR 16689 at commit [`8379c38`](https://github.com/apache/spark/commit/8379c3834fc27e3303501536181bd85372493982).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r97712469
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1138,6 +1138,9 @@ setMethod("collect",
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    +                    # If vec is an vector with only NAs, the type is logical
    --- End diff --
    
    Yes. My first commit was trying to cast the column to its corresponding R data type explicitly, even if it is an vector with all NAs. However some existed tests were failed and expecting to get logical NA. For example
    ```
    3. Failure: column functions (@test_sparkSQL.R#1280) ---------------------------
    collect(select(df, first(df$age)))[[1]] not equal to NA.
    Types not compatible: double vs logical
    4. Failure: column functions (@test_sparkSQL.R#1282) ---------------------------
    collect(select(df, first("age")))[[1]] not equal to NA.
    Types not compatible: double vs logical
    ``` 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Yes, collect on timestamp was getting `c("POSIXct", "POSIXt")`. But when NA exists at the top of the timetamp column, it was getting `numeric` as I described in the PR description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #71969 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71969/testReport)** for PR 16689 at commit [`6a0eb3f`](https://github.com/apache/spark/commit/6a0eb3f8789b6a66a4d1419c389fdda2edc0bc95).
     * This patch **fails R style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72250/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r97707703
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1138,6 +1138,9 @@ setMethod("collect",
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    +                    # If vec is an vector with only NAs, the type is logical
    --- End diff --
    
    if the DataFrame column is of type string, shouldn't it converts to R as character (which can be all NA), even though the column only has NULL (which maps to NA in R)?
    
    it seems with this change it would become logical in R instead of character.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #72250 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72250/testReport)** for PR 16689 at commit [`407c625`](https://github.com/apache/spark/commit/407c6254246be82e88098aa08838284e6861838a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Great, done!
    Looking forward to more contributions from you :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Ok - I think this sounds good then ! @felixcheung Let me know if you want me to take a look at the code as well or if not feel free to merge when you think its ready


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/16689


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #71969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71969/testReport)** for PR 16689 at commit [`6a0eb3f`](https://github.com/apache/spark/commit/6a0eb3f8789b6a66a4d1419c389fdda2edc0bc95).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r98605236
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1138,6 +1138,11 @@ setMethod("collect",
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    +                    class(vec) <-
    +                      if (colType == "timestamp")
    +                        c("POSIXct", "POSIXt")
    +                      else
    +                        PRIMITIVE_TYPES[[colType]]
    --- End diff --
    
    by setting these instead of having it inferred - does this break any existing behavior? does any type differ because of this line of change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    hmm, that's not a super big issue since vector and list is more or less the same in R.
    I think it might be better if we are treating the type consistently, although it might be a concerning if this is changing in a non-backward compatible manner.
    
    let me try to find some time to test this out? thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Sure. Shall I add the tests in pkg/inst/tests/testthat/test_sparkSQL.R?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r98605157
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1138,6 +1138,11 @@ setMethod("collect",
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    +                    class(vec) <-
    +                      if (colType == "timestamp")
    +                        c("POSIXct", "POSIXt")
    --- End diff --
    
    why should the class be `c("POSIXct", "POSIXt")` in this case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r98615769
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1138,6 +1138,11 @@ setMethod("collect",
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    +                    class(vec) <-
    +                      if (colType == "timestamp")
    +                        c("POSIXct", "POSIXt")
    --- End diff --
    
    Should `PRIMITIVE_TYPES[["timestamp"]]` be changed then
    https://github.com/apache/spark/blob/master/R/pkg/R/types.R#L32


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Yes. The JIRA id is SPARK-19342. Thank you for the help and advices :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Thanks for the reminder. I may have forgotten to mention that I am the reporter of this JIRA bug. My JIRA ID is also titicaca. Thank you! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #71982 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71982/testReport)** for PR 16689 at commit [`43e334a`](https://github.com/apache/spark/commit/43e334a1debc31ab4383589ffd4144cc9d1e7633).
     * This patch **fails R style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    I have modified the codes and tests, including the existed tests @test_sparkSQL.R#1280 and @test_sparkSQL.R#1282. 
    
    Like in local R,  now NA column of the SparkDataFrame will also be collected as its corresponding type instead of logical NA.   
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    hmm, this seems like a reasonable approach. With these changes:
    - collect on timestamp would get `c("POSIXct", "POSIXt")`
    - coltypes output will not change
    
    @shivaram what do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    yes. but please see my other comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #72250 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72250/testReport)** for PR 16689 at commit [`407c625`](https://github.com/apache/spark/commit/407c6254246be82e88098aa08838284e6861838a).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    @titicaca he means, what is your user ID on JIRA? so we can credit you. It's clear what the JIRA is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    great! @shivaram could you get Jenkins to test this fix please? I don't seem to have the power to command it :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #71982 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71982/testReport)** for PR 16689 at commit [`43e334a`](https://github.com/apache/spark/commit/43e334a1debc31ab4383589ffd4144cc9d1e7633).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71982/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Oh @felixcheung , I was writing a comment but I just saw you. I was looking into this for my curiosity.
    
    Isn't this due to R type coercion rule with POSIXlt? 
    
    ```r
    > str(data.frame(col1 = c(as.POSIXct("2017-01-01 12:00:01"), NA)))
    'data.frame':	2 obs. of  1 variable:
     $ col1: POSIXct, format: "2017-01-01 12:00:01" NA
    ```
    
    ```r
    > str(data.frame(col1 = c(NA, as.POSIXct("2017-01-01 12:00:01"))))
    'data.frame':	2 obs. of  1 variable:
     $ col1: num  NA 1.48e+09
    ```
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Sorry for the late reply. I figured out that the tests failed because if a vector is with only NAs, the type is logical, therefore we cannot cast the type in that case. I have updated the codes and added some tests for that. Thank you for the advice. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Thanks. I tried to fix the method `coltypes` for the modification of the timestamp, and it can pass all the tests now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    @felixcheung @titicaca Just to make sure I understand, collect on timestamp was getting `c("POSIXct", "POSIXt")` even before this change ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Thanks! I can verify this case and the fix. 
    Could you please add some tests for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #72378 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72378/testReport)** for PR 16689 at commit [`d6d454e`](https://github.com/apache/spark/commit/d6d454ec0a587c456d5e4a964784589643aa4730).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    I tried to modify the PRIMITIVE_TYPES for timestamp, but it had a side effect on coltypes method.
    
    In test_sparkSQL.R#2262, `expect_equal(coltypes(DF), c("integer", "logical", "POSIXct"))`, coltypes return a list instead of a vector because of the convertion from timestamp to `c(POSIXct, POSIXt)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #72182 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72182/testReport)** for PR 16689 at commit [`7903bb3`](https://github.com/apache/spark/commit/7903bb3cc44477d9f7a25971ef4487af0627d333).
     * This patch **fails R style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    **[Test build #72186 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72186/testReport)** for PR 16689 at commit [`8379c38`](https://github.com/apache/spark/commit/8379c3834fc27e3303501536181bd85372493982).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    (I might be wrong but was suspecting that it returns `NA` first as `logical` when we collect via `SerDe.scala` and then it ends up `numeric` due to the type coercion when `NA` is located first as above.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72182/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16689: [SPARK-19342][SPARKR] bug fixed in collect method...

Posted by titicaca <gi...@git.apache.org>.
Github user titicaca commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16689#discussion_r98612545
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1138,6 +1138,11 @@ setMethod("collect",
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    +                    class(vec) <-
    +                      if (colType == "timestamp")
    +                        c("POSIXct", "POSIXt")
    +                      else
    +                        PRIMITIVE_TYPES[[colType]]
    --- End diff --
    
    Currently all tests are passed, except for the two modified tests with NA types as discussed before.  The followings are the all type convertions from SparkDataframe to R data.frame, which have been tested in the existing tests in test_sparkSQL.R. 
    ```
    PRIMITIVE_TYPES <- as.environment(list(
      "tinyint" = "integer",
      "smallint" = "integer",
      "int" = "integer",
      "bigint" = "numeric",
      "float" = "numeric",
      "double" = "numeric",
      "decimal" = "numeric",
      "string" = "character",
      "binary" = "raw",
      "boolean" = "logical",
      "timestamp" = "POSIXct",
      "date" = "Date",
      # following types are not SQL types returned by dtypes(). They are listed here for usage
      # by checkType() in schema.R.
      # TODO: refactor checkType() in schema.R.
      "byte" = "integer",
      "integer" = "integer"
      ))
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72378/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    @titicaca do you have a JIRA id on https://issues.apache.org? We would resolve the bug to you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Oh. it was all written in the PR description... I removed my uesless comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: SPARK-19342 bug fixed in collect method for collecting t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16689: [SPARK-19342][SPARKR] bug fixed in collect method for co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16689
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71969/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org