You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2020/12/17 11:41:00 UTC

[jira] [Updated] (SPARK-26199) Long expressions cause mutate to fail

     [ https://issues.apache.org/jira/browse/SPARK-26199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-26199:
---------------------------------
    Fix Version/s: 3.1.0

> Long expressions cause mutate to fail
> -------------------------------------
>
>                 Key: SPARK-26199
>                 URL: https://issues.apache.org/jira/browse/SPARK-26199
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 2.2.0
>            Reporter: João Rafael
>            Assignee: Michael Chirico
>            Priority: Minor
>             Fix For: 3.1.0, 3.2.0
>
>
> Calling {{mutate(df, field = expr)}} fails when expr is very long.
> Example:
> {code:R}
> df <- mutate(df, field = ifelse(
>     lit(TRUE),
>     lit("A"),
>     ifelse(
>         lit(T),
>         lit("BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"),
>         lit("C")
>     )
> ))
> {code}
> Stack trace:
> {code:R}
> FATAL subscript out of bounds
>   at .handleSimpleError(function (obj) 
> {
>     level = sapply(class(obj), sw
>   at FUN(X[[i]], ...)
>   at lapply(seq_along(args), function(i) {
>     if (ns[[i]] != "") {
> at lapply(seq_along(args), function(i) {
>     if (ns[[i]] != "") {
> at mutate(df, field = ifelse(lit(TRUE), lit("A"), ifelse(lit(T), lit("BBB
>   at #78: mutate(df, field = ifelse(lit(TRUE), lit("A"), ifelse(lit(T
> {code}
> The root cause is in: [DataFrame.R#LL2182|https://github.com/apache/spark/blob/master/R/pkg/R/DataFrame.R#L2182]
> When the expression is long {{deparse}} returns multiple lines, causing {{args}} to have more elements than {{ns}}. The solution could be to set {{nlines = 1}} or to collapse the lines together.
> A simple work around exists, by first placing the expression in a variable and using it instead:
> {code:R}
> tmp <- ifelse(
>     lit(TRUE),
>     lit("A"),
>     ifelse(
>         lit(T),
>         lit("BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"),
>         lit("C")
>     )
> )
> df <- mutate(df, field = tmp)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org