You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "John Grimes (Jira)" <ji...@apache.org> on 2020/09/06 08:16:00 UTC

[jira] [Created] (SPARK-32805) Literal integer seems to get confused as column reference

John Grimes created SPARK-32805:
-----------------------------------

             Summary: Literal integer seems to get confused as column reference
                 Key: SPARK-32805
                 URL: https://issues.apache.org/jira/browse/SPARK-32805
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: John Grimes


When using a literal integer in the group by expression, an error is caused which seems to indicate that the integer was interpreted as a column reference.

I would expect this to succeed, and result in a single grouping with a value of "2", and a count of "1".

Here is an example of a minimal program which reproduces the problem:
{code:java}
import java.util.ArrayList;
import java.util.List;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
import org.apache.spark.sql.functions;

class Scratch {

  public static void main(String[] args) {
    SparkSession spark = SparkSession.builder()
        .master("local[*]")
        .getOrCreate();

    List<Something> somethings = new ArrayList<>();
    Something something = new Something();
    something.setFieldA("foo");
    somethings.add(something);

    Dataset<Row> dataframe = spark.createDataFrame(somethings, Something.class);
    Dataset<Row> result = dataframe.groupBy(functions.lit(2))
        .agg(functions.count("*"));
    result.collectAsList().forEach(row -> System.out.println(row.toString()));
  }

  public static class Something {
    private String fieldA;

    public Something() {
    }

    public String getFieldA() {
      return fieldA;
    }

    public void setFieldA(String fieldA) {
      this.fieldA = fieldA;
    }
  }
}
{code}
Adding .cast("int") to the end of the literal column fixes the problem, but this seems unnecessary, as I have already told the Spark API that I wanted a literal integer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org