You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jo Desmet (JIRA)" <ji...@apache.org> on 2015/10/01 04:46:04 UTC

[jira] [Created] (SPARK-10893) Lag Analytic function broken

Jo Desmet created SPARK-10893:
---------------------------------

             Summary: Lag Analytic function broken
                 Key: SPARK-10893
                 URL: https://issues.apache.org/jira/browse/SPARK-10893
             Project: Spark
          Issue Type: Bug
          Components: Spark Core, SQL
    Affects Versions: 1.5.0
         Environment: Spark Standalone Cluster on Linux
            Reporter: Jo Desmet


Trying to aggregate with the LAG Analytic function gives the wrong result. In my testcase it was always giving the fixed value '103079215105' when I tried to run on an integer.

Input Jason:
{"VAA":"A", "VBB":1}
{"VAA":"B", "VBB":-1}
{"VAA":"C", "VBB":2}
{"VAA":"d", "VBB":3}
{"VAA":null, "VBB":null}

Java:
    SparkContext sc = new SparkContext(conf);
    HiveContext sqlContext = new HiveContext(sc);
    DataFrame df = sqlContext.read().json(getInputPath("input.json"));
    
    df = df.withColumn(
      "previous",
      lag(dataFrame.col("VBB"), 1)
        .over(Window.orderBy(dataFrame.col("VAA")))
      );




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org