You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jo Desmet (JIRA)" <ji...@apache.org> on 2015/10/01 04:46:04 UTC
[jira] [Created] (SPARK-10893) Lag Analytic function broken
Jo Desmet created SPARK-10893:
---------------------------------
Summary: Lag Analytic function broken
Key: SPARK-10893
URL: https://issues.apache.org/jira/browse/SPARK-10893
Project: Spark
Issue Type: Bug
Components: Spark Core, SQL
Affects Versions: 1.5.0
Environment: Spark Standalone Cluster on Linux
Reporter: Jo Desmet
Trying to aggregate with the LAG Analytic function gives the wrong result. In my testcase it was always giving the fixed value '103079215105' when I tried to run on an integer.
Input Jason:
{"VAA":"A", "VBB":1}
{"VAA":"B", "VBB":-1}
{"VAA":"C", "VBB":2}
{"VAA":"d", "VBB":3}
{"VAA":null, "VBB":null}
Java:
SparkContext sc = new SparkContext(conf);
HiveContext sqlContext = new HiveContext(sc);
DataFrame df = sqlContext.read().json(getInputPath("input.json"));
df = df.withColumn(
"previous",
lag(dataFrame.col("VBB"), 1)
.over(Window.orderBy(dataFrame.col("VAA")))
);
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org