You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Venkata Ramana G (JIRA)" <ji...@apache.org> on 2014/11/04 20:38:34 UTC
[jira] [Commented] (SPARK-4217) Result of SparkSQL is incorrect
after a table join and group by operation
[ https://issues.apache.org/jira/browse/SPARK-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196674#comment-14196674 ]
Venkata Ramana G commented on SPARK-4217:
-----------------------------------------
I have executed this on Hive and SparkSQL.
looks like your "Hive" result is wrong.
I have got same result on SparkSQL and Hive
SparkSQL
[2010,210924]
[2004,3265696]
[2005,13247234]
[2006,13670416]
[2007,16711974]
[2008,14670698]
[2009,6322137]
Hive
2004 3265696
2005 13247234
2006 13670416
2007 16711974
2008 14670698
2009 6322137
2010 210924
> Result of SparkSQL is incorrect after a table join and group by operation
> -------------------------------------------------------------------------
>
> Key: SPARK-4217
> URL: https://issues.apache.org/jira/browse/SPARK-4217
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.1.0
> Environment: Hadoop 2.2.0
> Spark1.1
> Reporter: peter.zhang
> Priority: Critical
> Attachments: TestScript.sql, saledata.zip
>
>
> I runed a test using same SQL script in SparkSQL, Shark and Hive environment as below
> ---------------------------------------------------------------
> select c.theyear, sum(b.amount)
> from tblstock a
> join tblStockDetail b on a.ordernumber = b.ordernumber
> join tbldate c on a.dateid = c.dateid
> group by c.theyear;
> result of hive/shark:
> theyear _c1
> 2004 1403018
> 2005 5557850
> 2006 7203061
> 2007 11300432
> 2008 12109328
> 2009 5365447
> 2010 188944
> result of SparkSQL:
> 2010 210924
> 2004 3265696
> 2005 13247234
> 2006 13670416
> 2007 16711974
> 2008 14670698
> 2009 6322137
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org