You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Antonio Piccolboni (JIRA)" <ji...@apache.org> on 2015/08/11 20:23:46 UTC

[jira] [Commented] (SPARK-8989) Support aggregations in HAVING clause

    [ https://issues.apache.org/jira/browse/SPARK-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682231#comment-14682231 ] 

Antonio Piccolboni commented on SPARK-8989:
-------------------------------------------

Test schema and data

CREATE TABLE Table1
    (key varchar(2), id int, value float)
;
    
INSERT INTO Table1
    (key, id, value)
VALUES
    ('ab', 1, 1.1),
    ('bc', 2, 2.2),
    ('df', 3, 3.3),
    ('ab', 1, 1.2),
    ('bc', 2, 2.3),
    ('df', 3, 3.4)
;

Test query

SELECT key, id, value FROM Table1 GROUP BY key HAVING max(value) = value;

Support:
mysql 5.6: needs `` quoting, runs but results incorrect
oracle 11g: can't create table, missing keyword
postgreSQL 9.3: no, id must appear in group by
SQLite (WebSQL): yes
MS SQL server: can't create table -- incorrect syntax


Tested via http://sqlfiddle.com/

I supposes we can chalk it up to some SQLite specific extension. 


> Support aggregations in HAVING clause
> -------------------------------------
>
>                 Key: SPARK-8989
>                 URL: https://issues.apache.org/jira/browse/SPARK-8989
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.4.0
>            Reporter: Antonio Piccolboni
>              Labels: aggregate, sql
>
> It seems like query of the type
> SELECT keys, id, value FROM table GROUP BY keys HAVING max(value) = value
> Are not working or supported (I selected issue type enhancement under the assumption it's the latter). The use case is very natural, for each group find the record attaining the maximum for a certain row-dependent expression, and return any selection of fields from that row together with the grouping keys. I know how to work around this with a join, but it's a considerably more involved query and therefore I thought worth asking if this is my misunderstanding, a shortcoming in the implementation of HAVING, a nice-to-have but not really on the road map or "that's crazy what kind of rotten database would accept this nonsense?" (by the way, I think PostgresSQL does; not sure how many other DBs do)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org