You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/07/31 10:46:52 UTC

[GitHub] [spark] beliefer opened a new pull request #25313: [WIP][SPARK-28580][SQL] Support ANSI SQL Unique-Predicate syntax

beliefer opened a new pull request #25313: [WIP][SPARK-28580][SQL] Support ANSI SQL Unique-Predicate syntax
URL: https://github.com/apache/spark/pull/25313
 
 
   ## What changes were proposed in this pull request?
   
   The aim of this PR is to support ANSI SQL `Unique-Predicate` syntax.
   The function of `Unique-Predicate` is specify a test for the absence of duplicate rows.
   The definition in ANSI docs is below:
   ```
   <unique predicate> ::=
   UNIQUE <table subquery>
   ```
   IMHO. I can't find any database supports this syntax. I lost some reference of other database.
   The usage maybe looks like:
   ```
   SELECT t.*
   FROM course AS t
   WHERE UNIQUE(
       SELECT r.course_id
       FROM section AS r
       WHERE t.course_id=r.course_id AND r.year = '2018'
   );
   ```
   I have references the implement of `Exists` in Spark SQL. The rule `RewritePredicateSubquery` replace the `Exists` with a semi join between inner table and outer table.
   I have a basic idea use some rule replace `Unique-Predicate` with equivalent SQL.
   Take the above SQL as an example, the replaced SQL likes below:
   ```
   SELECT T.*
   FROM course AS T
   WHERE 1 = (
     SELECT count(R.course_id)
     FROM section AS  R
     WHERE T.course_id=R.course_id AND R.year = 2018
   );
   ```
   But I don't know whether welcomed by everyone or not, I need some better thinking.
   ## How was this patch tested?
   
   new UT
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org