You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Yingyi Bu (Code Review)" <do...@asterixdb.incubator.apache.org> on 2016/11/01 17:19:41 UTC

Change in asterixdb[master]: Address Don's comments for the query section doc.

Yingyi Bu has uploaded a new change for review.

  https://asterix-gerrit.ics.uci.edu/1329

Change subject: Address Don's comments for the query section doc.
......................................................................

Address Don's comments for the query section doc.

Change-Id: I30ab472d87221d6b1c709f24c4dba1bdfaa2c0b4
---
M asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
1 file changed, 142 insertions(+), 48 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/29/1329/1

diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
index 85787e8..dc760ba 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
@@ -33,13 +33,13 @@
     Subquery           ::= "(" SelectStatement ")"
 
     SelectBlock        ::= SelectClause
-                           ( FromClause ( WithClause )?)?
+                           ( FromClause ( LetClause )?)?
                            ( WhereClause )?
                            ( GroupbyClause ( LetClause )? ( HavingClause )? )?
                            |
-                           FromClause ( WithClause )?
+                           FromClause ( LetClause )?
                            ( WhereClause )?
-                           ( GroupbyClause ( WithClause )? ( HavingClause )? )?
+                           ( GroupbyClause ( LetClause )? ( HavingClause )? )?
                            SelectClause
 
     SelectClause       ::= <SELECT> ( <ALL> | <DISTINCT> )? ( SelectRegular | SelectValue )
@@ -48,10 +48,10 @@
     Projection         ::= ( Expression ( <AS> )? Identifier | "*" )
 
     FromClause         ::= <FROM> FromTerm ( "," FromTerm )*
-    FromTerm           ::= Expression (( <AS> )? Variable)? ( <AT> Variable )?
+    FromTerm           ::= Expression (( <AS> )? Variable)?
                            ( ( JoinType )? ( JoinClause | UnnestClause ) )*
 
-    JoinClause         ::= <JOIN> Expression (( <AS> )? Variable)? (<AT> Variable)? <ON> Expression
+    JoinClause         ::= <JOIN> Expression (( <AS> )? Variable)? <ON> Expression
     UnnestClause       ::= ( <UNNEST> | <CORRELATE> | <FLATTEN> ) Expression
                            ( <AS> )? Variable ( <AT> Variable )?
     JoinType           ::= ( <INNER> | <LEFT> ( <OUTER> )? )
@@ -74,13 +74,13 @@
 
 In this section, we will make use of two stored collections of objects (datasets), `GleambookUsers` and `GleambookMessages`, in a series of running examples to explain `SELECT` queries. The contents of the example collections are as follows:
 
-`GleambookUsers` collection:
+`GleambookUsers` collection (or, dataset):
 
     {"id":1,"alias":"Margarita","name":"MargaritaStoddard","nickname":"Mags","userSince":"2012-08-20T10:10:00","friendIds":[2,3,6,10],"employment":[{"organizationName":"Codetechno","start-date":"2006-08-06"},{"organizationName":"geomedia","start-date":"2010-06-17","end-date":"2010-01-26"}],"gender":"F"}
     {"id":2,"alias":"Isbel","name":"IsbelDull","nickname":"Izzy","userSince":"2011-01-22T10:10:00","friendIds":[1,4],"employment":[{"organizationName":"Hexviafind","startDate":"2010-04-27"}]}
     {"id":3,"alias":"Emory","name":"EmoryUnk","userSince":"2012-07-10T10:10:00","friendIds":[1,5,8,9],"employment":[{"organizationName":"geomedia","startDate":"2010-06-17","endDate":"2010-01-26"}]}
 
-`GleambookMessages` collection:
+`GleambookMessages` collection (or, dataset):
 
     {"messageId":2,"authorId":1,"inResponseTo":4,"senderLocation":[41.66,80.87],"message":" dislike iphone its touch-screen is horrible"}
     {"messageId":3,"authorId":2,"inResponseTo":4,"senderLocation":[48.09,81.01],"message":" like samsung the plan is amazing"}
@@ -136,6 +136,7 @@
 In SQL++, the traditional SQL-style `SELECT` syntax is also supported.
 This syntax can also be reformulated in a `SELECT VALUE` based manner in SQL++.
 (E.g., `SELECT expA AS fldA, expB AS fldB` is syntactic sugar for `SELECT VALUE { 'fldA': expA, 'fldB': expB }`.)
+Unlike in SQL, the result of an SQL++ query does not preserve the order of expressions in the `SELECT` clause.
 
 ##### Example
     SELECT user.alias user_alias, user.name user_name
@@ -150,7 +151,9 @@
     } ]
 
 ### <a id="Select_star">SELECT *</a>
-In SQL++, `SELECT *` returns a object with a nested field for each input tuple. Each field has as its field name the name of a binding variable generated by either the `FROM` clause or `GROUP BY` clause in the current enclosing `SELECT` statement, and its field is the value of that binding variable.
+In SQL++, `SELECT *` returns a object with a nested field for each input tuple.
+Each field has as its field name the name of a binding variable generated by either the `FROM` clause or `GROUP BY`
+clause in the current enclosing `SELECT` statement, and its field value is the value of that binding variable.
 
 ##### Example
 
@@ -224,6 +227,75 @@
             ]
         }
     } ]
+
+
+##### Example
+
+    SELECT *
+    FROM GleambookUsers u, GleambookMessages m
+    WHERE m.authorId = u.id and u.id = 2;
+
+This query does an inner join that we will discuss in [multiple from terms](#Multiple_from_terms).
+Since both `u` and `m` are binding variable generated in the `FROM` clause, this query returns:
+
+    [ {
+        "u": {
+            "userSince": "2011-01-22T10:10:00",
+            "friendIds": [
+                1,
+                4
+            ],
+            "name": "IsbelDull",
+            "nickname": "Izzy",
+            "alias": "Isbel",
+            "id": 2,
+            "employment": [
+                {
+                    "organizationName": "Hexviafind",
+                    "startDate": "2010-04-27"
+                }
+            ]
+        },
+        "m": {
+            "senderLocation": [
+                31.5,
+                75.56
+            ],
+            "inResponseTo": 1,
+            "messageId": 6,
+            "authorId": 2,
+            "message": " like t-mobile its platform is mind-blowing"
+        }
+    }, {
+        "u": {
+            "userSince": "2011-01-22T10:10:00",
+            "friendIds": [
+                1,
+                4
+            ],
+            "name": "IsbelDull",
+            "nickname": "Izzy",
+            "alias": "Isbel",
+            "id": 2,
+            "employment": [
+                {
+                    "organizationName": "Hexviafind",
+                    "startDate": "2010-04-27"
+                }
+            ]
+        },
+        "m": {
+            "senderLocation": [
+                48.09,
+                81.01
+            ],
+            "inResponseTo": 4,
+            "messageId": 3,
+            "authorId": 2,
+            "message": " like samsung the plan is amazing"
+        }
+    } ]
+
 
 ### <a id="Select_distinct">SELECT DISTINCT</a>
 SQL++'s `DISTINCT` keyword is used to eliminate duplicate items in results. The following example shows how it works.
@@ -388,6 +460,7 @@
 
 ### <a id="Binding_expressions">Binding expressions</a>
 In SQL++, in addition to stored collections, a `FROM` clause can iterate over any intermediate collection returned by a valid SQL++ expression.
+In the tuple stream generated by a `FROM` clause, the ordering of the input tuples are not guaranteed to be preserved.
 
 ##### Example
 
@@ -437,7 +510,8 @@
 Similar to standard SQL, SQL++ supports implicit `FROM` binding variables (i.e., aliases), for which a binding variable is generated. SQL++ variable generation falls into three cases:
 
   * If the binding expression is a variable reference expression, the generated variable's name will be the name of the referenced variable itself.
-  * If the binding expression is a field access expression, the generated variable's name will be the last identifier in the expression.
+  * If the binding expression is a field access expression (or a fully qualified name for a dataset), the generated
+    variable's name will be the last identifier (or the dataset name) in the expression.
   * For all other cases, a compilation error will be raised.
 
 The next two examples show queries that do not provide binding variables in their `FROM` clauses.
@@ -551,7 +625,7 @@
 
 ### <a id="Group_variables">Group variables</a>
 In a `GROUP BY` clause, in addition to the binding variable(s) defined for the grouping key(s), SQL++ allows a user to define a *group variable* by using the clause's `GROUP AS` extension to denote the resulting group.
-After grouping, then, the query's in-scope variables include the grouping key's binding variables as well as this group variable which will be bound to one collection value for each group. This per-group collection value will be a set of nested objects in which each field of the object is the result of a renamed variable defined in parentheses following the group variable's name. The `GROUP AS` syntax is as follows:
+After grouping, then, the query's in-scope variables include the grouping key's binding variables as well as this group variable which will be bound to one collection value for each group. This per-group collection (i.e., multiset) value will be a set of nested objects in which each field of the object is the result of a renamed variable defined in parentheses following the group variable's name. The `GROUP AS` syntax is as follows:
 
     <GROUP> <AS> Variable ("(" Variable <AS> VariableReference ("," Variable <AS> VariableReference )* ")")?
 
@@ -839,13 +913,13 @@
 ##### Example
 
     SELECT authorId,
-           (SELECT VALUE m.msg
-            FROM msgs m
-            WHERE m.msg.message LIKE '% like%'
-            ORDER BY m.msg.messageId
-            LIMIT 2) AS msgs
-    FROM GleambookMessages message
-    GROUP BY message.authorId GROUP AS msgs(message AS msg);
+       (SELECT VALUE g.msg
+        FROM g
+        WHERE g.msg.message LIKE '% like%'
+        ORDER BY g.msg.messageId
+        LIMIT 2) AS msgs
+    FROM GleambookMessages gbm
+    GROUP BY gbm.authorId GROUP AS g(gbm as msg);
 
 This query returns:
 
@@ -936,13 +1010,13 @@
 legal, executable, and returns the same result:
 
     SELECT uid,
-           (SELECT m.message
-            FROM (SELECT VALUE grp.message FROM `$1` AS grp) AS m
-            WHERE m.message LIKE '% like%'
-            ORDER BY m.messageId
-            LIMIT 2) AS msgs
-    FROM GleambookMessages message
-    GROUP BY message.authorId AS uid GROUP AS `$1` (message AS message);
+       (SELECT g.msg.message
+        FROM g
+        WHERE g.msg.message LIKE '% like%'
+        ORDER BY g.msg.messageId
+        LIMIT 2) AS msgs
+    FROM GleambookMessages gbm
+    GROUP BY gbm.authorId AS uid GROUP AS g(gbm as msg);
 
 ### <a id="Aggregation_functions">Aggregation functions</a>
 In traditional SQL, which doesn't support nested data, grouping always also involves the use of aggregation
@@ -977,7 +1051,7 @@
 
     ARRAY_AVG(
         (
-          SELECT VALUE len(friendIds) FROM GleambookUsers
+          SELECT VALUE ARRAY_COUNT(friendIds) FROM GleambookUsers
         )
     );
 
@@ -1015,7 +1089,7 @@
 
 ##### Example
 
-    SELECT uid, COUNT(msg) AS msgCnt
+    SELECT uid, COUNT(*) AS msgCnt
     FROM GleambookMessages msg
     GROUP BY msg.authorId AS uid;
 
@@ -1023,7 +1097,7 @@
 Rather, the `COUNT` query above is using a special "sugared" function symbol that the SQL++ compiler
 will rewrite as follows:
 
-    SELECT uid AS uid, ARRAY_COUNT( (SELECT g.msg FROM `$1` as g) ) AS msgCnt
+    SELECT uid AS uid, ARRAY_COUNT( (SELECT VALUE 1 FROM `$1` as g) ) AS msgCnt
     FROM GleambookMessages msg
     GROUP BY msg.authorId AS uid GROUP AS `$1`(msg AS msg);
 
@@ -1090,13 +1164,13 @@
 The `ORDER BY` clause is used to globally sort data in either ascending order (i.e., `ASC`) or descending order (i.e., `DESC`).
 During ordering, `MISSING` and `NULL` are treated as being smaller than any other value if they are encountered
 in the ordering key(s). `MISSING` is treated as smaller than `NULL` if both occur in the data being sorted.
-The following example returns all `GleambookUsers` ordered by their friend numbers.
+The following example returns all `GleambookUsers` in descending order by their number of friends.
 
 ##### Example
 
       SELECT VALUE user
       FROM GleambookUsers AS user
-      ORDER BY len(user.friendIds) DESC;
+      ORDER BY ARRAY_COUNT(user.friendIds) DESC;
 
 This query returns:
 
@@ -1206,12 +1280,12 @@
 ##### Example
 
     WITH avgFriendCount AS (
-      SELECT VALUE AVG(LEN(user.friendIds))
+      SELECT VALUE AVG(ARRAY_COUNT(user.friendIds))
       FROM GleambookUsers AS user
     )[0]
     SELECT VALUE user
     FROM GleambookUsers user
-    WHERE LEN(user.friendIds) > avgFriendCount;
+    WHERE ARRAY_COUNT(user.friendIds) > avgFriendCount;
 
 This query returns:
 
@@ -1263,8 +1337,8 @@
 
     SELECT *
     FROM GleambookUsers user
-    WHERE LEN(user.friendIds) >
-        ( SELECT VALUE AVG(LEN(user.friendIds))
+    WHERE ARRAY_COUNT(user.friendIds) >
+        ( SELECT VALUE AVG(ARRAY_COUNT(user.friendIds))
           FROM GleambookUsers AS user
         ) [0];
 
@@ -1389,7 +1463,11 @@
                  );
 
 ## <a id="Union_all">UNION ALL</a>
-UNION ALL can be used to combine two input streams into one. As in SQL, there is no ordering guarantee on the contents of the output stream. However, unlike SQL, SQL++ does not constrain what the data looks like on the input streams; in particular, it allows heterogenity on the input and output streams. The following odd but legal query is an example:
+UNION ALL can be used to combine two input streams into one. As in SQL, there is no ordering guarantee on the contents of the output stream.
+However, unlike SQL, SQL++ does not constrain what the data looks like on the input streams; in particular,
+it allows heterogenity on the input and output streams.
+A type error will be raised if one of the inputs is not a collection.
+The following odd but legal query is an example:
 
 ##### Example
 
@@ -1456,20 +1534,7 @@
 within a query the subquery occurs -- and again, its result is never automatically cast into a scalar.
 
 ## <a id="Vs_SQL-92">SQL++ vs. SQL-92</a>
-The following matrix is a quick "SQL-92 compatibility cheat sheet" for SQL++.
-
-| Feature |  SQL++ | SQL-92 |
-|----------|--------|--------|
-| SELECT * | Returns nested objects | Returns flattened concatenated objects |
-| Subquery | Returns a collection  | The returned collection is cast into a scalar value if the subquery appears in a SELECT list or on one side of a comparison or as input to a function |
-| LEFT OUTER JOIN |  Fills in `MISSING`(s) for non-matches  |   Fills in `NULL`(s) for non-matches    |
-| UNION ALL       | Allows heterogeneous inputs and output | Input streams must be UNION-compatible and output field names are drawn from the first input stream
-| IN constant_expr | The constant expression has to be an array or multiset, i.e., [..,..,...] | The constant collection can be represented as comma-separated items in a paren pair |
-| String literal | Double quotes or single quotes | Single quotes only |
-| Delimited identifiers | Backticks | Double quotes |
-
-For things beyond this cheat sheet, SQL++ is SQL-92 compliant.
-Morever, SQL++ offers the following additional features beyond SQL-92 (hence the "++" in its name):
+SQL++ offers the following additional features beyond SQL-92 (hence the "++" in its name):
 
   * Fully composable and functional: A subquery can iterate over any intermediate collection and can appear anywhere in a query.
   * Schema-free: The query language does not assume the existence of a static schema for any data that it processes.
@@ -1477,3 +1542,32 @@
   * Powerful GROUP BY: In addition to a set of aggregate functions as in standard SQL, the groups created by the `GROUP BY` clause are directly usable in nested queries and/or to obtain nested results.
   * Generalized SELECT clause: A SELECT clause can return any type of collection, while in SQL-92, a `SELECT` clause has to return a (homogeneous) collection of objects.
 
+
+The following matrix is a quick "SQL-92 compatibility cheat sheet" for SQL++.
+
+| Feature |  SQL++ | SQL-92 |  Why different?  |
+|----------|--------|-------|------------------|
+| SELECT * | Returns nested objects | Returns flattened concatenated objects | Nested collections are 1st class citizens |
+| Subquery | Returns a collection  | The returned collection is cast into a scalar value if the subquery appears in a SELECT list or on one side of a comparison or as input to a function | Nested collections are 1st class citizens |
+| LEFT OUTER JOIN |  Fills in `MISSING`(s) for non-matches  |   Fills in `NULL`(s) for non-matches    | "Absence" is more appropriate than "unknown" here.  |
+| UNION ALL       | Allows heterogeneous inputs and output | Input streams must be UNION-compatible and output field names are drawn from the first input stream | Heterogenity and nested collections are common |
+| IN constant_expr | The constant expression has to be an array or multiset, i.e., [..,..,...] | The constant collection can be represented as comma-separated items in a paren pair | Nested collections are 1st class citizens |
+| String literal | Double quotes or single quotes | Single quotes only | Double quoted strings are pervasive |
+| Delimited identifiers | Backticks | Double quotes | Double quoted strings are pervasive |
+
+The following SQL-92 features are not implemented yet. However, SQL++ does not conflict those features:
+
+  * CROSS JOIN, NATURAL JOIN, UNION JOIN
+  * RIGHT and FULL OUTER JOIN
+  * INTERSECT, EXCEPT, UNION with set semantics
+  * CAST expression
+  * NULLIF expression
+  * COALESCE expression
+  * ALL and SOME predicates for linking to subqueries
+  * UNIQUE predicate (tests a collection for duplicates)
+  * MATCH predicate (tests for referential integrity)
+  * Row and Table constructors
+  * DISTINCT aggregates
+  * Preserved order for expressions in a SELECT list
+
+

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1329
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I30ab472d87221d6b1c709f24c4dba1bdfaa2c0b4
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Yingyi Bu <bu...@gmail.com>

Change in asterixdb[master]: Address Don's comments for the query section doc.

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Address Don's comments for the query section doc.
......................................................................


Patch Set 1:

Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1070/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1329
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30ab472d87221d6b1c709f24c4dba1bdfaa2c0b4
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Address Don's comments for the query section doc.

Posted by "Till Westmann (Code Review)" <do...@asterixdb.incubator.apache.org>.
Till Westmann has posted comments on this change.

Change subject: Address Don's comments for the query section doc.
......................................................................


Patch Set 1: Code-Review+2

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1329
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30ab472d87221d6b1c709f24c4dba1bdfaa2c0b4
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-HasComments: No

Change in asterixdb[master]: Address Don's comments for the query section doc.

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Address Don's comments for the query section doc.
......................................................................


Patch Set 1: Integration-Tests+1

Integration Tests Successful

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/1070/ : SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1329
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30ab472d87221d6b1c709f24c4dba1bdfaa2c0b4
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Address Don's comments for the query section doc.

Posted by "Jenkins (Code Review)" <do...@asterixdb.incubator.apache.org>.
Jenkins has posted comments on this change.

Change subject: Address Don's comments for the query section doc.
......................................................................


Patch Set 1:

Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3219/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1329
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30ab472d87221d6b1c709f24c4dba1bdfaa2c0b4
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-HasComments: No

Change in asterixdb[master]: Address Don's comments for the query section doc.

Posted by "Yingyi Bu (Code Review)" <do...@asterixdb.incubator.apache.org>.
Yingyi Bu has submitted this change and it was merged.

Change subject: Address Don's comments for the query section doc.
......................................................................


Address Don's comments for the query section doc.

Change-Id: I30ab472d87221d6b1c709f24c4dba1bdfaa2c0b4
Reviewed-on: https://asterix-gerrit.ics.uci.edu/1329
Sonar-Qube: Jenkins <je...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: Till Westmann <ti...@apache.org>
---
M asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
1 file changed, 142 insertions(+), 48 deletions(-)

Approvals:
  Till Westmann: Looks good to me, approved
  Jenkins: Verified; No violations found; Verified



diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
index 85787e8..dc760ba 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
@@ -33,13 +33,13 @@
     Subquery           ::= "(" SelectStatement ")"
 
     SelectBlock        ::= SelectClause
-                           ( FromClause ( WithClause )?)?
+                           ( FromClause ( LetClause )?)?
                            ( WhereClause )?
                            ( GroupbyClause ( LetClause )? ( HavingClause )? )?
                            |
-                           FromClause ( WithClause )?
+                           FromClause ( LetClause )?
                            ( WhereClause )?
-                           ( GroupbyClause ( WithClause )? ( HavingClause )? )?
+                           ( GroupbyClause ( LetClause )? ( HavingClause )? )?
                            SelectClause
 
     SelectClause       ::= <SELECT> ( <ALL> | <DISTINCT> )? ( SelectRegular | SelectValue )
@@ -48,10 +48,10 @@
     Projection         ::= ( Expression ( <AS> )? Identifier | "*" )
 
     FromClause         ::= <FROM> FromTerm ( "," FromTerm )*
-    FromTerm           ::= Expression (( <AS> )? Variable)? ( <AT> Variable )?
+    FromTerm           ::= Expression (( <AS> )? Variable)?
                            ( ( JoinType )? ( JoinClause | UnnestClause ) )*
 
-    JoinClause         ::= <JOIN> Expression (( <AS> )? Variable)? (<AT> Variable)? <ON> Expression
+    JoinClause         ::= <JOIN> Expression (( <AS> )? Variable)? <ON> Expression
     UnnestClause       ::= ( <UNNEST> | <CORRELATE> | <FLATTEN> ) Expression
                            ( <AS> )? Variable ( <AT> Variable )?
     JoinType           ::= ( <INNER> | <LEFT> ( <OUTER> )? )
@@ -74,13 +74,13 @@
 
 In this section, we will make use of two stored collections of objects (datasets), `GleambookUsers` and `GleambookMessages`, in a series of running examples to explain `SELECT` queries. The contents of the example collections are as follows:
 
-`GleambookUsers` collection:
+`GleambookUsers` collection (or, dataset):
 
     {"id":1,"alias":"Margarita","name":"MargaritaStoddard","nickname":"Mags","userSince":"2012-08-20T10:10:00","friendIds":[2,3,6,10],"employment":[{"organizationName":"Codetechno","start-date":"2006-08-06"},{"organizationName":"geomedia","start-date":"2010-06-17","end-date":"2010-01-26"}],"gender":"F"}
     {"id":2,"alias":"Isbel","name":"IsbelDull","nickname":"Izzy","userSince":"2011-01-22T10:10:00","friendIds":[1,4],"employment":[{"organizationName":"Hexviafind","startDate":"2010-04-27"}]}
     {"id":3,"alias":"Emory","name":"EmoryUnk","userSince":"2012-07-10T10:10:00","friendIds":[1,5,8,9],"employment":[{"organizationName":"geomedia","startDate":"2010-06-17","endDate":"2010-01-26"}]}
 
-`GleambookMessages` collection:
+`GleambookMessages` collection (or, dataset):
 
     {"messageId":2,"authorId":1,"inResponseTo":4,"senderLocation":[41.66,80.87],"message":" dislike iphone its touch-screen is horrible"}
     {"messageId":3,"authorId":2,"inResponseTo":4,"senderLocation":[48.09,81.01],"message":" like samsung the plan is amazing"}
@@ -136,6 +136,7 @@
 In SQL++, the traditional SQL-style `SELECT` syntax is also supported.
 This syntax can also be reformulated in a `SELECT VALUE` based manner in SQL++.
 (E.g., `SELECT expA AS fldA, expB AS fldB` is syntactic sugar for `SELECT VALUE { 'fldA': expA, 'fldB': expB }`.)
+Unlike in SQL, the result of an SQL++ query does not preserve the order of expressions in the `SELECT` clause.
 
 ##### Example
     SELECT user.alias user_alias, user.name user_name
@@ -150,7 +151,9 @@
     } ]
 
 ### <a id="Select_star">SELECT *</a>
-In SQL++, `SELECT *` returns a object with a nested field for each input tuple. Each field has as its field name the name of a binding variable generated by either the `FROM` clause or `GROUP BY` clause in the current enclosing `SELECT` statement, and its field is the value of that binding variable.
+In SQL++, `SELECT *` returns a object with a nested field for each input tuple.
+Each field has as its field name the name of a binding variable generated by either the `FROM` clause or `GROUP BY`
+clause in the current enclosing `SELECT` statement, and its field value is the value of that binding variable.
 
 ##### Example
 
@@ -224,6 +227,75 @@
             ]
         }
     } ]
+
+
+##### Example
+
+    SELECT *
+    FROM GleambookUsers u, GleambookMessages m
+    WHERE m.authorId = u.id and u.id = 2;
+
+This query does an inner join that we will discuss in [multiple from terms](#Multiple_from_terms).
+Since both `u` and `m` are binding variable generated in the `FROM` clause, this query returns:
+
+    [ {
+        "u": {
+            "userSince": "2011-01-22T10:10:00",
+            "friendIds": [
+                1,
+                4
+            ],
+            "name": "IsbelDull",
+            "nickname": "Izzy",
+            "alias": "Isbel",
+            "id": 2,
+            "employment": [
+                {
+                    "organizationName": "Hexviafind",
+                    "startDate": "2010-04-27"
+                }
+            ]
+        },
+        "m": {
+            "senderLocation": [
+                31.5,
+                75.56
+            ],
+            "inResponseTo": 1,
+            "messageId": 6,
+            "authorId": 2,
+            "message": " like t-mobile its platform is mind-blowing"
+        }
+    }, {
+        "u": {
+            "userSince": "2011-01-22T10:10:00",
+            "friendIds": [
+                1,
+                4
+            ],
+            "name": "IsbelDull",
+            "nickname": "Izzy",
+            "alias": "Isbel",
+            "id": 2,
+            "employment": [
+                {
+                    "organizationName": "Hexviafind",
+                    "startDate": "2010-04-27"
+                }
+            ]
+        },
+        "m": {
+            "senderLocation": [
+                48.09,
+                81.01
+            ],
+            "inResponseTo": 4,
+            "messageId": 3,
+            "authorId": 2,
+            "message": " like samsung the plan is amazing"
+        }
+    } ]
+
 
 ### <a id="Select_distinct">SELECT DISTINCT</a>
 SQL++'s `DISTINCT` keyword is used to eliminate duplicate items in results. The following example shows how it works.
@@ -388,6 +460,7 @@
 
 ### <a id="Binding_expressions">Binding expressions</a>
 In SQL++, in addition to stored collections, a `FROM` clause can iterate over any intermediate collection returned by a valid SQL++ expression.
+In the tuple stream generated by a `FROM` clause, the ordering of the input tuples are not guaranteed to be preserved.
 
 ##### Example
 
@@ -437,7 +510,8 @@
 Similar to standard SQL, SQL++ supports implicit `FROM` binding variables (i.e., aliases), for which a binding variable is generated. SQL++ variable generation falls into three cases:
 
   * If the binding expression is a variable reference expression, the generated variable's name will be the name of the referenced variable itself.
-  * If the binding expression is a field access expression, the generated variable's name will be the last identifier in the expression.
+  * If the binding expression is a field access expression (or a fully qualified name for a dataset), the generated
+    variable's name will be the last identifier (or the dataset name) in the expression.
   * For all other cases, a compilation error will be raised.
 
 The next two examples show queries that do not provide binding variables in their `FROM` clauses.
@@ -551,7 +625,7 @@
 
 ### <a id="Group_variables">Group variables</a>
 In a `GROUP BY` clause, in addition to the binding variable(s) defined for the grouping key(s), SQL++ allows a user to define a *group variable* by using the clause's `GROUP AS` extension to denote the resulting group.
-After grouping, then, the query's in-scope variables include the grouping key's binding variables as well as this group variable which will be bound to one collection value for each group. This per-group collection value will be a set of nested objects in which each field of the object is the result of a renamed variable defined in parentheses following the group variable's name. The `GROUP AS` syntax is as follows:
+After grouping, then, the query's in-scope variables include the grouping key's binding variables as well as this group variable which will be bound to one collection value for each group. This per-group collection (i.e., multiset) value will be a set of nested objects in which each field of the object is the result of a renamed variable defined in parentheses following the group variable's name. The `GROUP AS` syntax is as follows:
 
     <GROUP> <AS> Variable ("(" Variable <AS> VariableReference ("," Variable <AS> VariableReference )* ")")?
 
@@ -839,13 +913,13 @@
 ##### Example
 
     SELECT authorId,
-           (SELECT VALUE m.msg
-            FROM msgs m
-            WHERE m.msg.message LIKE '% like%'
-            ORDER BY m.msg.messageId
-            LIMIT 2) AS msgs
-    FROM GleambookMessages message
-    GROUP BY message.authorId GROUP AS msgs(message AS msg);
+       (SELECT VALUE g.msg
+        FROM g
+        WHERE g.msg.message LIKE '% like%'
+        ORDER BY g.msg.messageId
+        LIMIT 2) AS msgs
+    FROM GleambookMessages gbm
+    GROUP BY gbm.authorId GROUP AS g(gbm as msg);
 
 This query returns:
 
@@ -936,13 +1010,13 @@
 legal, executable, and returns the same result:
 
     SELECT uid,
-           (SELECT m.message
-            FROM (SELECT VALUE grp.message FROM `$1` AS grp) AS m
-            WHERE m.message LIKE '% like%'
-            ORDER BY m.messageId
-            LIMIT 2) AS msgs
-    FROM GleambookMessages message
-    GROUP BY message.authorId AS uid GROUP AS `$1` (message AS message);
+       (SELECT g.msg.message
+        FROM g
+        WHERE g.msg.message LIKE '% like%'
+        ORDER BY g.msg.messageId
+        LIMIT 2) AS msgs
+    FROM GleambookMessages gbm
+    GROUP BY gbm.authorId AS uid GROUP AS g(gbm as msg);
 
 ### <a id="Aggregation_functions">Aggregation functions</a>
 In traditional SQL, which doesn't support nested data, grouping always also involves the use of aggregation
@@ -977,7 +1051,7 @@
 
     ARRAY_AVG(
         (
-          SELECT VALUE len(friendIds) FROM GleambookUsers
+          SELECT VALUE ARRAY_COUNT(friendIds) FROM GleambookUsers
         )
     );
 
@@ -1015,7 +1089,7 @@
 
 ##### Example
 
-    SELECT uid, COUNT(msg) AS msgCnt
+    SELECT uid, COUNT(*) AS msgCnt
     FROM GleambookMessages msg
     GROUP BY msg.authorId AS uid;
 
@@ -1023,7 +1097,7 @@
 Rather, the `COUNT` query above is using a special "sugared" function symbol that the SQL++ compiler
 will rewrite as follows:
 
-    SELECT uid AS uid, ARRAY_COUNT( (SELECT g.msg FROM `$1` as g) ) AS msgCnt
+    SELECT uid AS uid, ARRAY_COUNT( (SELECT VALUE 1 FROM `$1` as g) ) AS msgCnt
     FROM GleambookMessages msg
     GROUP BY msg.authorId AS uid GROUP AS `$1`(msg AS msg);
 
@@ -1090,13 +1164,13 @@
 The `ORDER BY` clause is used to globally sort data in either ascending order (i.e., `ASC`) or descending order (i.e., `DESC`).
 During ordering, `MISSING` and `NULL` are treated as being smaller than any other value if they are encountered
 in the ordering key(s). `MISSING` is treated as smaller than `NULL` if both occur in the data being sorted.
-The following example returns all `GleambookUsers` ordered by their friend numbers.
+The following example returns all `GleambookUsers` in descending order by their number of friends.
 
 ##### Example
 
       SELECT VALUE user
       FROM GleambookUsers AS user
-      ORDER BY len(user.friendIds) DESC;
+      ORDER BY ARRAY_COUNT(user.friendIds) DESC;
 
 This query returns:
 
@@ -1206,12 +1280,12 @@
 ##### Example
 
     WITH avgFriendCount AS (
-      SELECT VALUE AVG(LEN(user.friendIds))
+      SELECT VALUE AVG(ARRAY_COUNT(user.friendIds))
       FROM GleambookUsers AS user
     )[0]
     SELECT VALUE user
     FROM GleambookUsers user
-    WHERE LEN(user.friendIds) > avgFriendCount;
+    WHERE ARRAY_COUNT(user.friendIds) > avgFriendCount;
 
 This query returns:
 
@@ -1263,8 +1337,8 @@
 
     SELECT *
     FROM GleambookUsers user
-    WHERE LEN(user.friendIds) >
-        ( SELECT VALUE AVG(LEN(user.friendIds))
+    WHERE ARRAY_COUNT(user.friendIds) >
+        ( SELECT VALUE AVG(ARRAY_COUNT(user.friendIds))
           FROM GleambookUsers AS user
         ) [0];
 
@@ -1389,7 +1463,11 @@
                  );
 
 ## <a id="Union_all">UNION ALL</a>
-UNION ALL can be used to combine two input streams into one. As in SQL, there is no ordering guarantee on the contents of the output stream. However, unlike SQL, SQL++ does not constrain what the data looks like on the input streams; in particular, it allows heterogenity on the input and output streams. The following odd but legal query is an example:
+UNION ALL can be used to combine two input streams into one. As in SQL, there is no ordering guarantee on the contents of the output stream.
+However, unlike SQL, SQL++ does not constrain what the data looks like on the input streams; in particular,
+it allows heterogenity on the input and output streams.
+A type error will be raised if one of the inputs is not a collection.
+The following odd but legal query is an example:
 
 ##### Example
 
@@ -1456,20 +1534,7 @@
 within a query the subquery occurs -- and again, its result is never automatically cast into a scalar.
 
 ## <a id="Vs_SQL-92">SQL++ vs. SQL-92</a>
-The following matrix is a quick "SQL-92 compatibility cheat sheet" for SQL++.
-
-| Feature |  SQL++ | SQL-92 |
-|----------|--------|--------|
-| SELECT * | Returns nested objects | Returns flattened concatenated objects |
-| Subquery | Returns a collection  | The returned collection is cast into a scalar value if the subquery appears in a SELECT list or on one side of a comparison or as input to a function |
-| LEFT OUTER JOIN |  Fills in `MISSING`(s) for non-matches  |   Fills in `NULL`(s) for non-matches    |
-| UNION ALL       | Allows heterogeneous inputs and output | Input streams must be UNION-compatible and output field names are drawn from the first input stream
-| IN constant_expr | The constant expression has to be an array or multiset, i.e., [..,..,...] | The constant collection can be represented as comma-separated items in a paren pair |
-| String literal | Double quotes or single quotes | Single quotes only |
-| Delimited identifiers | Backticks | Double quotes |
-
-For things beyond this cheat sheet, SQL++ is SQL-92 compliant.
-Morever, SQL++ offers the following additional features beyond SQL-92 (hence the "++" in its name):
+SQL++ offers the following additional features beyond SQL-92 (hence the "++" in its name):
 
   * Fully composable and functional: A subquery can iterate over any intermediate collection and can appear anywhere in a query.
   * Schema-free: The query language does not assume the existence of a static schema for any data that it processes.
@@ -1477,3 +1542,32 @@
   * Powerful GROUP BY: In addition to a set of aggregate functions as in standard SQL, the groups created by the `GROUP BY` clause are directly usable in nested queries and/or to obtain nested results.
   * Generalized SELECT clause: A SELECT clause can return any type of collection, while in SQL-92, a `SELECT` clause has to return a (homogeneous) collection of objects.
 
+
+The following matrix is a quick "SQL-92 compatibility cheat sheet" for SQL++.
+
+| Feature |  SQL++ | SQL-92 |  Why different?  |
+|----------|--------|-------|------------------|
+| SELECT * | Returns nested objects | Returns flattened concatenated objects | Nested collections are 1st class citizens |
+| Subquery | Returns a collection  | The returned collection is cast into a scalar value if the subquery appears in a SELECT list or on one side of a comparison or as input to a function | Nested collections are 1st class citizens |
+| LEFT OUTER JOIN |  Fills in `MISSING`(s) for non-matches  |   Fills in `NULL`(s) for non-matches    | "Absence" is more appropriate than "unknown" here.  |
+| UNION ALL       | Allows heterogeneous inputs and output | Input streams must be UNION-compatible and output field names are drawn from the first input stream | Heterogenity and nested collections are common |
+| IN constant_expr | The constant expression has to be an array or multiset, i.e., [..,..,...] | The constant collection can be represented as comma-separated items in a paren pair | Nested collections are 1st class citizens |
+| String literal | Double quotes or single quotes | Single quotes only | Double quoted strings are pervasive |
+| Delimited identifiers | Backticks | Double quotes | Double quoted strings are pervasive |
+
+The following SQL-92 features are not implemented yet. However, SQL++ does not conflict those features:
+
+  * CROSS JOIN, NATURAL JOIN, UNION JOIN
+  * RIGHT and FULL OUTER JOIN
+  * INTERSECT, EXCEPT, UNION with set semantics
+  * CAST expression
+  * NULLIF expression
+  * COALESCE expression
+  * ALL and SOME predicates for linking to subqueries
+  * UNIQUE predicate (tests a collection for duplicates)
+  * MATCH predicate (tests for referential integrity)
+  * Row and Table constructors
+  * DISTINCT aggregates
+  * Preserved order for expressions in a SELECT list
+
+

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1329
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I30ab472d87221d6b1c709f24c4dba1bdfaa2c0b4
Gerrit-PatchSet: 2
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Yingyi Bu <bu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-Reviewer: Yingyi Bu <bu...@gmail.com>