You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@asterixdb.apache.org by al...@apache.org on 2018/11/21 18:06:16 UTC

asterixdb git commit: [ASTERIXDB-2463][DOC] add docs for primary index & parallel sort parameters

Repository: asterixdb
Updated Branches:
  refs/heads/master fa4dc5c28 -> e39b386ff


[ASTERIXDB-2463][DOC] add docs for primary index & parallel sort parameters

- user model changes: no
- storage format changes: no
- interface changes: no

Details:
add docs for primary index & parallel sort parameters.

Change-Id: Iaad725f7104da7f70b1064ffda90e5397b3094ec
Reviewed-on: https://asterix-gerrit.ics.uci.edu/3015
Sonar-Qube: Jenkins <je...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Contrib: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: Till Westmann <ti...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/asterixdb/repo
Commit: http://git-wip-us.apache.org/repos/asf/asterixdb/commit/e39b386f
Tree: http://git-wip-us.apache.org/repos/asf/asterixdb/tree/e39b386f
Diff: http://git-wip-us.apache.org/repos/asf/asterixdb/diff/e39b386f

Branch: refs/heads/master
Commit: e39b386ffadba288234c0f146a67087a97f8145a
Parents: fa4dc5c
Author: Ali Alsuliman <al...@gmail.com>
Authored: Tue Nov 20 15:46:47 2018 -0800
Committer: Ali Alsuliman <al...@gmail.com>
Committed: Wed Nov 21 10:04:49 2018 -0800

----------------------------------------------------------------------
 asterixdb/asterix-doc/pom.xml                   |  2 +-
 .../main/markdown/sqlpp/5_ddl_dataset_index.md  | 53 +++++++++++++++++---
 .../markdown/sqlpp/5_ddl_nonenforced_index.md   | 31 ------------
 .../asterix-doc/src/site/markdown/ncservice.md  |  4 +-
 4 files changed, 51 insertions(+), 39 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/asterixdb/blob/e39b386f/asterixdb/asterix-doc/pom.xml
----------------------------------------------------------------------
diff --git a/asterixdb/asterix-doc/pom.xml b/asterixdb/asterix-doc/pom.xml
index 8ccf08c..b6f77a5 100644
--- a/asterixdb/asterix-doc/pom.xml
+++ b/asterixdb/asterix-doc/pom.xml
@@ -52,7 +52,7 @@
             <configuration>
               <target>
                 <concat destfile="${project.build.directory}/generated-site/markdown/sqlpp/manual.md">
-                  <filelist dir="${project.basedir}/src/main/markdown/sqlpp" files="0_toc.md,1_intro.md,2_expr_title.md,2_expr.md,3_query_title.md,3_declare_dataverse.md,3_declare_function.md,3_query.md,4_error_title.md,4_error.md,5_ddl_head.md,5_ddl_dataset_index.md,5_ddl_nonenforced_index.md,5_ddl_function_removal.md,5_ddl_dml.md,appendix_1_title.md,appendix_1_keywords.md,appendix_2_title.md,appendix_2_parameters.md,appendix_2_index_only.md,appendix_3_title.md,appendix_3_resolution.md" />
+                  <filelist dir="${project.basedir}/src/main/markdown/sqlpp" files="0_toc.md,1_intro.md,2_expr_title.md,2_expr.md,3_query_title.md,3_declare_dataverse.md,3_declare_function.md,3_query.md,4_error_title.md,4_error.md,5_ddl_head.md,5_ddl_dataset_index.md,5_ddl_function_removal.md,5_ddl_dml.md,appendix_1_title.md,appendix_1_keywords.md,appendix_2_title.md,appendix_2_parameters.md,appendix_2_index_only.md,appendix_3_title.md,appendix_3_resolution.md" />
                 </concat>
                 <concat destfile="${project.build.directory}/generated-site/markdown/sqlpp/builtins.md">
                   <filelist dir="${project.basedir}/src/main/markdown/builtins" files="0_toc.md,1_numeric_common.md,1_numeric_delta.md,2_string_common.md,2_string_delta.md,3_binary.md,4_spatial.md,5_similarity.md,6_tokenizing.md,7_temporal.md,7_allens.md,8_record.md,9_aggregate_sql.md,10_comparison.md,11_type.md,13_conditional.md,12_misc.md" />

http://git-wip-us.apache.org/repos/asf/asterixdb/blob/e39b386f/asterixdb/asterix-doc/src/main/markdown/sqlpp/5_ddl_dataset_index.md
----------------------------------------------------------------------
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/5_ddl_dataset_index.md b/asterixdb/asterix-doc/src/main/markdown/sqlpp/5_ddl_dataset_index.md
index 589b038..79e04ae 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/5_ddl_dataset_index.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/5_ddl_dataset_index.md
@@ -192,9 +192,10 @@ the URL and path needed to locate the data in HDFS and a description of the data
 
 ### <a id="Indices">Indices</a>
 
-    IndexSpecification ::= <INDEX> Identifier IfNotExists <ON> QualifiedName
-                           "(" ( IndexField ) ( "," IndexField )* ")" ( "type" IndexType "?")?
-                           ( (<NOT>)? <ENFORCED> )?
+    IndexSpecification ::= (<INDEX> Identifier IfNotExists <ON> QualifiedName
+                           "(" ( IndexField ) ( "," IndexField )* ")" (<TYPE> IndexType)? (<ENFORCED>)?)
+                           |
+                           <PRIMARY> <INDEX> Identifier? IfNotExists <ON> QualifiedName (<TYPE> <BTREE>)?
     IndexType          ::= <BTREE> | <RTREE> | <KEYWORD> | <NGRAM> "(" IntegerLiteral ")"
 
 The CREATE INDEX statement creates a secondary index on one or more fields of a specified dataset.
@@ -216,15 +217,26 @@ field.
 
     CREATE INDEX gbAuthorIdx ON GleambookMessages(authorId) TYPE BTREE;
 
-The following example creates an open btree index called gbSendTimeIdx on the (non-predeclared) sendTime field of the GleambookMessages dataset having datetime type.
-This index can be useful for accelerating exact-match queries, range search queries, and joins involving the sendTime field.
-The index is enforced so that records that do not have the "sendTime" field or have a mismatched type on the field
+The following example creates an open btree index called gbSendTimeIdx on the (non-declared) `sendTime` field of the GleambookMessages dataset having datetime type.
+This index can be useful for accelerating exact-match queries, range search queries, and joins involving the `sendTime` field.
+The index is enforced so that records that do not have the `sendTime` field or have a mismatched type on the field
 cannot be inserted into the dataset.
 
 #### Example
 
     CREATE INDEX gbSendTimeIdx ON GleambookMessages(sendTime: datetime?) TYPE BTREE ENFORCED;
 
+The following example creates an open btree index called gbReadTimeIdx on the (non-declared) `readTime`
+field of the GleambookMessages dataset having datetime type.
+This index can be useful for accelerating exact-match queries, range search queries,
+and joins involving the `readTime` field.
+The index is not enforced so that records that do not have the `readTime` field or have a mismatched type on the field
+can still be inserted into the dataset.
+
+#### Example
+
+    CREATE INDEX gbReadTimeIdx ON GleambookMessages(readTime: datetime?);
+
 The following example creates a btree index called crpUserScrNameIdx on screenName,
 a nested field residing within a object-valued user field in the ChirpMessages dataset.
 This index can be useful for accelerating exact-match queries, range search queries,
@@ -252,3 +264,32 @@ The following example creates a keyword index called fbMessageIdx on the message
 #### Example
 
     CREATE INDEX fbMessageIdx ON GleambookMessages(message) TYPE KEYWORD;
+
+The following example creates a special secondary index which holds only the primary keys.
+This index is useful for speeding up aggregation queries which involve only primary keys.
+The name of the index is optional. If the name is not specified, the system will generate
+one. When the user would like to drop this index, the metadata can be queried to find the system-generated name.
+
+#### Example
+
+    CREATE PRIMARY INDEX gb_pk_idx ON GleambookMessages;
+
+An example query that can be accelerated using the primary-key index:
+
+    SELECT COUNT(*) FROM GleambookMessages;
+
+To look up the the above primary-key index, issue the following query:
+
+    SELECT VALUE i
+    FROM Metadata.`Index` i
+    WHERE i.DataverseName = "TinySocial" AND i.DatasetName = "GleambookMessages";
+
+The query returns:
+
+    [ { "DataverseName": "TinySocial", "DatasetName": "GleambookMessages", "IndexName": "GleambookMessages", "IndexStructure": "BTREE", "SearchKey": [ [ "messageId" ] ], "IsPrimary": true, "Timestamp": "Wed Nov 07 17:25:11 PST 2018", "PendingOp": 0 }
+    , { "DataverseName": "TinySocial", "DatasetName": "GleambookMessages", "IndexName": "gb_pk_idx", "IndexStructure": "BTREE", "SearchKey": [  ], "IsPrimary": false, "Timestamp": "Wed Nov 07 17:25:11 PST 2018", "PendingOp": 0 }
+     ]
+
+Remember that `CREATE PRIMARY INDEX` creates a secondary index.
+That is the reason the `IsPrimary` field is false.
+The primary-key index can be identified by the fact that the `SearchKey` field is empty since it only contains primary key fields.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/asterixdb/blob/e39b386f/asterixdb/asterix-doc/src/main/markdown/sqlpp/5_ddl_nonenforced_index.md
----------------------------------------------------------------------
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/5_ddl_nonenforced_index.md b/asterixdb/asterix-doc/src/main/markdown/sqlpp/5_ddl_nonenforced_index.md
deleted file mode 100644
index 518941d..0000000
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/5_ddl_nonenforced_index.md
+++ /dev/null
@@ -1,31 +0,0 @@
-<!--
- ! Licensed to the Apache Software Foundation (ASF) under one
- ! or more contributor license agreements.  See the NOTICE file
- ! distributed with this work for additional information
- ! regarding copyright ownership.  The ASF licenses this file
- ! to you under the Apache License, Version 2.0 (the
- ! "License"); you may not use this file except in compliance
- ! with the License.  You may obtain a copy of the License at
- !
- !   http://www.apache.org/licenses/LICENSE-2.0
- !
- ! Unless required by applicable law or agreed to in writing,
- ! software distributed under the License is distributed on an
- ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- ! KIND, either express or implied.  See the License for the
- ! specific language governing permissions and limitations
- ! under the License.
- !-->
-
-
-The following example creates an open btree index called gbReadTimeIdx on the (non-predeclared) readTime
-field of the GleambookMessages dataset having datetime type.
-This index can be useful for accelerating exact-match queries, range search queries,
-and joins involving the `readTime` field.
-The index is not enforced so that records that do not have the `readTime` field or have a mismatched type on the field
-can still be inserted into the dataset.
-
-#### Example
-
-    CREATE INDEX gbReadTimeIdx ON GleambookMessages(readTime: datetime?);
-

http://git-wip-us.apache.org/repos/asf/asterixdb/blob/e39b386f/asterixdb/asterix-doc/src/site/markdown/ncservice.md
----------------------------------------------------------------------
diff --git a/asterixdb/asterix-doc/src/site/markdown/ncservice.md b/asterixdb/asterix-doc/src/site/markdown/ncservice.md
index ef2ac9b..66e99a8 100644
--- a/asterixdb/asterix-doc/src/site/markdown/ncservice.md
+++ b/asterixdb/asterix-doc/src/site/markdown/ncservice.md
@@ -345,7 +345,9 @@ The following parameters are configured under the "[common]" section.
 | common  | compiler.joinmemory                       | The memory budget (in bytes) for a join operator instance in a partition | 33554432 (32 MB) |
 | common  | compiler.parallelism                      | The degree of parallelism for query execution. Zero means to use the storage parallelism as the query execution parallelism, while other integer values dictate the number of query execution parallel partitions. The system will fall back to use the number of all available CPU cores in the cluster as the degree of parallelism if the number set by a user is too large or too small | 0 |
 | common  | compiler.sortmemory                       | The memory budget (in bytes) for a sort operator instance in a partition | 33554432 (32 MB) |
-| common  | compiler.textsearchmemory                       | The memory budget (in bytes) for an inverted-index-search operator instance in a partition | 33554432 (32 MB) |
+| common  | compiler.sort.parallel                    | Enable full parallel sort for queries | true |
+| common  | compiler.sort.samples                     | The number of samples taken from each partition to guide the sort operation when full parallel sort is enabled | 100 |
+| common  | compiler.textsearchmemory                 | The memory budget (in bytes) for an inverted-index-search operator instance in a partition | 33554432 (32 MB) |
 | common  | log.level                                 | The logging level for master and slave processes | WARNING |
 | common  | max.wait.active.cluster                   | The max pending time (in seconds) for cluster startup. After the threshold, if the cluster still is not up and running, it is considered unavailable | 60 |
 | common  | messaging.frame.count                     | Number of reusable frames for NC to NC messaging | 512 |