You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "vtlim (via GitHub)" <gi...@apache.org> on 2023/02/24 01:59:24 UTC

[GitHub] [druid] vtlim commented on a diff in pull request #12549: document arrays in sql

vtlim commented on code in PR #12549:
URL: https://github.com/apache/druid/pull/12549#discussion_r1116411489


##########
docs/querying/sql-data-types.md:
##########
@@ -33,7 +33,7 @@ Columns in Druid are associated with a specific data type. This topic describes
 
 Druid natively supports five basic column types: "long" (64 bit signed int), "float" (32 bit float), "double" (64 bit
 float) "string" (UTF-8 encoded strings and string arrays), and "complex" (catch-all for more exotic data types like

Review Comment:
   ```suggestion
   Druid natively supports the following basic column types: "long" (64-bit signed int), "float" (32-bit float), "double" (64-bit
   float) "string" (UTF-8 encoded strings and string arrays), "complex" (catch-all for more exotic data types like
   ```



##########
docs/querying/sql-data-types.md:
##########
@@ -65,6 +65,7 @@ The following table describes how Druid maps SQL types onto native types when ru
 |BIGINT|LONG|`0`|Druid LONG columns (except `__time`) are reported as BIGINT|
 |TIMESTAMP|LONG|`0`, meaning 1970-01-01 00:00:00 UTC|Druid's `__time` column is reported as TIMESTAMP. Casts between string and timestamp types assume standard SQL formatting, e.g. `2000-01-02 03:04:05`, _not_ ISO8601 formatting. For handling other formats, use one of the [time functions](sql-scalar.md#date-and-time-functions).|
 |DATE|LONG|`0`, meaning 1970-01-01|Casting TIMESTAMP to DATE rounds down the timestamp to the nearest day. Casts between string and date types assume standard SQL formatting, e.g. `2000-01-02`. For handling other formats, use one of the [time functions](sql-scalar.md#date-and-time-functions).|
+|ARRAY|ARRAY|Druid native array types work as SQL arrays, and multi-value strings can be converted to arrays. See the [`ARRAY` details](#arrays).|

Review Comment:
   I think you're missing a column value. You have three columns but should have four. The third column should be default value, and the fourth column is notes.



##########
docs/querying/sql-data-types.md:
##########
@@ -85,8 +87,45 @@ the `UNNEST` functionality available in some other SQL dialects. Refer to the do
 > they are handled in Druid SQL and in native queries. For example, expressions involving multi-value dimensions may be
 > incorrectly optimized by the Druid SQL planner: `multi_val_dim = 'a' AND multi_val_dim = 'b'` is optimized to
 > `false`, even though it is possible for a single row to have both "a" and "b" as values for `multi_val_dim`. The
-> SQL behavior of multi-value dimensions will change in a future release to more closely align with their behavior
-> in native queries.
+> SQL behavior of multi-value dimensions may change in a future release to more closely align with their behavior
+> in native queries, but the [multi-value string functions](./sql-multivalue-string-functions.md) should be able to provide
+> nearly all possible native functionality.
+
+## Arrays
+Druid supports `ARRAY` types constructed at query time, though it currently lacks the ability to store them in
+segments. `ARRAY` types behave as standard SQL arrays, where results are grouped by matching entire arrays. This is in
+contrast to the implicit `UNNEST` that occurs when grouping on multi-value dimensions directly or when used with the
+multi-value functions. You can convert multi-value dimensions to standard SQL arrays either by explicitly converting
+them with `MV_TO_ARRAY`, or implicitly when used within the [array functions](./sql-array-functions.md). Arrays may
+also be constructed from multiple columns using the array functions.
+
+## Multi-value strings behavior
+The behavior of Druid [multi-value string dimensions](multi-value-dimensions.md) varies depending on the context of their usage.
+
+When used with standard `VARCHAR` functions, which expect a single input value per row, Druid maps the function across all values in the row.
+the function across all values in the row. If the row is null or empty, the function will recieve `NULL` as its input,
+otherwise it will be applied to every row value and continue its life as a multi-value `VARCHAR`.
+
+When used with the explicit [multi-value string functions](./sql-multivalue-string-functions.md), Druid processes the
+row values as if they were `ARRAY` typed, so any operations which produce null and empty rows are distinguished as
+separate values (unlike implicit mapping behavior). These multi-value string functions, typically denoted with an `MV_`
+prefix, retain their `VARCHAR` type after the computation is complete. Note that Druid multi-value columns do _not_
+distinguish between empty and null rows, so an empty row will never appear natively as input to a multi-valued function,
+but any multi-value function which manipulates the array form of the value may produce an empty array, which will be
+handled separately while processing.
+
+> Do not mix the usage of multi-value functions and normal scalar functions within the same expression, as the planner will be unable
+> to determine how to properly process the value given its ambiguous usage. A multi-value string must be treated consistently within
+> an expression.
+
+When converted to `ARRAY` or used with [array functions](./sql-array-functions.md), multi-value strings behave as standard SQL arrays and can no longer
+be manipulated with non-array functions.
+
+Druid serializes multi-value `VARCHAR` results as a JSON string of the array, if the value was not grouped on. If the

Review Comment:
   ```suggestion
   Druid serializes multi-value `VARCHAR` results as a JSON string of the array, if grouping was not applied on the value. If the
   ```



##########
docs/querying/sql-data-types.md:
##########
@@ -85,8 +87,45 @@ the `UNNEST` functionality available in some other SQL dialects. Refer to the do
 > they are handled in Druid SQL and in native queries. For example, expressions involving multi-value dimensions may be
 > incorrectly optimized by the Druid SQL planner: `multi_val_dim = 'a' AND multi_val_dim = 'b'` is optimized to
 > `false`, even though it is possible for a single row to have both "a" and "b" as values for `multi_val_dim`. The
-> SQL behavior of multi-value dimensions will change in a future release to more closely align with their behavior
-> in native queries.
+> SQL behavior of multi-value dimensions may change in a future release to more closely align with their behavior
+> in native queries, but the [multi-value string functions](./sql-multivalue-string-functions.md) should be able to provide
+> nearly all possible native functionality.
+
+## Arrays
+Druid supports `ARRAY` types constructed at query time, though it currently lacks the ability to store them in
+segments. `ARRAY` types behave as standard SQL arrays, where results are grouped by matching entire arrays. This is in
+contrast to the implicit `UNNEST` that occurs when grouping on multi-value dimensions directly or when used with the
+multi-value functions. You can convert multi-value dimensions to standard SQL arrays either by explicitly converting
+them with `MV_TO_ARRAY`, or implicitly when used within the [array functions](./sql-array-functions.md). Arrays may
+also be constructed from multiple columns using the array functions.
+
+## Multi-value strings behavior
+The behavior of Druid [multi-value string dimensions](multi-value-dimensions.md) varies depending on the context of their usage.
+
+When used with standard `VARCHAR` functions, which expect a single input value per row, Druid maps the function across all values in the row.
+the function across all values in the row. If the row is null or empty, the function will recieve `NULL` as its input,
+otherwise it will be applied to every row value and continue its life as a multi-value `VARCHAR`.
+
+When used with the explicit [multi-value string functions](./sql-multivalue-string-functions.md), Druid processes the
+row values as if they were `ARRAY` typed, so any operations which produce null and empty rows are distinguished as
+separate values (unlike implicit mapping behavior). These multi-value string functions, typically denoted with an `MV_`
+prefix, retain their `VARCHAR` type after the computation is complete. Note that Druid multi-value columns do _not_
+distinguish between empty and null rows, so an empty row will never appear natively as input to a multi-valued function,
+but any multi-value function which manipulates the array form of the value may produce an empty array, which will be

Review Comment:
   ```suggestion
   but any multi-value function which manipulates the array form of the value may produce an empty array, which is
   ```



##########
docs/querying/sql-multivalue-string-functions.md:
##########
@@ -36,14 +36,15 @@ sidebar_label: "Multi-value string functions"
 
 Druid supports string dimensions containing multiple values.
 This page describes the operations you can perform on multi-value string dimensions using [Druid SQL](./sql.md).
-See [Multi-value dimensions](multi-value-dimensions.md) for more information.
+See [SQL multi-value strings](./sql-data-types.md#multi-value-strings) and native [Multi-value dimensions](multi-value-dimensions.md) for more information.
 
 All "array" references in the multi-value string function documentation can refer to multi-value string columns or
-`ARRAY` literals.
+`ARRAY` types. Multi-value strings can also be converted to `ARRAY` types using `MV_TO_ARRAY`, see
+[Multi-value string functions](sql-multivalue-string-functions.md), [`ARRAY` data type documentation](./sql-data-types.md#arrays),
+and [array functions](./sql-array-functions.md) for additional details.

Review Comment:
   Broke apart the run on sentence and moved the preposition to the front for clarity
   ```suggestion
   `ARRAY` types. Multi-value strings can also be converted to `ARRAY` types using `MV_TO_ARRAY`.
   For additional details, see [Multi-value string functions](sql-multivalue-string-functions.md),
   [`ARRAY` data type documentation](./sql-data-types.md#arrays),
   and [array functions](./sql-array-functions.md).
   ```



##########
docs/querying/sql-array-functions.md:
##########
@@ -0,0 +1,57 @@
+---
+id: sql-array-functions
+title: "SQL ARRAY functions"
+sidebar_label: "Array functions"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+<!--
+  The format of the tables that describe the functions and operators
+  should not be changed without updating the script create-sql-docs
+  in web-console/script/create-sql-docs, because the script detects
+  patterns in this markdown file and parse it to TypeScript file for web console
+-->
+
+
+> Apache Druid supports two query languages: Druid SQL and [native queries](querying.md).
+> This document describes the SQL language.
+
+This page describes the operations you can perform on arrays using [Druid SQL](./sql.md). See [`ARRAY` data type documentation](./sql-data-types.md#arrays) for additional details.
+
+All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
+identical to the [multi-value string functions](./sql-multivalue-string-functions.md), but use `ARRAY` types and behavior.
+
+|Function|Notes|
+|--------|-----|
+|`ARRAY[expr1, expr2, ...]`|Constructs a SQL `ARRAY` literal from the expression arguments, using the type of the first argument as the output array type.|
+|`ARRAY_LENGTH(arr)`|Returns length of array expression.|

Review Comment:
   ```suggestion
   |`ARRAY_LENGTH(arr)`|Returns length of the array expression.|
   ```



##########
docs/querying/sql-array-functions.md:
##########
@@ -0,0 +1,57 @@
+---
+id: sql-array-functions
+title: "SQL ARRAY functions"
+sidebar_label: "Array functions"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+<!--
+  The format of the tables that describe the functions and operators
+  should not be changed without updating the script create-sql-docs
+  in web-console/script/create-sql-docs, because the script detects
+  patterns in this markdown file and parse it to TypeScript file for web console
+-->
+
+
+> Apache Druid supports two query languages: Druid SQL and [native queries](querying.md).
+> This document describes the SQL language.
+
+This page describes the operations you can perform on arrays using [Druid SQL](./sql.md). See [`ARRAY` data type documentation](./sql-data-types.md#arrays) for additional details.
+
+All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
+identical to the [multi-value string functions](./sql-multivalue-string-functions.md), but use `ARRAY` types and behavior.
+
+|Function|Notes|

Review Comment:
   ```suggestion
   |Function|Description|
   ```



##########
docs/querying/sql-array-functions.md:
##########
@@ -0,0 +1,57 @@
+---
+id: sql-array-functions
+title: "SQL ARRAY functions"
+sidebar_label: "Array functions"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+<!--
+  The format of the tables that describe the functions and operators
+  should not be changed without updating the script create-sql-docs
+  in web-console/script/create-sql-docs, because the script detects
+  patterns in this markdown file and parse it to TypeScript file for web console
+-->
+
+
+> Apache Druid supports two query languages: Druid SQL and [native queries](querying.md).
+> This document describes the SQL language.
+
+This page describes the operations you can perform on arrays using [Druid SQL](./sql.md). See [`ARRAY` data type documentation](./sql-data-types.md#arrays) for additional details.
+
+All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
+identical to the [multi-value string functions](./sql-multivalue-string-functions.md), but use `ARRAY` types and behavior.
+
+|Function|Notes|
+|--------|-----|
+|`ARRAY[expr1, expr2, ...]`|Constructs a SQL `ARRAY` literal from the expression arguments, using the type of the first argument as the output array type.|
+|`ARRAY_LENGTH(arr)`|Returns length of array expression.|
+|`ARRAY_OFFSET(arr, long)`|Returns the array element at the 0 based index supplied, or null for an out of range index.|
+|`ARRAY_ORDINAL(arr, long)`|Returns the array element at the 1 based index supplied, or null for an out of range index.|
+|`ARRAY_CONTAINS(arr, expr)`|Returns 1 if the array contains the element specified by `expr`, or contains all elements specified by `expr` if `expr` is an array, else 0.|
+|`ARRAY_OVERLAP(arr1, arr2)`|Returns 1 if `arr1` and `arr2` have any elements in common, else 0.|
+|`ARRAY_OFFSET_OF(arr, expr)`|Returns the 0 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
+|`ARRAY_ORDINAL_OF(arr, expr)`|Returns the 1 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
+|`ARRAY_PREPEND(expr, arr)`|Prepends `expr` to `arr`.  The type of `arr` determines the resulting array type.|
+|`ARRAY_APPEND(arr, expr)`|Appends `expr` to `arr`. The type of `arr` determines the resulting array type.|
+|`ARRAY_CONCAT(arr1, arr2)`|Concatenates two arrays. The type of `arr1` determines the resulting array type.|
+|`ARRAY_SLICE(arr, start, end)`|Returns the subarray of `arr` from the 0 based index `start` (inclusive) to `end` (exclusive). Returns `null`, if `start` is less than 0, greater than length of `arr`, or less than `end`.|

Review Comment:
   ```suggestion
   |`ARRAY_SLICE(arr, start, end)`|Returns the subarray of `arr` from the 0-based index `start` (inclusive) to `end` (exclusive). Returns `null` if `start` is less than 0, greater than length of `arr`, or less than `end`.|
   ```



##########
docs/querying/sql-array-functions.md:
##########
@@ -0,0 +1,57 @@
+---
+id: sql-array-functions
+title: "SQL ARRAY functions"
+sidebar_label: "Array functions"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+<!--
+  The format of the tables that describe the functions and operators
+  should not be changed without updating the script create-sql-docs
+  in web-console/script/create-sql-docs, because the script detects
+  patterns in this markdown file and parse it to TypeScript file for web console
+-->
+
+
+> Apache Druid supports two query languages: Druid SQL and [native queries](querying.md).
+> This document describes the SQL language.
+
+This page describes the operations you can perform on arrays using [Druid SQL](./sql.md). See [`ARRAY` data type documentation](./sql-data-types.md#arrays) for additional details.
+
+All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
+identical to the [multi-value string functions](./sql-multivalue-string-functions.md), but use `ARRAY` types and behavior.
+
+|Function|Notes|
+|--------|-----|
+|`ARRAY[expr1, expr2, ...]`|Constructs a SQL `ARRAY` literal from the expression arguments, using the type of the first argument as the output array type.|
+|`ARRAY_LENGTH(arr)`|Returns length of array expression.|
+|`ARRAY_OFFSET(arr, long)`|Returns the array element at the 0 based index supplied, or null for an out of range index.|
+|`ARRAY_ORDINAL(arr, long)`|Returns the array element at the 1 based index supplied, or null for an out of range index.|
+|`ARRAY_CONTAINS(arr, expr)`|Returns 1 if the array contains the element specified by `expr`, or contains all elements specified by `expr` if `expr` is an array, else 0.|
+|`ARRAY_OVERLAP(arr1, arr2)`|Returns 1 if `arr1` and `arr2` have any elements in common, else 0.|
+|`ARRAY_OFFSET_OF(arr, expr)`|Returns the 0 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
+|`ARRAY_ORDINAL_OF(arr, expr)`|Returns the 1 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
+|`ARRAY_PREPEND(expr, arr)`|Prepends `expr` to `arr`.  The type of `arr` determines the resulting array type.|
+|`ARRAY_APPEND(arr, expr)`|Appends `expr` to `arr`. The type of `arr` determines the resulting array type.|
+|`ARRAY_CONCAT(arr1, arr2)`|Concatenates two arrays. The type of `arr1` determines the resulting array type.|

Review Comment:
   Suggests that arr1 goes before arr2, or could make more explicit
   ```suggestion
   |`ARRAY_CONCAT(arr1, arr2)`|Concatenates `arr2` to `arr1`. The type of `arr1` determines the resulting array type.|
   ```



##########
docs/querying/sql-data-types.md:
##########
@@ -85,8 +87,45 @@ the `UNNEST` functionality available in some other SQL dialects. Refer to the do
 > they are handled in Druid SQL and in native queries. For example, expressions involving multi-value dimensions may be
 > incorrectly optimized by the Druid SQL planner: `multi_val_dim = 'a' AND multi_val_dim = 'b'` is optimized to
 > `false`, even though it is possible for a single row to have both "a" and "b" as values for `multi_val_dim`. The
-> SQL behavior of multi-value dimensions will change in a future release to more closely align with their behavior
-> in native queries.
+> SQL behavior of multi-value dimensions may change in a future release to more closely align with their behavior
+> in native queries, but the [multi-value string functions](./sql-multivalue-string-functions.md) should be able to provide
+> nearly all possible native functionality.
+
+## Arrays
+Druid supports `ARRAY` types constructed at query time, though it currently lacks the ability to store them in
+segments. `ARRAY` types behave as standard SQL arrays, where results are grouped by matching entire arrays. This is in
+contrast to the implicit `UNNEST` that occurs when grouping on multi-value dimensions directly or when used with the
+multi-value functions. You can convert multi-value dimensions to standard SQL arrays either by explicitly converting
+them with `MV_TO_ARRAY`, or implicitly when used within the [array functions](./sql-array-functions.md). Arrays may
+also be constructed from multiple columns using the array functions.
+
+## Multi-value strings behavior
+The behavior of Druid [multi-value string dimensions](multi-value-dimensions.md) varies depending on the context of their usage.
+
+When used with standard `VARCHAR` functions, which expect a single input value per row, Druid maps the function across all values in the row.
+the function across all values in the row. If the row is null or empty, the function will recieve `NULL` as its input,

Review Comment:
   ```suggestion
   If the row is null or empty, the function will receive `NULL` as its input,
   ```



##########
docs/querying/sql-data-types.md:
##########
@@ -85,8 +87,45 @@ the `UNNEST` functionality available in some other SQL dialects. Refer to the do
 > they are handled in Druid SQL and in native queries. For example, expressions involving multi-value dimensions may be
 > incorrectly optimized by the Druid SQL planner: `multi_val_dim = 'a' AND multi_val_dim = 'b'` is optimized to
 > `false`, even though it is possible for a single row to have both "a" and "b" as values for `multi_val_dim`. The
-> SQL behavior of multi-value dimensions will change in a future release to more closely align with their behavior
-> in native queries.
+> SQL behavior of multi-value dimensions may change in a future release to more closely align with their behavior
+> in native queries, but the [multi-value string functions](./sql-multivalue-string-functions.md) should be able to provide
+> nearly all possible native functionality.
+
+## Arrays
+Druid supports `ARRAY` types constructed at query time, though it currently lacks the ability to store them in
+segments. `ARRAY` types behave as standard SQL arrays, where results are grouped by matching entire arrays. This is in
+contrast to the implicit `UNNEST` that occurs when grouping on multi-value dimensions directly or when used with the
+multi-value functions. You can convert multi-value dimensions to standard SQL arrays either by explicitly converting
+them with `MV_TO_ARRAY`, or implicitly when used within the [array functions](./sql-array-functions.md). Arrays may

Review Comment:
   ```suggestion
   multi-value functions. You can convert multi-value dimensions to standard SQL arrays either explicitly by converting
   them with `MV_TO_ARRAY` or implicitly when used within the [array functions](./sql-array-functions.md). Arrays may
   ```



##########
docs/querying/sql-data-types.md:
##########
@@ -85,8 +87,45 @@ the `UNNEST` functionality available in some other SQL dialects. Refer to the do
 > they are handled in Druid SQL and in native queries. For example, expressions involving multi-value dimensions may be
 > incorrectly optimized by the Druid SQL planner: `multi_val_dim = 'a' AND multi_val_dim = 'b'` is optimized to
 > `false`, even though it is possible for a single row to have both "a" and "b" as values for `multi_val_dim`. The
-> SQL behavior of multi-value dimensions will change in a future release to more closely align with their behavior
-> in native queries.
+> SQL behavior of multi-value dimensions may change in a future release to more closely align with their behavior
+> in native queries, but the [multi-value string functions](./sql-multivalue-string-functions.md) should be able to provide
+> nearly all possible native functionality.
+
+## Arrays
+Druid supports `ARRAY` types constructed at query time, though it currently lacks the ability to store them in
+segments. `ARRAY` types behave as standard SQL arrays, where results are grouped by matching entire arrays. This is in
+contrast to the implicit `UNNEST` that occurs when grouping on multi-value dimensions directly or when used with the
+multi-value functions. You can convert multi-value dimensions to standard SQL arrays either by explicitly converting
+them with `MV_TO_ARRAY`, or implicitly when used within the [array functions](./sql-array-functions.md). Arrays may
+also be constructed from multiple columns using the array functions.
+
+## Multi-value strings behavior
+The behavior of Druid [multi-value string dimensions](multi-value-dimensions.md) varies depending on the context of their usage.
+
+When used with standard `VARCHAR` functions, which expect a single input value per row, Druid maps the function across all values in the row.
+the function across all values in the row. If the row is null or empty, the function will recieve `NULL` as its input,
+otherwise it will be applied to every row value and continue its life as a multi-value `VARCHAR`.
+
+When used with the explicit [multi-value string functions](./sql-multivalue-string-functions.md), Druid processes the
+row values as if they were `ARRAY` typed, so any operations which produce null and empty rows are distinguished as
+separate values (unlike implicit mapping behavior). These multi-value string functions, typically denoted with an `MV_`
+prefix, retain their `VARCHAR` type after the computation is complete. Note that Druid multi-value columns do _not_
+distinguish between empty and null rows, so an empty row will never appear natively as input to a multi-valued function,

Review Comment:
   Separated the sentences for readability
   ```suggestion
   distinguish between empty and null rows. An empty row will never appear natively as input to a multi-valued function,
   ```



##########
docs/querying/sql-multivalue-string-functions.md:
##########
@@ -59,3 +60,4 @@ All "array" references in the multi-value string function documentation can refe
 |`MV_SLICE(arr, start, end)`|Returns the subarray of `arr` from the 0 based index start(inclusive) to end(exclusive), or `null`, if start is less than 0, greater than length of arr or less than end.|
 |`MV_TO_STRING(arr, str)`|Joins all elements of `arr` by the delimiter specified by `str`.|
 |`STRING_TO_MV(str1, str2)`|Splits `str1` into an array on the delimiter specified by `str2`.|
+|`MV_TO_ARRAY(str)`|Converts a multi-value string from a `VARCHAR` to an `ARRAY` instead.|

Review Comment:
   ```suggestion
   |`MV_TO_ARRAY(str)`|Converts a multi-value string from a `VARCHAR` to an `ARRAY`.|
   ```



##########
docs/querying/sql-multivalue-string-functions.md:
##########
@@ -36,14 +36,15 @@ sidebar_label: "Multi-value string functions"
 
 Druid supports string dimensions containing multiple values.
 This page describes the operations you can perform on multi-value string dimensions using [Druid SQL](./sql.md).
-See [Multi-value dimensions](multi-value-dimensions.md) for more information.
+See [SQL multi-value strings](./sql-data-types.md#multi-value-strings) and native [Multi-value dimensions](multi-value-dimensions.md) for more information.
 
 All "array" references in the multi-value string function documentation can refer to multi-value string columns or
-`ARRAY` literals.
+`ARRAY` types. Multi-value strings can also be converted to `ARRAY` types using `MV_TO_ARRAY`, see
+[Multi-value string functions](sql-multivalue-string-functions.md), [`ARRAY` data type documentation](./sql-data-types.md#arrays),
+and [array functions](./sql-array-functions.md) for additional details.
 
 |Function|Notes|

Review Comment:
   ```suggestion
   |Function|Description|
   ```



##########
docs/querying/sql-array-functions.md:
##########
@@ -0,0 +1,57 @@
+---
+id: sql-array-functions
+title: "SQL ARRAY functions"
+sidebar_label: "Array functions"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+<!--
+  The format of the tables that describe the functions and operators
+  should not be changed without updating the script create-sql-docs
+  in web-console/script/create-sql-docs, because the script detects
+  patterns in this markdown file and parse it to TypeScript file for web console
+-->
+
+
+> Apache Druid supports two query languages: Druid SQL and [native queries](querying.md).
+> This document describes the SQL language.
+
+This page describes the operations you can perform on arrays using [Druid SQL](./sql.md). See [`ARRAY` data type documentation](./sql-data-types.md#arrays) for additional details.
+
+All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
+identical to the [multi-value string functions](./sql-multivalue-string-functions.md), but use `ARRAY` types and behavior.
+
+|Function|Notes|
+|--------|-----|
+|`ARRAY[expr1, expr2, ...]`|Constructs a SQL `ARRAY` literal from the expression arguments, using the type of the first argument as the output array type.|
+|`ARRAY_LENGTH(arr)`|Returns length of array expression.|
+|`ARRAY_OFFSET(arr, long)`|Returns the array element at the 0 based index supplied, or null for an out of range index.|
+|`ARRAY_ORDINAL(arr, long)`|Returns the array element at the 1 based index supplied, or null for an out of range index.|
+|`ARRAY_CONTAINS(arr, expr)`|Returns 1 if the array contains the element specified by `expr`, or contains all elements specified by `expr` if `expr` is an array, else 0.|
+|`ARRAY_OVERLAP(arr1, arr2)`|Returns 1 if `arr1` and `arr2` have any elements in common, else 0.|
+|`ARRAY_OFFSET_OF(arr, expr)`|Returns the 0 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
+|`ARRAY_ORDINAL_OF(arr, expr)`|Returns the 1 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|

Review Comment:
   ```suggestion
   |`ARRAY_OFFSET_OF(arr, expr)`|Returns the 0-based index of the first occurrence of `expr` in the array. If no matching elements exist in the array, returns `-1` or `null` if `druid.generic.useDefaultValueForNull=false`.|
   |`ARRAY_ORDINAL_OF(arr, expr)`|Returns the 1-based index of the first occurrence of `expr` in the array. If no matching elements exist in the array, returns `-1` or `null` if `druid.generic.useDefaultValueForNull=false`.|
   ```



##########
docs/querying/sql-array-functions.md:
##########
@@ -0,0 +1,57 @@
+---
+id: sql-array-functions
+title: "SQL ARRAY functions"
+sidebar_label: "Array functions"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+<!--
+  The format of the tables that describe the functions and operators
+  should not be changed without updating the script create-sql-docs
+  in web-console/script/create-sql-docs, because the script detects
+  patterns in this markdown file and parse it to TypeScript file for web console
+-->
+
+
+> Apache Druid supports two query languages: Druid SQL and [native queries](querying.md).
+> This document describes the SQL language.
+
+This page describes the operations you can perform on arrays using [Druid SQL](./sql.md). See [`ARRAY` data type documentation](./sql-data-types.md#arrays) for additional details.
+
+All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
+identical to the [multi-value string functions](./sql-multivalue-string-functions.md), but use `ARRAY` types and behavior.
+
+|Function|Notes|
+|--------|-----|
+|`ARRAY[expr1, expr2, ...]`|Constructs a SQL `ARRAY` literal from the expression arguments, using the type of the first argument as the output array type.|
+|`ARRAY_LENGTH(arr)`|Returns length of array expression.|
+|`ARRAY_OFFSET(arr, long)`|Returns the array element at the 0 based index supplied, or null for an out of range index.|
+|`ARRAY_ORDINAL(arr, long)`|Returns the array element at the 1 based index supplied, or null for an out of range index.|
+|`ARRAY_CONTAINS(arr, expr)`|Returns 1 if the array contains the element specified by `expr`, or contains all elements specified by `expr` if `expr` is an array, else 0.|
+|`ARRAY_OVERLAP(arr1, arr2)`|Returns 1 if `arr1` and `arr2` have any elements in common, else 0.|
+|`ARRAY_OFFSET_OF(arr, expr)`|Returns the 0 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
+|`ARRAY_ORDINAL_OF(arr, expr)`|Returns the 1 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
+|`ARRAY_PREPEND(expr, arr)`|Prepends `expr` to `arr`.  The type of `arr` determines the resulting array type.|
+|`ARRAY_APPEND(arr, expr)`|Appends `expr` to `arr`. The type of `arr` determines the resulting array type.|
+|`ARRAY_CONCAT(arr1, arr2)`|Concatenates two arrays. The type of `arr1` determines the resulting array type.|
+|`ARRAY_SLICE(arr, start, end)`|Returns the subarray of `arr` from the 0 based index `start` (inclusive) to `end` (exclusive). Returns `null`, if `start` is less than 0, greater than length of `arr`, or less than `end`.|

Review Comment:
   Should that say if `start` is greater than `end`?



##########
docs/querying/sql-array-functions.md:
##########
@@ -0,0 +1,57 @@
+---
+id: sql-array-functions
+title: "SQL ARRAY functions"
+sidebar_label: "Array functions"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+<!--
+  The format of the tables that describe the functions and operators
+  should not be changed without updating the script create-sql-docs
+  in web-console/script/create-sql-docs, because the script detects
+  patterns in this markdown file and parse it to TypeScript file for web console
+-->
+
+
+> Apache Druid supports two query languages: Druid SQL and [native queries](querying.md).
+> This document describes the SQL language.
+
+This page describes the operations you can perform on arrays using [Druid SQL](./sql.md). See [`ARRAY` data type documentation](./sql-data-types.md#arrays) for additional details.
+
+All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
+identical to the [multi-value string functions](./sql-multivalue-string-functions.md), but use `ARRAY` types and behavior.
+
+|Function|Notes|
+|--------|-----|
+|`ARRAY[expr1, expr2, ...]`|Constructs a SQL `ARRAY` literal from the expression arguments, using the type of the first argument as the output array type.|
+|`ARRAY_LENGTH(arr)`|Returns length of array expression.|
+|`ARRAY_OFFSET(arr, long)`|Returns the array element at the 0 based index supplied, or null for an out of range index.|
+|`ARRAY_ORDINAL(arr, long)`|Returns the array element at the 1 based index supplied, or null for an out of range index.|

Review Comment:
   ```suggestion
   |`ARRAY_OFFSET(arr, long)`|Returns the array element at the 0-based index supplied, or null for an out of range index.|
   |`ARRAY_ORDINAL(arr, long)`|Returns the array element at the 1-based index supplied, or null for an out of range index.|
   ```



##########
docs/querying/sql-array-functions.md:
##########
@@ -0,0 +1,57 @@
+---
+id: sql-array-functions
+title: "SQL ARRAY functions"
+sidebar_label: "Array functions"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+<!--
+  The format of the tables that describe the functions and operators
+  should not be changed without updating the script create-sql-docs
+  in web-console/script/create-sql-docs, because the script detects
+  patterns in this markdown file and parse it to TypeScript file for web console
+-->
+
+
+> Apache Druid supports two query languages: Druid SQL and [native queries](querying.md).
+> This document describes the SQL language.
+
+This page describes the operations you can perform on arrays using [Druid SQL](./sql.md). See [`ARRAY` data type documentation](./sql-data-types.md#arrays) for additional details.
+
+All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
+identical to the [multi-value string functions](./sql-multivalue-string-functions.md), but use `ARRAY` types and behavior.
+
+|Function|Notes|
+|--------|-----|
+|`ARRAY[expr1, expr2, ...]`|Constructs a SQL `ARRAY` literal from the expression arguments, using the type of the first argument as the output array type.|
+|`ARRAY_LENGTH(arr)`|Returns length of array expression.|
+|`ARRAY_OFFSET(arr, long)`|Returns the array element at the 0 based index supplied, or null for an out of range index.|
+|`ARRAY_ORDINAL(arr, long)`|Returns the array element at the 1 based index supplied, or null for an out of range index.|
+|`ARRAY_CONTAINS(arr, expr)`|Returns 1 if the array contains the element specified by `expr`, or contains all elements specified by `expr` if `expr` is an array, else 0.|

Review Comment:
   This description is pretty dense, trying to unpack a bit:
   ```suggestion
   |`ARRAY_CONTAINS(arr, expr)`|If `expr` is a single element, returns 1 if `arr` contains the element. If `expr` is an array, returns 1 if `arr` contains all elements specified by `expr`. Otherwise returns 0.|
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org