You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by al...@apache.org on 2023/06/23 19:24:43 UTC

[arrow-datafusion] branch main updated: Docs: Update SQL status page (#6736)

This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git


The following commit(s) were added to refs/heads/main by this push:
     new 80d1750fda Docs: Update SQL status page (#6736)
80d1750fda is described below

commit 80d1750fda0f8d1d09d7bbf5b05662b1f6618954
Author: Andrew Lamb <an...@nerdnetworks.org>
AuthorDate: Fri Jun 23 15:24:37 2023 -0400

    Docs: Update SQL status page (#6736)
    
    * Docs: Update SQL status page
    
    * Add catalog and data inserts
    
    * Try to clarify that support for nested types is in progress, but not done
    
    * prettier
---
 docs/source/user-guide/sql/sql_status.md | 143 ++++++++++++-------------------
 1 file changed, 53 insertions(+), 90 deletions(-)

diff --git a/docs/source/user-guide/sql/sql_status.md b/docs/source/user-guide/sql/sql_status.md
index 6075a23330..709534adf4 100644
--- a/docs/source/user-guide/sql/sql_status.md
+++ b/docs/source/user-guide/sql/sql_status.md
@@ -34,97 +34,61 @@
 
 ## SQL Support
 
-- [x] Projection
-- [x] Filter (WHERE)
-- [x] Filter post-aggregate (HAVING)
-- [x] Limit
-- [x] Aggregate
-- [x] Common math functions
-- [x] cast
-- [x] try_cast
+- [x] Projection (`SELECT`)
+- [x] Filter (`WHERE`)
+- [x] Filter post-aggregate (`HAVING`)
+- [x] Sorting (`ORDER BY`)
+- [x] Limit (`LIMIT`
+- [x] Aggregate (`GROUP BY`)
+- [x] cast /try_cast
 - [x] [`VALUES` lists](https://www.postgresql.org/docs/current/queries-values.html)
-- Postgres compatible String functions
-  - [x] ascii
-  - [x] bit_length
-  - [x] btrim
-  - [x] char_length
-  - [x] character_length
-  - [x] chr
-  - [x] concat
-  - [x] concat_ws
-  - [x] initcap
-  - [x] left
-  - [x] length
-  - [x] lpad
-  - [x] ltrim
-  - [x] octet_length
-  - [x] regexp_replace
-  - [x] repeat
-  - [x] replace
-  - [x] reverse
-  - [x] right
-  - [x] rpad
-  - [x] rtrim
-  - [x] split_part
-  - [x] starts_with
-  - [x] strpos
-  - [x] substr
-  - [x] to_hex
-  - [x] translate
-  - [x] trim
-- Conditional functions
-  - [x] nullif
-  - [x] case
-  - [x] coalesce
-- Approximation functions
-  - [x] approx_distinct
-  - [x] approx_median
-  - [x] approx_percentile_cont
-  - [x] approx_percentile_cont_with_weight
-- Common date/time functions
-  - [ ] Basic date functions
-  - [ ] Basic time functions
-  - [x] Basic timestamp functions
-    - [x] [to_timestamp](./scalar_functions.md#to_timestamp)
-    - [x] [to_timestamp_millis](./scalar_functions.md#to_timestamp_millis)
-    - [x] [to_timestamp_micros](./scalar_functions.md#to_timestamp_micros)
-    - [x] [to_timestamp_seconds](./scalar_functions.md#to_timestamp_seconds)
-    - [x] [extract](./scalar_functions.md#extract)
-    - [x] [date_part](./scalar_functions.md#date_part)
-- nested functions
-  - [x] Array of columns
+- [x] [String Functions](./scalar_functions.md#string-functions)
+- [x] [Conditional Functions](./scalar_functions.md#conditional-functions)
+- [x] [Time and Date Functions](./scalar_functions.md#time-and-date-functions)
+- [x] [Math Functions](./scalar_functions.md#math-functions)
+- [x] [Aggregate Functions](./aggregate_functions.md) (`SUM`, `MEDIAN`, and many more)
 - [x] Schema Queries
-  - [x] SHOW TABLES
-  - [x] SHOW COLUMNS FROM <table/view>
-  - [x] SHOW CREATE TABLE <view>
-  - [x] information_schema.{tables, columns, views}
-  - [ ] information_schema other views
-- [x] Sorting
-- [ ] Nested types
-- [ ] Lists
+  - [x] `SHOW TABLES`
+  - [x] `SHOW COLUMNS FROM <table/view>`
+  - [x] `SHOW CREATE TABLE <view>`
+  - [x] Basic SQL [Information Schema](./information_schema.md) (`TABLES`, `VIEWS`, `COLUMNS`)
+  - [ ] Full SQL [Information Schema](./information_schema.md) support
+- [ ] Support for nested types (`ARRAY`/`LIST` and `STRUCT`. See [#2326](https://github.com/apache/arrow-datafusion/issues/2326) for details)
+  - [x] Read support
+  - [x] Write support
+  - [x] Field access (`col['field']` and [`col[1]`])
+  - [x] [Array Functions](./scalar_functions.md#array-functions)
+  - [ ] [Struct Functions](./scalar_functions.md#struct-functions)
+    - [x] `struct`
+    - [ ] [Postgres JSON operators](https://github.com/apache/arrow-datafusion/issues/6631) (`->`, `->>`, etc.)
 - [x] Subqueries
-- [x] Common table expressions
-- [x] Set Operations
-  - [x] UNION ALL
-  - [x] UNION
-  - [x] INTERSECT
-  - [x] INTERSECT ALL
-  - [x] EXCEPT
-  - [x] EXCEPT ALL
-- [x] Joins
-  - [x] INNER JOIN
-  - [x] LEFT JOIN
-  - [x] RIGHT JOIN
-  - [x] FULL JOIN
-  - [x] CROSS JOIN
-- [ ] Window
-  - [x] Empty window
-  - [x] Common window functions
-  - [x] Window with PARTITION BY clause
-  - [x] Window with ORDER BY clause
-  - [ ] Window with FILTER clause
-  - [ ] [Window with custom WINDOW FRAME](https://github.com/apache/arrow-datafusion/issues/361)
-  - [ ] UDF and UDAF for window functions
+- [x] Common Table Expressions (CTE)
+- [x] Set Operations (`UNION [ALL]`, `INTERSECT [ALL]`, `EXCEPT[ALL]`)
+- [x] Joins (`INNER`, `LEFT`, `RIGHT`, `FULL`, `CROSS`)
+- [x] Window Functions
+  - [x] Empty (`OVER()`)
+  - [x] Partitioning and ordering: (`OVER(PARTITION BY <..> ORDER BY <..>)`)
+  - [x] Custom Window (`ORDER BY time ROWS BETWEEN 2 PRECEDING AND 0 FOLLOWING)`)
+  - [x] User Defined Window and Aggregate Functions
+- [x] Catalogs
+  - [x] Schemas (`CREATE / DROP SCHEMA`)
+  - [x] Tables (`CREATE / DROP TABLE`, `CREATE TABLE AS SELECT`)
+- [ ] Data Insert
+  - [x] `INSERT INTO`
+  - [ ] `COPY .. INTO ..`
+  - [x] CSV
+  - [ ] JSON
+  - [ ] Parquet
+  - [ ] Avro
+
+## Runtime
+
+- [x] Streaming Grouping
+- [x] Streaming Window Evaluation
+- [x] Memory limits enforced
+- [x] Spilling (to disk) Sort
+- [ ] Spilling (to disk) Grouping
+- [ ] Spilling (to disk) Joins
 
 ## Data Sources
 
@@ -132,8 +96,7 @@ In addition to allowing arbitrary datasources via the `TableProvider`
 trait, DataFusion includes built in support for the following formats:
 
 - [x] CSV
-- [x] Parquet primitive types
-- [x] Parquet nested types
+- [x] Parquet (for all primitive and nested types)
 - [x] JSON
 - [x] Avro
 - [x] Arrow