You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@drill.apache.org by dz...@apache.org on 2023/02/21 13:25:46 UTC

[drill-site] branch master updated (3f5ac6dbc -> e134c7f74)

This is an automated email from the ASF dual-hosted git repository.

dzamo pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/drill-site.git


    from 3f5ac6dbc Update docs for Drill 1.21.0.
     new 3aaa82d10 Document the MS Access storage plugin.
     new e134c7f74 Document filtered aggregates.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../plugins/180-ms-access-plugin.md                | 47 ++++++++++++++++++++++
 .../050-aggregate-and-aggregate-statistical.md     | 29 +++++++++++++
 2 files changed, 76 insertions(+)
 create mode 100644 _docs/en/connect-a-data-source/plugins/180-ms-access-plugin.md


[drill-site] 02/02: Document filtered aggregates.

Posted by dz...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

dzamo pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/drill-site.git

commit e134c7f7472cedd65e3107a5cf3d370c138c68ed
Author: James Turton <ja...@somecomputer.xyz>
AuthorDate: Tue Feb 21 15:25:28 2023 +0200

    Document filtered aggregates.
---
 .../050-aggregate-and-aggregate-statistical.md     | 29 ++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/_docs/en/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md b/_docs/en/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
index 8a65264e4..49bcde22b 100644
--- a/_docs/en/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
+++ b/_docs/en/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
@@ -98,6 +98,35 @@ SELECT ANY_VALUE(employee_id) as anyemp FROM cp.`employee.json` GROUP BY salary
 ...
 ```
 
+## Filtered Aggregates
+**Introduced in release: 1.21**
+
+Starting in Drill 1.21 it is possible to follow an aggregate function invocation with a boolean expression that will filter the values procesed by the aggregate using the following syntax.
+```
+agg_func( column ) FILTER(WHERE boolean_expression)
+```
+
+For example
+``` sql
+SELECT
+  count(n_name) FILTER(WHERE n_regionkey = 1) AS nations_count_in_1_region,
+  count(n_name) FILTER(WHERE n_regionkey = 2) AS nations_count_in_2_region,
+  count(n_name) FILTER(WHERE n_regionkey = 3) AS nations_count_in_3_region,
+  count(n_name) FILTER(WHERE n_regionkey = 4) AS nations_count_in_4_region,
+  count(n_name) FILTER(WHERE n_regionkey = 0) AS nations_count_in_0_region
+FROM cp.`tpch/nation.parquet`
+```
+will return
+```
++---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
+| nations_count_in_1_region | nations_count_in_2_region | nations_count_in_3_region | nations_count_in_4_region | nations_count_in_0_region |
++---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
+| 5                         | 5                         | 5                         | 5                         | 5                         |
++---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
+```
+
+**N.B.** Some versions of Drill prior to 1.21 do not fail if FILTER expressions are included with aggregate function calls, but silently do no filtering yielding incorrect results. Filtered aggregates are only supported from version 1.21 onward.
+
 ## AVG
 
 Returns the average of a numerical expression.


[drill-site] 01/02: Document the MS Access storage plugin.

Posted by dz...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

dzamo pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/drill-site.git

commit 3aaa82d102ca9d9bd6022b6d4a63d30957be4900
Author: James Turton <ja...@somecomputer.xyz>
AuthorDate: Tue Feb 21 15:10:12 2023 +0200

    Document the MS Access storage plugin.
---
 .../plugins/180-ms-access-plugin.md                | 47 ++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/_docs/en/connect-a-data-source/plugins/180-ms-access-plugin.md b/_docs/en/connect-a-data-source/plugins/180-ms-access-plugin.md
new file mode 100644
index 000000000..748420b3c
--- /dev/null
+++ b/_docs/en/connect-a-data-source/plugins/180-ms-access-plugin.md
@@ -0,0 +1,47 @@
+---
+title: "MS Access Storage Plugin"
+slug: "MS Access Storage Plugin"
+parent: "Connect a Data Source"
+---
+
+**Introduced in release:** 1.21
+
+This plugin enables Drill to read Microsoft Access database files. This plugin can read Access Files from all versions later than Access 1997.
+
+## Configuration
+Simply add the following to any Drill file system configuration.  Typically, MS Access files will use the extension `accdb` or `mdb`.  Drill comes pre-configured to recognize these extensions as MS Access.
+
+```json
+"msaccess": {
+  "type": "msaccess",
+  "extensions": ["mdb", "accdb"]
+}
+```
+
+## Schemas
+Drill will discover the schema automatically from the Access file.  The plugin does support schema provisioning for consistency, but is not recommended.
+
+## Querying a Table
+Access files will contain multiple tables.  To access a specific table, use the `table()` function in the `FROM` clause, and specify the table name using the `tableName` parameter, as shown below.
+
+```sql
+SELECT * 
+FROM table(dfs.`file_name.accdb` (type=> 'msaccess', tableName => 'Table1'))
+```
+
+## Metadata Queries
+Since an Access file may contain multiple tables, there needs to be a way to determine what tables are present in the Access file.  In Drill, simply querying a file, without specifying a `tableName` will result in a metadata query, rather than getting the actual data back.  
+
+For example:
+
+```sql
+SELECT * FROM dfs.test.`access/data/V2019/extDateTestV2019.accdb`;
++--------+-------------------------+-------------------------+-----------+-----------+----------------------------------------------------------------------+
+| table  |      created_date       |      updated_date       | row_count | col_count |                               columns                                |
++--------+-------------------------+-------------------------+-----------+-----------+----------------------------------------------------------------------+
+| Table1 | 2021-06-03 20:09:56.993 | 2021-06-03 20:09:56.993 | 9         | 6         | ["ID","Field1","DateExt","DateNormal","DateExtStr","DateNormalCalc"] |
++--------+-------------------------+-------------------------+-----------+-----------+----------------------------------------------------------------------+
+
+```
+## Password-Protected Files
+The password protection in Access is just a software level protection and really does not offer any security.  Drill can query password protected files without any password.