You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by li...@apache.org on 2022/12/13 17:45:32 UTC
[arrow-adbc] branch main updated: docs(c/driver/flight_sql): document new driver options (#228)
This is an automated email from the ASF dual-hosted git repository.
lidavidm pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-adbc.git
The following commit(s) were added to refs/heads/main by this push:
new ccfdbbc docs(c/driver/flight_sql): document new driver options (#228)
ccfdbbc is described below
commit ccfdbbcb4a95ec697fcbe8cb1712fe02bcf71744
Author: David Li <li...@gmail.com>
AuthorDate: Tue Dec 13 12:45:26 2022 -0500
docs(c/driver/flight_sql): document new driver options (#228)
---
docs/source/_static/css/custom.css | 20 +++++
docs/source/conf.py | 3 +-
docs/source/driver/cpp/flight_sql.rst | 139 ++++++++++++++++++++++++++++------
3 files changed, 136 insertions(+), 26 deletions(-)
diff --git a/docs/source/_static/css/custom.css b/docs/source/_static/css/custom.css
new file mode 100644
index 0000000..543a956
--- /dev/null
+++ b/docs/source/_static/css/custom.css
@@ -0,0 +1,20 @@
+/* Licensed to the Apache Software Foundation (ASF) under one */
+/* or more contributor license agreements. See the NOTICE file */
+/* distributed with this work for additional information */
+/* regarding copyright ownership. The ASF licenses this file */
+/* to you under the Apache License, Version 2.0 (the */
+/* "License"); you may not use this file except in compliance */
+/* with the License. You may obtain a copy of the License at */
+
+/* http://www.apache.org/licenses/LICENSE-2.0 */
+
+/* Unless required by applicable law or agreed to in writing, */
+/* software distributed under the License is distributed on an */
+/* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY */
+/* KIND, either express or implied. See the License for the */
+/* specific language governing permissions and limitations */
+/* under the License. */
+
+p.admonition-title {
+ font-weight: bold;
+}
diff --git a/docs/source/conf.py b/docs/source/conf.py
index bce8f7e..a43a3c2 100644
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -71,6 +71,8 @@ breathe_projects = {
# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+html_css_files = ["css/custom.css"]
+html_static_path = ["_static"]
html_theme = "furo"
html_theme_options = {
"dark_logo": "logo-dark.png",
@@ -79,7 +81,6 @@ html_theme_options = {
"source_branch": "main",
"source_directory": "docs/source/",
}
-html_static_path = ["_static"]
# -- Options for Intersphinx -------------------------------------------------
diff --git a/docs/source/driver/cpp/flight_sql.rst b/docs/source/driver/cpp/flight_sql.rst
index a67c617..d8ae7a6 100644
--- a/docs/source/driver/cpp/flight_sql.rst
+++ b/docs/source/driver/cpp/flight_sql.rst
@@ -60,17 +60,76 @@ the :cpp:class:`AdbcDatabase`.
with pyarrow.flight_sql.connect("grpc://localhost:8080") as conn:
pass
-Additional Configuration Options
---------------------------------
+Supported Features
+==================
-The Flight SQL driver supports some additional configuration options
-in addition to the "standard" ADBC options.
+The Flight SQL driver generally supports features defined in the ADBC
+API specification 1.0.0, as well as some additional, custom options.
+
+Authentication
+--------------
+
+The driver does no authentication by default.
+
+The driver implements one optional authentication scheme that mimics
+the Arrow Flight SQL JDBC driver. This can be enabled by setting the
+option ``arrow.flight.sql.authorization_header`` on the
+:cpp:class:`AdbcDatabase`. The client provides credentials by setting
+the option value to the value of the ``authorization`` header sent
+from client to server. The server then responds with an
+``authorization`` header on the first request. The value of this
+header will then be sent back as the ``authorization`` header on all
+future requests.
+
+Bulk Ingestion
+--------------
+
+Flight SQL does not have a dedicated API for bulk ingestion of Arrow
+data into a given table. The driver instead constructs SQL statements
+to create and insert into the table.
+
+.. warning:: The driver does not escape or validate the names of
+ tables or columns. As a precaution, it instead limits
+ identifier names to letters, numbers, and underscores.
+ Bulk ingestion should not be used with untrusted user
+ input.
+
+The driver binds a batch of data at a time for efficiency. Also, the
+generated SQL statements hardcode ``?`` as the parameter identifier.
+
+Client Options
+--------------
+
+The options used for creating the Flight RPC client can be customized.
+These options map 1:1 with the options in FlightClientOptions:
+
+``arrow.flight.sql.client_option.tls_root_certs``
+ Override the root certificates used to validate the server's TLS
+ certificate.
+
+``arrow.flight.sql.client_option.override_hostname``
+ Override the hostname used to verify the server's TLS certificate.
+
+``arrow.flight.sql.client_option.cert_chain``
+ The certificate chain to use for mTLS.
+
+``arrow.flight.sql.client_option.private_key``
+ The private key to use for mTLS.
+
+``arrow.flight.sql.client_option.generic_int_option.``
+``arrow.flight.sql.client_option.generic_string_option.``
+ Option prefixes used to specify generic transport-layer options.
+
+``arrow.flight.sql.client_option.disable_server_verification``
+ Disable verification of the server's TLS certificate. Value
+ should be ``true`` or ``false``.
Custom Call Headers
-~~~~~~~~~~~~~~~~~~~
+-------------------
Custom HTTP headers can be attached to requests via options that apply
-to both :cpp:class:`AdbcConnection` and :cpp:class:`AdbcStatement`.
+to :cpp:class:`AdbcDatabase`, :cpp:class:`AdbcConnection`, and
+:cpp:class:`AdbcStatement`.
``arrow.flight.sql.rpc.call_header.<HEADER NAME>``
Add the header ``<HEADER NAME>`` to outgoing requests with the given
@@ -78,8 +137,46 @@ to both :cpp:class:`AdbcConnection` and :cpp:class:`AdbcStatement`.
.. warning:: Header names must be in all lowercase.
+Distributed Result Sets
+-----------------------
+
+The driver will fetch all partitions (FlightEndpoints) returned by the
+server, in an unspecified order (note that Flight SQL itself does not
+define an ordering on these partitions). If an endpoint has no
+locations, the data will be fetched using the original server
+connection. Else, the driver will try each location given, in order,
+until a request succeeds. If the connection or request fails, it will
+try the next location.
+
+The driver does not currently cache or pool these secondary
+connections. It also does not retry connections or requests.
+Requests are made sequentially, one at a timeāthe driver does not
+parallelize requests or perform readahead.
+
+Metadata
+--------
+
+The driver currently will not populate column constraint info (foreign
+keys, primary keys, etc.) in :cpp:func:`AdbcConnectionGetObjects`.
+Also, catalog filters are evaluated as simple string matches, not
+``LIKE``-style patterns.
+
+Partitioned Result Sets
+-----------------------
+
+The Flight SQL driver supports ADBC's partitioned result sets. When
+requested, each partition of a result set contains a serialized
+FlightInfo, containing one of the FlightEndpoints of the original
+response. Clients who may wish to introspect the partition can do so
+by deserializing the contained FlightInfo from the ADBC partitions.
+(For example, a client that wishes to distribute work across multiple
+workers or machines may want to try to take advantage of locality
+information that ADBC does not have.)
+
+.. TODO: code samples
+
Timeouts
-~~~~~~~~
+--------
By default, timeouts are not used for RPC calls. They can be set via
special options on :cpp:class:`AdbcConnection`. In general, it is
@@ -108,10 +205,14 @@ The options are as follows:
For example, this controls the timeout of the underlying Flight
calls that implement bulk ingestion, or transaction support.
-.. TODO: code samples
+Transactions
+------------
+
+The driver will issue transaction RPCs, but the driver will not check
+the server's SqlInfo to determine whether this is supported first.
Type Mapping
-~~~~~~~~~~~~
+------------
When executing a bulk ingestion operation, the driver needs to be able
to construct appropriate SQL queries for the database. (The driver
@@ -125,6 +226,10 @@ Flight SQL metadata to construct this mapping.)
All such options begin with ``arrow.flight.sql.quirks.ingest_type.``
and are followed by a type name below.
+.. warning:: The driver does **not** escape or validate the values
+ here. They should not come from untrusted user input, or
+ a SQL injection vulnerability may result.
+
.. csv-table:: Type Names
:header: "Arrow Type Name", "Default SQL Type Name"
@@ -146,20 +251,4 @@ and are followed by a type name below.
time64,TIME
timestamp,TIMESTAMP
-.. TODO: code samples
-
-Partitioned Result Set Support
-------------------------------
-
-The Flight SQL driver supports ADBC's partitioned result sets, mapping
-them onto FlightEndpoints. Each partition of a result set contains a
-serialized FlightInfo, containing one of the FlightEndpoints of the
-original response. Clients who may wish to introspect the partition
-can do so by deserializing the contained FlightInfo from the ADBC
-partitions. (For example, a client that wishes to distribute work
-across multiple workers or machines may want to try to take advantage
-of locality information that ADBC does not have.)
-
-.. TODO: code samples
-
.. _DBAPI 2.0: https://peps.python.org/pep-0249/