You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by li...@apache.org on 2022/04/20 13:27:56 UTC

[arrow] branch master updated: ARROW-16065: [FlightRPC][Docs] Improve Flight documentation

This is an automated email from the ASF dual-hosted git repository.

lidavidm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 7dd8a4bd62 ARROW-16065: [FlightRPC][Docs] Improve Flight documentation
7dd8a4bd62 is described below

commit 7dd8a4bd62879416eca189fe6d9e0023e4936d87
Author: David Li <li...@gmail.com>
AuthorDate: Wed Apr 20 09:27:39 2022 -0400

    ARROW-16065: [FlightRPC][Docs] Improve Flight documentation
    
    This reworks the Flight protocol documentation and the C++ Flight documentation, and adds a brief Python documentation page. It also fixes some warnings and errors I found.
    
    Closes #12815 from lidavidm/arrow-16065
    
    Lead-authored-by: David Li <li...@gmail.com>
    Co-authored-by: Will Jones <wi...@gmail.com>
    Signed-off-by: David Li <li...@gmail.com>
---
 docs/source/cpp/api/utilities.rst            |  54 +++----
 docs/source/cpp/build_system.rst             |   2 +-
 docs/source/cpp/flight.rst                   | 112 ++++++++++----
 docs/source/cpp/orc.rst                      |   6 +-
 docs/source/format/Flight.rst                | 218 ++++++++++++++++++++++-----
 docs/source/format/Flight/DoExchange.mmd     |  32 ++++
 docs/source/format/Flight/DoExchange.mmd.svg |   1 +
 docs/source/format/Flight/DoGet.mmd          |  33 ++++
 docs/source/format/Flight/DoGet.mmd.svg      |   1 +
 docs/source/format/Flight/DoPut.mmd          |  29 ++++
 docs/source/format/Flight/DoPut.mmd.svg      |   1 +
 docs/source/python/api/flight.rst            |  17 +++
 docs/source/python/flight.rst                | 130 ++++++++++++++++
 docs/source/python/index.rst                 |   5 +-
 python/pyarrow/_flight.pyx                   |  48 +++++-
 15 files changed, 582 insertions(+), 107 deletions(-)

diff --git a/docs/source/cpp/api/utilities.rst b/docs/source/cpp/api/utilities.rst
index a0043ff9ca..5ce3d7d1a4 100644
--- a/docs/source/cpp/api/utilities.rst
+++ b/docs/source/cpp/api/utilities.rst
@@ -78,9 +78,9 @@ Visitors
 Type Traits
 ===========
 
-These types provide relationships between Arrow types at compile time. :cpp:type:`TypeTraits`
-maps Arrow DataTypes to other types, and :cpp:type:`CTypeTraits ` maps C types to
-Arrow types.
+These types provide relationships between Arrow types at compile
+time. :cpp:type:`TypeTraits` maps Arrow DataTypes to other types, and
+:cpp:type:`CTypeTraits` maps C types to Arrow types.
 
 TypeTraits
 ----------
@@ -89,40 +89,40 @@ Each specialized type defines the following associated types:
 
 .. cpp:type:: TypeTraits::ArrayType
 
-  Corresponding :doc:`Arrow array type </cpp/api/array.rst>`
+   Corresponding :doc:`Arrow array type <./array>`
 
 .. cpp:type:: TypeTraits::BuilderType
 
-  Corresponding :doc:`array builder type </cpp/api/builders.rst>`
+   Corresponding :doc:`array builder type <./builder>`
 
 .. cpp:type:: TypeTraits::ScalarType
 
-  Corresponding :doc:`Arrow scalar type </cpp/api/scalar.rst>`
+   Corresponding :doc:`Arrow scalar type <./scalar>`
 
-.. cpp:var:: TypeTraits::is_parameter_free
+.. cpp:var:: bool TypeTraits::is_parameter_free
 
-  Whether the type has any type parameters, such as field types in nested types
-  or scale and precision in decimal types.
+   Whether the type has any type parameters, such as field types in nested types
+   or scale and precision in decimal types.
 
 
 In addition, the following are defined for many but not all of the types:
 
 .. cpp:type:: TypeTraits::CType
 
-  Corresponding C type. For example, ``int64_t`` for ``Int64Array``.
+   Corresponding C type. For example, ``int64_t`` for ``Int64Array``.
 
 .. cpp:type:: TypeTraits::TensorType
 
-  Corresponding :doc:`Arrow tensor type </cpp/api/tensor.rst>`
+   Corresponding :doc:`Arrow tensor type <./tensor>`
 
 .. cpp:function:: static inline constexpr int64_t bytes_required(int64_t elements)
 
-  Return the number of bytes required for given number of elements. Defined for 
-  types with a fixed size.
+   Return the number of bytes required for given number of elements. Defined for
+   types with a fixed size.
 
 .. cpp:function:: static inline std::shared_ptr<DataType> TypeTraits::type_singleton()
 
-  For types where is_parameter_free is true, returns an instance of the data type.
+   For types where is_parameter_free is true, returns an instance of the data type.
 
 
 .. doxygengroup:: type-traits
@@ -137,7 +137,7 @@ Each specialized type defines the following associated types:
 
 .. cpp:type:: CTypeTraits::ArrowType
 
-  Corresponding :doc:`Arrow type </cpp/api/datatype.rst>`
+   Corresponding :doc:`Arrow type <./datatype>`
 
 .. doxygengroup:: c-type-traits
    :content-only:
@@ -150,24 +150,24 @@ Each specialized type defines the following associated types:
 Type Predicates
 ---------------
 
-Type predicates that can be used with templates. Predicates of the form ``is_XXX`` 
-resolve to constant boolean values, while predicates of the form ``enable_if_XXX`` 
-resolve to the second type parameter ``R`` if the first parameter ``T`` passes 
+Type predicates that can be used with templates. Predicates of the form ``is_XXX``
+resolve to constant boolean values, while predicates of the form ``enable_if_XXX``
+resolve to the second type parameter ``R`` if the first parameter ``T`` passes
 the test.
 
 Example usage:
 
 .. code-block:: cpp
 
-  template<typename TypeClass>
-  arrow::enable_if_number<TypeClass, RETURN_TYPE> MyFunction(const TypeClass& type) {
-    ..
-  }
+   template<typename TypeClass>
+   arrow::enable_if_number<TypeClass, RETURN_TYPE> MyFunction(const TypeClass& type) {
+     ..
+   }
 
-  template<typename ArrayType, typename TypeClass=ArrayType::TypeClass>
-  arrow::enable_if_number<TypeClass, RETURN_TYPE> MyFunction(const ArrayType& array) {
-    ..
-  }
+   template<typename ArrayType, typename TypeClass=ArrayType::TypeClass>
+   arrow::enable_if_number<TypeClass, RETURN_TYPE> MyFunction(const ArrayType& array) {
+     ..
+   }
 
 
 .. doxygengroup:: type-predicates
@@ -184,4 +184,4 @@ Type predicates that can be applied at runtime.
 .. doxygengroup:: runtime-type-predicates
    :content-only:
    :members:
-   :undoc-members:
\ No newline at end of file
+   :undoc-members:
diff --git a/docs/source/cpp/build_system.rst b/docs/source/cpp/build_system.rst
index d91d070e09..ff353f3832 100644
--- a/docs/source/cpp/build_system.rst
+++ b/docs/source/cpp/build_system.rst
@@ -176,7 +176,7 @@ text version of the IANA timezone database and add the Windows timezone mapping
 XML. To download, you can use the following batch script:
 
 .. literalinclude:: ../../../ci/appveyor-cpp-setup.bat
-   :language: cmd
+   :language: batch
    :start-after: @rem (Doc section: Download timezone database)
    :end-before: @rem (Doc section: Download timezone database)
 
diff --git a/docs/source/cpp/flight.rst b/docs/source/cpp/flight.rst
index 75aea3c47c..a941ead904 100644
--- a/docs/source/cpp/flight.rst
+++ b/docs/source/cpp/flight.rst
@@ -23,8 +23,20 @@ Arrow Flight RPC
 ================
 
 Arrow Flight is an RPC framework for efficient transfer of Flight data
-over the network. See :doc:`../format/Flight` for full details on
-the protocol, or :doc:`./api/flight` for API docs.
+over the network.
+
+.. seealso::
+
+   :doc:`Flight protocol documentation <../format/Flight>`
+        Documentation of the Flight protocol, including how to use
+        Flight conceptually.
+
+   :doc:`Flight API documentation <./api/flight>`
+        C++ API documentation listing all of the various client and
+        server types.
+
+   `C++ Cookbook <https://arrow.apache.org/cookbook/cpp/flight.html>`_
+        Recipes for using Arrow Flight in C++.
 
 Writing a Flight Service
 ========================
@@ -82,41 +94,83 @@ server stops.
    std::cout << "Server listening on localhost:" << server->port() << std::endl;
    ARROW_CHECK_OK(server->Serve());
 
-
-Enabling TLS and Authentication
--------------------------------
-
-TLS can be enabled by providing a certificate and key pair to
-:func:`FlightServerBase::Init
-<arrow::flight::FlightServerBase::Init>`. Additionally, use
-:func:`Location::ForGrpcTls <arrow::flight::Location::ForGrpcTls>` to
-construct the :class:`arrow::flight::Location` to listen on.
-
-Similarly, authentication can be enabled by providing an
-implementation of :class:`ServerAuthHandler
-<arrow::flight::ServerAuthHandler>`. Authentication consists of two
-parts: on initial client connection, the server and client
-authentication implementations can perform any negotiation needed;
-then, on each RPC thereafter, the client provides a token. The server
-authentication handler validates the token and provides the identity
-of the client. This identity can be obtained from the
-:class:`arrow::flight::ServerCallContext`.
-
 Using the Flight Client
 =======================
 
 To connect to a Flight service, create an instance of
 :class:`arrow::flight::FlightClient` by calling :func:`Connect
-<arrow::flight::FlightClient::Connect>`. This takes a Location and
-returns the client through an out parameter. To authenticate, call
+<arrow::flight::FlightClient::Connect>`.
+
+Each RPC method returns :class:`arrow::Result` to indicate the
+success/failure of the request, and the result object if the request
+succeeded. Some calls are streaming calls, so they will return a
+reader and/or a writer object; the final call status isn't known until
+the stream is completed.
+
+Cancellation and Timeouts
+=========================
+
+When making a call, clients can optionally provide
+:class:`FlightCallOptions <arrow::flight::FlightCallOptions>`. This
+allows clients to set a timeout on calls or provide custom HTTP
+headers, among other features. Also, some objects returned by client
+RPC calls expose a ``Cancel`` method which allows terminating a call
+early.
+
+On the server side, no additional code is needed to implement
+timeouts. For cancellation, the server needs to manually poll
+:func:`ServerCallContext::is_cancelled
+<arrow::flight::ServerCallContext::is_cancelled>` to check if the
+client has cancelled the call, and if so, break out of any processing
+the server is currently doing.
+
+Enabling TLS
+============
+
+TLS can be enabled when setting up a server by providing a certificate
+and key pair to :func:`FlightServerBase::Init
+<arrow::flight::FlightServerBase::Init>`.
+
+On the client side, use :func:`Location::ForGrpcTls
+<arrow::flight::Location::ForGrpcTls>` to construct the
+:class:`arrow::flight::Location` to listen on.
+
+Enabling Authentication
+=======================
+
+.. warning:: Authentication is insecure without enabling TLS.
+
+Handshake-based authentication can be enabled by implementing
+:class:`ServerAuthHandler <arrow::flight::ServerAuthHandler>` and
+providing this to the server during construction.
+
+Authentication consists of two parts: on initial client connection,
+the server and client authentication implementations can perform any
+negotiation needed. The client authentication handler then provides a
+token that will be attached to future calls. This is done by calling
 :func:`Authenticate <arrow::flight::FlightClient::Authenticate>` with
 the desired client authentication implementation.
 
-Each RPC method returns :class:`arrow::Status` to indicate the
-success/failure of the request. Any other return values are specified
-through out parameters. They also take an optional :class:`options
-<arrow::flight::FlightCallOptions>` parameter that allows specifying a
-timeout for the call.
+On each RPC thereafter, the client handler's token is automatically
+added to the call in the request headers. The server authentication
+handler validates the token and provides the identity of the
+client. On the server, this identity can be obtained from the
+:class:`arrow::flight::ServerCallContext`.
+
+Custom Middleware
+=================
+
+Servers and clients support custom middleware (or interceptors) that
+are called on every request and can modify the request in a limited
+fashion.  These can be implemented by subclassing :class:`ServerMiddleware
+<arrow::flight::ServerMiddleware>` and :class:`ClientMiddleware
+<arrow::flight::ClientMiddleware>`, then providing them when creating
+the client or server.
+
+Middleware are fairly limited, but they can add headers to a
+request/response. On the server, they can inspect incoming headers and
+fail the request; hence, they can be used to implement custom
+authentication methods.
 
 Alternative Transports
 ======================
diff --git a/docs/source/cpp/orc.rst b/docs/source/cpp/orc.rst
index 2cd54018c3..863a842e8a 100644
--- a/docs/source/cpp/orc.rst
+++ b/docs/source/cpp/orc.rst
@@ -91,10 +91,10 @@ Here are a list of ORC types and mapped Arrow types.
 * \(1) We do not support writing UNION types.
 
 * \(2) On the read side the ORC type is read as the first corresponding Arrow type in the table.
-  
+
 * \(3) On the read side the ORC TIMESTAMP type is read as the Arrow Timestamp type with
-  :type:`arrow::TimeUnit::NANO`. Also we currently don't support timezones.
-       
+  :cpp:enumerator:`arrow::TimeUnit::NANO`. Also we currently don't support timezones.
+
 
 Compression
 -----------
diff --git a/docs/source/format/Flight.rst b/docs/source/format/Flight.rst
index c79c563864..972bcaeebc 100644
--- a/docs/source/format/Flight.rst
+++ b/docs/source/format/Flight.rst
@@ -17,6 +17,7 @@
 
 .. _flight-rpc:
 
+================
 Arrow Flight RPC
 ================
 
@@ -29,73 +30,210 @@ either downloaded from or uploaded to another service. A set of
 metadata methods offers discovery and introspection of streams, as
 well as the ability to implement application-specific methods.
 
-Methods and message wire formats are defined by Protobuf, enabling
+Methods and message wire formats are defined by Protobuf_, enabling
 interoperability with clients that may support gRPC and Arrow
 separately, but not Flight. However, Flight implementations include
 further optimizations to avoid overhead in usage of Protobuf (mostly
 around avoiding excessive memory copies).
 
 .. _gRPC: https://grpc.io/
+.. _Protobuf: https://developers.google.com/protocol-buffers/
 
-RPC Methods
------------
+RPC Methods and Request Patterns
+================================
 
 Flight defines a set of RPC methods for uploading/downloading data,
 retrieving metadata about a data stream, listing available data
 streams, and for implementing application-specific RPC methods. A
 Flight service implements some subset of these methods, while a Flight
-client can call any of these methods. Thus, one Flight client can
-connect to any Flight service and perform basic operations.
+client can call any of these methods.
+
+Data streams are identified by descriptors (the ``FlightDescriptor``
+message), which are either a path or an arbitrary binary command. For
+instance, the descriptor may encode a SQL query, a path to a file on a
+distributed file system, or even a pickled Python object; the
+application can use this message as it sees fit.
+
+Thus, one Flight client can connect to any service and perform basic
+operations. To facilitate this, Flight services are *expected* to
+support some common request patterns, described next. Of course,
+applications may ignore compatibility and simply treat the Flight RPC
+methods as low-level building blocks for their own purposes.
+
+See `Protocol Buffer Definitions`_ for full details on the methods and
+messages involved.
 
-Data streams are identified by descriptors, which are either a path or
-an arbitrary binary command. A client that wishes to download the data
-would:
+Downloading Data
+----------------
+
+A client that wishes to download the data would:
+
+.. figure:: ./Flight/DoGet.mmd.svg
+
+   Retrieving data via ``DoGet``.
 
 #. Construct or acquire a ``FlightDescriptor`` for the data set they
-   are interested in. A client may know what descriptor they want
-   already, or they may use methods like ``ListFlights`` to discover
-   them.
+   are interested in.
+
+   A client may know what descriptor they want already, or they may
+   use methods like ``ListFlights`` to discover them.
 #. Call ``GetFlightInfo(FlightDescriptor)`` to get a ``FlightInfo``
-   message containing details on where the data is located (as well as
-   other metadata, like the schema and possibly an estimate of the
-   dataset size).
+   message.
 
    Flight does not require that data live on the same server as
-   metadata: this call may list other servers to connect to. The
-   ``FlightInfo`` message includes a ``Ticket``, an opaque binary
-   token that the server uses to identify the exact data set being
-   requested.
-#. Connect to other servers (if needed).
-#. Call ``DoGet(Ticket)`` to get back a stream of Arrow record
-   batches.
+   metadata. Hence, ``FlightInfo`` contains details on where the data
+   is located, so the client can go fetch the data from an appropriate
+   server. This is encoded as a series of ``FlightEndpoint`` messages
+   inside ``FlightInfo``. Each endpoint represents some location that
+   contains a subset of the response data.
+
+   An endpoint contains a list of locations (server addresses) where
+   this data can be retrieved from, and a ``Ticket``, an opaque binary
+   token that the server will use to identify the data being
+   requested. There is no ordering defined on endpoints or the data
+   within, so if the dataset is sorted, applications should return
+   data in a single endpoint.
+
+   The response also contains other metadata, like the schema, and
+   optionally an estimate of the dataset size.
+#. Consume each endpoint returned by the server.
+
+   To consume an endpoint, the client should connect to one of the
+   locations in the endpoint, then call ``DoGet(Ticket)`` with the
+   ticket in the endpoint. This will give the client a stream of Arrow
+   record batches.
+
+   If the server wishes to indicate that the data is on the local
+   server and not a different location, then it can return an empty
+   list of locations. The client can then reuse the existing
+   connection to the original server to fetch data. Otherwise, the
+   client must connect to one of the indicated locations.
+
+   In this way, the locations inside an endpoint can also be thought
+   of as performing look-aside load balancing or service discovery
+   functions. And the endpoints can represent data that is partitioned
+   or otherwise distributed.
+
+   The client must consume all endpoints to retrieve the complete data
+   set. The client can consume endpoints in any order, or even in
+   parallel, or distribute the endpoints among multiple machines for
+   consumption; this is up to the application to implement.
+
+Uploading Data
+--------------
 
 To upload data, a client would:
 
+.. figure:: ./Flight/DoPut.mmd.svg
+
+   Uploading data via ``DoGet``.
+
 #. Construct or acquire a ``FlightDescriptor``, as before.
 #. Call ``DoPut(FlightData)`` and upload a stream of Arrow record
-   batches. They would also include the ``FlightDescriptor`` with the
-   first message.
+   batches.
 
-See `Protocol Buffer Definitions`_ for full details on the methods and
-messages involved.
+   The ``FlightDescriptor`` is included with the first message so the
+   server can identify the dataset.
+
+``DoPut`` allows the server to send response messages back to the
+client with custom metadata. This can be used to implement things like
+resumable writes (e.g. the server can periodically send a message
+indicating how many rows have been committed so far).
+
+Exchanging Data
+---------------
+
+Some use cases may require uploading and downloading data within a
+single call. While this can be emulated with multiple calls, this may
+be difficult if the application is stateful. For instance, the
+application may wish to implement a call where the client uploads data
+and the server responds with a transformation of that data; this would
+require being stateful if implemented using ``DoGet`` and
+``DoPut``. Instead, ``DoExchange`` allows this to be implemented as a
+single call. A client would:
+
+.. figure:: ./Flight/DoExchange.mmd.svg
+
+   Complex data flow with ``DoExchange``.
+
+#. Construct or acquire a ``FlightDescriptor``, as before.
+#. Call ``DoExchange(FlightData)``.
+
+   The ``FlightDescriptor`` is included with the first message, as
+   with ``DoPut``. At this point, both the client and the server may
+   simultaneously stream data to the other side.
 
 Authentication
---------------
+==============
+
+Flight supports a variety of authentication methods that applications
+can customize for their needs.
+
+"Handshake" authentication
+  This is implemented in two parts. At connection time, the client
+  calls the ``Handshake`` RPC method, and the application-defined
+  authentication handler can exchange any number of messages with its
+  counterpart on the server. The handler then provides a binary
+  token. The Flight client will then include this token in the headers
+  of all future calls, which is validated by the server authentication
+  handler.
 
-Flight supports application-implemented authentication
-methods. Authentication, if enabled, has two phases: at connection
-time, the client and server can exchange any number of messages. Then,
-the client can provide a token alongside each call, and the server can
-validate that token.
+  Applications may use any part of this; for instance, they may ignore
+  the initial handshake and send an externally acquired token (e.g. a
+  bearer token) on each call, or they may establish trust during the
+  handshake and not validate a token for each call, treating the
+  connection as stateful (a "login" pattern).
 
-Applications may use any part of this; for instance, they may ignore
-the initial handshake and send an externally acquired token on each
-call, or they may establish trust during the handshake and not
-validate a token for each call. (Note that the latter is not secure if
-you choose to deploy a layer 7 load balancer, as is common with gRPC.)
+  .. warning:: Unless a token is validated on every call, this pattern
+               is not secure, especially in the presenence of a layer
+               7 load balancer, as is common with gRPC, or if gRPC
+               transparently reconnects the client.
+
+Header-based/middleware-based authentication
+  Clients may include custom headers with calls. Custom middleware can
+  then be implemented to validate and accept/reject calls on the
+  server side.
+
+`Mutual TLS (mTLS)`_
+  The client provides a certificate during connection establishment
+  which is verified by the server. The application does not need to
+  implement any authentication code, but must provision and distribute
+  certificates.
+
+  This may only be available in certain implementations, and is only
+  available when TLS is also enabled.
+
+Some Flight implementations may expose the underlying gRPC API as
+well, in which case any `authentication method supported by gRPC
+<https://grpc.io/docs/guides/auth/>`_ is available.
+
+.. _Mutual TLS (mTLS): https://grpc.io/docs/guides/auth/#supported-auth-mechanisms
+
+Transport Implementations
+=========================
+
+Flight is primarily defined in terms of its Protobuf and gRPC
+specification below, but Arrow implementations may also support
+alternative transports (see :ref:`status-flight-rpc`). In that case,
+implementations should use the following URI schemes for the given
+transport implemenatations:
+
++----------------------------+----------------------------+
+| Transport                  | URI Scheme                 |
++============================+============================+
+| gRPC (plaintext)           | grpc: or grpc+tcp:         |
++----------------------------+----------------------------+
+| gRPC (TLS)                 | grpc+tls:                  |
++----------------------------+----------------------------+
+| gRPC (Unix domain socket)  | grpc+unix:                 |
++----------------------------+----------------------------+
+| UCX_ (plaintext)           | ucx:                       |
++----------------------------+----------------------------+
+
+.. _UCX: https://openucx.org/
 
 Error Handling
---------------
+==============
 
 Arrow Flight defines its own set of error codes. The implementation
 differs between languages (e.g. in C++, Unimplemented is a general
@@ -137,15 +275,15 @@ but the following set is exposed:
 |                |by the client for connectivity reasons.    |
 +----------------+-------------------------------------------+
 
-
 External Resources
-------------------
+==================
 
+- https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/
 - https://arrow.apache.org/blog/2018/10/09/0.11.0-release/
 - https://www.slideshare.net/JacquesNadeau5/apache-arrow-flight-overview
 
 Protocol Buffer Definitions
----------------------------
+===========================
 
 .. literalinclude:: ../../../format/Flight.proto
    :language: protobuf
diff --git a/docs/source/format/Flight/DoExchange.mmd b/docs/source/format/Flight/DoExchange.mmd
new file mode 100644
index 0000000000..14f1789aea
--- /dev/null
+++ b/docs/source/format/Flight/DoExchange.mmd
@@ -0,0 +1,32 @@
+%% Licensed to the Apache Software Foundation (ASF) under one
+%% or more contributor license agreements.  See the NOTICE file
+%% distributed with this work for additional information
+%% regarding copyright ownership.  The ASF licenses this file
+%% to you under the Apache License, Version 2.0 (the
+%% "License"); you may not use this file except in compliance
+%% with the License.  You may obtain a copy of the License at
+%%
+%%   http://www.apache.org/licenses/LICENSE-2.0
+%%
+%% Unless required by applicable law or agreed to in writing,
+%% software distributed under the License is distributed on an
+%% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+%% KIND, either express or implied.  See the License for the
+%% specific language governing permissions and limitations
+%% under the License.
+
+%% To generate the diagram, use mermaid-cli
+%% Example: docker run --rm -v $(pwd)/FlightSql:/data minlag/mermaid-cli -i /data/CommandGetTables.mmd
+
+sequenceDiagram
+autonumber
+
+participant Client
+participant Server
+Note right of Client: The first FlightData includes a FlightDescriptor
+Client->>Server: DoExchange(FlightData)
+par [Client sends data]
+    Client->>Server: stream of FlightData
+and [Server sends data]
+    Server->>Client: stream of FlightData
+end
diff --git a/docs/source/format/Flight/DoExchange.mmd.svg b/docs/source/format/Flight/DoExchange.mmd.svg
new file mode 100644
index 0000000000..204d63d772
--- /dev/null
+++ b/docs/source/format/Flight/DoExchange.mmd.svg
@@ -0,0 +1 @@
+<svg id="mermaid-1649258037754" width="100%" xmlns="http://www.w3.org/2000/svg" height="449" style="max-width: 607px; background-color: white;" viewBox="-50 -10 607 449"><style>#mermaid-1649258037754 {font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}#mermaid-1649258037754 .error-icon{fill:#552222;}#mermaid-1649258037754 .error-text{fill:#552222;stroke:#552222;}#mermaid-1649258037754 .edge-thickness-normal{stroke-width:2px;}#mermaid-1649258037754 .edge-thickne [...]
\ No newline at end of file
diff --git a/docs/source/format/Flight/DoGet.mmd b/docs/source/format/Flight/DoGet.mmd
new file mode 100644
index 0000000000..c2e3cd0344
--- /dev/null
+++ b/docs/source/format/Flight/DoGet.mmd
@@ -0,0 +1,33 @@
+%% Licensed to the Apache Software Foundation (ASF) under one
+%% or more contributor license agreements.  See the NOTICE file
+%% distributed with this work for additional information
+%% regarding copyright ownership.  The ASF licenses this file
+%% to you under the Apache License, Version 2.0 (the
+%% "License"); you may not use this file except in compliance
+%% with the License.  You may obtain a copy of the License at
+%%
+%%   http://www.apache.org/licenses/LICENSE-2.0
+%%
+%% Unless required by applicable law or agreed to in writing,
+%% software distributed under the License is distributed on an
+%% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+%% KIND, either express or implied.  See the License for the
+%% specific language governing permissions and limitations
+%% under the License.
+
+%% To generate the diagram, use mermaid-cli
+%% Example: docker run --rm -v $(pwd)/FlightSql:/data minlag/mermaid-cli -i /data/CommandGetTables.mmd
+
+sequenceDiagram
+autonumber
+
+participant Client
+participant Metadata Server
+participant Data Server
+Client->>Metadata Server: GetFlightInfo(FlightDescriptor)
+Metadata Server->>Client: FlightInfo{endpoints: [FlightEndpoint{ticket: Ticket}, …]}
+Note over Client, Data Server: This may be parallelized
+loop for each endpoint in FlightInfo.endpoints
+    Client->>Data Server: DoGet(Ticket)
+    Data Server->>Client: stream of FlightData
+end
diff --git a/docs/source/format/Flight/DoGet.mmd.svg b/docs/source/format/Flight/DoGet.mmd.svg
new file mode 100644
index 0000000000..48a50d77ed
--- /dev/null
+++ b/docs/source/format/Flight/DoGet.mmd.svg
@@ -0,0 +1 @@
+<svg id="mermaid-1649258038801" width="100%" xmlns="http://www.w3.org/2000/svg" height="448" style="max-width: 910px; background-color: white;" viewBox="-50 -10 910 448"><style>#mermaid-1649258038801 {font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}#mermaid-1649258038801 .error-icon{fill:#552222;}#mermaid-1649258038801 .error-text{fill:#552222;stroke:#552222;}#mermaid-1649258038801 .edge-thickness-normal{stroke-width:2px;}#mermaid-1649258038801 .edge-thickne [...]
\ No newline at end of file
diff --git a/docs/source/format/Flight/DoPut.mmd b/docs/source/format/Flight/DoPut.mmd
new file mode 100644
index 0000000000..5845edef1f
--- /dev/null
+++ b/docs/source/format/Flight/DoPut.mmd
@@ -0,0 +1,29 @@
+%% Licensed to the Apache Software Foundation (ASF) under one
+%% or more contributor license agreements.  See the NOTICE file
+%% distributed with this work for additional information
+%% regarding copyright ownership.  The ASF licenses this file
+%% to you under the Apache License, Version 2.0 (the
+%% "License"); you may not use this file except in compliance
+%% with the License.  You may obtain a copy of the License at
+%%
+%%   http://www.apache.org/licenses/LICENSE-2.0
+%%
+%% Unless required by applicable law or agreed to in writing,
+%% software distributed under the License is distributed on an
+%% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+%% KIND, either express or implied.  See the License for the
+%% specific language governing permissions and limitations
+%% under the License.
+
+%% To generate the diagram, use mermaid-cli
+%% Example: docker run --rm -v $(pwd)/FlightSql:/data minlag/mermaid-cli -i /data/CommandGetTables.mmd
+
+sequenceDiagram
+autonumber
+
+participant Client
+participant Server
+Note right of Client: The first FlightData includes a FlightDescriptor
+Client->>Server: DoPut(FlightData)
+Client->>Server: stream of FlightData
+Server->>Client: PutResult{app_metadata}
diff --git a/docs/source/format/Flight/DoPut.mmd.svg b/docs/source/format/Flight/DoPut.mmd.svg
new file mode 100644
index 0000000000..9e490e152b
--- /dev/null
+++ b/docs/source/format/Flight/DoPut.mmd.svg
@@ -0,0 +1 @@
+<svg id="mermaid-1649258039834" width="100%" xmlns="http://www.w3.org/2000/svg" height="351" style="max-width: 607px; background-color: white;" viewBox="-50 -10 607 351"><style>#mermaid-1649258039834 {font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}#mermaid-1649258039834 .error-icon{fill:#552222;}#mermaid-1649258039834 .error-text{fill:#552222;stroke:#552222;}#mermaid-1649258039834 .edge-thickness-normal{stroke-width:2px;}#mermaid-1649258039834 .edge-thickne [...]
\ No newline at end of file
diff --git a/docs/source/python/api/flight.rst b/docs/source/python/api/flight.rst
index ea1e7d9018..605d171a1d 100644
--- a/docs/source/python/api/flight.rst
+++ b/docs/source/python/api/flight.rst
@@ -57,6 +57,7 @@ Flight Client
 .. autosummary::
    :toctree: ../generated/
 
+    connect
     FlightCallOptions
     FlightClient
     FlightStreamReader
@@ -88,6 +89,22 @@ Authentication
     ClientAuthHandler
     ServerAuthHandler
 
+Errors
+------
+
+.. autosummary::
+   :toctree: ../generated/
+
+    FlightError
+    FlightCancelledError
+    FlightInternalError
+    FlightServerError
+    FlightTimedOutError
+    FlightUnauthenticatedError
+    FlightUnauthorizedError
+    FlightUnavailableError
+    FlightWriteSizeExceededError
+
 Middleware
 ----------
 
diff --git a/docs/source/python/flight.rst b/docs/source/python/flight.rst
new file mode 100644
index 0000000000..d038bcce57
--- /dev/null
+++ b/docs/source/python/flight.rst
@@ -0,0 +1,130 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+.. currentmodule:: pyarrow.flight
+.. highlight:: python
+
+================
+Arrow Flight RPC
+================
+
+Arrow Flight is an RPC framework for efficient transfer of Flight data
+over the network.
+
+.. seealso::
+
+   :doc:`Flight protocol documentation <../format/Flight>`
+        Documentation of the Flight protocol, including how to use
+        Flight conceptually.
+
+   :doc:`Flight API documentation <./api/flight>`
+        Python API documentation listing all of the various client and
+        server classes.
+
+   `Python Cookbook <https://arrow.apache.org/cookbook/py/flight.html>`_
+        Recipes for using Arrow Flight in Python.
+
+Writing a Flight Service
+========================
+
+Servers are subclasses of :class:`FlightServerBase`. To implement
+individual RPCs, override the RPC methods on this class.
+
+.. code-block:: python
+
+   import pyarrow.flight as flight
+
+   class MyFlightServer(flight.FlightServerBase):
+       def list_flights(self, context, criteria):
+           info = flight.FlightInfo(...)
+           yield info
+
+Each RPC method always takes a :class:`ServerCallContext` for common
+parameters. To indicate failure, raise an exception; Flight-specific
+errors can be indicated by raising one of the subclasses of
+:class:`FlightError`.
+
+To start a server, create a :class:`Location` to specify where to
+listen, and create an instance of the server. (A string will be
+converted into a location.) This will start the server, but won't
+block the rest of the program. Call :meth:`FlightServerBase.serve` to
+block until the server stops.
+
+.. code-block:: python
+
+   # Listen to all interfaces on a free port
+   server = MyFlightServer("grpc://0.0.0.0:0")
+
+   print("Server listening on port", server.port)
+   server.serve()
+
+Using the Flight Client
+=======================
+
+To connect to a Flight service, call :meth:`pyarrow.flight.connect`
+with a location.
+
+Cancellation and Timeouts
+=========================
+
+When making a call, clients can optionally provide
+:class:`FlightCallOptions`. This allows clients to set a timeout on
+calls or provide custom HTTP headers, among other features. Also, some
+objects returned by client RPC calls expose a ``cancel`` method which
+allows terminating a call early.
+
+On the server side, timeouts are transparent. For cancellation, the
+server needs to manually poll :meth:`ServerCallContext.is_cancelled`
+to check if the client has cancelled the call, and if so, break out of
+any processing the server is currently doing.
+
+Enabling TLS
+============
+
+TLS can be enabled when setting up a server by providing a certificate
+and key pair to :class:`FlightServerBase`.
+
+On the client side, use :func:`Location.for_grpc_tls` to construct the
+:class:`Location` to listen on.
+
+Enabling Authentication
+=======================
+
+.. warning:: Authentication is insecure without enabling TLS.
+
+Handshake-based authentication can be enabled by implementing
+:class:`ServerAuthHandler`. Authentication consists of two parts: on
+initial client connection, the server and client authentication
+implementations can perform any negotiation needed; then, on each RPC
+thereafter, the client provides a token. The server authentication
+handler validates the token and provides the identity of the
+client. This identity can be obtained from the
+:class:`ServerCallContext`.
+
+Custom Middleware
+=================
+
+Servers and clients support custom middleware (or interceptors) that
+are called on every request and can modify the request in a limited
+fashion.  These can be implemented by subclassing
+:class:`ServerMiddleware` and :class:`ClientMiddleware`, then
+providing them when creating the client or server.
+
+Middleware are fairly limited, but they can add headers to a
+request/response. On the server, they can inspect incoming headers and
+fail the request; hence, they can be used to implement custom
+authentication methods.
diff --git a/docs/source/python/index.rst b/docs/source/python/index.rst
index e120340e7e..b38041db83 100644
--- a/docs/source/python/index.rst
+++ b/docs/source/python/index.rst
@@ -20,8 +20,8 @@ PyArrow - Apache Arrow Python bindings
 
 This is the documentation of the Python API of Apache Arrow.
 
-Apache Arrow is a development platform for in-memory analytics. 
-It contains a set of technologies that enable big data systems to store, process and move data fast. 
+Apache Arrow is a development platform for in-memory analytics.
+It contains a set of technologies that enable big data systems to store, process and move data fast.
 
 See the :doc:`parent documentation <../index>` for additional details on
 the Arrow Project itself, on the Arrow format and the other language bindings.
@@ -55,6 +55,7 @@ files into Arrow structures.
    json
    parquet
    dataset
+   flight
    extending_types
    integration
    env_vars
diff --git a/python/pyarrow/_flight.pyx b/python/pyarrow/_flight.pyx
index 5821956b29..c7cd19f7de 100644
--- a/python/pyarrow/_flight.pyx
+++ b/python/pyarrow/_flight.pyx
@@ -149,6 +149,27 @@ class CertKeyPair(_CertKeyPair):
 
 
 cdef class FlightError(Exception):
+    """
+    The base class for Flight-specific errors.
+
+    A server may raise this class or one of its subclasses to provide
+    a more detailed error to clients.
+
+    Parameters
+    ----------
+    message : str, optional
+        The error message.
+    extra_info : bytes, optional
+        Extra binary error details that were provided by the
+        server/will be sent to the client.
+
+    Attributes
+    ----------
+    extra_info : bytes
+        Extra binary error details that were provided by the
+        server/will be sent to the client.
+  """
+
     cdef dict __dict__
 
     def __init__(self, message='', extra_info=b''):
@@ -159,43 +180,58 @@ cdef class FlightError(Exception):
         message = tobytes("Flight error: {}".format(str(self)))
         return CStatus_UnknownError(message)
 
+
 cdef class FlightInternalError(FlightError, ArrowException):
+    """An error internal to the Flight server occurred."""
+
     cdef CStatus to_status(self):
         return MakeFlightError(CFlightStatusInternal,
                                tobytes(str(self)), self.extra_info)
 
 
 cdef class FlightTimedOutError(FlightError, ArrowException):
+    """The Flight RPC call timed out."""
+
     cdef CStatus to_status(self):
         return MakeFlightError(CFlightStatusTimedOut,
                                tobytes(str(self)), self.extra_info)
 
 
 cdef class FlightCancelledError(FlightError, ArrowCancelled):
+    """The operation was cancelled."""
+
     cdef CStatus to_status(self):
         return MakeFlightError(CFlightStatusCancelled, tobytes(str(self)),
                                self.extra_info)
 
 
 cdef class FlightServerError(FlightError, ArrowException):
+    """A server error occurred."""
+
     cdef CStatus to_status(self):
         return MakeFlightError(CFlightStatusFailed, tobytes(str(self)),
                                self.extra_info)
 
 
 cdef class FlightUnauthenticatedError(FlightError, ArrowException):
+    """The client is not authenticated."""
+
     cdef CStatus to_status(self):
         return MakeFlightError(
             CFlightStatusUnauthenticated, tobytes(str(self)), self.extra_info)
 
 
 cdef class FlightUnauthorizedError(FlightError, ArrowException):
+    """The client is not authorized to perform the given operation."""
+
     cdef CStatus to_status(self):
         return MakeFlightError(CFlightStatusUnauthorized, tobytes(str(self)),
                                self.extra_info)
 
 
 cdef class FlightUnavailableError(FlightError, ArrowException):
+    """The server is not reachable or available."""
+
     cdef CStatus to_status(self):
         return MakeFlightError(CFlightStatusUnavailable, tobytes(str(self)),
                                self.extra_info)
@@ -2804,14 +2840,15 @@ cdef class FlightServerBase(_Weakrefable):
 
 def connect(location, **kwargs):
     """
-    Connect to the Flight server
+    Connect to a Flight server.
+
     Parameters
     ----------
-    location : str, tuple or Location
-        Location to connect to. Either a gRPC URI like `grpc://localhost:port`,
-        a tuple of (host, port) pair, or a Location instance.
+    location : str, tuple, or Location
+        Location to connect to. Either a URI like "grpc://localhost:port",
+        a tuple of (host, port), or a Location instance.
     tls_root_certs : bytes or None
-        PEM-encoded
+        PEM-encoded.
     cert_chain: str or None
         If provided, enables TLS mutual authentication.
     private_key: str or None
@@ -2832,6 +2869,7 @@ def connect(location, **kwargs):
     generic_options : list or None
         A list of generic (string, int or string) options to pass to
         the underlying transport.
+
     Returns
     -------
     client : FlightClient