You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by li...@apache.org on 2022/04/05 12:28:54 UTC
[arrow] branch master updated: ARROW-16046: [Docs][FlightRPC][Python] Ensure Flight Python API is documented
This is an automated email from the ASF dual-hosted git repository.
lidavidm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new c5a8129756 ARROW-16046: [Docs][FlightRPC][Python] Ensure Flight Python API is documented
c5a8129756 is described below
commit c5a812975686a094088a95ab35b6215d52bc2b80
Author: David Li <li...@gmail.com>
AuthorDate: Tue Apr 5 08:28:45 2022 -0400
ARROW-16046: [Docs][FlightRPC][Python] Ensure Flight Python API is documented
Add some missing classes to the docs, also fix a couple build warnings.
Closes #12737 from lidavidm/arrow-16046
Lead-authored-by: David Li <li...@gmail.com>
Co-authored-by: Joris Van den Bossche <jo...@gmail.com>
Signed-off-by: David Li <li...@gmail.com>
---
.../developers/guide/tutorials/r_tutorial.rst | 22 +--
docs/source/java/vector_schema_root.rst | 41 +++--
docs/source/python/api/flight.rst | 7 +
python/pyarrow/_flight.pyx | 182 ++++++++++++++++++++-
4 files changed, 219 insertions(+), 33 deletions(-)
diff --git a/docs/source/developers/guide/tutorials/r_tutorial.rst b/docs/source/developers/guide/tutorials/r_tutorial.rst
index d536f0de7e..3b8acaab65 100644
--- a/docs/source/developers/guide/tutorials/r_tutorial.rst
+++ b/docs/source/developers/guide/tutorials/r_tutorial.rst
@@ -52,12 +52,12 @@ to Arrow R package following the steps specified by the
:ref:`step_by_step` section. Navigate there whenever there is
some information you may find is missing here.
-The binding will be added to the ``expression.R`` file in the
+The binding will be added to the ``expression.R`` file in the
R package. But you can also follow these steps in case you are
adding a binding that will live somewhere else.
.. seealso::
-
+
To read more about the philosophy behind R bindings, refer to the
`Writing Bindings article <https://arrow.apache.org/docs/r/articles/developers/bindings.html>`_.
@@ -219,7 +219,7 @@ tests we have is in ``test-dplyr-funcs-datetime.R``:
)
})
-And
+And
.. code-block:: R
@@ -245,7 +245,7 @@ more research and code corrections.
ℹ Testing arrow
See arrow_info() for available features
✔ | F W S OK | Context
- ✖ | 1 230 | dplyr-funcs-datetime [1.4s]
+ ✖ | 1 230 | dplyr-funcs-datetime [1.4s]
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Failure (test-dplyr-funcs-datetime.R:187:3): strftime
``%>%`(...)` did not throw the expected error.
@@ -328,7 +328,7 @@ And ``git diff`` to see the changes in the files in order to spot any error we m
@@ -444,6 +444,15 @@ test_that("extract wday from timestamp", {
)
})
-
+
+test_that("extract mday from timestamp", {
+ compare_dplyr_binding(
+ .input %>%
@@ -383,11 +383,11 @@ We can use ``git log`` to check the history of commits:
Date: Thu Jan 20 09:45:59 2022 +0900
ARROW-15372: [C++][Gandiva] Gandiva now depends on boost/crc.hpp which is missing from the trimmed boost archive
-
+
See build error https://github.com/ursacomputing/crossbow/runs/4871392838?check_suite_focus=true#step:5:11762
-
+
Closes #12190 from kszucs/ARROW-15372
-
+
Authored-by: Krisztián Szűcs <sz...@gmail.com>
Signed-off-by: Sutou Kouhei <ko...@clear-code.com>
@@ -411,10 +411,10 @@ on GitHub called origin.
Writing objects: 100% (151/151), 35.78 KiB | 8.95 MiB/s, done.
Total 151 (delta 129), reused 33 (delta 20), pack-reused 0
remote: Resolving deltas: 100% (129/129), completed with 80 local objects.
- remote:
+ remote:
remote: Create a pull request for 'ARROW-14816' on GitHub by visiting:
remote: https://github.com/AlenkaF/arrow/pull/new/ARROW-14816
- remote:
+ remote:
To https://github.com/AlenkaF/arrow.git
* [new branch] ARROW-14816 -> ARROW-14816
@@ -423,7 +423,7 @@ to create a Pull Request. On the GitHub Arrow
page (main or forked) we will see a yellow notice
bar with a note that we made recent pushes to the branch
ARROW-14816. That’s great, now we can make the Pull Request
-by clicking on **Compare & pull request**.
+by clicking on **Compare & pull request**.
.. figure:: /developers/images/R_tutorial_create_pr_notice.jpeg
:scale: 60 %
diff --git a/docs/source/java/vector_schema_root.rst b/docs/source/java/vector_schema_root.rst
index 53c8c579dc..34392a46af 100644
--- a/docs/source/java/vector_schema_root.rst
+++ b/docs/source/java/vector_schema_root.rst
@@ -73,22 +73,23 @@ with some optional schema-wide metadata (in addition to per-field metadata).
VectorSchemaRoot
================
-.. note::
+A `VectorSchemaRoot`_ is a container for batches of data. Batches flow through
+VectorSchemaRoot as part of a pipeline.
- VectorSchemaRoot is somewhat analogous to tables and record batches in the other Arrow implementations
- in that they all are 2D datasets, but the usage is different.
+.. note::
-A :class:`VectorSchemaRoot` is a container that can hold batches, batches flow through :class:`VectorSchemaRoot`
-as part of a pipeline. Note this is different from other implementations (i.e. in C++ and Python,
-a :class:`RecordBatch` is a collection of equal-length vector instances and was created each time for a new batch).
+ VectorSchemaRoot is somewhat analogous to tables or record batches in the
+ other Arrow implementations in that they all are 2D datasets, but their
+ usage is different.
-The recommended usage for :class:`VectorSchemaRoot` is creating a single :class:`VectorSchemaRoot`
-based on the known schema and populated data over and over into the same VectorSchemaRoot in a stream
-of batches rather than creating a new :class:`VectorSchemaRoot` instance each time
-(see `Flight`_ or ``ArrowFileWriter`` for better understanding). Thus at any one point a VectorSchemaRoot may have data or
-may have no data (say it was transferred downstream or not yet populated).
+The recommended usage is to create a single VectorSchemaRoot based on a known
+schema and populate data over and over into that root in a stream of batches,
+rather than creating a new instance each time (see `Flight`_ or
+``ArrowFileWriter`` as examples). Thus at any one point, a VectorSchemaRoot may
+have data or may have no data (say it was transferred downstream or not yet
+populated).
-Here is the example of building a :class:`VectorSchemaRoot`
+Here is an example of creating a VectorSchemaRoot:
.. code-block:: Java
@@ -107,9 +108,10 @@ Here is the example of building a :class:`VectorSchemaRoot`
List<FieldVector> vectors = Arrays.asList(bitVector, varCharVector);
VectorSchemaRoot vectorSchemaRoot = new VectorSchemaRoot(fields, vectors);
-The vectors within a :class:`VectorSchemaRoot` could be loaded/unloaded via :class:`VectorLoader` and :class:`VectorUnloader`.
-:class:`VectorLoader` and :class:`VectorUnloader` handles converting between :class:`VectorSchemaRoot` and :class:`ArrowRecordBatch` (
-representation of a RecordBatch :doc:`IPC <../format/IPC.rst>` message). Examples as below
+Data can be loaded into/unloaded from a VectorSchemaRoot via `VectorLoader`_
+and `VectorUnloader`_. They handle converting between VectorSchemaRoot and
+`ArrowRecordBatch`_ (a representation of a RecordBatch :ref:`IPC <format-ipc>`
+message). For example:
.. code-block:: Java
@@ -123,13 +125,18 @@ representation of a RecordBatch :doc:`IPC <../format/IPC.rst>` message). Example
VectorLoader loader = new VectorLoader(root2);
loader.load(recordBatch);
-A new :class:`VectorSchemaRoot` could be sliced from an existing instance with zero-copy
+A new VectorSchemaRoot can be sliced from an existing root without copying
+data:
.. code-block:: Java
// 0 indicates start index (inclusive) and 5 indicated length (exclusive).
VectorSchemaRoot newRoot = vectorSchemaRoot.slice(0, 5);
+.. _`ArrowRecordBatch`: https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/ipc/message/ArrowRecordBatch.html
.. _`Field`: https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/types/pojo/Field.html
-.. _`Schema`: https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/types/pojo/Schema.html
.. _`Flight`: https://arrow.apache.org/docs/java/reference/org/apache/arrow/flight/package-summary.html
+.. _`Schema`: https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/types/pojo/Schema.html
+.. _`VectorLoader`: https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/VectorLoader.html
+.. _`VectorSchemaRoot`: https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/VectorSchemaRoot.html
+.. _`VectorUnloader`: https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/VectorUnloader.html
diff --git a/docs/source/python/api/flight.rst b/docs/source/python/api/flight.rst
index 0cfbb6b4bd..ea1e7d9018 100644
--- a/docs/source/python/api/flight.rst
+++ b/docs/source/python/api/flight.rst
@@ -46,6 +46,8 @@ Common Types
FlightEndpoint
FlightInfo
Location
+ MetadataRecordBatchReader
+ MetadataRecordBatchWriter
Ticket
Result
@@ -57,6 +59,8 @@ Flight Client
FlightCallOptions
FlightClient
+ FlightStreamReader
+ FlightStreamWriter
ClientMiddlewareFactory
ClientMiddleware
@@ -66,9 +70,12 @@ Flight Server
.. autosummary::
:toctree: ../generated/
+ FlightDataStream
+ FlightMetadataWriter
FlightServerBase
GeneratorStream
RecordBatchStream
+ ServerCallContext
ServerMiddlewareFactory
ServerMiddleware
diff --git a/python/pyarrow/_flight.pyx b/python/pyarrow/_flight.pyx
index 81a5c921fd..59e2e30f27 100644
--- a/python/pyarrow/_flight.pyx
+++ b/python/pyarrow/_flight.pyx
@@ -877,7 +877,12 @@ cdef class _MetadataRecordBatchReader(_Weakrefable, _ReadPandasMixin):
cdef class MetadataRecordBatchReader(_MetadataRecordBatchReader):
- """The virtual base class for readers for Flight streams."""
+ """The base class for readers for Flight streams.
+
+ See Also
+ --------
+ FlightStreamReader
+ """
cdef class FlightStreamReader(MetadataRecordBatchReader):
@@ -1172,6 +1177,11 @@ cdef class FlightClient(_Weakrefable):
def connect(cls, location, tls_root_certs=None, cert_chain=None,
private_key=None, override_hostname=None,
disable_server_verification=None):
+ """Connect to a Flight server.
+
+ .. deprecated:: 0.15.0
+ Use the ``FlightClient`` constructor or ``pyarrow.flight.connect`` function instead.
+ """
warnings.warn("The 'FlightClient.connect' method is deprecated, use "
"FlightClient constructor or pyarrow.flight.connect "
"function instead")
@@ -1447,6 +1457,7 @@ cdef class FlightClient(_Weakrefable):
return py_writer, py_reader
def close(self):
+ """Close the client and disconnect."""
check_flight_status(self.client.get().Close())
def __del__(self):
@@ -1462,7 +1473,14 @@ cdef class FlightClient(_Weakrefable):
cdef class FlightDataStream(_Weakrefable):
- """Abstract base class for Flight data streams."""
+ """
+ Abstract base class for Flight data streams.
+
+ See Also
+ --------
+ RecordBatchStream
+ GeneratorStream
+ """
cdef CFlightDataStream* to_stream(self) except *:
"""Create the C++ data stream for the backing Python object.
@@ -1474,7 +1492,12 @@ cdef class FlightDataStream(_Weakrefable):
cdef class RecordBatchStream(FlightDataStream):
- """A Flight data stream backed by RecordBatches."""
+ """A Flight data stream backed by RecordBatches.
+
+ The remainder of this DoGet request will be handled in C++,
+ without having to acquire the GIL.
+
+ """
cdef:
object data_source
CIpcWriteOptions write_options
@@ -1485,7 +1508,9 @@ cdef class RecordBatchStream(FlightDataStream):
Parameters
----------
data_source : RecordBatchReader or Table
+ The data to stream to the client.
options : pyarrow.ipc.IpcWriteOptions, optional
+ Optional IPC options to control how to write the data.
"""
if (not isinstance(data_source, RecordBatchReader) and
not isinstance(data_source, lib.Table)):
@@ -1561,6 +1586,7 @@ cdef class ServerCallContext(_Weakrefable):
return frombytes(self.context.peer(), safe=True)
def is_cancelled(self):
+ """Check if the current RPC call has been canceled by the client."""
return self.context.is_cancelled()
def get_middleware(self, key):
@@ -2452,6 +2478,10 @@ cdef class _ServerMiddlewareWrapper(ServerMiddleware):
cdef class FlightServerBase(_Weakrefable):
"""A Flight service definition.
+ To start the server, create an instance of this class with an
+ appropriate location. The server will be running as soon as the
+ instance is created; it is not required to call :meth:`serve`.
+
Override methods to define your Flight service.
Parameters
@@ -2564,32 +2594,169 @@ cdef class FlightServerBase(_Weakrefable):
return self.server.get().port()
def list_flights(self, context, criteria):
+ """List flights available on this service.
+
+ Applications should override this method to implement their
+ own behavior. The default method raises a NotImplementedError.
+
+ Parameters
+ ----------
+ context : ServerCallContext
+ Common contextual information.
+ criteria : bytes
+ Filter criteria provided by the client.
+
+ Returns
+ -------
+ iterator of FlightInfo
+
+ """
raise NotImplementedError
def get_flight_info(self, context, descriptor):
+ """Get information about a flight.
+
+ Applications should override this method to implement their
+ own behavior. The default method raises a NotImplementedError.
+
+ Parameters
+ ----------
+ context : ServerCallContext
+ Common contextual information.
+ descriptor : FlightDescriptor
+ The descriptor for the flight provided by the client.
+
+ Returns
+ -------
+ FlightInfo
+
+ """
raise NotImplementedError
def get_schema(self, context, descriptor):
+ """Get the schema of a flight.
+
+ Applications should override this method to implement their
+ own behavior. The default method raises a NotImplementedError.
+
+ Parameters
+ ----------
+ context : ServerCallContext
+ Common contextual information.
+ descriptor : FlightDescriptor
+ The descriptor for the flight provided by the client.
+
+ Returns
+ -------
+ Schema
+
+ """
raise NotImplementedError
- def do_put(self, context, descriptor, reader,
+ def do_put(self, context, descriptor, reader: MetadataRecordBatchReader,
writer: FlightMetadataWriter):
+ """Write data to a flight.
+
+ Applications should override this method to implement their
+ own behavior. The default method raises a NotImplementedError.
+
+ Parameters
+ ----------
+ context : ServerCallContext
+ Common contextual information.
+ descriptor : FlightDescriptor
+ The descriptor for the flight provided by the client.
+ reader : MetadataRecordBatchReader
+ A reader for data uploaded by the client.
+ writer : FlightMetadataWriter
+ A writer to send responses to the client.
+
+ """
raise NotImplementedError
def do_get(self, context, ticket):
+ """Write data to a flight.
+
+ Applications should override this method to implement their
+ own behavior. The default method raises a NotImplementedError.
+
+ Parameters
+ ----------
+ context : ServerCallContext
+ Common contextual information.
+ ticket : Ticket
+ The ticket for the flight.
+
+ Returns
+ -------
+ FlightDataStream
+ A stream of data to send back to the client.
+
+ """
raise NotImplementedError
def do_exchange(self, context, descriptor, reader, writer):
+ """Write data to a flight.
+
+ Applications should override this method to implement their
+ own behavior. The default method raises a NotImplementedError.
+
+ Parameters
+ ----------
+ context : ServerCallContext
+ Common contextual information.
+ descriptor : FlightDescriptor
+ The descriptor for the flight provided by the client.
+ reader : MetadataRecordBatchReader
+ A reader for data uploaded by the client.
+ writer : MetadataRecordBatchWriter
+ A writer to send responses to the client.
+
+ """
raise NotImplementedError
def list_actions(self, context):
+ """List custom actions available on this server.
+
+ Applications should override this method to implement their
+ own behavior. The default method raises a NotImplementedError.
+
+ Parameters
+ ----------
+ context : ServerCallContext
+ Common contextual information.
+
+ Returns
+ -------
+ iterator of ActionType or tuple
+
+ """
raise NotImplementedError
def do_action(self, context, action):
+ """Execute a custom action.
+
+ This method should return an iterator, or it should be a
+ generator. Applications should override this method to
+ implement their own behavior. The default method raises a
+ NotImplementedError.
+
+ Parameters
+ ----------
+ context : ServerCallContext
+ Common contextual information.
+ action : Action
+ The action to execute.
+
+ Returns
+ -------
+ iterator of bytes
+
+ """
raise NotImplementedError
def serve(self):
- """Start serving.
+ """Block until the server shuts down.
This method only returns if shutdown() is called or a signal a
received.
@@ -2600,6 +2767,11 @@ cdef class FlightServerBase(_Weakrefable):
check_flight_status(self.server.get().ServeWithSignals())
def run(self):
+ """Block until the server shuts down.
+
+ .. deprecated:: 0.15.0
+ Use the ``FlightServer.serve`` method instead
+ """
warnings.warn("The 'FlightServer.run' method is deprecated, use "
"FlightServer.serve method instead")
self.serve()