You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/07 15:19:25 UTC

[GitHub] [arrow] raulcd commented on a diff in pull request #12800: ARROW-15776: [Python] Expose IpcReadOptions

raulcd commented on code in PR #12800:
URL: https://github.com/apache/arrow/pull/12800#discussion_r845262179


##########
python/pyarrow/ipc.pxi:
##########
@@ -100,6 +100,74 @@ cdef _wrap_read_stats(CIpcReadStats c):
                      c.num_replaced_dictionaries)
 
 
+cdef class IpcReadOptions(_Weakrefable):
+    """
+    Serialization options for reading IPC format.
+
+    Parameters
+    ----------
+    use_threads : bool
+        Whether to use the global CPU thread pool to parallelize any
+        computational tasks like decompression.
+    ensure_native_endian : bool
+        Whether to convert incoming data to platform-native endianness.
+        Default is true.
+    included_fields : list
+        If empty (the default), return all deserialized fields.
+        If non-empty, the values are the indices of fields to read on
+        the top-level schema.
+    memory_pool : MemoryPool, default None
+        Uses default memory pool if not specified
+    """
+    __slots__ = ()
+
+    # cdef block is in lib.pxd
+
+    def __init__(self, *, bint ensure_native_endian=True,
+                 bint use_threads=True, list included_fields=None,
+                 MemoryPool memory_pool=None):
+        self.c_options = CIpcReadOptions.Defaults()
+        self.ensure_native_endian = ensure_native_endian
+        self.use_threads = use_threads
+        self.included_fields = included_fields
+        self.memory_pool = memory_pool
+
+    @property
+    def ensure_native_endian(self):
+        return self.c_options.ensure_native_endian
+
+    @ensure_native_endian.setter
+    def ensure_native_endian(self, bint value):
+        self.c_options.ensure_native_endian = value
+
+    @property
+    def use_threads(self):
+        return self.c_options.use_threads
+
+    @use_threads.setter
+    def use_threads(self, bint value):
+        self.c_options.use_threads = value
+
+    @property
+    def included_fields(self):
+        return self.c_options.included_fields
+
+    @included_fields.setter
+    def included_fields(self, list value):
+        self.c_options.included_fields = value or list()

Review Comment:
   I don't have a bunch of experience with cython but from my tests vector doesn't seem to allow nullable:
   ```
   pyarrow/ipc.pxi:157: in pyarrow.lib.IpcReadOptions.included_fields.__set__
       self.c_options.included_fields = None
   stringsource:47: in vector.from_py.__pyx_convert_vector_from_py_int
       ???
   E   TypeError: 'NoneType' object is not iterable
   ```
   I am not sure I understand what would be the proposed change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org