You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/01 08:41:53 UTC

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8755: ARROW-10709: [Python] Allow PythonFile.read() to always return a buffer

maartenbreddels commented on a change in pull request #8755:
URL: https://github.com/apache/arrow/pull/8755#discussion_r533159508



##########
File path: python/pyarrow/tests/test_io.py
##########
@@ -163,6 +163,34 @@ def test_python_file_readinto():
         assert len(dst_buf) == length
 
 
+def test_python_file_read_buffer():
+    length = 10
+    data = b'0123456798'
+    dst_buf = bytearray(data)
+
+    class DuckReader:
+        def close(self):
+            pass
+
+        @property
+        def closed(self):
+            return False
+
+        def read_buffer(self, nbytes):
+            assert nbytes == length
+            return memoryview(dst_buf)[:nbytes]
+
+    duck_reader = DuckReader()
+    with pa.PythonFile(duck_reader, mode='r') as f:
+        buf = f.read_buffer(length)

Review comment:
       I think both `read` and `read_buffer` should be able to return bytes (which exposes the buffer interface), and a buffer/memoryview, for backward compatibility.

##########
File path: cpp/src/arrow/python/io.cc
##########
@@ -199,25 +219,32 @@ Result<int64_t> PyReadableFile::Read(int64_t nbytes, void* out) {
     PyObject* bytes_obj = bytes.obj();
     DCHECK(bytes_obj != NULL);
 
-    if (!PyBytes_Check(bytes_obj)) {
+    Py_buffer py_buf_;
+    if (!PyObject_GetBuffer(bytes_obj, &py_buf_, PyBUF_ANY_CONTIGUOUS)) {
+      const uint8_t* data = reinterpret_cast<const uint8_t*>(py_buf_.buf);
+      ARROW_CHECK_NE(data, nullptr) << "Null pointer in Py_buffer";

Review comment:
       According to https://docs.python.org/3/c-api/buffer.html#c.PyObject_GetBuffer it seems that I cannot, I was following what you've done in https://github.com/apache/arrow/blob/070c4810d4d19b44672ed215a5f4580396d5b3dc/cpp/src/arrow/python/common.cc#L178 so maybe in practice you've noticed issues.
   I think it was originally added in https://github.com/apache/arrow/commit/5dce01f76f0569d820089682aef32d87c806b61d 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org