You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/13 03:33:46 UTC

[GitHub] [arrow] alexdesiqueira opened a new pull request, #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

alexdesiqueira opened a new pull request, #13144:
URL: https://github.com/apache/arrow/pull/13144

   Work in progress. This exposes `RandomAccessFile::GetStream` to `pyarrow`, as requested in [JIRA#ARROW-16356](https://issues.apache.org/jira/browse/ARROW-16356).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1131378987

   @pitrou there are some things I don't understand yet, and I'll check them later to continue working on this. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1148748244

   Yes, it's okay :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou closed pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream
URL: https://github.com/apache/arrow/pull/13144


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1182706778

   @pitrou sorry about the delay also; I'll try to fix it now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1130183224

   @alexdesiqueira Did you manage to get a working PyArrow development setup? This would be easier than waiting for CI to run :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1130310477

   @alexdesiqueira You should probably not enable ASAN when developing PyArrow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #13144:
URL: https://github.com/apache/arrow/pull/13144#discussion_r908597511


##########
python/pyarrow/tests/test_io.py:
##########
@@ -125,6 +125,40 @@ def test_python_file_read():
         pa.PythonFile(StringIO(), mode='r')
 
 
+def test_python_file_get_stream():
+    data = b'data1data2data3data4data5'
+
+    buf = BytesIO(data)
+    f = pa.PythonFile(buf, mode='r')
+
+    stream1 = f.get_stream(file_offset=0, nbytes=10)
+    stream2 = f.get_stream(file_offset=9, nbytes=16)
+
+    buf_stream2_4 = stream2.read(nbytes=4)
+    assert len(buf_stream2_4) == 4
+    assert buf_stream2_4 == b'2dat'
+    assert stream2.tell() == 4
+
+    buf_stream1_6 = stream1.read(nbytes=6)
+    assert len(buf_stream1_6) == 6
+    assert buf_stream1_6 == b'data1d'
+    assert stream1.tell() == 6
+
+    # Read to end of each stream
+    buf_stream1_4 = stream1.read(nbytes=4)
+    assert len(buf_stream1_4) == 2
+    assert buf_stream1_4 == b'a2'

Review Comment:
   `buf_stream1_4` should be `b'ata2'`, no?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #13144:
URL: https://github.com/apache/arrow/pull/13144#discussion_r886415362


##########
python/pyarrow/io.pxi:
##########
@@ -395,6 +395,42 @@ cdef class NativeFile(_Weakrefable):
 
         return PyObject_to_object(obj)
 
+    def get_stream(self, file_offset, nbytes):
+        """
+        Returns an input stream that reads a file segment independent of the
+        state of the file.

Review Comment:
   Nit:
   ```suggestion
           Return an input stream that reads a file segment independent of the
           state of the file.
   ```



##########
python/pyarrow/io.pxi:
##########
@@ -395,6 +395,42 @@ cdef class NativeFile(_Weakrefable):
 
         return PyObject_to_object(obj)
 
+    def get_stream(self, file_offset, nbytes):

Review Comment:
   This implementation seems basically correct to me, thanks.



##########
python/pyarrow/io.pxi:
##########
@@ -395,6 +395,42 @@ cdef class NativeFile(_Weakrefable):
 
         return PyObject_to_object(obj)
 
+    def get_stream(self, file_offset, nbytes):
+        """
+        Returns an input stream that reads a file segment independent of the
+        state of the file.
+
+        Allows reading portions of a random access file as an input stream
+        without interfering with each other.
+
+        Parameters
+        ----------
+        file_offset : int
+        nbytes : int
+
+        Returns
+        -------
+        stream : bytes

Review Comment:
   The annotation here is incorrect, should be something like:
   ```suggestion
           stream : NativeFile
   ```



##########
python/pyarrow/tests/test_io.py:
##########
@@ -125,6 +125,24 @@ def test_python_file_read():
         pa.PythonFile(StringIO(), mode='r')
 
 
+def test_python_file_get_stream():

Review Comment:
   The tests would deserve expanding a bit:
   * test what happens when reading past the of stream
   * test what happens when `file_offset` and `nbytes` are slightly out of bounds
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1182707497

   @pitrou sorry about the delay as well, life got in the way :slightly_smiling_face: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #13144:
URL: https://github.com/apache/arrow/pull/13144#discussion_r876055701


##########
python/pyarrow/tests/test_io.py:
##########
@@ -125,6 +125,22 @@ def test_python_file_read():
         pa.PythonFile(StringIO(), mode='r')
 
 
+def test_python_file_get_stream():
+    data = b'data1data2data3data4data5'
+
+    buf = BytesIO(data)
+    f = pa.PythonFile(buf, mode='r')
+
+    stream1 = f.get_stream(file_offset=0, nbytes=10)
+    stream2 = f.get_stream(file_offset=9, nbytes=16)
+
+    buf_nbytes6 = stream1.read(nbytes=6)
+    assert buf_nbytes6 == b'data1d'

Review Comment:
   Probably want to exercise this a bit more. When it the end of file reached?



##########
python/pyarrow/tests/test_io.py:
##########
@@ -125,6 +125,22 @@ def test_python_file_read():
         pa.PythonFile(StringIO(), mode='r')
 
 
+def test_python_file_get_stream():
+    data = b'data1data2data3data4data5'
+
+    buf = BytesIO(data)
+    f = pa.PythonFile(buf, mode='r')
+
+    stream1 = f.get_stream(file_offset=0, nbytes=10)
+    stream2 = f.get_stream(file_offset=9, nbytes=16)
+
+    buf_nbytes6 = stream1.read(nbytes=6)
+    assert buf_nbytes6 == b'data1d'

Review Comment:
   Probably want to exercise this a bit more. When is the end of file reached?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1131386376

   @alexdesiqueira Did you manage to get a working local build? It seems like you are waiting for CI to test your changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1189967389

   @alexdesiqueira Yes, I can do that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1168844653

   @alexdesiqueira Sorry for the delay. The tests you wrote are probably good enough, except that it seems there's a mistake in them? See the comment I posted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1131642114

   @pitrou I did this night :sweat_smile: I just pushed some changes I had here, sorry about that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1130289036

   > @alexdesiqueira Did you manage to get a working PyArrow development setup? This would be easier than waiting for CI to run :-)
   
   @pitrou thanks for reviewing! You're right :) I was setting it up but had an issue with `ASan` that I couldn't solve yet:
   ```sh
   ==181460==ASan runtime does not come first in initial library list;
   you should either link runtime to your application or manually preload it with LD_PRELOAD.
   ```
   
   I'll check `ci/docker` and `python/examples/minimal_build` to setup a container for tests today.
   Thanks again!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on a diff in pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on code in PR #13144:
URL: https://github.com/apache/arrow/pull/13144#discussion_r876456543


##########
python/pyarrow/tests/test_io.py:
##########
@@ -125,6 +125,22 @@ def test_python_file_read():
         pa.PythonFile(StringIO(), mode='r')
 
 
+def test_python_file_get_stream():
+    data = b'data1data2data3data4data5'
+
+    buf = BytesIO(data)
+    f = pa.PythonFile(buf, mode='r')
+
+    stream1 = f.get_stream(file_offset=0, nbytes=10)
+    stream2 = f.get_stream(file_offset=9, nbytes=16)
+
+    buf_nbytes6 = stream1.read(nbytes=6)
+    assert buf_nbytes6 == b'data1d'

Review Comment:
   Got it. I'll get back to that. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #13144:
URL: https://github.com/apache/arrow/pull/13144#discussion_r876054820


##########
python/pyarrow/io.pxi:
##########
@@ -395,6 +395,38 @@ cdef class NativeFile(_Weakrefable):
 
         return PyObject_to_object(obj)
 
+    def get_stream(self, file_offset, nbytes):
+        """
+        Returns an input stream that reads a file segment independent of the
+        state of the file.
+
+        Allows reading portions of a random access file as an input stream
+        without interfering with each other.
+
+        Parameters
+        ----------
+        file_offset : int
+        nbytes : int
+
+        Returns
+        -------
+        data : bytes
+        """
+        cdef:
+            shared_ptr[CInputStream] data
+            int64_t c_file_offset
+            int64_t c_nbytes
+
+        c_file_offset = file_offset
+        c_nbytes = nbytes
+
+        handle = self.get_random_access_file()
+
+        data = <shared_ptr[CInputStream]> GetResultValue(
+            handle.get().GetStream(self.read(), c_file_offset, c_nbytes))

Review Comment:
   `GetStream` being a static method, you shouldn't call it on an instance. Also, I'm curious why you're passing `self.read()` here? The goal is _not_ to read data immediately.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1125624543

   :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1125624538

   https://issues.apache.org/jira/browse/ARROW-16356


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1129448888

   @pitrou this is what I could do so far. Please let me know if I'm on the right track — or how far I am from it :) Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1148596354

   @alexdesiqueira Are you willing to update this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1148745916

   @pitrou yes, but I won't have too much time this week. Let me know if that's okay.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1189221735

   It's fine @alexdesiqueira, but there are CI failures, are you willing to tackle these (I'm assuming you should be able to reproduce them locally)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1189669484

   Hey @pitrou,
   I'm sorry, life got in the way and I'm postponing/dropping a lot of extra work. Could I ask you to take over? I'll assume it would be way easier for you to finish this :slightly_smiling_face: 
   Apologies again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1194713275

   @pitrou thank you! I appreciate it. Thanks for the reviews also; I learned a lot in the process.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #13144:
URL: https://github.com/apache/arrow/pull/13144#discussion_r876053758


##########
python/pyarrow/includes/libarrow.pxd:
##########
@@ -1262,6 +1262,11 @@ cdef extern from "arrow/io/api.h" namespace "arrow::io" nogil:
                                                                   Seekable):
         CResult[int64_t] GetSize()
 
+        CResult[shared_ptr[CInputStream]] GetStream(

Review Comment:
   Note that `GetStream` is a static C++ method, so you must reflect that in the Cython declaration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alexdesiqueira commented on pull request #13144: ARROW-16356: [Python] Expose RandomAccessFile::GetStream

Posted by GitBox <gi...@apache.org>.
alexdesiqueira commented on PR #13144:
URL: https://github.com/apache/arrow/pull/13144#issuecomment-1163802255

   I'm back @pitrou :slightly_smiling_face: 
   For these tests, I'm trying to emulate [cpp/src/arrow/io/memory_test.cc:320](https://github.com/apache/arrow/blob/8daa7a4ed5629c0020dadf7325a6b523bdfc62e9/cpp/src/arrow/io/memory_test.cc#L320):
   
   ```c++
   // Read to end of each stream
     ASSERT_OK_AND_EQ(2, stream1->Read(4, buf3));
     ASSERT_EQ(0, std::memcmp(buf3, "a2", 2));
     ASSERT_OK_AND_EQ(10, stream1->Tell());
   
     ASSERT_OK_AND_EQ(0, stream1->Read(1, buf3));
     ASSERT_OK_AND_EQ(10, stream1->Tell());
   
     // stream2 had its extent limited
     ASSERT_OK_AND_ASSIGN(buf2, stream2->Read(20));
     ASSERT_TRUE(SliceBuffer(buf, 13, 12)->Equals(*buf2));
   
     ASSERT_OK_AND_ASSIGN(buf2, stream2->Read(1));
     ASSERT_EQ(0, buf2->size());
     ASSERT_OK_AND_EQ(16, stream2->Tell());
   
     ASSERT_OK(stream1->Close());
   
     // idempotent
     ASSERT_OK(stream1->Close());
     ASSERT_TRUE(stream1->closed());
   
     // Check whether closed
     ASSERT_RAISES(IOError, stream1->Tell());
     ASSERT_RAISES(IOError, stream1->Read(1));
     ASSERT_RAISES(IOError, stream1->Read(1, buf3));
   ```
   What do you think on what we have so far? I'm kinda lost on the ones we have missing. Any ideas I could use?
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org