You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/07/09 18:26:49 UTC

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7692: ARROW-9321: [C++][Dataset] Populate statistics opportunistically

jorisvandenbossche commented on a change in pull request #7692:
URL: https://github.com/apache/arrow/pull/7692#discussion_r452408096



##########
File path: python/pyarrow/_dataset.pyx
##########
@@ -909,13 +909,24 @@ cdef class ParquetFileFragment(FileFragment):
 
     def __reduce__(self):
         buffer = self.buffer
+        if self.row_groups is not None:

Review comment:
       Yeah, I actually realized we were not yet pickling the row group id's when discussing this in the meeting we had, and was planning to open a JIRA / do a quick PR, but you already fixed it ;)
   
   (it didn't fail because we simply didn't include any row group information in the serialization)
   
   Only preserving the rowgroup id's (as you do here) should be sufficient for now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org