You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/21 18:12:54 UTC

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #12909: ARROW-15796: [Python] Pickling ParquetFileFragment shouldn't fetch metadata

jorisvandenbossche commented on code in PR #12909:
URL: https://github.com/apache/arrow/pull/12909#discussion_r855459023


##########
python/pyarrow/_dataset_parquet.pyx:
##########
@@ -327,6 +331,7 @@ cdef class ParquetFileFragment(FileFragment):
     @property
     def metadata(self):
         self.ensure_complete_metadata()
+

Review Comment:
   Small nitpick: but it's best to avoid such (stylistic) changes in code that is otherwise not touched



##########
python/pyarrow/_dataset_parquet.pyx:
##########
@@ -301,7 +301,11 @@ cdef class ParquetFileFragment(FileFragment):
 
     def __reduce__(self):
         buffer = self.buffer
-        row_groups = [row_group.id for row_group in self.row_groups]
+        if not bool(self.parquet_file_fragment.row_groups()):

Review Comment:
   Can you maybe add a comment to clarify that if `parquet_file_fragment.row_groups()` is empty, this means that the metadata information is not yet populated 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org