You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/21 18:12:54 UTC
[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #12909: ARROW-15796: [Python] Pickling ParquetFileFragment shouldn't fetch metadata
jorisvandenbossche commented on code in PR #12909:
URL: https://github.com/apache/arrow/pull/12909#discussion_r855459023
##########
python/pyarrow/_dataset_parquet.pyx:
##########
@@ -327,6 +331,7 @@ cdef class ParquetFileFragment(FileFragment):
@property
def metadata(self):
self.ensure_complete_metadata()
+
Review Comment:
Small nitpick: but it's best to avoid such (stylistic) changes in code that is otherwise not touched
##########
python/pyarrow/_dataset_parquet.pyx:
##########
@@ -301,7 +301,11 @@ cdef class ParquetFileFragment(FileFragment):
def __reduce__(self):
buffer = self.buffer
- row_groups = [row_group.id for row_group in self.row_groups]
+ if not bool(self.parquet_file_fragment.row_groups()):
Review Comment:
Can you maybe add a comment to clarify that if `parquet_file_fragment.row_groups()` is empty, this means that the metadata information is not yet populated
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org