You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "judahrand (via GitHub)" <gi...@apache.org> on 2023/10/02 09:48:11 UTC

Re: [PR] GH-35331: [Python] Expose Parquet sorting metadata [arrow]

judahrand commented on code in PR #37665:
URL: https://github.com/apache/arrow/pull/37665#discussion_r1342485832


##########
python/pyarrow/_parquet.pyx:
##########
@@ -505,6 +506,204 @@ cdef class ColumnChunkMetaData(_Weakrefable):
         return self.metadata.GetColumnIndexLocation().has_value()
 
 
+cdef class SortingColumn:
+    """
+    Sorting specification for a single column.
+
+    Returned by :meth:`RowGroupMetaData.sorting_columns` and used in
+    :class:`ParquetWriter` to specify the sort order of the data.
+
+    Parameters
+    ----------
+    column_index : int
+        Index of column that data is sorted by.
+    descending : bool, default False
+        Whether column is sorted in descending order.
+    nulls_first : bool, default False
+        Whether null values appear before valid values.
+
+    Notes
+    -----
+
+    Column indices are zero-based, refer only to leaf fields, and are in

Review Comment:
   This is made clear in the docstrings of the Parquet Writer: https://github.com/apache/arrow/blob/5db4e8e243ec8ca6724252bca7f478a0c417e76e/python/pyarrow/parquet/core.py#L885-L888



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org