You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/04/27 14:41:42 UTC

[GitHub] [arrow] wgtmac commented on pull request #35351: GH-35331: [C++][Parquet] Parquet Export Footer metadata SortColumns

wgtmac commented on PR #35351:
URL: https://github.com/apache/arrow/pull/35351#issuecomment-1525822169

   I was thinking if we can reuse `arrow::compute::Ordering`: https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/ordering.h#L61-L117
   
   But it slightly differs with `parquet::format::SortingColumn`: https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L682-L692
   
   The null placement for `arrow::compute::Ordering` is the same for all sort keys but that of parquet can vary among columns.
   
   In most cases null placement should be consistent in the same engine, so I think we can simply reuse `arrow::compute::Ordering` and does not return sorting columns if that in the RowGroupMetadata indicates different null placement from columns.
   
   WDYT? @mapleFU @wjones127 @pitrou 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org