You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 19:57:48 UTC

[GitHub] [beam] damccorm opened a new issue, #20857: DataFrame API: Consider allowing partitioning by column in addition to Index

damccorm opened a new issue, #20857:
URL: https://github.com/apache/beam/issues/20857

   For some DataFrame use-cases it may be beneficial to partition a dataset across the columns as well as across the index.
   
   One example might be computing a correlation in a DataFrame with a very large number of columns. It would be beneficial to be able to perform pairwise column correlations on separate workers.
   
   Imported from Jira [BEAM-12132](https://issues.apache.org/jira/browse/BEAM-12132). Original Jira may contain additional context.
   Reported by: bhulette.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org