You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2023/01/10 08:36:30 UTC

[GitHub] [iceberg] peay commented on pull request #6470: Spark: Allow specifying file format in RewriteDataFiles

peay commented on PR #6470:
URL: https://github.com/apache/iceberg/pull/6470#issuecomment-1376904416

   The motivation in https://github.com/apache/iceberg/issues/6464 is to allow writing as Avro from a streaming pipeline, where row-based can make sense for small but frequent micro-batches, but then compacting to Parquet for longer-term batch analytics. This can be done today by configuring the table as Parquet, and explicitly setting the streaming writer to write as Avro, but it is a bit less flexible and I think it'd make sense to keep such details in compaction settings instead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org