You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/09 12:31:13 UTC
[GitHub] [arrow-rs] alamb commented on issue #1666: Handling Unsupported Arrow Types in Parquet
alamb commented on issue #1666:
URL: https://github.com/apache/arrow-rs/issues/1666#issuecomment-1121038516
> I think that we should come out a consistent approach on dealing with this kind of Arrow types.
100% agree with @viirya on this one
> The final option is the simplest to implement, the least surprising to users, and what I would propose implementing. It would break ecosystem interoperability in certain cases, but I think it is more important that we faithfully round-trip data than maintain maximal compatibility.
I agree with @jorgecarleitao and @tustvold that this (option 3) would be best for some users (such as IOx), but as @viirya mentions this may not be ideal from a "least surprising" point of view with people using a broader selection of ecosystem tools
Maybe we could add an option to `WriterProperties` that allows choosing the tradeoff desired. Something like
```rust
/// Converts data prior to writing to maximize compatibility with other
/// parquet implementations. This may result in losing precision on some data
/// types that can not be natively represented, such as `Timestamp(Nanosecond)`
/// as parquet only supports millisecond precision timestamps natively
maximize_ecosystem_compatibility: bool
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org