You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Zoltan Ivanfi <zi...@cloudera.com.INVALID> on 2018/10/01 12:58:00 UTC
Row group layout anomalies
Hi,
PARQUET-1337 describes the problem of ending up with a drastically
different (and worse) row group layout than intended under certain
circumstances.
A few weeks ago I started tweaking the logic that controls this in a
test-driven fashion. I have found that fixing one problem repeatedly leads
to the discovery of another one. After playing this whack-a-mole for a
while, I ended up with a much more fundamental change than I originally
intended with still room (and need) for improvement.
Due to the potential impact of these changes, I have put together a design
doc that describes all the problems I could identify and some possible
fixes for them:
https://docs.google.com/document/d/1FJAVwzszZGkxZa8FtKtSbgBKm7qkS4cXuNW8hl4YKwU/edit#
If you are interested, please review and comment on the document.
Thanks,
Zoltan