You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Xinli shang <sh...@uber.com.INVALID> on 2019/10/28 20:54:30 UTC

PARQUET-1685:Truncate Min/Max for Statistics

Hi all,

If you know any Parquet application requires the min/max value of
statistics to be real column value, please let me know. Also, If you have
any questions or concerns about this feature, feel free to raise comments
in the design document
<https://docs.google.com/document/d/1Mgb0dDXQJkgjouboDrGa9v06hWGJ0oPiwnmffXShQ_M>
or
in the Jira ticket PARQUET-1685
<https://issues.apache.org/jira/browse/PARQUET-1685>.

-- 
Xinli Shang

Re: PARQUET-1685:Truncate Min/Max for Statistics

Posted by Wes McKinney <we...@gmail.com>.
This seems useful to me as behavior to "opt in" to. We implemented an
option to disable the storage of statistics in C++ but having an
option to truncate large statistics also seems worthwhile.

On Mon, Oct 28, 2019 at 3:54 PM Xinli shang <sh...@uber.com.invalid> wrote:
>
> Hi all,
>
> If you know any Parquet application requires the min/max value of
> statistics to be real column value, please let me know. Also, If you have
> any questions or concerns about this feature, feel free to raise comments
> in the design document
> <https://docs.google.com/document/d/1Mgb0dDXQJkgjouboDrGa9v06hWGJ0oPiwnmffXShQ_M>
> or
> in the Jira ticket PARQUET-1685
> <https://issues.apache.org/jira/browse/PARQUET-1685>.
>
> --
> Xinli Shang