You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Felipe Aramburu <fe...@blazingdb.com> on 2017/09/05 18:46:34 UTC

ByteArray max length statistic

Is there anyway to know what the maximum possible length of
parquet::ByteArray that are stored in a column per row group to know what
the maximum possible string size will be?

Felipe

Re: ByteArray max length statistic

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
There isn't metadata in the footer for this. I think the only thing you can
do is to read the dictionary when a column is entirely dictionary-encoded.
That provides easy access, but there isn't always a dictionary. Plus, you
have to read the whole dictionary page and decode it, which defeats the
purpose of knowing how large a string may be before you allocate memory for
it.

rb

On Tue, Sep 5, 2017 at 11:46 AM, Felipe Aramburu <fe...@blazingdb.com>
wrote:

> Is there anyway to know what the maximum possible length of
> parquet::ByteArray that are stored in a column per row group to know what
> the maximum possible string size will be?
>
> Felipe
>



-- 
Ryan Blue
Software Engineer
Netflix