You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@orc.apache.org by Korry Douglas <ko...@me.com> on 2019/03/01 14:41:20 UTC

How to handle ColumnStatistics in C++

Hi all,

I’m trying to use the ColumnStatistics returned by Reader::getColumnStatistics().  

ColumnStatistics is a superclass and the actual type of each column statistic varies by data type.  For example, StringColumnStatistics, BinaryColumnStatistics, and BooleanColumnStatistics are all subclasses of ColumnStatistics.

What is the best way to find the actual type (subclass) of a given ColumnStatistics pointer?

Trying to use dynamic_cast is messy because Reader::getColumnStatistics() returns an ORC_UNIQUE_PTR<ColumnStatistics>.

Trying to use typeid() is problematic - the typeid() is not (for example) StringColumnStatistics as I would have expected but StringColumnStatisticsImpl (and that class is something I’m not supposed to peek at).

As far as I can see, there is no member function or data member in ColumnStatistics that will explicitly tell me the subtype.

Any suggestions?

Thanks in advance.


            — Korry

Re: How to handle ColumnStatistics in C++

Posted by Gang Wu <ga...@apache.org>.
Yes, you are right. This interface returns column statistics of all columns
and their types can be found via type from the file footer..

On Fri, Mar 1, 2019 at 10:04 AM Korry Douglas <ko...@me.com> wrote:

> I think I’ve figured this out - I have to look at the column type and then
> infer which of the ColumnStatistics subtypes to use.  Is that right?
>
>
>               — Korry
>
> > On Mar 1, 2019, at 9:41 AM, Korry Douglas <ko...@me.com> wrote:
> >
> > Hi all,
> >
> > I’m trying to use the ColumnStatistics returned by
> Reader::getColumnStatistics().
> >
> > ColumnStatistics is a superclass and the actual type of each column
> statistic varies by data type.  For example, StringColumnStatistics,
> BinaryColumnStatistics, and BooleanColumnStatistics are all subclasses of
> ColumnStatistics.
> >
> > What is the best way to find the actual type (subclass) of a given
> ColumnStatistics pointer?
> >
> > Trying to use dynamic_cast is messy because
> Reader::getColumnStatistics() returns an ORC_UNIQUE_PTR<ColumnStatistics>.
> >
> > Trying to use typeid() is problematic - the typeid() is not (for
> example) StringColumnStatistics as I would have expected but
> StringColumnStatisticsImpl (and that class is something I’m not supposed to
> peek at).
> >
> > As far as I can see, there is no member function or data member in
> ColumnStatistics that will explicitly tell me the subtype.
> >
> > Any suggestions?
> >
> > Thanks in advance.
> >
> >
> >            — Korry
>
>

Re: How to handle ColumnStatistics in C++

Posted by Korry Douglas <ko...@me.com>.
I think I’ve figured this out - I have to look at the column type and then infer which of the ColumnStatistics subtypes to use.  Is that right?  


              — Korry

> On Mar 1, 2019, at 9:41 AM, Korry Douglas <ko...@me.com> wrote:
> 
> Hi all,
> 
> I’m trying to use the ColumnStatistics returned by Reader::getColumnStatistics().  
> 
> ColumnStatistics is a superclass and the actual type of each column statistic varies by data type.  For example, StringColumnStatistics, BinaryColumnStatistics, and BooleanColumnStatistics are all subclasses of ColumnStatistics.
> 
> What is the best way to find the actual type (subclass) of a given ColumnStatistics pointer?
> 
> Trying to use dynamic_cast is messy because Reader::getColumnStatistics() returns an ORC_UNIQUE_PTR<ColumnStatistics>.
> 
> Trying to use typeid() is problematic - the typeid() is not (for example) StringColumnStatistics as I would have expected but StringColumnStatisticsImpl (and that class is something I’m not supposed to peek at).
> 
> As far as I can see, there is no member function or data member in ColumnStatistics that will explicitly tell me the subtype.
> 
> Any suggestions?
> 
> Thanks in advance.
> 
> 
>            — Korry