You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by Korry Douglas <ko...@me.com.INVALID> on 2018/11/01 20:28:57 UTC

Questions about C++ interface

Hi all - I’m an absolute ORC newbie so please forgive the novice questions.

I’m trying to use the C++ API to implement a reader.  I have two questions (to start with):

1) I’ve been reading the docs at https://orc.apache.org/docs/core-cpp.html - is there more complete documentation that describes all of the public classes that I can use?

2) My understanding is that an ORC file contains min/max indexes that can help me reduce the amount of data I have to read to satisfy a given query.  Is there a class that I can use to specify the range of values that I want to read?  Or do I have to read the statistics myself (for each stripe?) and manually inspect the min/max values.

Thanks in advance.


             — Korry

Re: Questions about C++ interface

Posted by Korry Douglas <ko...@me.com.INVALID>.
Thanks for the help - much appreciated.

          — Korry

> On Nov 1, 2018, at 7:19 PM, Xiening Dai <xn...@live.com> wrote:
> 
> 1) You can find the public headers in c++/include/orc. All the classes and methods have good documentation in the code. You can also take a look at the sample c++ codes under tools/src, especially FileContents.cc<http://FileContents.cc> and FileScan.cc<http://FileScan.cc>. Both demonstrate the usage of c++ reader.
> 
> 2) What you are mentioning is a feature we call “Predicate Pushdown”. Unfortunately it is not supported by c++ reader currently. The java reader does support it through SearchArgument class. I assume the implementation would be similar for c++ when we add this support in the future.
> 
> 
> On Nov 1, 2018, at 1:28 PM, Korry Douglas <ko...@me.com.INVALID>> wrote:
> 
> Hi all - I’m an absolute ORC newbie so please forgive the novice questions.
> 
> I’m trying to use the C++ API to implement a reader.  I have two questions (to start with):
> 
> 1) I’ve been reading the docs at https://orc.apache.org/docs/core-cpp.html - is there more complete documentation that describes all of the public classes that I can use?
> 
> 2) My understanding is that an ORC file contains min/max indexes that can help me reduce the amount of data I have to read to satisfy a given query.  Is there a class that I can use to specify the range of values that I want to read?  Or do I have to read the statistics myself (for each stripe?) and manually inspect the min/max values.
> 
> Thanks in advance.
> 
> 
>            — Korry
> 


Re: Questions about C++ interface

Posted by Xiening Dai <xn...@live.com>.
1) You can find the public headers in c++/include/orc. All the classes and methods have good documentation in the code. You can also take a look at the sample c++ codes under tools/src, especially FileContents.cc<http://FileContents.cc> and FileScan.cc<http://FileScan.cc>. Both demonstrate the usage of c++ reader.

2) What you are mentioning is a feature we call “Predicate Pushdown”. Unfortunately it is not supported by c++ reader currently. The java reader does support it through SearchArgument class. I assume the implementation would be similar for c++ when we add this support in the future.


On Nov 1, 2018, at 1:28 PM, Korry Douglas <ko...@me.com.INVALID>> wrote:

Hi all - I’m an absolute ORC newbie so please forgive the novice questions.

I’m trying to use the C++ API to implement a reader.  I have two questions (to start with):

1) I’ve been reading the docs at https://orc.apache.org/docs/core-cpp.html - is there more complete documentation that describes all of the public classes that I can use?

2) My understanding is that an ORC file contains min/max indexes that can help me reduce the amount of data I have to read to satisfy a given query.  Is there a class that I can use to specify the range of values that I want to read?  Or do I have to read the statistics myself (for each stripe?) and manually inspect the min/max values.

Thanks in advance.


            — Korry