You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by Praveen Krishna <pr...@tutanota.com> on 2019/06/06 11:14:42 UTC
Pluggable index for ORC
Hi all,
Can we have a custom stream (custom StreamKind Enum) that might hold a some index for a some columns. We are trying to index a Varchar column in an ORC file using Lucene and store the index as a part of that Varchar column. For a VARCHAR column we have currently three streams
1. Present stream
2. Data stream
3. Length stream
It would be better if we could have an Index stream or a StreamKind which would represent a index chunk so that in future index for some columns can be computed and stored as a part of that column.
Regards,
Praveen Krishna D
Re: Pluggable index for ORC
Posted by Dain Sundstrom <da...@iq80.com>.
It would be nice if the there were some reserved space in the enums for experimentations like this.
-dain
----
Dain Sundstrom
Co-founder @ Presto Software Foundation, Co-creator of Presto (https://prestosql.io)
> On Jun 6, 2019, at 4:14 AM, Praveen Krishna <pr...@tutanota.com> wrote:
>
> Hi all,
>
> Can we have a custom stream (custom StreamKind Enum) that might hold a some index for a some columns. We are trying to index a Varchar column in an ORC file using Lucene and store the index as a part of that Varchar column. For a VARCHAR column we have currently three streams
> 1. Present stream
> 2. Data stream
> 3. Length stream
>
> It would be better if we could have an Index stream or a StreamKind which would represent a index chunk so that in future index for some columns can be computed and stored as a part of that column.
>
> Regards,
> Praveen Krishna D