You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@orc.apache.org by "Panagiotis Garefalakis (Jira)" <ji...@apache.org> on 2020/01/27 16:57:00 UTC

[jira] [Created] (ORC-593) Allow row level Skipping

Panagiotis Garefalakis created ORC-593:
------------------------------------------

             Summary: Allow row level Skipping
                 Key: ORC-593
                 URL: https://issues.apache.org/jira/browse/ORC-593
             Project: ORC
          Issue Type: Improvement
            Reporter: Panagiotis Garefalakis
             Fix For: 1.5.8


Currently, ORC supports filtering at: File, Stripe, and row group level.

There is an on-going effort to add more detailed row-level filters using filter Predicates as part of the Reader.Options as part of [#ORC-577].

However, there are still cases where the framework implementing the TreeReader interface wants to skip particular rows to avoid expensive type Decode i.e DecimalColumnVector or Decimal64ColumnVector type.

In this ticket I propose to support extend the TreeReader abstract class with an extra method next Vector method.
{code:java}
abstract void nextVector(ColumnVector previous,
 boolean[] isNull, boolean[] skipRows,
 final int batchSize){code}
The subclasses implementing this method will be able to use the skipRows method to avoid expensive decoding when needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)