You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Csaba Ringhofer (Jira)" <ji...@apache.org> on 2020/03/30 21:59:00 UTC

[jira] [Created] (IMPALA-9579) Read support for binary in ORC

Csaba Ringhofer created IMPALA-9579:
---------------------------------------

             Summary: Read support for binary in ORC
                 Key: IMPALA-9579
                 URL: https://issues.apache.org/jira/browse/IMPALA-9579
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend
            Reporter: Csaba Ringhofer


ORC has its own BINARY type, which has some differences compared to STRING/VARCHAR/CHAR, as BINARY:
- doesn't use dictionary encoding
- doesn't store min/max values in the statistics

The c++ library uses the same ColumnReader as for STRING, so the implementation efforts should be minimal: https://github.com/apache/orc/blob/a9ec6a2e39ed71ef8a2d874df14700956aa847be/c%2B%2B/src/ColumnReader.cc#L1752



--
This message was sent by Atlassian Jira
(v8.3.4#803005)