You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by "Oleksandr Diachenko (JIRA)" <ji...@apache.org> on 2017/04/04 19:07:41 UTC
[jira] [Updated] (HAWQ-1404) PXF to leverage file-level stats of
ORC file and emit records for COUNT(*)
[ https://issues.apache.org/jira/browse/HAWQ-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Oleksandr Diachenko updated HAWQ-1404:
--------------------------------------
Fix Version/s: 2.3.0.0-incubating
> PXF to leverage file-level stats of ORC file and emit records for COUNT(*)
> --------------------------------------------------------------------------
>
> Key: HAWQ-1404
> URL: https://issues.apache.org/jira/browse/HAWQ-1404
> Project: Apache HAWQ
> Issue Type: Improvement
> Components: PXF
> Reporter: Oleksandr Diachenko
> Assignee: Oleksandr Diachenko
> Fix For: 2.3.0.0-incubating
>
>
> For cases when user issues COUNT(*) queries without WHERE clause PXF should be able to leverage file-level stats for a ORC file and emit given number of records back to HAWQ, avoiding reading actual tuples from disk. This should be a first step in enabling PXF to use ORC stats(file, stripe and row group levels) so we can improve a wider range of aggregate queries.
> So whenever PXF receives "count" as AGG-TYPE parameters value - it should optimize it by emitting tuples using ORC file-level stats.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)