You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Selina Zhang (JIRA)" <ji...@apache.org> on 2015/02/04 01:47:35 UTC

[jira] [Created] (HIVE-9573) Lazy load partitions for SELECT LIMIT type query

Selina Zhang created HIVE-9573:
----------------------------------

             Summary: Lazy load partitions for SELECT LIMIT type query
                 Key: HIVE-9573
                 URL: https://issues.apache.org/jira/browse/HIVE-9573
             Project: Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Selina Zhang
            Assignee: Selina Zhang


Some tools such as HUE uses 

SELECT * FROM table LIMIT 100;

to grab the sample content of table. For table with large partitions it causes a large amount of partition objects been loaded. Thus slow down the HS2 even cause OOM from time to time. 

My solution is lazy load partition objects in FetchOperator for this type of query. We can skip retrieving the whole partitions but only partition names in PartitionPrunner, and lazy load the partition objects when needed in execution time (for local job only).

I have a patch ready. But want to hear more suggestions. Thanks!




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)