You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Christoph Lipka (JIRA)" <ji...@apache.org> on 2015/06/02 14:05:17 UTC

[jira] [Created] (HIVE-10891) Limited fetch on partitioned table can eat up all heap

Christoph Lipka created HIVE-10891:
--------------------------------------

             Summary: Limited fetch on partitioned table can eat up all heap
                 Key: HIVE-10891
                 URL: https://issues.apache.org/jira/browse/HIVE-10891
             Project: Hive
          Issue Type: Bug
          Components: Physical Optimizer
    Affects Versions: 1.1.0
            Reporter: Christoph Lipka


When doing a query like 
{code}
select *
from partitioned_table
where not_the_partition_key_column = "xyz"
limit 100
{code}
it is executed in memory. For all tables except the smallest this behavior quickly consumes the complete heap and crashes the server.
If the limit clause is omitted, a mr-job is started and the query is executed without memory issues. One can also work around this problem by extending the query to also select the partition_key like 
{code}
select *
from partitioned_table a
where a.not_the_partition_key_column = "xyz"
and a.partition_key_column = (select b.partition_key_column from partitioned_table b)
limit 100
{code}
In this case hive also creates a mr-job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)