You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Christoph Lipka (JIRA)" <ji...@apache.org> on 2015/06/02 14:05:17 UTC
[jira] [Created] (HIVE-10891) Limited fetch on partitioned table
can eat up all heap
Christoph Lipka created HIVE-10891:
--------------------------------------
Summary: Limited fetch on partitioned table can eat up all heap
Key: HIVE-10891
URL: https://issues.apache.org/jira/browse/HIVE-10891
Project: Hive
Issue Type: Bug
Components: Physical Optimizer
Affects Versions: 1.1.0
Reporter: Christoph Lipka
When doing a query like
{code}
select *
from partitioned_table
where not_the_partition_key_column = "xyz"
limit 100
{code}
it is executed in memory. For all tables except the smallest this behavior quickly consumes the complete heap and crashes the server.
If the limit clause is omitted, a mr-job is started and the query is executed without memory issues. One can also work around this problem by extending the query to also select the partition_key like
{code}
select *
from partitioned_table a
where a.not_the_partition_key_column = "xyz"
and a.partition_key_column = (select b.partition_key_column from partitioned_table b)
limit 100
{code}
In this case hive also creates a mr-job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)