You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Sarvesh Sakalanaga (JIRA)" <ji...@apache.org> on 2013/06/12 19:37:20 UTC

[jira] [Commented] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space

    [ https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13681421#comment-13681421 ] 

Sarvesh Sakalanaga commented on HIVE-4666:
------------------------------------------

This is because count (*) requests ORC to load all the columns. Since TPCH line item schema has 16 columns, loading all the columns into memory leads to out of memory exception. 
There are two things required to fix this bug.
1) Optimize count(*) to count(1) - which should be addressed by this bug.
2) Optimize count(1) to not read all columns - This is Hive-4113 (https://issues.apache.org/jira/browse/HIVE-4113)

The temp workaround is to increase the java heap size (property mapred.child.java.opts in mapred-site.xml).

                
> Count(*) over tpch lineitem ORC results in Error: Java heap space
> -----------------------------------------------------------------
>
>                 Key: HIVE-4666
>                 URL: https://issues.apache.org/jira/browse/HIVE-4666
>             Project: Hive
>          Issue Type: Sub-task
>    Affects Versions: vectorization-branch
>            Reporter: Tony Murphy
>            Assignee: Sarvesh Sakalanaga
>             Fix For: vectorization-branch
>
>         Attachments: output
>
>
> Executing the following query over an orc tpch line item table fails due to Error: Java heap space
> {noformat}
> INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*) AS count_order FROM  lineitem_orc
> {noformat}
> the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira