You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ning Zhang (JIRA)" <ji...@apache.org> on 2010/08/04 20:09:18 UTC
[jira] Created: (HIVE-1511) Hive plan serialization is slow
Hive plan serialization is slow
-------------------------------
Key: HIVE-1511
URL: https://issues.apache.org/jira/browse/HIVE-1511
Project: Hadoop Hive
Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Ning Zhang
As reported by Edward Capriolo:
For reference I did this as a test case....
SELECT * FROM src where
key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0
OR key=0 OR key=0 OR key=0 OR
key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0
OR key=0 OR key=0 OR key=0 OR
...(100 more of these)
No OOM but I gave up after the test case did not go anywhere for about
2 minutes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1511) Hive plan serialization is slow
Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895343#action_12895343 ]
Ning Zhang commented on HIVE-1511:
----------------------------------
The issue seems to be the fact that we serialize the plan by writing to HDFS file directly. We probably should cache it locally and then write it to HDFS.
> Hive plan serialization is slow
> -------------------------------
>
> Key: HIVE-1511
> URL: https://issues.apache.org/jira/browse/HIVE-1511
> Project: Hadoop Hive
> Issue Type: Improvement
> Affects Versions: 0.7.0
> Reporter: Ning Zhang
>
> As reported by Edward Capriolo:
> For reference I did this as a test case....
> SELECT * FROM src where
> key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> ...(100 more of these)
> No OOM but I gave up after the test case did not go anywhere for about
> 2 minutes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1511) Hive plan serialization is slow
Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895352#action_12895352 ]
Edward Capriolo commented on HIVE-1511:
---------------------------------------
Also possibly a clever way to remove duplicate expressions that evaluate to the same result such as multiple key=0
> Hive plan serialization is slow
> -------------------------------
>
> Key: HIVE-1511
> URL: https://issues.apache.org/jira/browse/HIVE-1511
> Project: Hadoop Hive
> Issue Type: Improvement
> Affects Versions: 0.7.0
> Reporter: Ning Zhang
>
> As reported by Edward Capriolo:
> For reference I did this as a test case....
> SELECT * FROM src where
> key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> ...(100 more of these)
> No OOM but I gave up after the test case did not go anywhere for about
> 2 minutes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.