You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Asim Jalis (JIRA)" <ji...@apache.org> on 2013/12/06 17:26:38 UTC

[jira] [Commented] (HIVE-549) Parallel Execution Mechanism

    [ https://issues.apache.org/jira/browse/HIVE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841393#comment-13841393 ] 

Asim Jalis commented on HIVE-549:
---------------------------------

I was curious why you decided to go with false by default? 

Could you summarize the offline discussion?

If setting it to true is safe then it seems like it would always speed up the query. The only downside I can see is that a query might use up more processor resources on each node.

> Parallel Execution Mechanism
> ----------------------------
>
>                 Key: HIVE-549
>                 URL: https://issues.apache.org/jira/browse/HIVE-549
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Adam Kramer
>            Assignee: Chaitanya Mishra
>              Labels: hive-appu
>             Fix For: 0.5.0
>
>         Attachments: HIVE549-v7.patch
>
>
> In a massively parallel database system, it would be awesome to also parallelize some of the mapreduce phases that our data needs to go through.
> One example that just occurred to me is UNION ALL: when you union two SELECT statements, effectively you could run those statements in parallel. There's no situation (that I can think of, but I don't have a formal proof) in which the left statement would rely on the right statement, or vice versa. So, they could be run at the same time...and perhaps they should be. Or, perhaps there should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?



--
This message was sent by Atlassian JIRA
(v6.1#6144)