You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by drichelson <dr...@tendrilinc.com> on 2012/06/20 02:04:47 UTC

Executing multiple queries in parallel from the one .hql file

I have multiple statements in a single .hql file that I am calling via an oozie action.
Most of these statements can be executed in parallel (they do not depend on each other).  I already have the parallel execution flag set to true (although I have yet to see multiple Hive MR jobs running at once)

Hive is running them all sequentially.

Without breaking out each statement into its own Oozie action, I'd like to run most of them in parallel.. any ideas?

To be clear, I am not looking to increase the number of mappers/reducers for each task, but to increase the number of map reduce jobs running at once as there are typically free slots on the cluster not being used.



 
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender.
Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company.
Finally, the recipient should check this email and any attachments for the presence of viruses.
The company accepts no liability for any damage caused by any virus transmitted by this email.


Re: Executing multiple queries in parallel from the one .hql file

Posted by "Tucker, Matt" <Ma...@disney.com>.
Hi,

Statements in an query file are executed serially. When a query is parsed by Hive, independent stages of the query are executed in parallel when you set the parallelization flag.

If the queries are completely independent of each other, it may be better to split them into separate files and set multiple oozie actions.  If queries rely on prior query resultsets, you're best off keeping them in a single file, or writing logic outside of hive to manage order of execution.



On Jun 19, 2012, at 8:05 PM, "drichelson" <dr...@tendrilinc.com> wrote:

> I have multiple statements in a single .hql file that I am calling via an oozie action.
> Most of these statements can be executed in parallel (they do not depend on each other).  I already have the parallel execution flag set to true (although I have yet to see multiple Hive MR jobs running at once)
> 
> Hive is running them all sequentially.
> 
> Without breaking out each statement into its own Oozie action, I'd like to run most of them in parallel.. any ideas?
> 
> To be clear, I am not looking to increase the number of mappers/reducers for each task, but to increase the number of map reduce jobs running at once as there are typically free slots on the cluster not being used.
> 
> 
> 
> 
> This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the sender.
> Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company.
> Finally, the recipient should check this email and any attachments for the presence of viruses.
> The company accepts no liability for any damage caused by any virus transmitted by this email.
>