You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Sha Liu <li...@hotmail.com> on 2013/06/20 21:59:26 UTC

Run queries from external files as subqueries

Hi,
While working on some complex queries with multiple level of subqueries, I'm wonder if it is possible in Hive to refactor these subqueries into different files and instruct the enclosing query to execute these files. This way these subqueries can potentially be reused by other questions or just run by themselves.
Thanks,Sha Liu 		 	   		  

Re: Run queries from external files as subqueries

Posted by Jan DolinĂ¡r <do...@gmail.com>.
Quick and dirty way to do such thing would be to use some kind of
preprocessor. To avoid writing one, you could use e.g. the one from GCC,
with just a little help from sed:

    gcc -E -x c query.hql -o- | sed '/#/d' > preprocessed.hql
    hive -f preprocessed.hql

Where query.hql can contain for example something like

    SELECT * FROM (
        #include "subquery.hql"
    ) t
    WHERE id = 1;

The includes can be nested and multiplied as much as necessary. As a bonus,
you could also use #define for repeated parts of code and/or #ifdef to
build different queries based on parameters parameters passed to gcc ;-)

Best regards,
Jan Dolinar


On Thu, Jun 20, 2013 at 10:09 PM, Bertrand Dechoux <de...@gmail.com>wrote:

> I am afraid that there is no automatic way of doing so. But that would be
> the same answer whether the question is about hive or any relational
> database.
> (I would be glad to have counter examples.)
>
> You might want to look at oozie in order to manage worflow. But the
> creation of the worflow is manual indeed.
> http://oozie.apache.org/
>
> Regards
>
> Bertrand
>
>
>
>
> On Thu, Jun 20, 2013 at 9:59 PM, Sha Liu <li...@hotmail.com> wrote:
>
>> Hi,
>>
>> While working on some complex queries with multiple level of subqueries,
>> I'm wonder if it is possible in Hive to refactor these subqueries into
>> different files and instruct the enclosing query to execute these files.
>> This way these subqueries can potentially be reused by other questions or
>> just run by themselves.
>>
>> Thanks,
>> Sha Liu
>>
>
>
>
> --
> Bertrand Dechoux
>

Re: Run queries from external files as subqueries

Posted by Bertrand Dechoux <de...@gmail.com>.
I am afraid that there is no automatic way of doing so. But that would be
the same answer whether the question is about hive or any relational
database.
(I would be glad to have counter examples.)

You might want to look at oozie in order to manage worflow. But the
creation of the worflow is manual indeed.
http://oozie.apache.org/

Regards

Bertrand




On Thu, Jun 20, 2013 at 9:59 PM, Sha Liu <li...@hotmail.com> wrote:

> Hi,
>
> While working on some complex queries with multiple level of subqueries,
> I'm wonder if it is possible in Hive to refactor these subqueries into
> different files and instruct the enclosing query to execute these files.
> This way these subqueries can potentially be reused by other questions or
> just run by themselves.
>
> Thanks,
> Sha Liu
>



-- 
Bertrand Dechoux