You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Bhavesh Shah <bh...@gmail.com> on 2012/01/05 05:55:20 UTC

How to write Block of queries in Hive?

Hello,
I am new to hive. I want to write block of queries in Hive so that one
query give result to another one like in SQL.

I have also visited one link given below:
http://karmasphere.com/ksc/hive-user-defined-functions.html

In above link I am looking for functions but I get below one and I dont
understand following things:

USING 'map_script'USING 'reduce_script'

in following block:


FROM (
 FROM pv_users
 MAP ( pv_users.userid, pv_users.date )
 USING 'map_script'
 AS c1, c2, c3
 DISTRIBUTE BY c2
 SORT BY c2, c1) map_output
 INSERT OVERWRITE TABLE pv_users_reduced
 REDUCE ( map_output.c1, map_output.c2, map_output.c3 )
 USING 'reduce_script'
 AS date, count;


Pls can anyone tell what is the use of scripts and how to write block
of queries in hive?




-- 
Regards,
Bhavesh Shah

Re: How to write Block of queries in Hive?

Posted by Bhavesh Shah <bh...@gmail.com>.
Hello Aniket,
Thanks for the explanation.

I have one more question that in SQL we write the multiple queries in which
one query get executed and give the result to another query as a input.
So, can we write something like that in Hive?
I have also tried customs scripts in Hive but I am not getting that How to
use it in block of queries. (Multiple queries)




Thanks and Regards,
Bhavesh Shah

On Thu, Jan 5, 2012 at 11:43 AM, Aniket Mokashi <an...@gmail.com> wrote:

> Hi Bhavesh,
>
> [moving discussion to hive user list]
>
> I would suggest you to send your discussion to hive user list in order to
> reach a broader audience.
>
> As per my understanding, in the query- map_script and reduce_script are
> custom scripts that run as a streaming jobs. You are asking hive to run
> map_script as mapper job on 3 columns to generate 3 new values- c1, c2, c3.
> After this, hive will sort your records on c1 and c2 and distribute them to
> reducers based on c3 values. 'reduce_scripts' will consume these 3 records
> and generate 2 records to store in pv_users_reduced.
>
> Hope it helps.
>
> Thanks,
> Aniket
>
> On Wed, Jan 4, 2012 at 8:55 PM, Bhavesh Shah <bhavesh25shah@gmail.com
> >wrote:
>
> > Hello,
> > I am new to hive. I want to write block of queries in Hive so that one
> > query give result to another one like in SQL.
> >
> > I have also visited one link given below:
> > http://karmasphere.com/ksc/hive-user-defined-functions.html
> >
> > In above link I am looking for functions but I get below one and I dont
> > understand following things:
> >
> > USING 'map_script'USING 'reduce_script'
> >
> > in following block:
> >
> >
> > FROM (
> >  FROM pv_users
> >  MAP ( pv_users.userid, pv_users.date )
> >  USING 'map_script'
> >  AS c1, c2, c3
> >  DISTRIBUTE BY c2
> >  SORT BY c2, c1) map_output
> >  INSERT OVERWRITE TABLE pv_users_reduced
> >  REDUCE ( map_output.c1, map_output.c2, map_output.c3 )
> >  USING 'reduce_script'
> >  AS date, count;
> >
> >
> > Pls can anyone tell what is the use of scripts and how to write block
> > of queries in hive?
> >
> >
> >
> >
> > --
> > Regards,
> > Bhavesh Shah
> >
>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>

Re: How to write Block of queries in Hive?

Posted by Aniket Mokashi <an...@gmail.com>.
Hi Bhavesh,

[moving discussion to hive user list]

I would suggest you to send your discussion to hive user list in order to
reach a broader audience.

As per my understanding, in the query- map_script and reduce_script are
custom scripts that run as a streaming jobs. You are asking hive to run
map_script as mapper job on 3 columns to generate 3 new values- c1, c2, c3.
After this, hive will sort your records on c1 and c2 and distribute them to
reducers based on c3 values. 'reduce_scripts' will consume these 3 records
and generate 2 records to store in pv_users_reduced.

Hope it helps.

Thanks,
Aniket

On Wed, Jan 4, 2012 at 8:55 PM, Bhavesh Shah <bh...@gmail.com>wrote:

> Hello,
> I am new to hive. I want to write block of queries in Hive so that one
> query give result to another one like in SQL.
>
> I have also visited one link given below:
> http://karmasphere.com/ksc/hive-user-defined-functions.html
>
> In above link I am looking for functions but I get below one and I dont
> understand following things:
>
> USING 'map_script'USING 'reduce_script'
>
> in following block:
>
>
> FROM (
>  FROM pv_users
>  MAP ( pv_users.userid, pv_users.date )
>  USING 'map_script'
>  AS c1, c2, c3
>  DISTRIBUTE BY c2
>  SORT BY c2, c1) map_output
>  INSERT OVERWRITE TABLE pv_users_reduced
>  REDUCE ( map_output.c1, map_output.c2, map_output.c3 )
>  USING 'reduce_script'
>  AS date, count;
>
>
> Pls can anyone tell what is the use of scripts and how to write block
> of queries in hive?
>
>
>
>
> --
> Regards,
> Bhavesh Shah
>



-- 
"...:::Aniket:::... Quetzalco@tl"

Re: How to write Block of queries in Hive?

Posted by Aniket Mokashi <an...@gmail.com>.
Hi Bhavesh,

[moving discussion to hive user list]

I would suggest you to send your discussion to hive user list in order to
reach a broader audience.

As per my understanding, in the query- map_script and reduce_script are
custom scripts that run as a streaming jobs. You are asking hive to run
map_script as mapper job on 3 columns to generate 3 new values- c1, c2, c3.
After this, hive will sort your records on c1 and c2 and distribute them to
reducers based on c3 values. 'reduce_scripts' will consume these 3 records
and generate 2 records to store in pv_users_reduced.

Hope it helps.

Thanks,
Aniket

On Wed, Jan 4, 2012 at 8:55 PM, Bhavesh Shah <bh...@gmail.com>wrote:

> Hello,
> I am new to hive. I want to write block of queries in Hive so that one
> query give result to another one like in SQL.
>
> I have also visited one link given below:
> http://karmasphere.com/ksc/hive-user-defined-functions.html
>
> In above link I am looking for functions but I get below one and I dont
> understand following things:
>
> USING 'map_script'USING 'reduce_script'
>
> in following block:
>
>
> FROM (
>  FROM pv_users
>  MAP ( pv_users.userid, pv_users.date )
>  USING 'map_script'
>  AS c1, c2, c3
>  DISTRIBUTE BY c2
>  SORT BY c2, c1) map_output
>  INSERT OVERWRITE TABLE pv_users_reduced
>  REDUCE ( map_output.c1, map_output.c2, map_output.c3 )
>  USING 'reduce_script'
>  AS date, count;
>
>
> Pls can anyone tell what is the use of scripts and how to write block
> of queries in hive?
>
>
>
>
> --
> Regards,
> Bhavesh Shah
>



-- 
"...:::Aniket:::... Quetzalco@tl"