You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Kumar Jayapal <kj...@gmail.com> on 2015/06/03 23:30:22 UTC

Prallel running jobs

Hi,


I have a question regarding jobs running in YARN.

I have SET hive.exec.parallel=true;

When I submit this command


FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2015,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
08
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2016,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
08
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2017,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
08;



Do the jobs submitted run parallely in YARN if so will they have different
application id?

What I see is one app_id and 3 taks (map) does it mean its running in
parallel?



Thanks
Jay

Re: Prallel running jobs

Posted by Mark Gao <ma...@gmail.com>.
Hi, Jay

You can split this SQL into 3 scripts, like this:
--script1.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2015,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

--script2.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2016,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

--script3.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2017,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

Then, exec hive command in parallel, like this:

hive -f script1.sql &
hive -f script2.sql &
hive -f script2.sql

There will be 3 maps RUNNING in parallel, instead of 3 tasks in one
map running sequentially.


On Thu, Jun 4, 2015 at 5:30 AM, Kumar Jayapal <kj...@gmail.com> wrote:
> Hi,
>
>
> I have a question regarding jobs running in YARN.
>
> I have SET hive.exec.parallel=true;
>
> When I submit this command
>
>
> FROM EMPLOYER_STAGE
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2015,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2016,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2017,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08;
>
>
>
> Do the jobs submitted run parallely in YARN if so will they have different
> application id?
>
> What I see is one app_id and 3 taks (map) does it mean its running in
> parallel?
>
>
>
> Thanks
> Jay

Re: Prallel running jobs

Posted by Mark Gao <ma...@gmail.com>.
Hi, Jay

You can split this SQL into 3 scripts, like this:
--script1.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2015,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

--script2.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2016,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

--script3.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2017,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

Then, exec hive command in parallel, like this:

hive -f script1.sql &
hive -f script2.sql &
hive -f script2.sql

There will be 3 maps RUNNING in parallel, instead of 3 tasks in one
map running sequentially.


On Thu, Jun 4, 2015 at 5:30 AM, Kumar Jayapal <kj...@gmail.com> wrote:
> Hi,
>
>
> I have a question regarding jobs running in YARN.
>
> I have SET hive.exec.parallel=true;
>
> When I submit this command
>
>
> FROM EMPLOYER_STAGE
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2015,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2016,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2017,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08;
>
>
>
> Do the jobs submitted run parallely in YARN if so will they have different
> application id?
>
> What I see is one app_id and 3 taks (map) does it mean its running in
> parallel?
>
>
>
> Thanks
> Jay

Re: Prallel running jobs

Posted by Mark Gao <ma...@gmail.com>.
Hi, Jay

You can split this SQL into 3 scripts, like this:
--script1.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2015,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

--script2.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2016,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

--script3.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2017,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

Then, exec hive command in parallel, like this:

hive -f script1.sql &
hive -f script2.sql &
hive -f script2.sql

There will be 3 maps RUNNING in parallel, instead of 3 tasks in one
map running sequentially.


On Thu, Jun 4, 2015 at 5:30 AM, Kumar Jayapal <kj...@gmail.com> wrote:
> Hi,
>
>
> I have a question regarding jobs running in YARN.
>
> I have SET hive.exec.parallel=true;
>
> When I submit this command
>
>
> FROM EMPLOYER_STAGE
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2015,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2016,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2017,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08;
>
>
>
> Do the jobs submitted run parallely in YARN if so will they have different
> application id?
>
> What I see is one app_id and 3 taks (map) does it mean its running in
> parallel?
>
>
>
> Thanks
> Jay

Re: Prallel running jobs

Posted by Mark Gao <ma...@gmail.com>.
Hi, Jay

You can split this SQL into 3 scripts, like this:
--script1.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2015,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

--script2.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2016,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

--script3.sql
FROM EMPLOYER_STAGE
INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2017,
FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND
FISCAL_PERIOD = 08;

Then, exec hive command in parallel, like this:

hive -f script1.sql &
hive -f script2.sql &
hive -f script2.sql

There will be 3 maps RUNNING in parallel, instead of 3 tasks in one
map running sequentially.


On Thu, Jun 4, 2015 at 5:30 AM, Kumar Jayapal <kj...@gmail.com> wrote:
> Hi,
>
>
> I have a question regarding jobs running in YARN.
>
> I have SET hive.exec.parallel=true;
>
> When I submit this command
>
>
> FROM EMPLOYER_STAGE
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2015,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2016,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08
> INSERT OVERWRITE TABLE  EMPLOYER  PARTITION (FISCAL_YEAR = 2017,
> FISCAL_PERIOD = 01) SELECT * WHERE FISCAL_YEAR = 2014 AND  FISCAL_PERIOD =
> 08;
>
>
>
> Do the jobs submitted run parallely in YARN if so will they have different
> application id?
>
> What I see is one app_id and 3 taks (map) does it mean its running in
> parallel?
>
>
>
> Thanks
> Jay