You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by xeon <xe...@gmail.com> on 2014/01/27 06:39:04 UTC

Performance in running jobs at the same time

Hi,

1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
running, I try to list them with "hadoop jobs -list", but it takes lots
of time for the command being executed. This happens because of the
performance of the VM. I just wonder how it works with big machines.
Does anyone have an idea if it takes long to launch Hadoop commands
while executing jobs?


2 - I want to run several jobs at the same time. How can I configure
the maximum number of jobs that I can run at the same time?


3 - Is there a calculation of how many jobs I can run at the same time
for specific environment similar to how many reduces should we set in
our jobs?

Thanks,

-- 
Best regards,

Re: Performance in running jobs at the same time

Posted by sudhakara st <su...@gmail.com>.
1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
running, I try to list them with "hadoop jobs -list", but it takes lots
of time for the command being executed. This happens because of the
performance of the VM. I just wonder how it works with big machines.
Does anyone have an idea if it takes long to launch Hadoop commands
while executing jobs.



*>> Get job information involves  communication with resource
mager/application Master. Because of available resource(CPU,Memory) in your
VM is too less. may hadoop command taking long time to get job information.*2
- I want to run several jobs at the same time. How can I configure
the maximum number of jobs that I can run at the same time?

*>> Once you submit you job to RM, scheduler will decide how to run your
job based on scheduler you used to run jobs and resource availability in
your cluster. you have to write or customize scheduler to control the
submission order or number of jobs to run at any instance. *

3 - Is there a calculation of how many jobs I can run at the same time
for specific environment similar to how many reduces should we set in
our jobs?

*>> If you  have clear idea about how much of data your going process in
your jobs, how much of resource it going to use, how much of total resource
available in cluster then you can define how many jobs can run at instance
of time. It possible when are going handle only fixed data set in all
cycles, in  real environment it not possible calculate  these thing for
each job in each run.** In hadoop2 RM takes care all resource mangemnt, you
need not to take special care about all these things. if need ordere
process of jobs then you look no Oozie kind of tool to control over order
of MR jobs.*


On Mon, Jan 27, 2014 at 11:09 AM, xeon <xe...@gmail.com> wrote:

> Hi,
>
> 1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
> running, I try to list them with "hadoop jobs -list", but it takes lots
> of time for the command being executed. This happens because of the
> performance of the VM. I just wonder how it works with big machines.
> Does anyone have an idea if it takes long to launch Hadoop commands
> while executing jobs?
>
>
> 2 - I want to run several jobs at the same time. How can I configure
> the maximum number of jobs that I can run at the same time?
>
>
> 3 - Is there a calculation of how many jobs I can run at the same time
> for specific environment similar to how many reduces should we set in
> our jobs?
>
> Thanks,
>
> --
> Best regards,
>



-- 

Regards,
...Sudhakara.st

Re: Performance in running jobs at the same time

Posted by sudhakara st <su...@gmail.com>.
1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
running, I try to list them with "hadoop jobs -list", but it takes lots
of time for the command being executed. This happens because of the
performance of the VM. I just wonder how it works with big machines.
Does anyone have an idea if it takes long to launch Hadoop commands
while executing jobs.



*>> Get job information involves  communication with resource
mager/application Master. Because of available resource(CPU,Memory) in your
VM is too less. may hadoop command taking long time to get job information.*2
- I want to run several jobs at the same time. How can I configure
the maximum number of jobs that I can run at the same time?

*>> Once you submit you job to RM, scheduler will decide how to run your
job based on scheduler you used to run jobs and resource availability in
your cluster. you have to write or customize scheduler to control the
submission order or number of jobs to run at any instance. *

3 - Is there a calculation of how many jobs I can run at the same time
for specific environment similar to how many reduces should we set in
our jobs?

*>> If you  have clear idea about how much of data your going process in
your jobs, how much of resource it going to use, how much of total resource
available in cluster then you can define how many jobs can run at instance
of time. It possible when are going handle only fixed data set in all
cycles, in  real environment it not possible calculate  these thing for
each job in each run.** In hadoop2 RM takes care all resource mangemnt, you
need not to take special care about all these things. if need ordere
process of jobs then you look no Oozie kind of tool to control over order
of MR jobs.*


On Mon, Jan 27, 2014 at 11:09 AM, xeon <xe...@gmail.com> wrote:

> Hi,
>
> 1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
> running, I try to list them with "hadoop jobs -list", but it takes lots
> of time for the command being executed. This happens because of the
> performance of the VM. I just wonder how it works with big machines.
> Does anyone have an idea if it takes long to launch Hadoop commands
> while executing jobs?
>
>
> 2 - I want to run several jobs at the same time. How can I configure
> the maximum number of jobs that I can run at the same time?
>
>
> 3 - Is there a calculation of how many jobs I can run at the same time
> for specific environment similar to how many reduces should we set in
> our jobs?
>
> Thanks,
>
> --
> Best regards,
>



-- 

Regards,
...Sudhakara.st

Re: Performance in running jobs at the same time

Posted by sudhakara st <su...@gmail.com>.
1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
running, I try to list them with "hadoop jobs -list", but it takes lots
of time for the command being executed. This happens because of the
performance of the VM. I just wonder how it works with big machines.
Does anyone have an idea if it takes long to launch Hadoop commands
while executing jobs.



*>> Get job information involves  communication with resource
mager/application Master. Because of available resource(CPU,Memory) in your
VM is too less. may hadoop command taking long time to get job information.*2
- I want to run several jobs at the same time. How can I configure
the maximum number of jobs that I can run at the same time?

*>> Once you submit you job to RM, scheduler will decide how to run your
job based on scheduler you used to run jobs and resource availability in
your cluster. you have to write or customize scheduler to control the
submission order or number of jobs to run at any instance. *

3 - Is there a calculation of how many jobs I can run at the same time
for specific environment similar to how many reduces should we set in
our jobs?

*>> If you  have clear idea about how much of data your going process in
your jobs, how much of resource it going to use, how much of total resource
available in cluster then you can define how many jobs can run at instance
of time. It possible when are going handle only fixed data set in all
cycles, in  real environment it not possible calculate  these thing for
each job in each run.** In hadoop2 RM takes care all resource mangemnt, you
need not to take special care about all these things. if need ordere
process of jobs then you look no Oozie kind of tool to control over order
of MR jobs.*


On Mon, Jan 27, 2014 at 11:09 AM, xeon <xe...@gmail.com> wrote:

> Hi,
>
> 1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
> running, I try to list them with "hadoop jobs -list", but it takes lots
> of time for the command being executed. This happens because of the
> performance of the VM. I just wonder how it works with big machines.
> Does anyone have an idea if it takes long to launch Hadoop commands
> while executing jobs?
>
>
> 2 - I want to run several jobs at the same time. How can I configure
> the maximum number of jobs that I can run at the same time?
>
>
> 3 - Is there a calculation of how many jobs I can run at the same time
> for specific environment similar to how many reduces should we set in
> our jobs?
>
> Thanks,
>
> --
> Best regards,
>



-- 

Regards,
...Sudhakara.st

Re: Performance in running jobs at the same time

Posted by sudhakara st <su...@gmail.com>.
1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
running, I try to list them with "hadoop jobs -list", but it takes lots
of time for the command being executed. This happens because of the
performance of the VM. I just wonder how it works with big machines.
Does anyone have an idea if it takes long to launch Hadoop commands
while executing jobs.



*>> Get job information involves  communication with resource
mager/application Master. Because of available resource(CPU,Memory) in your
VM is too less. may hadoop command taking long time to get job information.*2
- I want to run several jobs at the same time. How can I configure
the maximum number of jobs that I can run at the same time?

*>> Once you submit you job to RM, scheduler will decide how to run your
job based on scheduler you used to run jobs and resource availability in
your cluster. you have to write or customize scheduler to control the
submission order or number of jobs to run at any instance. *

3 - Is there a calculation of how many jobs I can run at the same time
for specific environment similar to how many reduces should we set in
our jobs?

*>> If you  have clear idea about how much of data your going process in
your jobs, how much of resource it going to use, how much of total resource
available in cluster then you can define how many jobs can run at instance
of time. It possible when are going handle only fixed data set in all
cycles, in  real environment it not possible calculate  these thing for
each job in each run.** In hadoop2 RM takes care all resource mangemnt, you
need not to take special care about all these things. if need ordere
process of jobs then you look no Oozie kind of tool to control over order
of MR jobs.*


On Mon, Jan 27, 2014 at 11:09 AM, xeon <xe...@gmail.com> wrote:

> Hi,
>
> 1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
> running, I try to list them with "hadoop jobs -list", but it takes lots
> of time for the command being executed. This happens because of the
> performance of the VM. I just wonder how it works with big machines.
> Does anyone have an idea if it takes long to launch Hadoop commands
> while executing jobs?
>
>
> 2 - I want to run several jobs at the same time. How can I configure
> the maximum number of jobs that I can run at the same time?
>
>
> 3 - Is there a calculation of how many jobs I can run at the same time
> for specific environment similar to how many reduces should we set in
> our jobs?
>
> Thanks,
>
> --
> Best regards,
>



-- 

Regards,
...Sudhakara.st