You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Roy Smith <ro...@panix.com> on 2013/01/11 05:02:13 UTC
How to interpret the progress meter?
I'm running a job that looks like it's going to take about 12 hours on 4 EC2 instances. I don't really understand the "complete" percentages reported by http://localhost:9100/jobtasks.jsp. They are extremely non-linear. For my reduce steps, they ramp up to 40-60% in just a few minutes, then take hours to slowly inch their way up the rest of the way to 100%.
What does the "complete" percentage really mean?
--
Roy Smith
roy@panix.com
Re: How to interpret the progress meter?
Posted by Mahesh Balija <ba...@gmail.com>.
Hi Smith,
In my experience usually the first 40% to around 70% the actual
process will occur the remaining would be devoted to write/flush the data
to the output files, usually this may take more time.
Best,
Mahesh Balija,
Calsoft Labs.
On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith <ro...@panix.com> wrote:
> I'm running a job that looks like it's going to take about 12 hours on 4
> EC2 instances. I don't really understand the "complete" percentages
> reported by http://localhost:9100/jobtasks.jsp. They are extremely
> non-linear. For my reduce steps, they ramp up to 40-60% in just a few
> minutes, then take hours to slowly inch their way up the rest of the way to
> 100%.
>
> What does the "complete" percentage really mean?
>
> --
> Roy Smith
> roy@panix.com
>
>
Re: How to interpret the progress meter?
Posted by Harsh J <ha...@cloudera.com>.
The map side percentage is as the map's record reader reports its
progress. The reduce side is divided into 3 phases of 33~% each ->
shuffle (fetch data), sort and finally user-code (reduce). It is
normal to see jumps between these values, depending on the work to be
done, etc.
On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith <ro...@panix.com> wrote:
> I'm running a job that looks like it's going to take about 12 hours on 4 EC2
> instances. I don't really understand the "complete" percentages reported by
> http://localhost:9100/jobtasks.jsp. They are extremely non-linear. For my
> reduce steps, they ramp up to 40-60% in just a few minutes, then take hours
> to slowly inch their way up the rest of the way to 100%.
>
> What does the "complete" percentage really mean?
>
> --
> Roy Smith
> roy@panix.com
>
--
Harsh J
Re: How to interpret the progress meter?
Posted by Harsh J <ha...@cloudera.com>.
The map side percentage is as the map's record reader reports its
progress. The reduce side is divided into 3 phases of 33~% each ->
shuffle (fetch data), sort and finally user-code (reduce). It is
normal to see jumps between these values, depending on the work to be
done, etc.
On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith <ro...@panix.com> wrote:
> I'm running a job that looks like it's going to take about 12 hours on 4 EC2
> instances. I don't really understand the "complete" percentages reported by
> http://localhost:9100/jobtasks.jsp. They are extremely non-linear. For my
> reduce steps, they ramp up to 40-60% in just a few minutes, then take hours
> to slowly inch their way up the rest of the way to 100%.
>
> What does the "complete" percentage really mean?
>
> --
> Roy Smith
> roy@panix.com
>
--
Harsh J
Re: How to interpret the progress meter?
Posted by Harsh J <ha...@cloudera.com>.
The map side percentage is as the map's record reader reports its
progress. The reduce side is divided into 3 phases of 33~% each ->
shuffle (fetch data), sort and finally user-code (reduce). It is
normal to see jumps between these values, depending on the work to be
done, etc.
On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith <ro...@panix.com> wrote:
> I'm running a job that looks like it's going to take about 12 hours on 4 EC2
> instances. I don't really understand the "complete" percentages reported by
> http://localhost:9100/jobtasks.jsp. They are extremely non-linear. For my
> reduce steps, they ramp up to 40-60% in just a few minutes, then take hours
> to slowly inch their way up the rest of the way to 100%.
>
> What does the "complete" percentage really mean?
>
> --
> Roy Smith
> roy@panix.com
>
--
Harsh J
Re: How to interpret the progress meter?
Posted by Mahesh Balija <ba...@gmail.com>.
Hi Smith,
In my experience usually the first 40% to around 70% the actual
process will occur the remaining would be devoted to write/flush the data
to the output files, usually this may take more time.
Best,
Mahesh Balija,
Calsoft Labs.
On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith <ro...@panix.com> wrote:
> I'm running a job that looks like it's going to take about 12 hours on 4
> EC2 instances. I don't really understand the "complete" percentages
> reported by http://localhost:9100/jobtasks.jsp. They are extremely
> non-linear. For my reduce steps, they ramp up to 40-60% in just a few
> minutes, then take hours to slowly inch their way up the rest of the way to
> 100%.
>
> What does the "complete" percentage really mean?
>
> --
> Roy Smith
> roy@panix.com
>
>
Re: How to interpret the progress meter?
Posted by Mahesh Balija <ba...@gmail.com>.
Hi Smith,
In my experience usually the first 40% to around 70% the actual
process will occur the remaining would be devoted to write/flush the data
to the output files, usually this may take more time.
Best,
Mahesh Balija,
Calsoft Labs.
On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith <ro...@panix.com> wrote:
> I'm running a job that looks like it's going to take about 12 hours on 4
> EC2 instances. I don't really understand the "complete" percentages
> reported by http://localhost:9100/jobtasks.jsp. They are extremely
> non-linear. For my reduce steps, they ramp up to 40-60% in just a few
> minutes, then take hours to slowly inch their way up the rest of the way to
> 100%.
>
> What does the "complete" percentage really mean?
>
> --
> Roy Smith
> roy@panix.com
>
>
Re: How to interpret the progress meter?
Posted by Mahesh Balija <ba...@gmail.com>.
Hi Smith,
In my experience usually the first 40% to around 70% the actual
process will occur the remaining would be devoted to write/flush the data
to the output files, usually this may take more time.
Best,
Mahesh Balija,
Calsoft Labs.
On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith <ro...@panix.com> wrote:
> I'm running a job that looks like it's going to take about 12 hours on 4
> EC2 instances. I don't really understand the "complete" percentages
> reported by http://localhost:9100/jobtasks.jsp. They are extremely
> non-linear. For my reduce steps, they ramp up to 40-60% in just a few
> minutes, then take hours to slowly inch their way up the rest of the way to
> 100%.
>
> What does the "complete" percentage really mean?
>
> --
> Roy Smith
> roy@panix.com
>
>
Re: How to interpret the progress meter?
Posted by Harsh J <ha...@cloudera.com>.
The map side percentage is as the map's record reader reports its
progress. The reduce side is divided into 3 phases of 33~% each ->
shuffle (fetch data), sort and finally user-code (reduce). It is
normal to see jumps between these values, depending on the work to be
done, etc.
On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith <ro...@panix.com> wrote:
> I'm running a job that looks like it's going to take about 12 hours on 4 EC2
> instances. I don't really understand the "complete" percentages reported by
> http://localhost:9100/jobtasks.jsp. They are extremely non-linear. For my
> reduce steps, they ramp up to 40-60% in just a few minutes, then take hours
> to slowly inch their way up the rest of the way to 100%.
>
> What does the "complete" percentage really mean?
>
> --
> Roy Smith
> roy@panix.com
>
--
Harsh J