You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Adam Shook <as...@clearedgeit.com> on 2011/09/22 21:41:20 UTC

FairScheduler Local Task Restriction

Hello All,

I have recently switched my small Hadoop dev cluster (v0.20.1) to use the FairScheduler.  I have a max of 128 map tasks available and recently noticed that my jobs seem to use a maximum of 16 at any given time (the job I am looking at in particular runs for about 15 minutes) - they are also all data local map tasks.  I searched around a bit and discovered the mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in mapred-site.xml, copied the file around to my nodes and tried running another job.  It still has 16 tasks.

Does it require a cluster restart?  Is it something totally different?  Should I not set this value to zero?

Thanks!

-- Adam

RE: FairScheduler Local Task Restriction

Posted by Adam Shook <as...@clearedgeit.com>.

I don't have much control over the cluster configuration, but I will speak with those that do.  As far as the input splits are concerned, I will look into some custom CombinedInputFormat stuff.

Thank you both very much!

--  Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:46 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

I don't think that this is hurting you in this particular instance but you are *most likely* oversubscribing your boxes with this configuration (a monitoring system like Ganglia will be able to confirm this). You will want at least 1 dedicated physical core for the DN and TT daemons (some would argue 1 core per daemon) and with your current configuration you are mapping a slot per logical core (12 cores * 2 for HT = 24 logical cores vs 16 + 8 = 24 slots). This might not be that much of an impact based on your normal job loads but it definitely could bite you if you start swapping due to several heavy tasks at the same time.

Also Joey's follow up is still a valid point because you could consume more data per input split. You are not restricted to block size if you tune the right parameters/classes.

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 3:33 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

Each box has 3 (seems weird) quad core HT'ed CPUS for a total of 12 HT'ed cores per machine.  16 map tasks and 8 reduce tasks each.

-- Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:26 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

Just to confirm your configuration, how many logical cores do these boxes actually have (I am assuming dual quad core HT'ed)? Do you not have any reduce slots allocated?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 3:22 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

Okay, I put a Thread.sleep to test my theory and it will run all 128 at a time - they are just completing too quickly.  I guess there is no other way to get around it, unless someone knows how to make the scheduler schedule faster...

-- Adam

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 4:17 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

It's an 8 node cluster and all 8 task trackers are being used.  Each tracker has 16 max map tasks.  Each tracker seems to be running two at a time.  Map tasks take 10 seconds from start to finish.  Is it possible that they are just completing faster than they can be created and it just seems to stick around 16?

-- Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

If you dig into the job history on the web-ui can you confirm whether it is the same 16 tasktrackers slots that are getting the map tasks? Long shot but it could be that it is actually distributing across your cluster and there is some other issue that is springing up. Also, how long does each of your map tasks take?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 2:41 PM
To: mapreduce-user@hadoop.apache.org
Subject: FairScheduler Local Task Restriction

Hello All,

I have recently switched my small Hadoop dev cluster (v0.20.1) to use the FairScheduler.  I have a max of 128 map tasks available and recently noticed that my jobs seem to use a maximum of 16 at any given time (the job I am looking at in particular runs for about 15 minutes) - they are also all data local map tasks.  I searched around a bit and discovered the mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in mapred-site.xml, copied the file around to my nodes and tried running another job.  It still has 16 tasks.

Does it require a cluster restart?  Is it something totally different?  Should I not set this value to zero?

Thanks!

-- Adam
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.
________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

RE: FairScheduler Local Task Restriction

Posted by "GOEKE, MATTHEW (AG/1000)" <ma...@monsanto.com>.

I don't think that this is hurting you in this particular instance but you are *most likely* oversubscribing your boxes with this configuration (a monitoring system like Ganglia will be able to confirm this). You will want at least 1 dedicated physical core for the DN and TT daemons (some would argue 1 core per daemon) and with your current configuration you are mapping a slot per logical core (12 cores * 2 for HT = 24 logical cores vs 16 + 8 = 24 slots). This might not be that much of an impact based on your normal job loads but it definitely could bite you if you start swapping due to several heavy tasks at the same time.

Also Joey's follow up is still a valid point because you could consume more data per input split. You are not restricted to block size if you tune the right parameters/classes.

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 3:33 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

Each box has 3 (seems weird) quad core HT'ed CPUS for a total of 12 HT'ed cores per machine.  16 map tasks and 8 reduce tasks each.

-- Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:26 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

Just to confirm your configuration, how many logical cores do these boxes actually have (I am assuming dual quad core HT'ed)? Do you not have any reduce slots allocated?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 3:22 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

Okay, I put a Thread.sleep to test my theory and it will run all 128 at a time - they are just completing too quickly.  I guess there is no other way to get around it, unless someone knows how to make the scheduler schedule faster...

-- Adam

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 4:17 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

It's an 8 node cluster and all 8 task trackers are being used.  Each tracker has 16 max map tasks.  Each tracker seems to be running two at a time.  Map tasks take 10 seconds from start to finish.  Is it possible that they are just completing faster than they can be created and it just seems to stick around 16?

-- Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

If you dig into the job history on the web-ui can you confirm whether it is the same 16 tasktrackers slots that are getting the map tasks? Long shot but it could be that it is actually distributing across your cluster and there is some other issue that is springing up. Also, how long does each of your map tasks take?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 2:41 PM
To: mapreduce-user@hadoop.apache.org
Subject: FairScheduler Local Task Restriction

Hello All,

I have recently switched my small Hadoop dev cluster (v0.20.1) to use the FairScheduler.  I have a max of 128 map tasks available and recently noticed that my jobs seem to use a maximum of 16 at any given time (the job I am looking at in particular runs for about 15 minutes) - they are also all data local map tasks.  I searched around a bit and discovered the mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in mapred-site.xml, copied the file around to my nodes and tried running another job.  It still has 16 tasks.

Does it require a cluster restart?  Is it something totally different?  Should I not set this value to zero?

Thanks!

-- Adam
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.
________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

RE: FairScheduler Local Task Restriction

Posted by Adam Shook <as...@clearedgeit.com>.

Each box has 3 (seems weird) quad core HT'ed CPUS for a total of 12 HT'ed cores per machine.  16 map tasks and 8 reduce tasks each.

-- Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:26 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

Just to confirm your configuration, how many logical cores do these boxes actually have (I am assuming dual quad core HT'ed)? Do you not have any reduce slots allocated?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 3:22 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

Okay, I put a Thread.sleep to test my theory and it will run all 128 at a time - they are just completing too quickly.  I guess there is no other way to get around it, unless someone knows how to make the scheduler schedule faster...

-- Adam

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 4:17 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

It's an 8 node cluster and all 8 task trackers are being used.  Each tracker has 16 max map tasks.  Each tracker seems to be running two at a time.  Map tasks take 10 seconds from start to finish.  Is it possible that they are just completing faster than they can be created and it just seems to stick around 16?

-- Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

If you dig into the job history on the web-ui can you confirm whether it is the same 16 tasktrackers slots that are getting the map tasks? Long shot but it could be that it is actually distributing across your cluster and there is some other issue that is springing up. Also, how long does each of your map tasks take?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 2:41 PM
To: mapreduce-user@hadoop.apache.org
Subject: FairScheduler Local Task Restriction

Hello All,

I have recently switched my small Hadoop dev cluster (v0.20.1) to use the FairScheduler.  I have a max of 128 map tasks available and recently noticed that my jobs seem to use a maximum of 16 at any given time (the job I am looking at in particular runs for about 15 minutes) - they are also all data local map tasks.  I searched around a bit and discovered the mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in mapred-site.xml, copied the file around to my nodes and tried running another job.  It still has 16 tasks.

Does it require a cluster restart?  Is it something totally different?  Should I not set this value to zero?

Thanks!

-- Adam
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.
________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

RE: FairScheduler Local Task Restriction

Posted by "GOEKE, MATTHEW (AG/1000)" <ma...@monsanto.com>.

Just to confirm your configuration, how many logical cores do these boxes actually have (I am assuming dual quad core HT'ed)? Do you not have any reduce slots allocated?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 3:22 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

Okay, I put a Thread.sleep to test my theory and it will run all 128 at a time - they are just completing too quickly.  I guess there is no other way to get around it, unless someone knows how to make the scheduler schedule faster...

-- Adam

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 4:17 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

It's an 8 node cluster and all 8 task trackers are being used.  Each tracker has 16 max map tasks.  Each tracker seems to be running two at a time.  Map tasks take 10 seconds from start to finish.  Is it possible that they are just completing faster than they can be created and it just seems to stick around 16?

-- Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

If you dig into the job history on the web-ui can you confirm whether it is the same 16 tasktrackers slots that are getting the map tasks? Long shot but it could be that it is actually distributing across your cluster and there is some other issue that is springing up. Also, how long does each of your map tasks take?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 2:41 PM
To: mapreduce-user@hadoop.apache.org
Subject: FairScheduler Local Task Restriction

Hello All,

I have recently switched my small Hadoop dev cluster (v0.20.1) to use the FairScheduler.  I have a max of 128 map tasks available and recently noticed that my jobs seem to use a maximum of 16 at any given time (the job I am looking at in particular runs for about 15 minutes) - they are also all data local map tasks.  I searched around a bit and discovered the mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in mapred-site.xml, copied the file around to my nodes and tried running another job.  It still has 16 tasks.

Does it require a cluster restart?  Is it something totally different?  Should I not set this value to zero?

Thanks!

-- Adam
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.
________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

RE: FairScheduler Local Task Restriction

Posted by Adam Shook <as...@clearedgeit.com>.

It is enabled -- I'll try disabling it.  The job runs over a large number of files that are relatively the size of an HDFS block, so less splits isn't much of an option.


-----Original Message-----
From: Joey Echeverria [mailto:joey@cloudera.com] 
Sent: Thursday, September 22, 2011 4:28 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: FairScheduler Local Task Restriction

Do you have assign multiple enabled in the fair scheduler? Event that
may not be able to keep up if the tasks are only taking 10 seconds.
Any way you could run the job with less splits?

On Thu, Sep 22, 2011 at 1:21 PM, Adam Shook <as...@clearedgeit.com> wrote:
> Okay, I put a Thread.sleep to test my theory and it will run all 128 at a
> time - they are just completing too quickly.  I guess there is no other way
> to get around it, unless someone knows how to make the scheduler schedule
> faster...
>
>
>
> -- Adam
>
>
>
> From: Adam Shook [mailto:ashook@clearedgeit.com]
> Sent: Thursday, September 22, 2011 4:17 PM
>
> To: mapreduce-user@hadoop.apache.org
> Subject: RE: FairScheduler Local Task Restriction
>
>
>
> It's an 8 node cluster and all 8 task trackers are being used.  Each tracker
> has 16 max map tasks.  Each tracker seems to be running two at a time.  Map
> tasks take 10 seconds from start to finish.  Is it possible that they are
> just completing faster than they can be created and it just seems to stick
> around 16?
>
>
>
> -- Adam
>
>
>
> From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
> Sent: Thursday, September 22, 2011 4:06 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: RE: FairScheduler Local Task Restriction
>
>
>
> If you dig into the job history on the web-ui can you confirm whether it is
> the same 16 tasktrackers slots that are getting the map tasks? Long shot but
> it could be that it is actually distributing across your cluster and there
> is some other issue that is springing up. Also, how long does each of your
> map tasks take?
>
>
>
> Matt
>
>
>
> From: Adam Shook [mailto:ashook@clearedgeit.com]
> Sent: Thursday, September 22, 2011 2:41 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: FairScheduler Local Task Restriction
>
>
>
> Hello All,
>
>
>
> I have recently switched my small Hadoop dev cluster (v0.20.1) to use the
> FairScheduler.  I have a max of 128 map tasks available and recently noticed
> that my jobs seem to use a maximum of 16 at any given time (the job I am
> looking at in particular runs for about 15 minutes) - they are also all data
> local map tasks.  I searched around a bit and discovered the
> mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in
> mapred-site.xml, copied the file around to my nodes and tried running
> another job.  It still has 16 tasks.
>
>
>
> Does it require a cluster restart?  Is it something totally different?
> Should I not set this value to zero?
>
>
>
> Thanks!
>
>
>
> -- Adam
>
> This e-mail message may contain privileged and/or confidential information,
> and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error,
> please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other use
> of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to monitoring,
> reading and archival by Monsanto, including its
> subsidiaries. The recipient of this e-mail is solely responsible for
> checking for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any damage
> caused by any such code transmitted by or accompanying
> this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export control
> laws and regulations of the United States, potentially
> including but not limited to the Export Administration Regulations (EAR) and
> sanctions regulations issued by the U.S. Department of
> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this
> information you are obligated to comply with all
> applicable U.S. export laws and regulations.
>
> ________________________________
>
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11
>
> ________________________________
>
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

Re: FairScheduler Local Task Restriction

Posted by Joey Echeverria <jo...@cloudera.com>.

Do you have assign multiple enabled in the fair scheduler? Event that
may not be able to keep up if the tasks are only taking 10 seconds.
Any way you could run the job with less splits?

On Thu, Sep 22, 2011 at 1:21 PM, Adam Shook <as...@clearedgeit.com> wrote:
> Okay, I put a Thread.sleep to test my theory and it will run all 128 at a
> time – they are just completing too quickly.  I guess there is no other way
> to get around it, unless someone knows how to make the scheduler schedule
> faster...
>
>
>
> -- Adam
>
>
>
> From: Adam Shook [mailto:ashook@clearedgeit.com]
> Sent: Thursday, September 22, 2011 4:17 PM
>
> To: mapreduce-user@hadoop.apache.org
> Subject: RE: FairScheduler Local Task Restriction
>
>
>
> It’s an 8 node cluster and all 8 task trackers are being used.  Each tracker
> has 16 max map tasks.  Each tracker seems to be running two at a time.  Map
> tasks take 10 seconds from start to finish.  Is it possible that they are
> just completing faster than they can be created and it just seems to stick
> around 16?
>
>
>
> -- Adam
>
>
>
> From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
> Sent: Thursday, September 22, 2011 4:06 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: RE: FairScheduler Local Task Restriction
>
>
>
> If you dig into the job history on the web-ui can you confirm whether it is
> the same 16 tasktrackers slots that are getting the map tasks? Long shot but
> it could be that it is actually distributing across your cluster and there
> is some other issue that is springing up. Also, how long does each of your
> map tasks take?
>
>
>
> Matt
>
>
>
> From: Adam Shook [mailto:ashook@clearedgeit.com]
> Sent: Thursday, September 22, 2011 2:41 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: FairScheduler Local Task Restriction
>
>
>
> Hello All,
>
>
>
> I have recently switched my small Hadoop dev cluster (v0.20.1) to use the
> FairScheduler.  I have a max of 128 map tasks available and recently noticed
> that my jobs seem to use a maximum of 16 at any given time (the job I am
> looking at in particular runs for about 15 minutes) – they are also all data
> local map tasks.  I searched around a bit and discovered the
> mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in
> mapred-site.xml, copied the file around to my nodes and tried running
> another job.  It still has 16 tasks.
>
>
>
> Does it require a cluster restart?  Is it something totally different?
> Should I not set this value to zero?
>
>
>
> Thanks!
>
>
>
> -- Adam
>
> This e-mail message may contain privileged and/or confidential information,
> and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error,
> please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other use
> of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to monitoring,
> reading and archival by Monsanto, including its
> subsidiaries. The recipient of this e-mail is solely responsible for
> checking for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any damage
> caused by any such code transmitted by or accompanying
> this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export control
> laws and regulations of the United States, potentially
> including but not limited to the Export Administration Regulations (EAR) and
> sanctions regulations issued by the U.S. Department of
> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this
> information you are obligated to comply with all
> applicable U.S. export laws and regulations.
>
> ________________________________
>
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11
>
> ________________________________
>
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

RE: FairScheduler Local Task Restriction

Posted by Adam Shook <as...@clearedgeit.com>.

Okay, I put a Thread.sleep to test my theory and it will run all 128 at a time - they are just completing too quickly.  I guess there is no other way to get around it, unless someone knows how to make the scheduler schedule faster...

-- Adam

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 4:17 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

It's an 8 node cluster and all 8 task trackers are being used.  Each tracker has 16 max map tasks.  Each tracker seems to be running two at a time.  Map tasks take 10 seconds from start to finish.  Is it possible that they are just completing faster than they can be created and it just seems to stick around 16?

-- Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

If you dig into the job history on the web-ui can you confirm whether it is the same 16 tasktrackers slots that are getting the map tasks? Long shot but it could be that it is actually distributing across your cluster and there is some other issue that is springing up. Also, how long does each of your map tasks take?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 2:41 PM
To: mapreduce-user@hadoop.apache.org
Subject: FairScheduler Local Task Restriction

Hello All,

I have recently switched my small Hadoop dev cluster (v0.20.1) to use the FairScheduler.  I have a max of 128 map tasks available and recently noticed that my jobs seem to use a maximum of 16 at any given time (the job I am looking at in particular runs for about 15 minutes) - they are also all data local map tasks.  I searched around a bit and discovered the mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in mapred-site.xml, copied the file around to my nodes and tried running another job.  It still has 16 tasks.

Does it require a cluster restart?  Is it something totally different?  Should I not set this value to zero?

Thanks!

-- Adam
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.
________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

RE: FairScheduler Local Task Restriction

Posted by Adam Shook <as...@clearedgeit.com>.

It's an 8 node cluster and all 8 task trackers are being used.  Each tracker has 16 max map tasks.  Each tracker seems to be running two at a time.  Map tasks take 10 seconds from start to finish.  Is it possible that they are just completing faster than they can be created and it just seems to stick around 16?

-- Adam

From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com]
Sent: Thursday, September 22, 2011 4:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: FairScheduler Local Task Restriction

If you dig into the job history on the web-ui can you confirm whether it is the same 16 tasktrackers slots that are getting the map tasks? Long shot but it could be that it is actually distributing across your cluster and there is some other issue that is springing up. Also, how long does each of your map tasks take?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 2:41 PM
To: mapreduce-user@hadoop.apache.org
Subject: FairScheduler Local Task Restriction

Hello All,

I have recently switched my small Hadoop dev cluster (v0.20.1) to use the FairScheduler.  I have a max of 128 map tasks available and recently noticed that my jobs seem to use a maximum of 16 at any given time (the job I am looking at in particular runs for about 15 minutes) - they are also all data local map tasks.  I searched around a bit and discovered the mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in mapred-site.xml, copied the file around to my nodes and tried running another job.  It still has 16 tasks.

Does it require a cluster restart?  Is it something totally different?  Should I not set this value to zero?

Thanks!

-- Adam
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.
________________________________

No virus found in this message.
Checked by AVG - www.avg.com<http://www.avg.com>
Version: 10.0.1410 / Virus Database: 1520/3912 - Release Date: 09/22/11

RE: FairScheduler Local Task Restriction

Posted by "GOEKE, MATTHEW (AG/1000)" <ma...@monsanto.com>.

If you dig into the job history on the web-ui can you confirm whether it is the same 16 tasktrackers slots that are getting the map tasks? Long shot but it could be that it is actually distributing across your cluster and there is some other issue that is springing up. Also, how long does each of your map tasks take?

Matt

From: Adam Shook [mailto:ashook@clearedgeit.com]
Sent: Thursday, September 22, 2011 2:41 PM
To: mapreduce-user@hadoop.apache.org
Subject: FairScheduler Local Task Restriction

Hello All,

I have recently switched my small Hadoop dev cluster (v0.20.1) to use the FairScheduler.  I have a max of 128 map tasks available and recently noticed that my jobs seem to use a maximum of 16 at any given time (the job I am looking at in particular runs for about 15 minutes) - they are also all data local map tasks.  I searched around a bit and discovered the mapred.fairscheduler.locality.delay may be to blame.  I set it to 0 in mapred-site.xml, copied the file around to my nodes and tried running another job.  It still has 16 tasks.

Does it require a cluster restart?  Is it something totally different?  Should I not set this value to zero?

Thanks!

-- Adam
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.