You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Bryan Pendleton (JIRA)" <ji...@apache.org> on 2006/02/14 21:49:08 UTC

[jira] Commented: (HADOOP-38) default splitter should incorporate fs block size

    [ http://issues.apache.org/jira/browse/HADOOP-38?page=comments#action_12366381 ] 

Bryan Pendleton commented on HADOOP-38:
---------------------------------------

The idea sounds sound, but is blocksize the best unit? There's a certain overhead for each additional task added to a job - for jobs with really large input, this could cause really large task lists. Is there going to be any code for pre-replicating blocks? Maybe sequences, so there'd be a natural "first choice" node for many chunkings of larger than one block? Obviously, as datanodes come and go this might not always work ideally, but it could help in the 80% case.

> default splitter should incorporate fs block size
> -------------------------------------------------
>
>          Key: HADOOP-38
>          URL: http://issues.apache.org/jira/browse/HADOOP-38
>      Project: Hadoop
>         Type: Improvement
>   Components: mapred
>     Reporter: Doug Cutting

>
> By default, the file splitting code should operate as follows.
>   inputs are <file>*, numMapTasks, minSplitSize, fsBlockSize
>   output is <file,start,length>*
>   totalSize = sum of all file sizes;
>   desiredSplitSize = totalSize / numMapTasks;
>   if (desiredSplitSize > fsBlockSize)             /* new */
>     desiredSplitSize = fsBlockSize;
>   if (desiredSplitSize < minSplitSize)
>     desiredSplitSize = minSplitSize;
>   chop input files into desiredSplitSize chunks & return them
> In other words, the numMapTasks is a desired minimum.  We'll try to chop input into at least numMapTasks chunks, each ideally a single fs block.
> If there's not enough input data to create numMapTasks tasks, each with an entire block, then we'll permit tasks whose input is smaller than a filesystem block, down to a minimum split size.
> This handles cases where:
>   - each input record takes a lot of time to process.  In this case we want to make sure we use all of the cluster.  Thus it is important to permit splits smaller than the fs block size.
>   - input i/o dominates.  In this case we want to permit the placement of tasks on hosts where their data is local.  This is only possible if splits are fs block size or smaller.
> Are there other common cases that this algorithm does not handle well?
> The part marked 'new' above is not currently implemented, but I'd like to add it.
> Does this sound reasonble?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Re: [jira] Commented: (HADOOP-38) default splitter should incorporate fs block size

Posted by Doug Cutting <cu...@apache.org>.
Eric Baldeschwieler wrote:
> You may simply want to specify the input size per job (maybe in  
> blocks?) and let the framework sort things out.

You can achieve that in my proposal by increasing the minSplitSize to 
something larger than the block size.  So that's already possible.  All 
that I'm suggesting is that the default is to try to make things one 
block per split, unless that results in too few splits.

> A possible optimization would be to read discontinuous blocks into  one 
> map job if you want to pump several blocks worth of data into  each 
> job.  Given the map/reduce mechanism, this should work, yes?

I think you mean multiple blocks per task.  That has potential 
restartability issues, since it's a lot like bigger blocks.  And the 
tasktracker still has to have some representation of every block in 
memory, so I'm not sure it makes the datastructure much smaller, which 
is my primary concern with large numbers of tasks.

A reason to usually keep the number of map tasks much greater than the 
number of CPUs is to reduce the impact of restarting map tasks, since 
most user computation is done while mapping.

A reason to usually keep the number of reduce tasks only slightly 
greater than the number of CPUs is to use all resources while not 
generating too many output files, since these might be combined with the 
output of other maps to form inputs, and we'd rather have fewer large 
inputs to split up than more smaller ones.

Doug

Re: [jira] Commented: (HADOOP-38) default splitter should incorporate fs block size

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.
You may simply want to specify the input size per job (maybe in  
blocks?) and let the framework sort things out.

A possible optimization would be to read discontinuous blocks into  
one map job if you want to pump several blocks worth of data into  
each job.  Given the map/reduce mechanism, this should work, yes?

On Feb 14, 2006, at 12:49 PM, Bryan Pendleton (JIRA) wrote:

>     [ http://issues.apache.org/jira/browse/HADOOP-38? 
> page=comments#action_12366381 ]
>
> Bryan Pendleton commented on HADOOP-38:
> ---------------------------------------
>
> The idea sounds sound, but is blocksize the best unit? There's a  
> certain overhead for each additional task added to a job - for jobs  
> with really large input, this could cause really large task lists.  
> Is there going to be any code for pre-replicating blocks? Maybe  
> sequences, so there'd be a natural "first choice" node for many  
> chunkings of larger than one block? Obviously, as datanodes come  
> and go this might not always work ideally, but it could help in the  
> 80% case.
>
>> default splitter should incorporate fs block size
>> -------------------------------------------------
>>
>>          Key: HADOOP-38
>>          URL: http://issues.apache.org/jira/browse/HADOOP-38
>>      Project: Hadoop
>>         Type: Improvement
>>   Components: mapred
>>     Reporter: Doug Cutting
>
>>
>> By default, the file splitting code should operate as follows.
>>   inputs are <file>*, numMapTasks, minSplitSize, fsBlockSize
>>   output is <file,start,length>*
>>   totalSize = sum of all file sizes;
>>   desiredSplitSize = totalSize / numMapTasks;
>>   if (desiredSplitSize > fsBlockSize)             /* new */
>>     desiredSplitSize = fsBlockSize;
>>   if (desiredSplitSize < minSplitSize)
>>     desiredSplitSize = minSplitSize;
>>   chop input files into desiredSplitSize chunks & return them
>> In other words, the numMapTasks is a desired minimum.  We'll try  
>> to chop input into at least numMapTasks chunks, each ideally a  
>> single fs block.
>> If there's not enough input data to create numMapTasks tasks, each  
>> with an entire block, then we'll permit tasks whose input is  
>> smaller than a filesystem block, down to a minimum split size.
>> This handles cases where:
>>   - each input record takes a lot of time to process.  In this  
>> case we want to make sure we use all of the cluster.  Thus it is  
>> important to permit splits smaller than the fs block size.
>>   - input i/o dominates.  In this case we want to permit the  
>> placement of tasks on hosts where their data is local.  This is  
>> only possible if splits are fs block size or smaller.
>> Are there other common cases that this algorithm does not handle  
>> well?
>> The part marked 'new' above is not currently implemented, but I'd  
>> like to add it.
>> Does this sound reasonble?
>
> -- 
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the  
> administrators:
>    http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see:
>    http://www.atlassian.com/software/jira
>