You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Marco Nicosia <ma...@yahoo-inc.com> on 2007/05/03 00:50:19 UTC

Suggest: 0.12.4 release?

Hadoop-a-loops,

My name's Marco Nicosia, and I run a few Hadoop clusters within Yahoo!.

I'd like to suggest cutting a 0.12.4 release for two reasons:

1] My Hadoop clusters need patches that are currently scheduled for 0.13, NOW.
2] These patches would go a long way towards "stabilizing" the 0.12 branch.

Having a (more) stable 0.12 branch would be good both for those who are waiting 
for 0.13 to stabilize, as well as those users that habitually lag one release 
behind.

I'm including a listing of the JIRA bugs that we'd like to use before 0.13 
stabilizes. What do you all think of scheduling these bugs against 0.12.4, and 
rolling that release between the time that 0.13 code-freezes and is deemed "stable?"

Each of these bugs have a patch uploaded (tho possibly not yet in patch-avail 
state) that we intend to apply against our copy of 0.12.3:

HDFS issues:
HADOOP-1255 - Name-node falls into infinite loop trying to remove a dead node
HADOOP-1297 - datanode sending block reports to namenode once every second
HADOOP-1312 - heartbeat monitor goes away
HADOOP-1189 - Still seeing unuexpected 'No space left on device' exceptions

Map/Reduce job hangs:
HADOOP-1152 - Reduce task hang failing in MapOutputCopier.copyOutput

And:
HADOOP-1183 - MapTask completion not recorded properly at the Reducer's end 
(Owen uneasy?)

-- OR --

HADOOP-1270 - Randomize the fetch of map outputs
	From Devaraj:
         ) Although h-1270 addresses a different problem, but under the
         ) covers, it also solves the problem of old map events
         ) (corresponding to the failing fetches) overwriting new map
         ) events. This is because the datastructure, knownOutputs,
         ) has been made a List there (earlier it was a Map) and so
         ) MapOutputLocations will get appended rather than being
         ) overwritten.

Finally, desired, but unlikely to be available soon:
HADOOP-1300 - deletion of excess replicas does not take into account 'rack-locality'

Comments?

-- 
    Marco Nicosia
    Kryptonite Grid