You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Alan Horowitz <al...@yahoo.com> on 2008/07/04 07:29:28 UTC
nested for loops
I'm a newbie, so feel free to rftm is this is old hat: what's the best way to do a nested for loop in hadoop? Specifically, lets say I've got a list of elements, and I want to do an all against all comparison. The standard nested for loop would be:
for i in 1..10:
for j in i..10:
doSomething(myList[i],myList[j])
Any good ways to do this in hadoop?
I assume I could do the full all-against-all with a nested map:
map1< key elements, value elements >
map2<key elements, value elements>
Is there any way to not do the duplicate calculations. This is a pretty common code pattern, so I figure someone has thought this through before.
(Also, is it kosher to have a map function that calls another map function or will that mess up the scheduler?)