You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Alan Horowitz <al...@yahoo.com> on 2008/07/04 07:29:28 UTC

nested for loops

I'm a newbie, so feel free to rftm is this is old hat: what's the best way to do a nested for loop in hadoop? Specifically, lets say I've got a list of elements, and I want to do an all against all comparison. The standard nested for loop would be:

for i in 1..10:
    for j in i..10:
        doSomething(myList[i],myList[j])

Any good ways to do this in hadoop?

I assume I could do the full all-against-all with a nested map:
map1< key elements, value elements >
    map2<key elements, value elements>

Is there any way to not do the duplicate calculations. This is a pretty common code pattern, so I figure someone has thought this through before.

(Also, is it kosher to have a map function that calls another map function or will that mess up the scheduler?)