You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by brien colwell <xc...@gmail.com> on 2009/10/17 00:14:24 UTC

map-side join with directories

hi all,

Regarding CompositeInputFormat, my experience is that when giving a 
directory as an input, the entries from the files in the directory do 
not join. Entries join as expected when giving each individual file as 
an input. Is this the expected behavior? I would expect both join 
expressions below to give the same result.


Path dirPath = new Path("hdfs://some/dir");
Path[] list = allFilesInDir(dirPath);

// This join expression does not join entries:
CompositeInputFormat.compose("outer", SequenceFileInputFormat.class, new 
Path[]{dirPath});

// Does join entries:
CompositeInputFormat.compose("outer", SequenceFileInputFormat.class, list);


regards,
Brien