You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by brien colwell <xc...@gmail.com> on 2009/10/17 00:14:24 UTC
map-side join with directories
hi all,
Regarding CompositeInputFormat, my experience is that when giving a
directory as an input, the entries from the files in the directory do
not join. Entries join as expected when giving each individual file as
an input. Is this the expected behavior? I would expect both join
expressions below to give the same result.
Path dirPath = new Path("hdfs://some/dir");
Path[] list = allFilesInDir(dirPath);
// This join expression does not join entries:
CompositeInputFormat.compose("outer", SequenceFileInputFormat.class, new
Path[]{dirPath});
// Does join entries:
CompositeInputFormat.compose("outer", SequenceFileInputFormat.class, list);
regards,
Brien