You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jason Venner <ja...@attributor.com> on 2007/12/28 20:53:42 UTC
FileUtil.copyMerge and SequenceFiles. 0-15.1
Is it safe to use this to generate a single SequenceFile out of a set of
sequence files produced by reduce?
this seems to be the source of my damaged sequence files.
Re: FileUtil.copyMerge and SequenceFiles. 0-15.1
Posted by Arun C Murthy <ar...@yahoo-inc.com>.
On Fri, Dec 28, 2007 at 11:53:42AM -0800, Jason Venner wrote:
>Is it safe to use this to generate a single SequenceFile out of a set of
>sequence files produced by reduce?
>
Nope.
FileUtil.copyMerge just copies bytes of src files into one large heap of a destination file. This will break if src files are SequenceFiles since we now have multiple headers mixed with data.
I've opened http://issues.apache.org/jira/browse/HADOOP-2501 to cover _merge_ and other useful utilities for SequenceFiles.
>this seems to be the source of my damaged sequence files.
>
Arun