You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Runping Qi (JIRA)" <ji...@apache.org> on 2007/01/24 09:03:51 UTC
[jira] Updated: (HADOOP-732) SequenceFile's header should allow to
store metadata in the form of key/value pairs
[ https://issues.apache.org/jira/browse/HADOOP-732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-732:
------------------------------
Attachment: seqFileMetadata.patch
Attached is a patch for this issue.
SequenceFile has a new header --- a TreeMap<Text, Text> object wrapped in a class, Metadata, implementing Writable interface. To accomodate this, the version number is bumped up to 6.
The Reader class has a new member variable for the metadata. A method is also added for returning the metadata object. The new code can read the files of old versions.
New constructors of various Writer classes are added to take a metadata object as their last parameter. New createWriter static functions with metadata as the last
parameter are also introduced. They are all backward compatible. A new unit test is added to TestSequenceFile for testing writing/reading sequence files with metadata.
All unit tests passed.
> SequenceFile's header should allow to store metadata in the form of key/value pairs
> -----------------------------------------------------------------------------------
>
> Key: HADOOP-732
> URL: https://issues.apache.org/jira/browse/HADOOP-732
> Project: Hadoop
> Issue Type: New Feature
> Components: io
> Reporter: Runping Qi
> Assigned To: Runping Qi
>
> The sequence file currently stores a fixed list of metadata attributes, such as key/value class names,
> compression method, etc. To make sequence file more self descriptable, it should allow to store a list of key/value pairs. One particular attribute of interest is to indicate whether the key/value classes are actually hadoop record classes,
> if so, store the DDls for the records. This way, we may create tools to extract DDl from a sequence file and
> then generate necessary classes. It also make it possible to provide an interpretive version of Hadoop record.
> This way, even in the situation where Hadoop or the application does not have the necessary classes,
> a sequence file of Hadoop records can be read and deserialized "interpretively".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.