You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Milind Bhandarkar (JIRA)" <ji...@apache.org> on 2007/03/06 20:59:24 UTC

[jira] Commented: (HADOOP-1031) Enhancements to Hadoop record I/O - Part 2

    [ https://issues.apache.org/jira/browse/HADOOP-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478546 ] 

Milind Bhandarkar commented on HADOOP-1031:
-------------------------------------------

I would like to add another proposed change to record I/O here. Currently hadoop.record.RecordReader and RecordWriter act as factories for various InputArcchive and OutputArchive recently. In the original design, this was done in order to have tight control over various serialization formats. This has proven to be counterproductive. For wider usage of record I/O one should be able to use their own serialization formats. The proposed changes make it possible. They are as follows:

1. Eliminate current record.RecordReader and record.RecordWriter.

2. rename InputArchive as RecordReader, and OutputArchive as RecordWriter.

3. rename various archives accordingly. e..g. BinaryInputArchive -> BinaryRecordReader etc.

> Enhancements to Hadoop record I/O - Part 2
> ------------------------------------------
>
>                 Key: HADOOP-1031
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1031
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: record
>    Affects Versions: 0.11.2
>         Environment: All
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
>
> Remaining planned enhancements to Hadoop record I/O:
> 5. Provide a 'swiggable' C binding, so that processing the generated C code with swig allows it to be used in scripting languages such as Python and Perl. 
> 7. Optimize generated write() and readFields() methods, so that they do not have to create BinaryOutputArchive or BinaryInputArchive every time these methods are called on a record. 
> 8. Implement ByteInStream and ByteOutStream for C++ runtime, as they will be needed for using Hadoop Record I/O with forthcoming C++ MapReduce framework (currently, only FileStreams are provided.) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.