You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Pete Wyckoff (JIRA)" <ji...@apache.org> on 2008/09/18 05:17:44 UTC

[jira] Created: (HADOOP-4199) Add Serialization for RecordIO

Add Serialization for RecordIO
------------------------------

                 Key: HADOOP-4199
                 URL: https://issues.apache.org/jira/browse/HADOOP-4199
             Project: Hadoop Core
          Issue Type: New Feature
          Components: mapred
            Reporter: Pete Wyckoff
            Priority: Minor


Implement org.apache.hadoop.io.serialization.Serialization/Serializer/Deserializer interfaces



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4199) Add Serialization for RecordIO

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HADOOP-4199:
---------------------------------

    Attachment: RecordIOSerialization.java

This needs some cleaning up but unit tests pass. will add those and the cleaned up version soon.


> Add Serialization for RecordIO
> ------------------------------
>
>                 Key: HADOOP-4199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4199
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Pete Wyckoff
>            Priority: Minor
>         Attachments: RecordIOSerialization.java
>
>
> Implement org.apache.hadoop.io.serialization.Serialization/Serializer/Deserializer interfaces

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4199) Add Serialization for RecordIO

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632216#action_12632216 ] 

Tom White commented on HADOOP-4199:
-----------------------------------

Since Record extends Writable, will WritableSerialization not work here?

> Add Serialization for RecordIO
> ------------------------------
>
>                 Key: HADOOP-4199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4199
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Pete Wyckoff
>            Priority: Minor
>         Attachments: HADOOP-4199.0.txt, RecordIOSerialization.java
>
>
> Implement org.apache.hadoop.io.serialization.Serialization/Serializer/Deserializer interfaces

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4199) Add Serialization for RecordIO

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633086#action_12633086 ] 

Pete Wyckoff commented on HADOOP-4199:
--------------------------------------

I guess this JIRA brings up the larger point that although a Serializer/Deserializer may be on something that extends Writable, you may still need more information that that?  i.e., Binary, CSV, .. for RecordIO.  This isn't a great example because admittedly CSV isn't a very useful format for long lived data.

So, maybe I should mark this as invalid? 

But, what if i did have data in CSV format? I could never get to it with the current Serialization framework and SequenceFileRecordReader - I would actually have to define my own SequenceFileRecordReader that knows not to add the default Writable Serialization implementation??



> Add Serialization for RecordIO
> ------------------------------
>
>                 Key: HADOOP-4199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4199
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/serialization, mapred
>            Reporter: Pete Wyckoff
>            Priority: Minor
>         Attachments: HADOOP-4199.0.txt, RecordIOSerialization.java
>
>
> Implement org.apache.hadoop.io.serialization.Serialization/Serializer/Deserializer interfaces

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4199) Add Serialization for RecordIO

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HADOOP-4199:
---------------------------------

    Attachment: HADOOP-4199.0.txt

This is a full blown patch modeled after the ThriftSerialization code. 

No build file here as waiting for the ThriftSerialization to be committed to link with that build file.


> Add Serialization for RecordIO
> ------------------------------
>
>                 Key: HADOOP-4199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4199
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Pete Wyckoff
>            Priority: Minor
>         Attachments: HADOOP-4199.0.txt, RecordIOSerialization.java
>
>
> Implement org.apache.hadoop.io.serialization.Serialization/Serializer/Deserializer interfaces

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4199) Add Serialization for RecordIO

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-4199:
------------------------------

    Component/s: contrib/serialization

> Add Serialization for RecordIO
> ------------------------------
>
>                 Key: HADOOP-4199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4199
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/serialization, mapred
>            Reporter: Pete Wyckoff
>            Priority: Minor
>         Attachments: HADOOP-4199.0.txt, RecordIOSerialization.java
>
>
> Implement org.apache.hadoop.io.serialization.Serialization/Serializer/Deserializer interfaces

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4199) Add Serialization for RecordIO

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632304#action_12632304 ] 

Pete Wyckoff commented on HADOOP-4199:
--------------------------------------

yes, you are right - forgot a Record is a Writable. But, what about clearing the object?  How will that work for these?

Also, how will non-Binary streams work?  Admittedly, I didn't implement that, but we should support Csv and anything else Record supports??



> Add Serialization for RecordIO
> ------------------------------
>
>                 Key: HADOOP-4199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4199
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Pete Wyckoff
>            Priority: Minor
>         Attachments: HADOOP-4199.0.txt, RecordIOSerialization.java
>
>
> Implement org.apache.hadoop.io.serialization.Serialization/Serializer/Deserializer interfaces

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.