You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Phabricator (Commented) (JIRA)" <ji...@apache.org> on 2012/04/02 02:06:29 UTC

[jira] [Commented] (HIVE-2711) Make the header of RCFile unique

    [ https://issues.apache.org/jira/browse/HIVE-2711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243910#comment-13243910 ] 

Phabricator commented on HIVE-2711:
-----------------------------------

omalley has commented on the revision "HIVE-2711 [jira] Make the header of RCFile unique".

  Ashutosh,

  My point is that RCFile was *always* distinct from Sequence Files. RCFile was a fork of Sequence File when the Sequence File version was 6, therefore nothing before version 6 can possibly be an RCFile.

  Headers:
    Sequence Files: SEQ1, SEQ2, SEQ3, SEQ4, SEQ5, SEQ6
    RCFiles: SEQ6, RCF1

  Also note that SEQ5 was last written by Hadoop 0.10 back in Feb 2007, a year and a half before Hive was created.

REVISION DETAIL
  https://reviews.facebook.net/D2115

                
> Make the header of RCFile unique
> --------------------------------
>
>                 Key: HIVE-2711
>                 URL: https://issues.apache.org/jira/browse/HIVE-2711
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: HIVE-2711.D2115.1.patch
>
>
> The RCFile implementation was copied from Hadoop's SequenceFile and copied the 'magic' string in the header. This means that you can't use the header to distinguish between RCFiles and SequenceFiles.
> I'd propose that we create a new header for RCFiles (RCF?) to replace the current SEQ. To maintain compatibility, we'll need to continue to accept the current 'SEQ\06' and just make new files contain the new header.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira