You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Santhosh Srinivasan (JIRA)" <ji...@apache.org> on 2009/06/11 19:57:07 UTC

[jira] Created: (PIG-842) PigStorage should support multi-byte delimiters

PigStorage should support multi-byte delimiters
-----------------------------------------------

                 Key: PIG-842
                 URL: https://issues.apache.org/jira/browse/PIG-842
             Project: Pig
          Issue Type: Improvement
          Components: impl
    Affects Versions: 0.3.0
            Reporter: Santhosh Srinivasan
             Fix For: 0.3.0


Currently, PigStorage supports single byte delimiters. Users have requested mult-byte delimiters. There are performance implications with multi-byte delimiters. i.e., instead of looking for a single byte, PigStorage should look for a pattern ala BinStorage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-842) PigStorage should support multi-byte delimiters

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720259#action_12720259 ] 

Alan Gates commented on PIG-842:
--------------------------------

I'm concerned about the performance hit of supporting multi-byte comparators.  Before we commit to doing this in PigStorage, we should test how much it slows down reading data.  If it is significant, we should consider having a PigMultiByteStorage or something that handles multi-byte delimiter characters.  It could extend PigStorage and only differ in how it parses the records.

> PigStorage should support multi-byte delimiters
> -----------------------------------------------
>
>                 Key: PIG-842
>                 URL: https://issues.apache.org/jira/browse/PIG-842
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.3.0
>            Reporter: Santhosh Srinivasan
>             Fix For: 0.3.0
>
>
> Currently, PigStorage supports single byte delimiters. Users have requested mult-byte delimiters. There are performance implications with multi-byte delimiters. i.e., instead of looking for a single byte, PigStorage should look for a pattern ala BinStorage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.