You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Santhosh Srinivasan (JIRA)" <ji...@apache.org> on 2009/06/11 19:57:07 UTC
[jira] Created: (PIG-842) PigStorage should support multi-byte
delimiters
PigStorage should support multi-byte delimiters
-----------------------------------------------
Key: PIG-842
URL: https://issues.apache.org/jira/browse/PIG-842
Project: Pig
Issue Type: Improvement
Components: impl
Affects Versions: 0.3.0
Reporter: Santhosh Srinivasan
Fix For: 0.3.0
Currently, PigStorage supports single byte delimiters. Users have requested mult-byte delimiters. There are performance implications with multi-byte delimiters. i.e., instead of looking for a single byte, PigStorage should look for a pattern ala BinStorage.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-842) PigStorage should support multi-byte
delimiters
Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720259#action_12720259 ]
Alan Gates commented on PIG-842:
--------------------------------
I'm concerned about the performance hit of supporting multi-byte comparators. Before we commit to doing this in PigStorage, we should test how much it slows down reading data. If it is significant, we should consider having a PigMultiByteStorage or something that handles multi-byte delimiter characters. It could extend PigStorage and only differ in how it parses the records.
> PigStorage should support multi-byte delimiters
> -----------------------------------------------
>
> Key: PIG-842
> URL: https://issues.apache.org/jira/browse/PIG-842
> Project: Pig
> Issue Type: Improvement
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Santhosh Srinivasan
> Fix For: 0.3.0
>
>
> Currently, PigStorage supports single byte delimiters. Users have requested mult-byte delimiters. There are performance implications with multi-byte delimiters. i.e., instead of looking for a single byte, PigStorage should look for a pattern ala BinStorage.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.