You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2009/02/27 21:53:12 UTC

[jira] Created: (PIG-691) BinStorage skips tuples when ^A is present in data

BinStorage skips tuples when ^A is present in data
--------------------------------------------------

                 Key: PIG-691
                 URL: https://issues.apache.org/jira/browse/PIG-691
             Project: Pig
          Issue Type: Bug
            Reporter: Olga Natkovich
            Assignee: Pradeep Kamath


Pradeep found a problem with BinStorage.getNext function that causes data loss. He is working on the fix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-691) BinStorage skips tuples when ^A is present in data

Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678165#action_12678165 ] 

Santhosh Srinivasan commented on PIG-691:
-----------------------------------------

I will be reviewing this patch.

> BinStorage skips tuples when ^A is present in data
> --------------------------------------------------
>
>                 Key: PIG-691
>                 URL: https://issues.apache.org/jira/browse/PIG-691
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>            Assignee: Pradeep Kamath
>             Fix For: types_branch
>
>         Attachments: PIG-691.patch
>
>
> Pradeep found a problem with BinStorage.getNext function that causes data loss. He is working on the fix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-691) BinStorage skips tuples when ^A is present in data

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-691:
-------------------------------

    Attachment: PIG-691.patch

> BinStorage skips tuples when ^A is present in data
> --------------------------------------------------
>
>                 Key: PIG-691
>                 URL: https://issues.apache.org/jira/browse/PIG-691
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>            Assignee: Pradeep Kamath
>             Fix For: types_branch
>
>         Attachments: PIG-691.patch
>
>
> Pradeep found a problem with BinStorage.getNext function that causes data loss. He is working on the fix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-691) BinStorage skips tuples when ^A is present in data

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-691:
-------------------------------

        Fix Version/s: types_branch
    Affects Version/s: types_branch
               Status: Patch Available  (was: Open)

Binstorage uses RECORD_1, RECORD_2 and RECORD_3 byte markers (the bytes 0x01, 0x02, 0x03) as the beginning of a new record. The current bug in BinStorage is that in getNext(), the code looks for RECORD_1 and if it finds RECORD_1, it looks for RECORD_2. If it fails to find RECORD_2, it goes back to look for entire sequence starting with looking for RECORD_1. However this failes when we have the following sequence:RECORD_1-RECORD_1-RECORD_2-RECORD_3. After reading the second RECORD_1 in the above sequence, we should not look for RECORD_1 again but start by looking for RECORD_2. This is an issue only when a record in binstorage spans two blocks and the part in the head of the second block has the above sequence. This can happen when the last field in the record is null (null is represented by the byte 0x01 which is RECORD_1). The attached patch fixes this issue.

> BinStorage skips tuples when ^A is present in data
> --------------------------------------------------
>
>                 Key: PIG-691
>                 URL: https://issues.apache.org/jira/browse/PIG-691
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>            Assignee: Pradeep Kamath
>             Fix For: types_branch
>
>
> Pradeep found a problem with BinStorage.getNext function that causes data loss. He is working on the fix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-691) BinStorage skips tuples when ^A is present in data

Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Santhosh Srinivasan updated PIG-691:
------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Patch has been comiited. Thanks for the fix Pradeep.

> BinStorage skips tuples when ^A is present in data
> --------------------------------------------------
>
>                 Key: PIG-691
>                 URL: https://issues.apache.org/jira/browse/PIG-691
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>            Assignee: Pradeep Kamath
>             Fix For: types_branch
>
>         Attachments: PIG-691.patch
>
>
> Pradeep found a problem with BinStorage.getNext function that causes data loss. He is working on the fix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.