You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Christian Kunz (JIRA)" <ji...@apache.org> on 2006/09/05 22:48:24 UTC

[jira] Created: (HADOOP-508) random seeks using FSDataInputStream can become invalid such that reads return invalid data

random seeks using FSDataInputStream can become invalid such that reads return invalid data 
--------------------------------------------------------------------------------------------

                 Key: HADOOP-508
                 URL: http://issues.apache.org/jira/browse/HADOOP-508
             Project: Hadoop
          Issue Type: Bug
    Affects Versions: 0.5.0
            Reporter: Christian Kunz


Some of my applications using Hadoop DFS  receive wrong data after certain random seeks. After some investigation I believe (without looking at source code of java.io.BufferedInputStream) that it basically boils down to the fact that the method 
read(byte[] b, int off, int len), when called with an external buffer larger than the internal buffer, reads into the external buffer directly without using the internal buffer anymore, but without invalidating the internal buffer by setting the variable 'count' to 0 such that a subsequent seek to an offset which is closer to the 'position' of the Positioncache than the internal buffersize will put the current position into the internal buffer containing outdated data from somewhere else.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-508) random seeks using FSDataInputStream can become invalid such that reads return invalid data

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-508?page=all ]

Milind Bhandarkar updated HADOOP-508:
-------------------------------------

               Status: Patch Available  (was: Open)
        Fix Version/s: 0.7.0
    Affects Version/s: 0.6.2
                           (was: 0.5.0)

Fixed.

> random seeks using FSDataInputStream can become invalid such that reads return invalid data
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-508
>                 URL: http://issues.apache.org/jira/browse/HADOOP-508
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.6.2
>            Reporter: Christian Kunz
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.7.0
>
>         Attachments: hadoop-508.patch
>
>
> Some of my applications using Hadoop DFS  receive wrong data after certain random seeks. After some investigation I believe (without looking at source code of java.io.BufferedInputStream) that it basically boils down to the fact that the method 
> read(byte[] b, int off, int len), when called with an external buffer larger than the internal buffer, reads into the external buffer directly without using the internal buffer anymore, but without invalidating the internal buffer by setting the variable 'count' to 0 such that a subsequent seek to an offset which is closer to the 'position' of the Positioncache than the internal buffersize will put the current position into the internal buffer containing outdated data from somewhere else.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-508) random seeks using FSDataInputStream can become invalid such that reads return invalid data

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-508?page=all ]

Milind Bhandarkar updated HADOOP-508:
-------------------------------------

    Attachment: hadoop-508.patch

Fixed.

> random seeks using FSDataInputStream can become invalid such that reads return invalid data
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-508
>                 URL: http://issues.apache.org/jira/browse/HADOOP-508
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.6.2
>            Reporter: Christian Kunz
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.7.0
>
>         Attachments: hadoop-508.patch
>
>
> Some of my applications using Hadoop DFS  receive wrong data after certain random seeks. After some investigation I believe (without looking at source code of java.io.BufferedInputStream) that it basically boils down to the fact that the method 
> read(byte[] b, int off, int len), when called with an external buffer larger than the internal buffer, reads into the external buffer directly without using the internal buffer anymore, but without invalidating the internal buffer by setting the variable 'count' to 0 such that a subsequent seek to an offset which is closer to the 'position' of the Positioncache than the internal buffersize will put the current position into the internal buffer containing outdated data from somewhere else.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-508) random seeks using FSDataInputStream can become invalid such that reads return invalid data

Posted by "Yoram Arnon (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-508?page=all ]

Yoram Arnon updated HADOOP-508:
-------------------------------

    Component/s: dfs

> random seeks using FSDataInputStream can become invalid such that reads return invalid data
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-508
>                 URL: http://issues.apache.org/jira/browse/HADOOP-508
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.5.0
>            Reporter: Christian Kunz
>
> Some of my applications using Hadoop DFS  receive wrong data after certain random seeks. After some investigation I believe (without looking at source code of java.io.BufferedInputStream) that it basically boils down to the fact that the method 
> read(byte[] b, int off, int len), when called with an external buffer larger than the internal buffer, reads into the external buffer directly without using the internal buffer anymore, but without invalidating the internal buffer by setting the variable 'count' to 0 such that a subsequent seek to an offset which is closer to the 'position' of the Positioncache than the internal buffersize will put the current position into the internal buffer containing outdated data from somewhere else.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-508) random seeks using FSDataInputStream can become invalid such that reads return invalid data

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-508?page=all ]

Doug Cutting updated HADOOP-508:
--------------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Fixed

I just committed this.  Thanks, Milind!

> random seeks using FSDataInputStream can become invalid such that reads return invalid data
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-508
>                 URL: http://issues.apache.org/jira/browse/HADOOP-508
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.6.2
>            Reporter: Christian Kunz
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.7.0
>
>         Attachments: hadoop-508.patch
>
>
> Some of my applications using Hadoop DFS  receive wrong data after certain random seeks. After some investigation I believe (without looking at source code of java.io.BufferedInputStream) that it basically boils down to the fact that the method 
> read(byte[] b, int off, int len), when called with an external buffer larger than the internal buffer, reads into the external buffer directly without using the internal buffer anymore, but without invalidating the internal buffer by setting the variable 'count' to 0 such that a subsequent seek to an offset which is closer to the 'position' of the Positioncache than the internal buffersize will put the current position into the internal buffer containing outdated data from somewhere else.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Assigned: (HADOOP-508) random seeks using FSDataInputStream can become invalid such that reads return invalid data

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-508?page=all ]

Milind Bhandarkar reassigned HADOOP-508:
----------------------------------------

    Assignee: Milind Bhandarkar

> random seeks using FSDataInputStream can become invalid such that reads return invalid data
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-508
>                 URL: http://issues.apache.org/jira/browse/HADOOP-508
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.5.0
>            Reporter: Christian Kunz
>         Assigned To: Milind Bhandarkar
>
> Some of my applications using Hadoop DFS  receive wrong data after certain random seeks. After some investigation I believe (without looking at source code of java.io.BufferedInputStream) that it basically boils down to the fact that the method 
> read(byte[] b, int off, int len), when called with an external buffer larger than the internal buffer, reads into the external buffer directly without using the internal buffer anymore, but without invalidating the internal buffer by setting the variable 'count' to 0 such that a subsequent seek to an offset which is closer to the 'position' of the Positioncache than the internal buffersize will put the current position into the internal buffer containing outdated data from somewhere else.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira