You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2007/04/21 02:40:15 UTC

[jira] Created: (HADOOP-1285) ChecksumFileSystem : Can't read when io.file.buffer.size < bytePerChecksum

ChecksumFileSystem : Can't read when io.file.buffer.size < bytePerChecksum
--------------------------------------------------------------------------

                 Key: HADOOP-1285
                 URL: https://issues.apache.org/jira/browse/HADOOP-1285
             Project: Hadoop
          Issue Type: Bug
          Components: fs
    Affects Versions: 0.12.3
            Reporter: Raghu Angadi



Looks like ChecksumFileSystem fails to read a file when bytesPerChecksum is larger than io.file.buffer.size. Default for bytesPerChecksum  and buffer size are 512 and 4096, so default config might not see the problem.

I noticed this problem when I was testing block level CRCs with different configs.

How to reproduce with latest trunk:
Copy a text  file larger than 512 bytes to dfs : bin/hadoop fs -copyFromLocal ~/tmp/x.txt x.txt
then set io.file.buffer.size to something smaller than 512 (say 53). Now try to read the file :

 bin/hadoop dfs -cat x.txt

This will print only the first 53 characters.

The following code or comment at  ChecksumFileSystem.java:163 seems suspect. But not sure if more changes are required:
{code}
    public int read(byte b[], int off, int len) throws IOException {
      // make sure that it ends at a checksum boundary
      long curPos = getPos();
      long endPos = len+curPos/bytesPerSum*bytesPerSum;
      return readBuffer(b, off, (int)(endPos-curPos));
    }
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-1285) ChecksumFileSystem : Can't read when io.file.buffer.size < bytePerChecksum

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513674 ] 

Raghu Angadi edited comment on HADOOP-1285 at 7/18/07 11:26 AM:
----------------------------------------------------------------

Fixed by HADOOP-1470.


 was:
Fixed by HADOOP-1140.

> ChecksumFileSystem : Can't read when io.file.buffer.size < bytePerChecksum
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1285
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1285
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.12.3
>            Reporter: Raghu Angadi
>
> Looks like ChecksumFileSystem fails to read a file when bytesPerChecksum is larger than io.file.buffer.size. Default for bytesPerChecksum  and buffer size are 512 and 4096, so default config might not see the problem.
> I noticed this problem when I was testing block level CRCs with different configs.
> How to reproduce with latest trunk:
> Copy a text  file larger than 512 bytes to dfs : bin/hadoop fs -copyFromLocal ~/tmp/x.txt x.txt
> then set io.file.buffer.size to something smaller than 512 (say 53). Now try to read the file :
>  bin/hadoop dfs -cat x.txt
> This will print only the first 53 characters.
> The following code or comment at  ChecksumFileSystem.java:163 seems suspect. But not sure if more changes are required:
> {code}
>     public int read(byte b[], int off, int len) throws IOException {
>       // make sure that it ends at a checksum boundary
>       long curPos = getPos();
>       long endPos = len+curPos/bytesPerSum*bytesPerSum;
>       return readBuffer(b, off, (int)(endPos-curPos));
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1285) ChecksumFileSystem : Can't read when io.file.buffer.size < bytePerChecksum

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12490631 ] 

Raghu Angadi commented on HADOOP-1285:
--------------------------------------

Btw, this test showed that ChecksumFS returned  53 bytes without checking CRC for full 512 bytes. 

> ChecksumFileSystem : Can't read when io.file.buffer.size < bytePerChecksum
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1285
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1285
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.12.3
>            Reporter: Raghu Angadi
>
> Looks like ChecksumFileSystem fails to read a file when bytesPerChecksum is larger than io.file.buffer.size. Default for bytesPerChecksum  and buffer size are 512 and 4096, so default config might not see the problem.
> I noticed this problem when I was testing block level CRCs with different configs.
> How to reproduce with latest trunk:
> Copy a text  file larger than 512 bytes to dfs : bin/hadoop fs -copyFromLocal ~/tmp/x.txt x.txt
> then set io.file.buffer.size to something smaller than 512 (say 53). Now try to read the file :
>  bin/hadoop dfs -cat x.txt
> This will print only the first 53 characters.
> The following code or comment at  ChecksumFileSystem.java:163 seems suspect. But not sure if more changes are required:
> {code}
>     public int read(byte b[], int off, int len) throws IOException {
>       // make sure that it ends at a checksum boundary
>       long curPos = getPos();
>       long endPos = len+curPos/bytesPerSum*bytesPerSum;
>       return readBuffer(b, off, (int)(endPos-curPos));
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HADOOP-1285) ChecksumFileSystem : Can't read when io.file.buffer.size < bytePerChecksum

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi resolved HADOOP-1285.
----------------------------------

    Resolution: Fixed

Fixed by HADOOP-1140.

> ChecksumFileSystem : Can't read when io.file.buffer.size < bytePerChecksum
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1285
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1285
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.12.3
>            Reporter: Raghu Angadi
>
> Looks like ChecksumFileSystem fails to read a file when bytesPerChecksum is larger than io.file.buffer.size. Default for bytesPerChecksum  and buffer size are 512 and 4096, so default config might not see the problem.
> I noticed this problem when I was testing block level CRCs with different configs.
> How to reproduce with latest trunk:
> Copy a text  file larger than 512 bytes to dfs : bin/hadoop fs -copyFromLocal ~/tmp/x.txt x.txt
> then set io.file.buffer.size to something smaller than 512 (say 53). Now try to read the file :
>  bin/hadoop dfs -cat x.txt
> This will print only the first 53 characters.
> The following code or comment at  ChecksumFileSystem.java:163 seems suspect. But not sure if more changes are required:
> {code}
>     public int read(byte b[], int off, int len) throws IOException {
>       // make sure that it ends at a checksum boundary
>       long curPos = getPos();
>       long endPos = len+curPos/bytesPerSum*bytesPerSum;
>       return readBuffer(b, off, (int)(endPos-curPos));
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.