You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Uma Maheswara Rao G (JIRA)" <ji...@apache.org> on 2012/11/09 11:58:13 UTC

[jira] [Commented] (HADOOP-8240) Allow users to specify a checksum type on create()

    [ https://issues.apache.org/jira/browse/HADOOP-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493910#comment-13493910 ] 

Uma Maheswara Rao G commented on HADOOP-8240:
---------------------------------------------

@Kihwal, 
 
 I have created a file with checksum disable option and I am seeing ArrayIndexOutOfBoundsException.

{code}
out = fs.create(fileName, FsPermission.getDefault(), flags, fs.getConf()
	  .getInt("io.file.buffer.size", 4096), replFactor, fs
	  .getDefaultBlockSize(fileName), null, ChecksumOpt.createDisabled());
{code}

See the trace here:

{noformat}
java.lang.ArrayIndexOutOfBoundsException: 0
	at org.apache.hadoop.fs.FSOutputSummer.int2byte(FSOutputSummer.java:178)
	at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:162)
	at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:106)
	at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:92)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
	at java.io.DataOutputStream.write(DataOutputStream.java:90)
	at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:261)
	at org.apache.hadoop.hdfs.TestReplication.testBadBlockReportOnTransfer(TestReplication.java:174)
{noformat}

Whether I have missed any other configs to set?


In FSOutputSummer#int2byte will not check any bytes length, so, do you think we have to to check the length then only we call this in CRC NULL case, as there will not be any checksum bytes?

{code}
static byte[] int2byte(int integer, byte[] bytes) {
    bytes[0] = (byte)((integer >>> 24) & 0xFF);
    bytes[1] = (byte)((integer >>> 16) & 0xFF);
    bytes[2] = (byte)((integer >>>  8) & 0xFF);
    bytes[3] = (byte)((integer >>>  0) & 0xFF);
    return bytes;
  }
{code}

Another point is, If I create any file with ChecksumOpt.createDisabled, there is no point of doing block scan on that block in DN, because it can never detect that block as corrupt as there will not be any CRC bytes. Unnecessary block read will happen via BlockScan for no pupose. I am not sure I understand this JIRA wrongly, please correct me if I am wrong.


                
> Allow users to specify a checksum type on create()
> --------------------------------------------------
>
>                 Key: HADOOP-8240
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8240
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.23.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>             Fix For: 0.23.3, 2.0.2-alpha
>
>         Attachments: hadoop-8240-branch-0.23-alone.patch.txt, hadoop-8240.patch, hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, hadoop-8240-post-hadoop-8700-br2-trunk.patch.txt, hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt, hadoop-8240-trunk-branch2.patch.txt
>
>
> Per discussion in HADOOP-8060, a way for users to specify a checksum type on create() is needed. The way FileSystem cache works makes it impossible to use dfs.checksum.type to achieve this. Also checksum-related API is at Filesystem-level, so we prefer something at that level, not hdfs-specific one.  Current proposal is to use CreatFlag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira