You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Ahad Rana (JIRA)" <ji...@apache.org> on 2010/06/22 09:24:56 UTC

[jira] Created: (HADOOP-6834) TFile.append compares initial key against null lastKey

TFile.append compares initial key against null lastKey  
--------------------------------------------------------

                 Key: HADOOP-6834
                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
             Project: Hadoop Common
          Issue Type: Bug
          Components: io
    Affects Versions: 0.20.2, 0.20.1
            Reporter: Ahad Rana


The following code in TFile.KeyReigster.close: 


            byte[] lastKey = lastKeyBufferOS.getBuffer();
            int lastLen = lastKeyBufferOS.size();
            if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
                lastLen) < 0) {
              throw new IOException("Keys are not added in sorted order");
            }

compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881366#action_12881366 ] 

Hong Tang commented on HADOOP-6834:
-----------------------------------

Yes, I guess LongWritable could be more defensive by throwing an exception indicating illegal format. Either way, TFile needs to be fixed.

I'd be happy to shepherd you with the patch process.

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889494#action_12889494 ] 

Hudson commented on HADOOP-6834:
--------------------------------

Integrated in Hadoop-Common-trunk #395 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/395/])
    HADOOP-6834. TFile.append compares initial key against null lastKey (hong tang via mahadev)


> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>             Fix For: 0.22.0
>
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881352#action_12881352 ] 

Hong Tang commented on HADOOP-6834:
-----------------------------------

In the first append, lastLen is 0, lastKey refers to a valid byte array whose bytes are all zero. So the behavior is determined. However, I do agree that it is possible for a customary comparator, an zero-length byte array may not be a legal key type. So it would be nice to do a first-key check.

Ahad, would you be interested in providing a patch, or perhaps share with us a sample program with customized comparator that fails TFile.Writer.append?

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888964#action_12888964 ] 

Hong Tang commented on HADOOP-6834:
-----------------------------------

@ahad, by "NULL" key, are you referring to zero-length key? It is certainly allowed by TFile (and you can even have multiple of them). 

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hong Tang reassigned HADOOP-6834:
---------------------------------

    Assignee: Hong Tang

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889261#action_12889261 ] 

Mahadev konar commented on HADOOP-6834:
---------------------------------------

+1 the patch looks good.

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889268#action_12889268 ] 

Hudson commented on HADOOP-6834:
--------------------------------

Integrated in Hadoop-Common-trunk-Commit #327 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/327/])
    HADOOP-6834. TFile.append compares initial key against null lastKey (hong tang via mahadev)


> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>             Fix For: 0.22.0
>
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888939#action_12888939 ] 

Hadoop QA commented on HADOOP-6834:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12449597/hadoop-6834-20100715.patch
  against trunk revision 964134.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 2 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/617/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/617/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/617/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/617/console

This message is automatically generated.

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hong Tang updated HADOOP-6834:
------------------------------

    Attachment: hadoop-6834-20100715.patch

Straightforward patch with a new unit test.

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated HADOOP-6834:
----------------------------------

           Status: Resolved  (was: Patch Available)
     Hadoop Flags: [Reviewed]
    Fix Version/s: 0.22.0
       Resolution: Fixed

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>             Fix For: 0.22.0
>
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Ahad Rana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881372#action_12881372 ] 

Ahad Rana commented on HADOOP-6834:
-----------------------------------

Hi Hong,

Sure, that would be great. What are the requirements these days ? Which version of hadoop should the patch target ? 

Ahad.

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881404#action_12881404 ] 

Hong Tang commented on HADOOP-6834:
-----------------------------------

You should always submit the patch for trunk. If you are using older releases, you may also backport the patch (attach here) and request it to be committed to older branches too.

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888963#action_12888963 ] 

Hong Tang commented on HADOOP-6834:
-----------------------------------

The javadoc warning reported by test-patch seems to be false positive. (Log: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/617/artifact/trunk/patchprocess/patchJavadocWarnings.txt).

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hong Tang updated HADOOP-6834:
------------------------------

    Status: Patch Available  (was: Open)

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.2, 0.20.1
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Ahad Rana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881362#action_12881362 ] 

Ahad Rana commented on HADOOP-6834:
-----------------------------------

Hi Hong,

Yes I would be happy to provide a patch. As far as triggering the failure, if you append a LongWritable with a negative value as the first key value and use the LongWritable's default Comparator, this will trigger the bug. I would be happy to provide a code snippet if required. In the case of LongWritable's comparator, it interprets the uninitialized zero bytes of the lastKey as a zero (since it does not check length), and thus this causes the out-of-order insertion exception. Either LongWritable's comparator is implemented improperly, or the contract implied by RawComparator is not clear on what happens in the case where you compare a non-zero byte array to a zero byte array. Probably both LongWritable's RawComparator should be fixed, but either way, passing in a uninitialized lastKey is going to cause problems for someone else down the road, so probably best to fix this as well ?     

 

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6834) TFile.append compares initial key against null lastKey

Posted by "Ahad Rana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888949#action_12888949 ] 

Ahad Rana commented on HADOOP-6834:
-----------------------------------

I apologize Hong,  this was on my todo list, but obviously I did not get to it in time. Your patch and test case look good. One question: Is a NULL key allowed in the TFile API ? It seems that KeyRegister's close does accepts a zero length key.

Ahad

> TFile.append compares initial key against null lastKey  
> --------------------------------------------------------
>
>                 Key: HADOOP-6834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6834
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1, 0.20.2
>            Reporter: Ahad Rana
>            Assignee: Hong Tang
>         Attachments: hadoop-6834-20100715.patch
>
>
> The following code in TFile.KeyReigster.close: 
>             byte[] lastKey = lastKeyBufferOS.getBuffer();
>             int lastLen = lastKeyBufferOS.size();
>             if (tfileMeta.getComparator().compare(key, 0, len, lastKey, 0,
>                 lastLen) < 0) {
>               throw new IOException("Keys are not added in sorted order");
>             }
> compares the initial  key (passed in via  TFile.Writer.append) against a technically NULL lastKey. lastKey is not initialized until after the first call to TFile.Writer.append. The underlying RawComparator interface used for comparisons does not stipulate the proper behavior when either length 1  or length 2 is zero. In the case of LongWritable, its WritableComparator implementation does an unsafe read on the passed in byte arrays b1 and b2. Since TFile pre-allocates the buffer used for storing lastKey, this passes a valid buffer with zero count to LongWritable's comparator, which ignores length and thus produces incorrect results. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.