You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Yuri Pradkin (JIRA)" <ji...@apache.org> on 2008/09/19 22:28:44 UTC

[jira] Created: (HADOOP-4226) LineRecordReader::readLine cleanup

LineRecordReader::readLine cleanup
----------------------------------

                 Key: HADOOP-4226
                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
             Project: Hadoop Core
          Issue Type: Improvement
          Components: mapred
    Affects Versions: 0.19.0
            Reporter: Yuri Pradkin
            Priority: Minor


I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  

I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment:     (was: HADOOP-4226.patch)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635084#action_12635084 ] 

Hadoop QA commented on HADOOP-4226:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12391039/HADOOP-4226.patch
  against trunk revision 699517.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3384/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3384/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3384/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3384/console

This message is automatically generated.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632916#action_12632916 ] 

Yuri Pradkin commented on HADOOP-4226:
--------------------------------------

another snafu, sorry, reuploaded the patch

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: In Progress  (was: Patch Available)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineRecordReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

> LineRecordReader::readLine cleanup
> ----------------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: In Progress  (was: Patch Available)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Work started: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HADOOP-4226 started by Yuri Pradkin.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636022#action_12636022 ] 

Hudson commented on HADOOP-4226:
--------------------------------

Integrated in Hadoop-trunk #620 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/620/])
    . Refactor and document LineReader to make it more readily
understandable. Contributed by Yuri Pradkin.


> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634278#action_12634278 ] 

Yuri Pradkin commented on HADOOP-4226:
--------------------------------------

OK, let me respond to Hudson:

1. Tests: since this is not new code, I don't feel obligated to provide more tests.  There already exist tests for TextInputFormat that are already checking this code; these tests pass.

2. Two failed tests: I've re-run these tests on my box, they pass (except TestLeaseRecovery2 throws exception closing an already closed file, but it still passes) on my system.  What's up with that?  If you could point me to a "clean" revision, I'll be happy to generate a patch against that.  Otherwise I think we're OK here.

I might make one more little change to the code to make it a little cleaner, but otherwise I think the patch is ready.  I urge commiters to look at it and let me know what they think.  I'd hate my time to be wasted, so please comment.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

changed  DEFAULT_BUFFER_SIZE back to private

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Work started: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HADOOP-4226 started by Yuri Pradkin.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: Patch Available  (was: In Progress)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: Patch Available  (was: In Progress)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineRecordReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment:     (was: LineReader.java.patch)

> LineRecordReader::readLine cleanup
> ----------------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

I think that failed test wasn't related to my changes at all, probably diffed against failing snapshot.  Here it is again with minor adjustments

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633389#action_12633389 ] 

Hadoop QA commented on HADOOP-4226:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12390601/HADOOP-4226.patch
  against trunk revision 697306.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3344/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3344/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3344/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3344/console

This message is automatically generated.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635086#action_12635086 ] 

Chris Douglas commented on HADOOP-4226:
---------------------------------------

* A quick note on policy: old patches are not deleted, as they orphan the discussion about them.
* The improvements to TestTextInputFormat are very good; thanks for these
* This is clearer and better documented than the existing code. However, in reviewing corner cases, the reasoning was sometimes unclear (e.g. deferring the increment of bytesConsumed for CR (clever, by the way)). If this could include a short summary of the strategy employed and a few more comments, it would be more readily understandable the next time a change is proposed.
* CR and LF should be final, as in the previous patch

With a little more documentation, I think this is good to commit.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment:     (was: HADOOP-4226.patch)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White reassigned HADOOP-4226:
---------------------------------

    Assignee: Yuri Pradkin

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635493#action_12635493 ] 

Chris Douglas commented on HADOOP-4226:
---------------------------------------

bq. I'm not a java programmer, so I could use some advice here on where would such summary go
Though we have some [standards|http://java.sun.com/docs/codeconv/html/CodeConventions.doc4.html#16838], these in particular aren't strictly followed. The vast, vast majority of users shouldn't see this, so putting it in the javadoc is probably unnecessary. A sketch of the algorithm could follow the declaration/signature of the method in a few lines of block comments. After that, the only pieces that require comments are those that aren't obvious, e.g. why bytesConsumed can't terminate the loop if we only have \r.

bq. If people were to re-use this class with different newline separators, they might find it useful to be able to change CR and LF
It's pretty specialized code... I'd much rather lock it down until we have a use case. Having non-final, public static fields is generally considered bad practice. It makes the code unverifiable in isolation and admits some very difficult bugs (and races if two classes assign different values to CR and LF).

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: Patch Available  (was: In Progress)

Patched against current trunk.

bq. If buffer ends in \r and the following segment starts with \r, it looks like this may not separate those lines.
I've modified tests in TestInputFormat to do additional beating on readLine.  One of the tests validates the \r\r sequence.  Buffer size varies and  at least a couple of times we should have a read in between of the two \r's.

I've fixed all other nits that you commented on.

bq. It's not obvious that it should work that way at all, but backwards compatibility is a big deal for this class.
It probably should, as it turns out \r is a newline on older macs and commodore, I think.


> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: Open  (was: Patch Available)

forgot to change DEFAULT_BUFFER_SIZE to private

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4226:
----------------------------------

    Status: Open  (was: Patch Available)

bq. OK. BTW, are you planning to switch LRR to use util.LineReader?
HADOOP-4269 fixed this; trunk runs the unit tests from TextInputFormat over o.a.h.util.LineReader, as intended. Sorry, the current patch no longer applies to trunk.

bq. Only as an optimization; for this little piece of code I think it's OK. It is (or going to be) heavily used, as you say.
*nod* Fair enough. I think it makes the code slightly harder to read for a nominal performance gain that the JIT compiler should cover, but it's not significant.

bq. Sorry, fixed. It was not obvious from looking at the code.
It's not obvious that it should work that way at all, but backwards compatibility is a big deal for this class.

* Unless there are new tests in TestLineReader, that class or the redundant tests in TestTextInputFormat should be removed. A test case verifying that streams ending in \r behave like those ending in \n would be a good addition to this patch. Sorry, "this" was ambiguous in my last comment.
* The default buffer size doesn't need to be public (resolved in trunk)
* CR and LF don't need to be constants; they should be private at least
* Instead of
{noformat}
+        if (bufferLength <= bufferPosn)
+          break; // EOF
{noformat}
It might make more sense to check {{bufferLength <= 0}}, to which bufferPosn is set
* Instead of incrementing:
{noformat}
+        if (prevCharCR) { //CR + notLF, we are at notLF
+          ++newlineLength;
{noformat}
Shouldn't this always set newLineLength = 1?
* If buffer ends in \r and the following segment starts with \r, it looks like this may not separate those lines.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineRecordReader::readLine cleanup

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632907#action_12632907 ] 

Hadoop QA commented on HADOOP-4226:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12390539/LineReader.java.patch
  against trunk revision 697306.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3338/console

This message is automatically generated.

> LineRecordReader::readLine cleanup
> ----------------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: LineReader.java.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineRecordReader::readLine cleanup

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632815#action_12632815 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-4226:
------------------------------------------------

Forgot to attach a file?

> LineRecordReader::readLine cleanup
> ----------------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: LineReader.java.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineRecordReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment:     (was: LineRecordReader.java.patch)

> LineRecordReader::readLine cleanup
> ----------------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635415#action_12635415 ] 

Yuri Pradkin commented on HADOOP-4226:
--------------------------------------

bq. old patches are not deleted, as they orphan the discussion about them.
Sorry about that: started deleting inconsequential patches and got carried away.

bq. If this could include a short summary of the strategy employed and a few more comments, it would be more readily understandable the next time a change is proposed.
I'm not a java programmer, so I could use some advice here on where would such summary go: javadoc comments (/**...**/) or the policy is to keep them short and the longer comments should go somewhere else?

bq. CR and LF should be final, as in the previous patch
If people were to re-use this class with different newline separators, they might find it useful to be able to change CR and LF.  Maybe they should not be static and be public, so they can simply be assigned to?

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: Patch Available  (was: In Progress)

added comments, CR,LF are private final static.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: Patch Available  (was: In Progress)

Thanks for your input.

bq.  TestTextInputFormat is testing LineRecordReader.LineReader, not o.a.h.util.LineReader, to which this patch applies.
OK.  BTW, are you planning to switch LRR to use util.LineReader?

bq. Why track str.getLength() with a local var?
Only as an optimization; for this little piece of code I think it's OK.  It is (or going to be) heavily used, as you say.

bq. This appears to regard \r as a valid character (newLineLength is reset for each char read, and only relevant when followed by \n), but the original does not.
Sorry, fixed.  It was not obvious from looking at the code.

bq. If there isn't a unit test validating this, writing it would be helpful.
Done.  Beating the hell out of it.


> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633479#action_12633479 ] 

Hadoop QA commented on HADOOP-4226:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12390667/HADOOP-4226.patch
  against trunk revision 697306.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3345/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3345/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3345/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3345/console

This message is automatically generated.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment:     (was: HADOOP-4226.patch)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Work started: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HADOOP-4226 started by Yuri Pradkin.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633126#action_12633126 ] 

Yuri Pradkin commented on HADOOP-4226:
--------------------------------------

What do I need to do to make this patch available (for Hudson to test)?  I've marked it "in Progress", re-attached the patch.  Now I don't even have "Submit patch", the only available workflow action is "resolve issue".  This web interface is a little srewy.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineRecordReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: LineRecordReader.java.patch

> LineRecordReader::readLine cleanup
> ----------------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: LineRecordReader.java.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: Patch Available  (was: In Progress)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: In Progress  (was: Patch Available)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4226:
----------------------------------

    Status: Open  (was: Patch Available)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineRecordReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: LineReader.java.patch

sorry, I did not realized that readLine moved to LineReader in util, deleted old patch, submitted new.

> LineRecordReader::readLine cleanup
> ----------------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: LineReader.java.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4226:
----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.20.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

+1 Looks good

I just committed this. Thanks, Yuri

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Comment: was deleted

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634326#action_12634326 ] 

Chris Douglas commented on HADOOP-4226:
---------------------------------------

bq. I might make one more little change to the code to make it a little cleaner, but otherwise I think the patch is ready. I urge commiters to look at it and let me know what they think. I'd hate my time to be wasted, so please comment.
If you're going to change the code again, a review of the final version is the best use of everyone's time. Also: it's only been two days. This is a heavily used piece of code with a lot of corner cases, so this refactoring is risky, low-priority, and non-trivial to review. Please try to be patient.

bq. Tests: since this is not new code, I don't feel obligated to provide more tests. There already exist tests for TextInputFormat that are already checking this code; these tests pass.
TestTextInputFormat is testing LineRecordReader.LineReader, not o.a.h.util.LineReader, to which this patch applies.

* Per the [guidelines|http://wiki.apache.org/hadoop/HowToContribute], don't use tabs
* Why track str.getLength() with a local var?
* This shouldn't append stray \r characters to the end of the string. This should treat \r, \n, and \r\n as delimiters, as in the original. If there isn't a unit test validating this, writing it would be helpful.
* This appears to regard \r as a valid character (newLineLength is reset for each char read, and only relevant when followed by \n), but the original does not.

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Attachment: HADOOP-4226.patch

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Assignee: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch, HADOOP-4226.patch, HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineRecordReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Status: Patch Available  (was: Open)

> LineRecordReader::readLine cleanup
> ----------------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4226) LineReader::readLine cleanup

Posted by "Yuri Pradkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuri Pradkin updated HADOOP-4226:
---------------------------------

    Summary: LineReader::readLine cleanup  (was: LineRecordReader::readLine cleanup)

> LineReader::readLine cleanup
> ----------------------------
>
>                 Key: HADOOP-4226
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4226
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Yuri Pradkin
>            Priority: Minor
>         Attachments: HADOOP-4226.patch
>
>
> I've been looking at HADOOP-4010 and realized that readLine is pretty convoluted.  I changed the implementation which made it hopefully a little easier to read/validate/understand.  
> I've had some problems testing it locally, so I'll submit it for Hudson to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.