You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2008/03/18 10:03:24 UTC

[jira] Created: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is a field seperator (by default, tab)

Streaming should assume an empty key if the first character on a line is a field seperator (by default, tab)
------------------------------------------------------------------------------------------------------------

                 Key: HADOOP-3040
                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
             Project: Hadoop Core
          Issue Type: Bug
          Components: contrib/streaming
            Reporter: Amareshwari Sriramadasu
            Assignee: Amareshwari Sriramadasu
             Fix For: 0.17.0


Streaming should assume an empty key if the first character on a line is a field seperator (by default, tab). And the value as the whole line excluding the field seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------

    Attachment: patch-3040.txt

Patch fixes the bug. The bug is due to UTF8ByteArrayUtils.findNthByte, which starts to look at the line read, from position 1 instead of 0.

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab)

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------

    Description: Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab). And the value as the whole line excluding the seperator.  (was: Streaming should assume an empty key if the first character on a line is a field seperator (by default, tab). And the value as the whole line excluding the field seperator.)
        Summary: Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab)  (was: Streaming should assume an empty key if the first character on a line is a field seperator (by default, tab))

> Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------

    Attachment: patch-3040.txt

Added a testcase with input having 3 lines:
1. First line has key\tvalue 
2. Second line starts with a tab. i.e empty key and whole line is value. 
3. Third line doesnt have any tab. Then whole line is the key.

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580316#action_12580316 ] 

Amareshwari Sriramadasu commented on HADOOP-3040:
-------------------------------------------------

Just for information,  InputFormat handling of keys and values is fine, problem was in MROutputThread of PipeMapred i.e at the collect in PipeMapred

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580525#action_12580525 ] 

Hadoop QA commented on HADOOP-3040:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12378199/patch-3040.txt
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included -1.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/console

This message is automatically generated.

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-3040:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------

    Description: Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.  (was: Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab). And the value as the whole line excluding the seperator.)
        Summary: Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)  (was: Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab))

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------

    Status: Open  (was: Patch Available)

Canceling patch for adding testcase.

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-3040:
--------------------------------

    Release Note: If the first character on a line is the separator, empty key is assumed, and the whole line is the value (due to a bug this was not the case).

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581503#action_12581503 ] 

Hadoop QA commented on HADOOP-3040:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12378475/patch-3040.txt
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 3 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/console

This message is automatically generated.

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3040) Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581523#action_12581523 ] 

Hudson commented on HADOOP-3040:
--------------------------------

Integrated in Hadoop-trunk #440 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/440/])

> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3040
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.