You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2008/03/18 10:03:24 UTC
[jira] Created: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is a field seperator (by default, tab)
Streaming should assume an empty key if the first character on a line is a field seperator (by default, tab)
------------------------------------------------------------------------------------------------------------
Key: HADOOP-3040
URL: https://issues.apache.org/jira/browse/HADOOP-3040
Project: Hadoop Core
Issue Type: Bug
Components: contrib/streaming
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
Fix For: 0.17.0
Streaming should assume an empty key if the first character on a line is a field seperator (by default, tab). And the value as the whole line excluding the field seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------
Attachment: patch-3040.txt
Patch fixes the bug. The bug is due to UTF8ByteArrayUtils.findNthByte, which starts to look at the line read, from position 1 instead of 0.
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is the seperator
(key.value.separator.in.input.line, by default, tab)
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------
Description: Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab). And the value as the whole line excluding the seperator. (was: Streaming should assume an empty key if the first character on a line is a field seperator (by default, tab). And the value as the whole line excluding the field seperator.)
Summary: Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab) (was: Streaming should assume an empty key if the first character on a line is a field seperator (by default, tab))
> Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------
Attachment: patch-3040.txt
Added a testcase with input having 3 lines:
1. First line has key\tvalue
2. Second line starts with a tab. i.e empty key and whole line is value.
3. Third line doesnt have any tab. Then whole line is the key.
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3040) Streaming should assume an empty
key if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580316#action_12580316 ]
Amareshwari Sriramadasu commented on HADOOP-3040:
-------------------------------------------------
Just for information, InputFormat handling of keys and values is fine, problem was in MROutputThread of PipeMapred i.e at the collect in PipeMapred
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3040) Streaming should assume an empty
key if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580525#action_12580525 ]
Hadoop QA commented on HADOOP-3040:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12378199/patch-3040.txt
against trunk revision 619744.
@author +1. The patch does not contain any @author tags.
tests included -1. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
javadoc +1. The javadoc tool did not generate any warning messages.
javac +1. The applied patch does not generate any new javac compiler warnings.
release audit +1. The applied patch does not generate any new release audit warnings.
findbugs +1. The patch does not introduce any new Findbugs warnings.
core tests +1. The patch passed core unit tests.
contrib tests +1. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1999/console
This message is automatically generated.
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-3040:
--------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this. Thanks, Amareshwari!
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------
Description: Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator. (was: Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab). And the value as the whole line excluding the seperator.)
Summary: Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab) (was: Streaming should assume an empty key if the first character on a line is the seperator (key.value.separator.in.input.line, by default, tab))
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------
Status: Open (was: Patch Available)
Canceling patch for adding testcase.
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------
Status: Patch Available (was: Open)
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-3040:
--------------------------------------------
Status: Patch Available (was: Open)
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3040) Streaming should assume an empty key
if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-3040:
--------------------------------
Release Note: If the first character on a line is the separator, empty key is assumed, and the whole line is the value (due to a bug this was not the case).
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3040) Streaming should assume an empty
key if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581503#action_12581503 ]
Hadoop QA commented on HADOOP-3040:
-----------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12378475/patch-3040.txt
against trunk revision 619744.
@author +1. The patch does not contain any @author tags.
tests included +1. The patch appears to include 3 new or modified tests.
javadoc +1. The javadoc tool did not generate any warning messages.
javac +1. The applied patch does not generate any new javac compiler warnings.
release audit +1. The applied patch does not generate any new release audit warnings.
findbugs +1. The patch does not introduce any new Findbugs warnings.
core tests +1. The patch passed core unit tests.
contrib tests +1. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2033/console
This message is automatically generated.
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3040) Streaming should assume an empty
key if the first character on a line is the seperator
(stream.map.output.field.separator, by default, tab)
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581523#action_12581523 ]
Hudson commented on HADOOP-3040:
--------------------------------
Integrated in Hadoop-trunk #440 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/440/])
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab)
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3040
> URL: https://issues.apache.org/jira/browse/HADOOP-3040
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.17.0
>
> Attachments: patch-3040.txt, patch-3040.txt
>
>
> Streaming should assume an empty key if the first character on a line is the seperator (stream.map.output.field.separator, by default, tab). And the value as the whole line excluding the seperator.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.