You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Gelesh (JIRA)" <ji...@apache.org> on 2012/08/03 17:58:03 UTC
[jira] [Created] (MAPREDUCE-4512) TextInputFormat delimiter bug:-
Input Text portion ends with & Delimiter starts with same char/char
sequence
Gelesh created MAPREDUCE-4512:
---------------------------------
Summary: TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence
Key: MAPREDUCE-4512
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: contrib/mumak, mr-am, mrv1, mrv2, task
Affects Versions: 2.0.0-alpha
Environment: Lynux
Reporter: Gelesh
Fix For: 0.20.204.0
TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and reaming input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.
eg delimiter ="record";
and Text = record 1:- name = "Gelesh" e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name ....
Here string "=Bangalorrecord 3: " satisfy two condition
1) contains the delimiter "record"
2) The character / character sequence immediately b4 the delimiter (ie 'r') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
Hear the delimiter is skipped
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4512) TextInputFormat delimiter
bug:- Input Text portion ends with & Delimiter starts with same char/char
sequence
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428214#comment-13428214 ]
Hadoop QA commented on MAPREDUCE-4512:
--------------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12539059/MAPREDUCE-4512.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2706//testReport/
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2706//console
This message is automatically generated.
> TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence
> -------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/mumak, mr-am, mrv1, mrv2, task
> Affects Versions: 2.0.0-alpha
> Environment: Lynux
> Reporter: Gelesh
> Labels: patch
> Fix For: 0.20.204.0
>
> Attachments: MAPREDUCE-4512.txt
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and reaming input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.
> eg delimiter ="record";
> and Text = record 1:- name = "Gelesh" e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name ....
> Here string "=Bangalorrecord 3: " satisfy two condition
> 1) contains the delimiter "record"
> 2) The character / character sequence immediately b4 the delimiter (ie 'r') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
> Hear the delimiter is skipped
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4512) TextInputFormat delimiter
bug:- Input Text portion ends with & Delimiter starts with same char/char
sequence
Posted by "Bhallamudi Venkata Siva Kamesh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429142#comment-13429142 ]
Bhallamudi Venkata Siva Kamesh commented on MAPREDUCE-4512:
-----------------------------------------------------------
Please update the patch with a Testcase.
> TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence
> -------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/mumak, mr-am, mrv1, mrv2, task
> Affects Versions: 0.20.204.0, 0.21.0, 1.0.3, 2.0.0-alpha
> Environment: Linux
> Reporter: Gelesh
> Labels: patch
> Fix For: 0.20.204.0
>
> Attachments: MAPREDUCE-4512.txt
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and the remaining input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.
> eg delimiter ="record";
> and Text =" record 1:- name = Gelesh e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name .... "
> Here string "=Bangalorrecord 3: " satisfy two conditions
> 1) contains the delimiter "record"
> 2) The character / character sequence immediately before the delimiter (ie ' r ') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
> Here the delimiter is not encountered by the program resulting in improper value text in map that contains the delimiter
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4512) TextInputFormat delimiter
bug:- Input Text portion ends with & Delimiter starts with same char/char
sequence
Posted by "Sonu Prathap (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428988#comment-13428988 ]
Sonu Prathap commented on MAPREDUCE-4512:
-----------------------------------------
I am also facing the similar issue, Please help me to re create the fixed code using patch
> TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence
> -------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/mumak, mr-am, mrv1, mrv2, task
> Affects Versions: 0.20.204.0, 0.21.0, 1.0.3, 2.0.0-alpha
> Environment: Linux
> Reporter: Gelesh
> Labels: patch
> Fix For: 0.20.204.0
>
> Attachments: MAPREDUCE-4512.txt
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and the remaining input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.
> eg delimiter ="record";
> and Text =" record 1:- name = Gelesh e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name .... "
> Here string "=Bangalorrecord 3: " satisfy two conditions
> 1) contains the delimiter "record"
> 2) The character / character sequence immediately before the delimiter (ie ' r ') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
> Here the delimiter is not encountered by the program resulting in improper value text in map that contains the delimiter
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4512) TextInputFormat delimiter bug:-
Input Text portion ends with & Delimiter starts with same char/char
sequence
Posted by "Gelesh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gelesh updated MAPREDUCE-4512:
------------------------------
Status: Patch Available (was: Open)
just one line of code change @ LineReader, would do. Tested
Any issues please let me know to help further
gelesh.hadoop@gmail.com
> TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence
> -------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/mumak, mr-am, mrv1, mrv2, task
> Affects Versions: 2.0.0-alpha
> Environment: Lynux
> Reporter: Gelesh
> Labels: patch
> Fix For: 0.20.204.0
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and reaming input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.
> eg delimiter ="record";
> and Text = record 1:- name = "Gelesh" e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name ....
> Here string "=Bangalorrecord 3: " satisfy two condition
> 1) contains the delimiter "record"
> 2) The character / character sequence immediately b4 the delimiter (ie 'r') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
> Hear the delimiter is skipped
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4512) TextInputFormat delimiter bug:-
Input Text portion ends with & Delimiter starts with same char/char
sequence
Posted by "Gelesh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gelesh updated MAPREDUCE-4512:
------------------------------
Description:
TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and the remaining input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.
eg delimiter ="record";
and Text =" record 1:- name = Gelesh e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name .... "
Here string "=Bangalorrecord 3: " satisfy two conditions
1) contains the delimiter "record"
2) The character / character sequence immediately before the delimiter (ie ' r ') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
Here the delimiter is not encountered by the program resulting in improper value text in map that contains the delimiter
was:
TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and reaming input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.
eg delimiter ="record";
and Text = record 1:- name = "Gelesh" e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name ....
Here string "=Bangalorrecord 3: " satisfy two condition
1) contains the delimiter "record"
2) The character / character sequence immediately b4 the delimiter (ie 'r') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
Hear the delimiter is skipped
Environment: Linux (was: Lynux)
Affects Version/s: 0.20.204.0
0.21.0
1.0.3
Test case
input file text
record 1 name: Java Location:UAErecord 2 name:Gelesh Location:Bangalorrecord 3 name Hadoop Location:Kerala
Delimiter = "record"
expected values in map
1 name: Java Location:UAE
2 name:Gelesh Location:Bangalor
3 name Hadoop Location:Kerala
Actual values received in map
1 name: Java Location:UAE
2 name:Gelesh Location:Bangalorrecord 3 name Hadoop Location:Kerala
> TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence
> -------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/mumak, mr-am, mrv1, mrv2, task
> Affects Versions: 0.20.204.0, 0.21.0, 1.0.3, 2.0.0-alpha
> Environment: Linux
> Reporter: Gelesh
> Labels: patch
> Fix For: 0.20.204.0
>
> Attachments: MAPREDUCE-4512.txt
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and the remaining input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.
> eg delimiter ="record";
> and Text =" record 1:- name = Gelesh e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name .... "
> Here string "=Bangalorrecord 3: " satisfy two conditions
> 1) contains the delimiter "record"
> 2) The character / character sequence immediately before the delimiter (ie ' r ') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
> Here the delimiter is not encountered by the program resulting in improper value text in map that contains the delimiter
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4512) TextInputFormat delimiter bug:-
Input Text portion ends with & Delimiter starts with same char/char
sequence
Posted by "Gelesh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gelesh updated MAPREDUCE-4512:
------------------------------
Attachment: MAPREDUCE-4512.txt
Just One line code change at LineRecord. Tested in case there is any issue please mail me gelesh.hadoop@gmail.com
> TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence
> -------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/mumak, mr-am, mrv1, mrv2, task
> Affects Versions: 2.0.0-alpha
> Environment: Lynux
> Reporter: Gelesh
> Labels: patch
> Fix For: 0.20.204.0
>
> Attachments: MAPREDUCE-4512.txt
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and reaming input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter.
> eg delimiter ="record";
> and Text = record 1:- name = "Gelesh" e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name ....
> Here string "=Bangalorrecord 3: " satisfy two condition
> 1) contains the delimiter "record"
> 2) The character / character sequence immediately b4 the delimiter (ie 'r') matches with first character (or character sequence ) of delimiter. (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
> Hear the delimiter is skipped
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira