You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2012/08/06 17:09:02 UTC

[jira] [Moved] (HADOOP-8654) TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence

     [ https://issues.apache.org/jira/browse/HADOOP-8654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe moved MAPREDUCE-4512 to HADOOP-8654:
-----------------------------------------------

          Component/s:     (was: mrv1)
                           (was: mr-am)
                           (was: mrv2)
                           (was: contrib/mumak)
                           (was: task)
                       util
        Fix Version/s:     (was: 0.20.204.0)
     Target Version/s: 2.2.0-alpha  (was: 2.0.0-alpha)
    Affects Version/s:     (was: 2.0.0-alpha)
                           (was: 1.0.3)
                           (was: 0.20.204.0)
                           (was: 0.21.0)
                       0.20.204.0
                       1.0.3
                       0.21.0
                       2.0.0-alpha
                  Key: HADOOP-8654  (was: MAPREDUCE-4512)
              Project: Hadoop Common  (was: Hadoop Map/Reduce)
    
> TextInputFormat delimiter  bug:- Input Text portion ends with & Delimiter starts with same char/char sequence
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8654
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8654
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 2.0.0-alpha, 0.21.0, 1.0.3, 0.20.204.0
>         Environment: Linux
>            Reporter: Gelesh
>              Labels: patch
>         Attachments: MAPREDUCE-4512.txt
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> TextInputFormat delimiter  bug scenario , a character sequence of the input text,  in which the first character matches with the first character of delimiter, and the remaining input text character sequence  matches with the entire delimiter character sequence from the  starting position of the delimiter.
> eg   delimiter ="record";
> and Text =" record 1:- name = Gelesh e mail = gelesh.hadoop@gmail.com Location Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name .... " 
> Here string "=Bangalorrecord 3: " satisfy two conditions 
> 1) contains the delimiter "record"
> 2) The character / character sequence immediately before the delimiter (ie ' r ') matches with first character (or character sequence ) of delimiter.  (ie "=Bangalor" ends with and Delimiter starts with same character/char sequence 'r' ),
> Here the delimiter is not encountered by the program resulting in improper value text in map that contains the delimiter   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira