You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Xiaoming Shi (JIRA)" <ji...@apache.org> on 2011/03/13 04:16:27 UTC

[jira] Created: (MAPREDUCE-2380) Multiple replace function call can be replaced with a single for loop to improve performance

Multiple replace function call can be replaced with a single for loop to improve performance 
---------------------------------------------------------------------------------------------

                 Key: MAPREDUCE-2380
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2380
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tools/rumen
    Affects Versions: 0.21.0
            Reporter: Xiaoming Shi


{noformat}
./hadoop-0.21.0/mapred/src/tools/org/apache/hadoop/tools/rumen/LoggedTaskAttempt.java  line:362
./chukwa-0.4.0/src/java/org/apache/hadoop/chukwa/datacollection/writer/localfs/LocalWriter.java   line:249
{noformat}

4 consecutive replace() is called to remove the special characters.  It's 3+ times slower than using a for loop 
replace them all.

{noformat}
e.g.
 - str.replace('a', '#');
 - str.replace('b', '%');

 + StringBuilder sb = new StringBuilder( str.length() );
 + for (int i=0; i < str.length(); i++)
 +  {
 +           char c = str.charAt(i);
 +         if ( c == 'a' )
 +               sb.append('#');
 +       else if ( c== 'b' )
 +               sb.append('%');
 +       else 
 +              sb.append(c); 
 +  }
 +  str  = sb.toString();
{noformat}
This bug has the same problem as the MySQL bug : http://bugs.mysql.com/bug.php?id=45699

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira