You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by re...@apache.org on 2004/10/19 01:16:36 UTC

cvs commit: jakarta-tomcat-connectors/util/java/org/apache/tomcat/util/buf StringCache.java CharChunk.java ByteChunk.java

remm        2004/10/18 16:16:36

  Modified:    util/java/org/apache/tomcat/util/buf CharChunk.java
                        ByteChunk.java
  Added:       util/java/org/apache/tomcat/util/buf StringCache.java
  Log:
  - Refactor toString for the buffers, and add a cache.
  - The cache works in two phases:
    - First phase is heavily synchronized, and keeps statistics on String usage
    - When first phase is done (after a number of calls to toString), a cache array is generated; this might be a rather expensive operation
    - During the second phase, unsynchronized lookups in the static cache are done to try to avoid expensive toString conversions
  - If the cache is registered in JMX (later ...), an operation exists to get back to the beginning of the first phase. This could be useful
    after installing new applications on the fly, which could have different Strign requirements.
  - I think it works really well for ByteChunk -> String, since this is a quite expensive operation (note: some of these conversions could
    be optimized by assuming US-ASCII encoding, which I'll do for the session ID cookie value since it's so commonly used - and the String
    is not cacheable, obviously - but doing the trick everywhere would lead to major problems). For CharChunk, it's less evident, as
    it is a matter of allocating a String, a char array and then using an arraycopy to move over the chars.
  - This is configured using system properties, for example in the catalina.properties file. Byte and char can be enabled separately.
  
  Revision  Changes    Path
  1.16      +6 -0      jakarta-tomcat-connectors/util/java/org/apache/tomcat/util/buf/CharChunk.java
  
  Index: CharChunk.java
  ===================================================================
  RCS file: /home/cvs/jakarta-tomcat-connectors/util/java/org/apache/tomcat/util/buf/CharChunk.java,v
  retrieving revision 1.15
  retrieving revision 1.16
  diff -u -r1.15 -r1.16
  --- CharChunk.java	29 Aug 2004 17:14:41 -0000	1.15
  +++ CharChunk.java	18 Oct 2004 23:16:36 -0000	1.16
  @@ -489,7 +489,13 @@
       // -------------------- Conversion and getters --------------------
   
       public String toString() {
  +        return StringCache.toString(this);
  +    }
  +    
  +    public String toStringInternal() {
   	if( buff==null ) return null;
  +    //System.out.println("CC toString: " + new String( buff, start, end-start));
  +    //Thread.dumpStack();
   	return new String( buff, start, end-start);
       }
   
  
  
  
  1.22      +36 -22    jakarta-tomcat-connectors/util/java/org/apache/tomcat/util/buf/ByteChunk.java
  
  Index: ByteChunk.java
  ===================================================================
  RCS file: /home/cvs/jakarta-tomcat-connectors/util/java/org/apache/tomcat/util/buf/ByteChunk.java,v
  retrieving revision 1.21
  retrieving revision 1.22
  diff -u -r1.21 -r1.22
  --- ByteChunk.java	29 Aug 2004 17:14:41 -0000	1.21
  +++ ByteChunk.java	18 Oct 2004 23:16:36 -0000	1.22
  @@ -166,6 +166,11 @@
       public void setEncoding( String enc ) {
   	this.enc=enc;
       }
  +    public String getEncoding() {
  +        if (enc == null)
  +            enc=DEFAULT_CHARACTER_ENCODING;
  +        return enc;
  +    }
   
       /**
        * Returns the message bytes.
  @@ -448,28 +453,37 @@
       // -------------------- Conversion and getters --------------------
   
       public String toString() {
  -	if (null == buff) {
  -	    return null;
  -	}
  -	String strValue=null;
  -	try {
  -	    if( enc==null ) enc=DEFAULT_CHARACTER_ENCODING;
  -	    return new String( buff, start, end-start, enc );
  -	    /*
  -	      Does not improve the speed too much on most systems,
  -	      it's safer to use the "clasical" new String().
  -
  -	      Most overhead is in creating char[] and copying,
  -	      the internal implementation of new String() is very close to
  -	      what we do. The decoder is nice for large buffers and if
  -	      we don't go to String ( so we can take advantage of reduced GC)
  -
  -	      // Method is commented out, in:
  -	      return B2CConverter.decodeString( enc );
  -	    */
  -	} catch (java.io.IOException e) {
  -	    return null;  // XXX 
  -	}
  +        if (null == buff) {
  +            return null;
  +        } else if (end-start == 0) {
  +            return "";
  +        }
  +        return StringCache.toString(this);
  +    }
  +    
  +    public String toStringInternal() {
  +        String strValue=null;
  +        try {
  +            if( enc==null ) enc=DEFAULT_CHARACTER_ENCODING;
  +            strValue = new String( buff, start, end-start, enc );
  +            /*
  +             Does not improve the speed too much on most systems,
  +             it's safer to use the "clasical" new String().
  +             
  +             Most overhead is in creating char[] and copying,
  +             the internal implementation of new String() is very close to
  +             what we do. The decoder is nice for large buffers and if
  +             we don't go to String ( so we can take advantage of reduced GC)
  +             
  +             // Method is commented out, in:
  +              return B2CConverter.decodeString( enc );
  +              */
  +        } catch (java.io.IOException e) {
  +            // FIXME 
  +        }
  +        //System.out.println("BC toString: " + strValue);
  +        //Thread.dumpStack();
  +        return strValue;
       }
   
       public int getInt()
  
  
  
  1.1                  jakarta-tomcat-connectors/util/java/org/apache/tomcat/util/buf/StringCache.java
  
  Index: StringCache.java
  ===================================================================
  /*
   *  Copyright 1999-2004 The Apache Software Foundation
   *
   *  Licensed under the Apache License, Version 2.0 (the "License");
   *  you may not use this file except in compliance with the License.
   *  You may obtain a copy of the License at
   *
   *      http://www.apache.org/licenses/LICENSE-2.0
   *
   *  Unless required by applicable law or agreed to in writing, software
   *  distributed under the License is distributed on an "AS IS" BASIS,
   *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   *  See the License for the specific language governing permissions and
   *  limitations under the License.
   */
  
  package org.apache.tomcat.util.buf;
  
  import java.util.ArrayList;
  import java.util.HashMap;
  import java.util.Iterator;
  import java.util.TreeMap;
  
  /**
   * This class implements a String cache for ByteChunk and CharChunk.
   *
   * @author Remy Maucherat
   */
  public class StringCache {
  
  
      private static org.apache.commons.logging.Log log=
          org.apache.commons.logging.LogFactory.getLog( StringCache.class );
      
      
      // ------------------------------------------------------- Static Variables
  
      
      /**
       * Enabled ?
       */
      protected static boolean byteEnabled = 
          ("true".equals(System.getProperty("tomcat.util.buf.StringCache.byte.enabled", "false")));
  
      
      protected static boolean charEnabled = 
          ("true".equals(System.getProperty("tomcat.util.buf.StringCache.char.enabled", "false")));
  
      
      protected static int trainThreshold = 
          Integer.parseInt(System.getProperty("tomcat.util.buf.StringCache.trainThreshold", "20000"));
      
  
      protected static int cacheSize = 
          Integer.parseInt(System.getProperty("tomcat.util.buf.StringCache.cacheSize", "1000"));
      
  
      /**
       * Statistics hash map for byte chunk.
       */
      protected static HashMap bcStats = new HashMap(cacheSize);
  
      
      /**
       * toString count for byte chunk.
       */
      protected static int bcCount = 0;
      
      
      /**
       * Cache for byte chunk.
       */
      protected static ByteEntry[] bcCache = null;
      
  
      /**
       * Statistics hash map for char chunk.
       */
      protected static HashMap ccStats = new HashMap(cacheSize);
  
  
      /**
       * toString count for char chunk.
       */
      protected static int ccCount = 0; 
      
  
      /**
       * Cache for char chunk.
       */
      protected static CharEntry[] ccCache = null;
  
      
      /**
       * Access count.
       */
      protected static int accessCount = 0;
      
  
      /**
       * Hit count.
       */
      protected static int hitCount = 0;
      
  
      // ------------------------------------------------------------ Properties
  
      
      /**
       * @return Returns the cacheSize.
       */
      public static int getCacheSize() {
          return cacheSize;
      }
      
      
      /**
       * @param cacheSize The cacheSize to set.
       */
      public static void setCacheSize(int cacheSize) {
          StringCache.cacheSize = cacheSize;
      }
  
      
      /**
       * @return Returns the enabled.
       */
      public static boolean getByteEnabled() {
          return byteEnabled;
      }
      
      
      /**
       * @param enabled The enabled to set.
       */
      public static void setByteEnabled(boolean byteEnabled) {
          StringCache.byteEnabled = byteEnabled;
      }
      
      
      /**
       * @return Returns the enabled.
       */
      public static boolean getCharEnabled() {
          return charEnabled;
      }
      
      
      /**
       * @param enabled The enabled to set.
       */
      public static void setCharEnabled(boolean charEnabled) {
          StringCache.charEnabled = charEnabled;
      }
      
      
      /**
       * @return Returns the trainThreshold.
       */
      public static int getTrainThreshold() {
          return trainThreshold;
      }
      
      
      /**
       * @param trainThreshold The trainThreshold to set.
       */
      public static void setTrainThreshold(int trainThreshold) {
          StringCache.trainThreshold = trainThreshold;
      }
  
      
      /**
       * @return Returns the accessCount.
       */
      public static int getAccessCount() {
          return accessCount;
      }
      
      
      /**
       * @return Returns the hitCount.
       */
      public static int getHitCount() {
          return hitCount;
      }
  
      
      // -------------------------------------------------- Public Static Methods
  
      
      public static void reset() {
          hitCount = 0;
          accessCount = 0;
          synchronized (bcStats) {
              bcCache = null;
              bcCount = 0;
          }
          synchronized (ccStats) {
              ccCache = null;
              ccCount = 0;
          }
      }
      
      
      public static String toString(ByteChunk bc) {
  
          // If the cache is null, then either caching is disabled, or we're
          // still training
          if (bcCache == null) {
              String value = bc.toStringInternal();
              if (byteEnabled) {
                  // If training, everything is synced
                  synchronized (bcStats) {
                      // If the cache has been generated on a previous invocation
                      // while waiting fot the lock, just return the toString value
                      // we just calculated
                      if (bcCache != null) {
                          return value;
                      }
                      // Two cases: either we just exceeded the train count, in which
                      // case the cache must be created, or we just update the count for
                      // the string
                      if (bcCount > trainThreshold) {
                          long t1 = System.currentTimeMillis();
                          // Sort the entries according to occurrence
                          TreeMap tempMap = new TreeMap();
                          Iterator entries = bcStats.keySet().iterator();
                          while (entries.hasNext()) {
                              ByteEntry entry = (ByteEntry) entries.next();
                              int[] countA = (int[]) bcStats.get(entry);
                              Integer count = new Integer(countA[0]);
                              // Add to the list for that count
                              ArrayList list = (ArrayList) tempMap.get(count);
                              if (list == null) {
                                  // Create list
                                  list = new ArrayList();
                                  tempMap.put(count, list);
                              }
                              list.add(entry);
                          }
                          // Allocate array of the right size
                          int size = bcStats.size();
                          if (size > cacheSize) {
                              size = cacheSize;
                          }
                          bcCache = new ByteEntry[size];
                          // Fill it up using an alphabetical order
                          // and a dumb insert sort
                          ByteChunk tempChunk = new ByteChunk();
                          int n = 0;
                          while (n < size) {
                              Object key = tempMap.lastKey();
                              ArrayList list = (ArrayList) tempMap.get(key);
                              ByteEntry[] list2 = 
                                  (ByteEntry[]) list.toArray(new ByteEntry[list.size()]);
                              for (int i = 0; i < list.size() && n < size; i++) {
                                  ByteEntry entry = (ByteEntry) list.get(i);
                                  tempChunk.setBytes(entry.name, 0, entry.name.length);
                                  int insertPos = findClosest(tempChunk, n);
                                  if (insertPos == n) {
                                      bcCache[n + 1] = entry;
                                  } else {
                                      System.arraycopy(bcCache, insertPos + 1, bcCache, 
                                              insertPos + 2, n - insertPos - 1);
                                      bcCache[insertPos + 1] = entry;
                                  }
                                  n++;
                              }
                              tempMap.remove(key);
                          }
                          bcCount = 0;
                          bcStats.clear();
                          if (log.isDebugEnabled()) {
                              long t2 = System.currentTimeMillis();
                              log.debug("ByteCache generation time: " + (t2 - t1) + "ms");
                          }
                      } else {
                          bcCount++;
                          // Allocate new ByteEntry for the lookup
                          ByteEntry entry = new ByteEntry();
                          entry.value = value;
                          int[] count = (int[]) bcStats.get(entry);
                          if (count == null) {
                              int end = bc.getEnd();
                              int start = bc.getStart();
                              // Create byte array and copy bytes
                              entry.name = new byte[bc.getLength()];
                              System.arraycopy(bc.getBuffer(), start, entry.name, 0, end - start);
                              // Set encoding
                              entry.enc = bc.getEncoding();
                              // Initialize occurrence count to one 
                              count = new int[1];
                              count[0] = 1;
                              // Set in the stats hash map
                              bcStats.put(entry, count);
                          } else {
                              count[0] = count[0] + 1;
                          }
                      }
                  }
              }
              return value;
          } else {
              accessCount++;
              // Find the corresponding String
              String result = find(bc);
              if (result == null) {
                  return bc.toStringInternal();
              }
              // Note: We don't care about safety for the stats
              hitCount++;
              return result;
          }
          
      }
  
  
      public static String toString(CharChunk cc) {
          
          // If the cache is null, then either caching is disabled, or we're
          // still training
          if (ccCache == null) {
              String value = cc.toStringInternal();
              if (charEnabled) {
                  // If training, everything is synced
                  synchronized (ccStats) {
                      // If the cache has been generated on a previous invocation
                      // while waiting fot the lock, just return the toString value
                      // we just calculated
                      if (ccCache != null) {
                          return value;
                      }
                      // Two cases: either we just exceeded the train count, in which
                      // case the cache must be created, or we just update the count for
                      // the string
                      if (ccCount > trainThreshold) {
                          long t1 = System.currentTimeMillis();
                          // Sort the entries according to occurrence
                          TreeMap tempMap = new TreeMap();
                          Iterator entries = ccStats.keySet().iterator();
                          while (entries.hasNext()) {
                              CharEntry entry = (CharEntry) entries.next();
                              int[] countA = (int[]) ccStats.get(entry);
                              Integer count = new Integer(countA[0]);
                              // Add to the list for that count
                              ArrayList list = (ArrayList) tempMap.get(count);
                              if (list == null) {
                                  // Create list
                                  list = new ArrayList();
                                  tempMap.put(count, list);
                              }
                              list.add(entry);
                          }
                          // Allocate array of the right size
                          int size = ccStats.size();
                          if (size > cacheSize) {
                              size = cacheSize;
                          }
                          ccCache = new CharEntry[size];
                          // Fill it up using an alphabetical order
                          // and a dumb insert sort
                          CharChunk tempChunk = new CharChunk();
                          int n = 0;
                          while (n < size) {
                              Object key = tempMap.lastKey();
                              ArrayList list = (ArrayList) tempMap.get(key);
                              CharEntry[] list2 = 
                                  (CharEntry[]) list.toArray(new CharEntry[list.size()]);
                              for (int i = 0; i < list.size() && n < size; i++) {
                                  CharEntry entry = (CharEntry) list.get(i);
                                  tempChunk.setChars(entry.name, 0, entry.name.length);
                                  int insertPos = findClosest(tempChunk, n);
                                  if (insertPos == n) {
                                      ccCache[n + 1] = entry;
                                  } else {
                                      System.arraycopy(ccCache, insertPos + 1, ccCache, 
                                              insertPos + 2, n - insertPos - 1);
                                      ccCache[insertPos + 1] = entry;
                                  }
                                  n++;
                              }
                              tempMap.remove(key);
                          }
                          ccCount = 0;
                          ccStats.clear();
                          if (log.isDebugEnabled()) {
                              long t2 = System.currentTimeMillis();
                              log.debug("CharCache generation time: " + (t2 - t1) + "ms");
                          }
                      } else {
                          ccCount++;
                          // Allocate new CharEntry for the lookup
                          CharEntry entry = new CharEntry();
                          entry.value = value;
                          int[] count = (int[]) ccStats.get(entry);
                          if (count == null) {
                              int end = cc.getEnd();
                              int start = cc.getStart();
                              // Create char array and copy chars
                              entry.name = new char[cc.getLength()];
                              System.arraycopy(cc.getBuffer(), start, entry.name, 0, end - start);
                              // Initialize occurrence count to one 
                              count = new int[1];
                              count[0] = 1;
                              // Set in the stats hash map
                              ccStats.put(entry, count);
                          } else {
                              count[0] = count[0] + 1;
                          }
                      }
                  }
              }
              return value;
          } else {
              accessCount++;
              // Find the corresponding String
              String result = find(cc);
              if (result == null) {
                  return cc.toStringInternal();
              }
              // Note: We don't care about safety for the stats
              hitCount++;
              return result;
          }
          
      }
      
      
      // ----------------------------------------------------- Protected Methods
  
  
      /**
       * Compare given byte chunk with byte array.
       * Return -1, 0 or +1 if inferior, equal, or superior to the String.
       */
      protected static final int compare(ByteChunk name, byte[] compareTo) {
          int result = 0;
  
          byte[] b = name.getBuffer();
          int start = name.getStart();
          int end = name.getEnd();
          int len = compareTo.length;
  
          if ((end - start) < len) {
              len = end - start;
          }
          for (int i = 0; (i < len) && (result == 0); i++) {
              if (b[i + start] > compareTo[i]) {
                  result = 1;
              } else if (b[i + start] < compareTo[i]) {
                  result = -1;
              }
          }
          if (result == 0) {
              if (compareTo.length > (end - start)) {
                  result = -1;
              } else if (compareTo.length < (end - start)) {
                  result = 1;
              }
          }
          return result;
      }
  
      
      /**
       * Find an entry given its name in the cache and return the associated String.
       */
      protected static final String find(ByteChunk name) {
          int pos = findClosest(name, bcCache.length);
          if ((pos < 0) || (compare(name, bcCache[pos].name) != 0)
                  || !(name.getEncoding().equals(bcCache[pos].enc))) {
              return null;
          } else {
              return bcCache[pos].value;
          }
      }
  
      
      /**
       * Find an entry given its name in a sorted array of map elements.
       * This will return the index for the closest inferior or equal item in the
       * given array.
       */
      protected static final int findClosest(ByteChunk name, int len) {
  
          int a = 0;
          int b = len - 1;
  
          // Special cases: -1 and 0
          if (b == -1) {
              return -1;
          }
          
          if (compare(name, bcCache[0].name) < 0) {
              return -1;
          }         
          if (b == 0) {
              return 0;
          }
  
          int i = 0;
          while (true) {
              i = (b + a) / 2;
              int result = compare(name, bcCache[i].name);
              if (result == 1) {
                  a = i;
              } else if (result == 0) {
                  return i;
              } else {
                  b = i;
              }
              if ((b - a) == 1) {
                  int result2 = compare(name, bcCache[b].name);
                  if (result2 < 0) {
                      return a;
                  } else {
                      return b;
                  }
              }
          }
  
      }
  
  
      /**
       * Compare given char chunk with char array.
       * Return -1, 0 or +1 if inferior, equal, or superior to the String.
       */
      protected static final int compare(CharChunk name, char[] compareTo) {
          int result = 0;
  
          char[] c = name.getBuffer();
          int start = name.getStart();
          int end = name.getEnd();
          int len = compareTo.length;
  
          if ((end - start) < len) {
              len = end - start;
          }
          for (int i = 0; (i < len) && (result == 0); i++) {
              if (c[i + start] > compareTo[i]) {
                  result = 1;
              } else if (c[i + start] < compareTo[i]) {
                  result = -1;
              }
          }
          if (result == 0) {
              if (compareTo.length > (end - start)) {
                  result = -1;
              } else if (compareTo.length < (end - start)) {
                  result = 1;
              }
          }
          return result;
      }
  
      
      /**
       * Find an entry given its name in the cache and return the associated String.
       */
      protected static final String find(CharChunk name) {
          int pos = findClosest(name, ccCache.length);
          if ((pos < 0) || (compare(name, ccCache[pos].name) != 0)) {
              return null;
          } else {
              return ccCache[pos].value;
          }
      }
  
      
      /**
       * Find an entry given its name in a sorted array of map elements.
       * This will return the index for the closest inferior or equal item in the
       * given array.
       */
      protected static final int findClosest(CharChunk name, int len) {
  
          int a = 0;
          int b = len - 1;
  
          // Special cases: -1 and 0
          if (b == -1) {
              return -1;
          }
          
          if (compare(name, ccCache[0].name) < 0 ) {
              return -1;
          }         
          if (b == 0) {
              return 0;
          }
  
          int i = 0;
          while (true) {
              i = (b + a) / 2;
              int result = compare(name, ccCache[i].name);
              if (result == 1) {
                  a = i;
              } else if (result == 0) {
                  return i;
              } else {
                  b = i;
              }
              if ((b - a) == 1) {
                  int result2 = compare(name, ccCache[b].name);
                  if (result2 < 0) {
                      return a;
                  } else {
                      return b;
                  }
              }
          }
  
      }
  
  
      // -------------------------------------------------- ByteEntry Inner Class
  
  
      protected static class ByteEntry {
  
          public byte[] name = null;
          public String enc = null;
          public String value = null;
  
          public String toString() {
              return value;
          }
          public int hashCode() {
              return value.hashCode();
          }
          public boolean equals(Object obj) {
              if (obj instanceof ByteEntry) {
                  return value.equals(((ByteEntry) obj).value);
              }
              return false;
          }
          
      }
  
  
      // -------------------------------------------------- CharEntry Inner Class
  
  
      protected static class CharEntry {
  
          public char[] name = null;
          public String value = null;
  
          public String toString() {
              return value;
          }
          public int hashCode() {
              return value.hashCode();
          }
          public boolean equals(Object obj) {
              if (obj instanceof CharEntry) {
                  return value.equals(((CharEntry) obj).value);
              }
              return false;
          }
          
      }
  
  
  }
  
  
  

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org


Re: cvs commit: jakarta-tomcat-connectors/util/java/org/apache/tomcat/util/buf StringCache.java CharChunk.java ByteChunk.java

Posted by Remy Maucherat <re...@apache.org>.
remm@apache.org wrote:

>remm        2004/10/18 16:16:36
>
>  Modified:    util/java/org/apache/tomcat/util/buf CharChunk.java
>                        ByteChunk.java
>  Added:       util/java/org/apache/tomcat/util/buf StringCache.java
>  Log:
>  - Refactor toString for the buffers, and add a cache.
>  - The cache works in two phases:
>    - First phase is heavily synchronized, and keeps statistics on String usage
>    - When first phase is done (after a number of calls to toString), a cache array is generated; this might be a rather expensive operation
>    - During the second phase, unsynchronized lookups in the static cache are done to try to avoid expensive toString conversions
>  - If the cache is registered in JMX (later ...), an operation exists to get back to the beginning of the first phase. This could be useful
>    after installing new applications on the fly, which could have different Strign requirements.
>  - I think it works really well for ByteChunk -> String, since this is a quite expensive operation (note: some of these conversions could
>    be optimized by assuming US-ASCII encoding, which I'll do for the session ID cookie value since it's so commonly used - and the String
>    is not cacheable, obviously - but doing the trick everywhere would lead to major problems). For CharChunk, it's less evident, as
>    it is a matter of allocating a String, a char array and then using an arraycopy to move over the chars.
>  - This is configured using system properties, for example in the catalina.properties file. Byte and char can be enabled separately.  
>
More optimizations:
- Optimize getting session id from the session id cookies: there might 
be more that one cookie, and all use straight byte -> String conversion 
(although they are conversion friendly US-ASCII encoded); this will also 
avoid keeping track of too much stuff in the byte cache
- Enable the cache for ByteChunk by default (so that it gets tested, and 
optimizing away some of these expensive conversions could be quite useful)

Rémy


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org