You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Andreas L Delmelle <a_...@pandora.be> on 2007/07/20 06:56:21 UTC

Caching CommonHyphenation (was: Re: The effect of the property cache ...)

On Jul 19, 2007, at 00:36, Andreas L Delmelle wrote:

>
> On Jul 18, 2007, at 23:18, Jeremias Maerki wrote:
> <snip />
>> - One of the easiest candidates for another flyweight is probably
>> CommonHyphenation (56K instances, 2.3MB in my example). The few  
>> member
>> variables could probably just be concatenated to a String (to be  
>> used as
>> the key).
>
> Interesting idea, will look into that asap.

FWIW:
Looked a bit closer at this, and it suddenly struck me that all the  
base Property types, apart from CharacterProperty which I overlooked  
as a possible candidate, were already cached:

StringProperty -> language, country, script
NumberProperty -> hyphenation-push/remain-character-count
EnumProperty -> hyphenate

CharacterProperty(*) -> hyphenation-character

(*) now also added, see http://svn.apache.org/viewvc?view=rev&rev=557814

This means we currently end up in the strange situation where  
different/separate CommonHyphenation instances are generated from  
identical sets of base Property instances.

Maybe the CommonHyphenation bundle could store references to the  
original properties themselves instead of duplicating their content/ 
value and storing them as primitives? By itself, this should be  
roughly the same in terms of overall memory consumption: replacement  
of some primitives with references.

In that case, one of the additional benefits of the individual  
Property caching is that you can now actually avoid calls to  
StringProperty.equals() in the rest of the code. "identity" means the  
same as "equality" here, so the fastest possible implementation for  
CommonHyphenation.equals() would then come to look like:

public final class CommonHyphenation {
...
public final StringProperty language;
public final StringProperty script;
public final StringProperty country;
public final EnumProperty hyphenate;
...
public boolean equals(Object obj) {
   if (obj == this) {
     return true;
   }
   if (obj instanceof CommonHyphenation) {
     CommonHyphenation ch = (CommonHyphenation) obj;
     return (ch.language == this.language
          && ch.script == this.script
          && ch.country == this.country
          && ch.hyphenate == this.hyphenate
          && ...)
   }
   return false;
}

One thing that cannot be avoided is the multiple calls to  
PropertyList.get() to get to the properties that are needed to  
perform the check for a flyweight bundle. Maybe the initial  
assignments can be moved into the getInstance() method, so they  
become part of the static code. getInstance() would get a  
PropertyList as argument, while the private constructor signature is  
altered to accept all the base properties as parameters.

The key in the Map could be a composite String, but could also again  
be the CommonHyphenation itself, if a decent hashCode()  
implementation is added.
The benefit of using the instance itself is that the key in a  
WeakHashMap is automatically released after the last object referring  
to it has been cleared. Using a key other than the instance itself  
would make WeakHashMap unusable, since the keys are in that case not  
referenced directly by any object. The key cannot be embedded in the  
instance itself, since that would prevent the entire entry from ever  
being released...

The properties themselves being immutable and final, I guess it does  
no harm to expose them as public members. Only a handful of places in  
TextLM and LineLM would need a slight adjustment to compensate for  
the lost getString() and getEnum() conversions. Maybe for  
convenience, if really needed, accessors could be added like:

public String language() {
   return language.getString();
}
...
public boolean hyphenate() {
   return (hyphenate.getEnum() == EN_TRUE);


Opinions?
For the interested parties: full CommonHyphenation below, following  
roughly the same principles as the Property caching.

Cheers

Andreas

--- Sample code ---
public final class CommonHyphenation {

     private static final Map cache =  
java.util.Collections.synchronizedMap(
                                         new java.util.WeakHashMap());

     private int hash = 0;

     /** The "language" property */
     private final StringProperty language;

     /** The "country" property */
     private final StringProperty country;

     /** The "script" property */
     private final StringProperty script;

     /** The "hyphenate" property */
     private final EnumProperty hyphenate;

     /** The "hyphenation-character" property */
     private final CharacterProperty hyphenationCharacter;

     /** The "hyphenation-push-character-count" property */
     private final NumberProperty hyphenationPushCharacterCount;

     /** The "hyphenation-remain-character-count" property*/
     private final NumberProperty hyphenationRemainCharacterCount;

     /**
      * Construct a CommonHyphenation object holding the given  
properties
      *
      */
     private CommonHyphenation(StringProperty language,
                               StringProperty country,
                               StringProperty script,
                               EnumProperty hyphenate,
                               CharacterProperty hyphenationCharacter,
                               NumberProperty  
hyphenationPushCharacterCount,
                               NumberProperty  
hyphenationRemainCharacterCount) {
         this.language = language;
         this.country = country;
         this.script = script;
         this.hyphenate = hyphenate;
         this.hyphenationCharacter = hyphenationCharacter;
         this.hyphenationPushCharacterCount =  
hyphenationPushCharacterCount;
         this.hyphenationRemainCharacterCount =  
hyphenationRemainCharacterCount;
     }

     /**
      * Gets the canonical <code>CommonHyphenation</code> instance  
corresponding
      * to the values of the related properties present on the given
      * <code>PropertyList</code>
      *
      * @param propertyList  the <code>PropertyList</code>
      */
     public static CommonHyphenation getInstance(PropertyList  
propertyList) throws PropertyException {
         StringProperty language =
             (StringProperty) propertyList.get(Constants.PR_LANGUAGE);
         StringProperty country =
             (StringProperty) propertyList.get(Constants.PR_COUNTRY);
         StringProperty script =
             (StringProperty) propertyList.get(Constants.PR_SCRIPT);
         EnumProperty hyphenate =
             (EnumProperty) propertyList.get(Constants.PR_HYPHENATE);
         CharacterProperty hyphenationCharacter =
             (CharacterProperty) propertyList.get 
(Constants.PR_HYPHENATION_CHARACTER);
         NumberProperty hyphenationPushCharacterCount =
             (NumberProperty) propertyList.get 
(Constants.PR_HYPHENATION_PUSH_CHARACTER_COUNT);
         NumberProperty hyphenationRemainCharacterCount =
             (NumberProperty) propertyList.get 
(Constants.PR_HYPHENATION_REMAIN_CHARACTER_COUNT);

         CommonHyphenation instance = new CommonHyphenation(
                                 language,
                                 country,
                                 script,
                                 hyphenate,
                                 hyphenationCharacter,
                                 hyphenationPushCharacterCount,
                                 hyphenationRemainCharacterCount);

         Object cachedInstance = cache.get(instance);
         if (cachedInstance == null) {
             cache.put(instance, instance);
         } else {
             instance = (CommonHyphenation) cachedInstance;
         }
         return instance;

     }

     /** @return the "lanuage" property as a String */
     public String language() {
         return language.getString();
     }

     /** @return the "country" property as a String */
     public String country() {
         return country.getString();
     }

     /** @return the "script" property as a String */
     public String script() {
         return script.getString();
     }

     /** @return the "hyphenate" property as a boolean */
     public boolean hyphenate() {
         return (hyphenate.getEnum() == Constants.EN_TRUE);
     }

     /** @return the "hyphenation-character" property as a char */
     public char hyphenationCharacter() {
         return hyphenationCharacter.getCharacter();
     }

     /** @return the "hyphenation-push-character-count" property as  
an int */
     public int hyphenationPushCharacterCount() {
         return hyphenationPushCharacterCount.getNumber().intValue();
     }

     /** @return the "hyphenation-remain-character-count" property as  
an int */
     public int hyphenationRemainCharacterCount() {
         return hyphenationRemainCharacterCount.getNumber().intValue();
     }

     /** {@inheritDoc */
     public boolean equals(Object obj) {
         if (obj == this) {
             return true;
         }
         if (obj instanceof CommonHyphenation) {
             CommonHyphenation ch = (CommonHyphenation) obj;
             return (ch.language == this.language
                     && ch.country == this.country
                     && ch.script == this.script
                     && ch.hyphenate == this.hyphenate
                     && ch.hyphenationCharacter ==  
this.hyphenationCharacter
                     && ch.hyphenationPushCharacterCount ==  
this.hyphenationPushCharacterCount
                     && ch.hyphenationRemainCharacterCount ==  
this.hyphenationRemainCharacterCount);
         }
         return false;
     }

     /** {@inheritDoc} */
     public int hashCode() {
         if (hash == 0) {
             int hash = 7;
             hash = 31 * hash + (language == null ? 0 :  
language.hashCode());
             hash = 31 * hash + (script == null ? 0 : script.hashCode 
());
             hash = 31 * hash + (country == null ? 0 :  
country.hashCode());
             hash = 31 * hash + (hyphenate == null ? 0 :  
hyphenate.hashCode());
             hash = 31 * hash +
                 (hyphenationCharacter == null ? 0 :  
hyphenationCharacter.hashCode());
             hash = 31 * hash +
                 (hyphenationPushCharacterCount == null ? 0 :  
hyphenationPushCharacterCount.hashCode());
             hash = 31 * hash +
                 (hyphenationRemainCharacterCount == null ? 0 :  
hyphenationRemainCharacterCount.hashCode());
         }
         return hash;
     }

}


Re: Caching CommonHyphenation (was: Re: The effect of the property cache ...)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 20, 2007, at 09:19, Jeremias Maerki wrote:

>> <snip />
>> This means we currently end up in the strange situation where
>> different/separate CommonHyphenation instances are generated from
>> identical sets of base Property instances.
>
> Raises the question for me if for properties without dynamic  
> context-based
> evaluation the property evaluation could be streamlined to directly
> return the primitive values instead of simple container objects like
> NumberProperty. throw new NotEnoughTimeRightNowException();

The 'evaluation' here is precisely triggered by the calls to  
PropertyList.get().
Before those calls in the CommonHyphenation constructor, the base  
properties might not even exist yet (if they were not specified on  
the FO that is bound to the CommonHyphenation).

Trading the NumberProperty for an int... No idea if that's feasible  
without a thorough revision. The entire property resolution mechanism  
currently depends on the generic Property return type. That design  
would obviously have to be abandoned for this handful of cases...

<snip />
>> public boolean hyphenate() {
>>    return (hyphenate.getEnum() == EN_TRUE);
>
> Well, I'd prefer Bean-style getters, i.e. getLanguage(),
> isHyphenationEnabled()

No problem. That was only by means of an example.

>
>>
>> Opinions?
>> For the interested parties: full CommonHyphenation below, following
>> roughly the same principles as the Property caching.
>
> "hash" should probably be transient here because it's a cached value.

Checked this out, and transient seems to be only applicable in a  
serialization context. If the object is never serialized, adding that  
keyword would seem to have zero effect... unless I'm missing something.



Cheers

Andreas

Re: Caching CommonHyphenation (was: Re: The effect of the property cache ...)

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 20.07.2007 06:56:21 Andreas L Delmelle wrote:
> On Jul 19, 2007, at 00:36, Andreas L Delmelle wrote:
> 
> >
> > On Jul 18, 2007, at 23:18, Jeremias Maerki wrote:
> > <snip />
> >> - One of the easiest candidates for another flyweight is probably
> >> CommonHyphenation (56K instances, 2.3MB in my example). The few  
> >> member
> >> variables could probably just be concatenated to a String (to be  
> >> used as
> >> the key).
> >
> > Interesting idea, will look into that asap.
> 
> FWIW:
> Looked a bit closer at this, and it suddenly struck me that all the  
> base Property types, apart from CharacterProperty which I overlooked  
> as a possible candidate, were already cached:
> 
> StringProperty -> language, country, script
> NumberProperty -> hyphenation-push/remain-character-count
> EnumProperty -> hyphenate
> 
> CharacterProperty(*) -> hyphenation-character
> 
> (*) now also added, see http://svn.apache.org/viewvc?view=rev&rev=557814
> 
> This means we currently end up in the strange situation where  
> different/separate CommonHyphenation instances are generated from  
> identical sets of base Property instances.

Raises the question for me if for properties without dynamic context-based
evaluation the property evaluation could be streamlined to directly
return the primitive values instead of simple container objects like
NumberProperty. throw new NotEnoughTimeRightNowException();

> Maybe the CommonHyphenation bundle could store references to the  
> original properties themselves instead of duplicating their content/ 
> value and storing them as primitives? By itself, this should be  
> roughly the same in terms of overall memory consumption: replacement  
> of some primitives with references.
> 
> In that case, one of the additional benefits of the individual  
> Property caching is that you can now actually avoid calls to  
> StringProperty.equals() in the rest of the code. "identity" means the  
> same as "equality" here, so the fastest possible implementation for  
> CommonHyphenation.equals() would then come to look like:
> 
> public final class CommonHyphenation {
> ...
> public final StringProperty language;
> public final StringProperty script;
> public final StringProperty country;
> public final EnumProperty hyphenate;
> ...
> public boolean equals(Object obj) {
>    if (obj == this) {
>      return true;
>    }
>    if (obj instanceof CommonHyphenation) {
>      CommonHyphenation ch = (CommonHyphenation) obj;
>      return (ch.language == this.language
>           && ch.script == this.script
>           && ch.country == this.country
>           && ch.hyphenate == this.hyphenate
>           && ...)
>    }
>    return false;
> }
> 
> One thing that cannot be avoided is the multiple calls to  
> PropertyList.get() to get to the properties that are needed to  
> perform the check for a flyweight bundle. Maybe the initial  
> assignments can be moved into the getInstance() method, so they  
> become part of the static code. getInstance() would get a  
> PropertyList as argument, while the private constructor signature is  
> altered to accept all the base properties as parameters.
> 
> The key in the Map could be a composite String, but could also again  
> be the CommonHyphenation itself, if a decent hashCode()  
> implementation is added.
> The benefit of using the instance itself is that the key in a  
> WeakHashMap is automatically released after the last object referring  
> to it has been cleared. Using a key other than the instance itself  
> would make WeakHashMap unusable, since the keys are in that case not  
> referenced directly by any object. The key cannot be embedded in the  
> instance itself, since that would prevent the entire entry from ever  
> being released...
> 
> The properties themselves being immutable and final, I guess it does  
> no harm to expose them as public members. Only a handful of places in  
> TextLM and LineLM would need a slight adjustment to compensate for  
> the lost getString() and getEnum() conversions. Maybe for  
> convenience, if really needed, accessors could be added like:
> 
> public String language() {
>    return language.getString();
> }
> ...
> public boolean hyphenate() {
>    return (hyphenate.getEnum() == EN_TRUE);

Well, I'd prefer Bean-style getters, i.e. getLanguage(),
isHyphenationEnabled()

> 
> Opinions?
> For the interested parties: full CommonHyphenation below, following  
> roughly the same principles as the Property caching.

"hash" should probably be transient here because it's a cached value.

> Cheers
> 
> Andreas
> 
> --- Sample code ---
> public final class CommonHyphenation {
> 
>      private static final Map cache =  
> java.util.Collections.synchronizedMap(
>                                          new java.util.WeakHashMap());
> 
>      private int hash = 0;
> 
>      /** The "language" property */
>      private final StringProperty language;
> 
>      /** The "country" property */
>      private final StringProperty country;
> 
>      /** The "script" property */
>      private final StringProperty script;
> 
>      /** The "hyphenate" property */
>      private final EnumProperty hyphenate;
> 
>      /** The "hyphenation-character" property */
>      private final CharacterProperty hyphenationCharacter;
> 
>      /** The "hyphenation-push-character-count" property */
>      private final NumberProperty hyphenationPushCharacterCount;
> 
>      /** The "hyphenation-remain-character-count" property*/
>      private final NumberProperty hyphenationRemainCharacterCount;
> 
>      /**
>       * Construct a CommonHyphenation object holding the given  
> properties
>       *
>       */
>      private CommonHyphenation(StringProperty language,
>                                StringProperty country,
>                                StringProperty script,
>                                EnumProperty hyphenate,
>                                CharacterProperty hyphenationCharacter,
>                                NumberProperty  
> hyphenationPushCharacterCount,
>                                NumberProperty  
> hyphenationRemainCharacterCount) {
>          this.language = language;
>          this.country = country;
>          this.script = script;
>          this.hyphenate = hyphenate;
>          this.hyphenationCharacter = hyphenationCharacter;
>          this.hyphenationPushCharacterCount =  
> hyphenationPushCharacterCount;
>          this.hyphenationRemainCharacterCount =  
> hyphenationRemainCharacterCount;
>      }
> 
>      /**
>       * Gets the canonical <code>CommonHyphenation</code> instance  
> corresponding
>       * to the values of the related properties present on the given
>       * <code>PropertyList</code>
>       *
>       * @param propertyList  the <code>PropertyList</code>
>       */
>      public static CommonHyphenation getInstance(PropertyList  
> propertyList) throws PropertyException {
>          StringProperty language =
>              (StringProperty) propertyList.get(Constants.PR_LANGUAGE);
>          StringProperty country =
>              (StringProperty) propertyList.get(Constants.PR_COUNTRY);
>          StringProperty script =
>              (StringProperty) propertyList.get(Constants.PR_SCRIPT);
>          EnumProperty hyphenate =
>              (EnumProperty) propertyList.get(Constants.PR_HYPHENATE);
>          CharacterProperty hyphenationCharacter =
>              (CharacterProperty) propertyList.get 
> (Constants.PR_HYPHENATION_CHARACTER);
>          NumberProperty hyphenationPushCharacterCount =
>              (NumberProperty) propertyList.get 
> (Constants.PR_HYPHENATION_PUSH_CHARACTER_COUNT);
>          NumberProperty hyphenationRemainCharacterCount =
>              (NumberProperty) propertyList.get 
> (Constants.PR_HYPHENATION_REMAIN_CHARACTER_COUNT);
> 
>          CommonHyphenation instance = new CommonHyphenation(
>                                  language,
>                                  country,
>                                  script,
>                                  hyphenate,
>                                  hyphenationCharacter,
>                                  hyphenationPushCharacterCount,
>                                  hyphenationRemainCharacterCount);
> 
>          Object cachedInstance = cache.get(instance);
>          if (cachedInstance == null) {
>              cache.put(instance, instance);
>          } else {
>              instance = (CommonHyphenation) cachedInstance;
>          }
>          return instance;
> 
>      }
> 
>      /** @return the "lanuage" property as a String */
>      public String language() {
>          return language.getString();
>      }
> 
>      /** @return the "country" property as a String */
>      public String country() {
>          return country.getString();
>      }
> 
>      /** @return the "script" property as a String */
>      public String script() {
>          return script.getString();
>      }
> 
>      /** @return the "hyphenate" property as a boolean */
>      public boolean hyphenate() {
>          return (hyphenate.getEnum() == Constants.EN_TRUE);
>      }
> 
>      /** @return the "hyphenation-character" property as a char */
>      public char hyphenationCharacter() {
>          return hyphenationCharacter.getCharacter();
>      }
> 
>      /** @return the "hyphenation-push-character-count" property as  
> an int */
>      public int hyphenationPushCharacterCount() {
>          return hyphenationPushCharacterCount.getNumber().intValue();
>      }
> 
>      /** @return the "hyphenation-remain-character-count" property as  
> an int */
>      public int hyphenationRemainCharacterCount() {
>          return hyphenationRemainCharacterCount.getNumber().intValue();
>      }
> 
>      /** {@inheritDoc */
>      public boolean equals(Object obj) {
>          if (obj == this) {
>              return true;
>          }
>          if (obj instanceof CommonHyphenation) {
>              CommonHyphenation ch = (CommonHyphenation) obj;
>              return (ch.language == this.language
>                      && ch.country == this.country
>                      && ch.script == this.script
>                      && ch.hyphenate == this.hyphenate
>                      && ch.hyphenationCharacter ==  
> this.hyphenationCharacter
>                      && ch.hyphenationPushCharacterCount ==  
> this.hyphenationPushCharacterCount
>                      && ch.hyphenationRemainCharacterCount ==  
> this.hyphenationRemainCharacterCount);
>          }
>          return false;
>      }
> 
>      /** {@inheritDoc} */
>      public int hashCode() {
>          if (hash == 0) {
>              int hash = 7;
>              hash = 31 * hash + (language == null ? 0 :  
> language.hashCode());
>              hash = 31 * hash + (script == null ? 0 : script.hashCode 
> ());
>              hash = 31 * hash + (country == null ? 0 :  
> country.hashCode());
>              hash = 31 * hash + (hyphenate == null ? 0 :  
> hyphenate.hashCode());
>              hash = 31 * hash +
>                  (hyphenationCharacter == null ? 0 :  
> hyphenationCharacter.hashCode());
>              hash = 31 * hash +
>                  (hyphenationPushCharacterCount == null ? 0 :  
> hyphenationPushCharacterCount.hashCode());
>              hash = 31 * hash +
>                  (hyphenationRemainCharacterCount == null ? 0 :  
> hyphenationRemainCharacterCount.hashCode());
>          }
>          return hash;
>      }
> 
> }



Jeremias Maerki