You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Jeremias Maerki <de...@jeremias-maerki.ch> on 2007/07/18 22:17:44 UTC

The effect of the property cache (or: playing around with a new toy)

I read this week about SAP publishing a free Memory Analyzer. I first
tried it on a OutOfMemory heap dump of an commercial formatter which we
use in that project that consumed so much of my time lately. It was an
immediate eye opener.

So I wanted to play with this really great tool a little more and ran a
big document which I knew would cause an OutOfMemoryError at 64MB heap
size. Here's what came out:

Class Name                                                             |Objects  |Shallow Heap |Retained Heap |Perm 
---------------------------------------------------------------------------------------------------------------------
org.apache.fop.fo.properties.CondLengthProperty                        |  315'760|    7'578'240|     7'578'240|     
java.util.WeakHashMap$Entry                                            |  121'873|    4'874'920|     8'861'568|     
org.apache.fop.fo.flow.Block                                           |   26'391|    4'855'944|    54'554'584|     
org.apache.fop.fo.properties.SpaceProperty                             |   87'592|    4'204'416|     7'160'880|     
org.apache.fop.fo.properties.CommonBorderPaddingBackground             |   78'940|    3'789'120|    16'419'968|     
org.apache.fop.fo.properties.KeepProperty                              |  110'661|    3'541'152|     3'541'152|     
org.apache.fop.fo.FOText                                               |   30'461|    2'924'256|    52'488'224|     
org.apache.fop.fo.properties.CommonFont                                |   56'859|    2'729'232|     3'638'976|     
org.xml.sax.helpers.LocatorImpl                                        |  109'431|    2'626'344|     2'626'344|     
org.apache.fop.fo.flow.TableCell                                       |   23'154|    2'593'248|    54'296'024|     
org.apache.fop.fo.properties.CondLengthProperty[]                      |   78'940|    2'526'080|    10'102'592|     
org.apache.fop.fo.properties.CommonBorderPaddingBackground$BorderInfo[]|   78'940|    2'526'080|     2'526'080|     
org.apache.fop.fo.properties.NumberProperty                            |   98'304|    2'359'296|     3'932'184|     
org.apache.fop.fo.properties.CommonHyphenation                         |   56'852|    2'274'080|     2'274'080|     
char[]                                                                 |   63'420|    1'931'488|     1'931'488|     
org.apache.fop.fo.properties.LengthRangeProperty                       |   37'843|    1'513'720|     1'513'736|     
org.apache.fop.fo.flow.TableColumn                                     |   14'687|    1'409'952|     4'829'912|     
org.apache.fop.fo.properties.CommonMarginBlock                         |   30'592|    1'223'680|     4'160'448|     
org.apache.fop.fo.FONode[]                                             |   44'346|    1'064'304|    55'383'904|     
org.apache.fop.fo.properties.PercentLength                             |   26'402|    1'056'080|     1'689'728|     
java.lang.Double                                                       |   57'931|      926'896|       926'968|     
java.lang.String[]                                                     |   57'188|      920'096|       943'640|     
org.apache.fop.fo.properties.CommonRelativePosition                    |   26'391|      844'512|       844'512|     
java.lang.String                                                       |   33'367|      800'808|     1'794'048|     
java.lang.Integer                                                      |   45'526|      728'416|       729'032|     
2'221 more...                                                          |         |             |              |     
Total of 2'246 entries                                                 |1'818'755|   67'134'104|              |     
---------------------------------------------------------------------------------------------------------------------

Immediately shows that Andreas' recent change (rev 554091) for switching
off the SAX Locators can save a lot of memory.

I remembered Andreas working on a property cache a couple of weeks ago.
Seems to work fine on EnumProperty:

Look for class: .*EnumProperty.*
Class Name                                     |Objects |Shallow Heap |Retained Heap |Perm 
--------------------------------------------------------------------------------------------
org.apache.fop.fo.properties.EnumProperty$Maker|      94|        4'512|              |     
org.apache.fop.fo.properties.EnumProperty      |     182|        4'368|              |     
Total of 2 entries                             |     276|        8'880|              |     
--------------------------------------------------------------------------------------------

Then I noticed that NumberProperty (which uses the property cache)
has 98304 instances which couldn't be right. The problem: just using
that WeakHashMap isn't enough without the key objects implementing
hashCode() and equals() (see JDK javadocs). So I added an implementation
of the two methods to NumberProperty and here's what results:

After the change:
Look for class: .*NumberProperty.*
Class Name                                                |Objects |Shallow Heap |Retained Heap |Perm 
-------------------------------------------------------------------------------------------------------
org.apache.fop.fo.properties.NumberProperty$Maker         |      34|        1'632|              |     
org.apache.fop.fo.properties.NumberProperty               |      10|          240|              |     
org.apache.fop.fo.flow.TableFObj$ColumnNumberPropertyMaker|       1|           48|              |     
Total of 3 entries                                        |      45|        1'920|              |     
-------------------------------------------------------------------------------------------------------

Saves 2.3MB of memory for the instances of just one class!!!! What I
didn't look into is the penalty on performance, though. I can imagine
that could be quite a bit.

Here's the diff for the change:

Index: C:/Dev/FOP/main/xml-fop-patching/src/java/org/apache/fop/fo/properties/NumberProperty.java
===================================================================
--- C:/Dev/FOP/main/xml-fop-patching/src/java/org/apache/fop/fo/properties/NumberProperty.java	(revision 557343)
+++ C:/Dev/FOP/main/xml-fop-patching/src/java/org/apache/fop/fo/properties/NumberProperty.java	(working copy)
@@ -217,4 +217,33 @@
         return Color.black;
     }
 
+    /** {@inheritDoc} */
+    public int hashCode() {
+        //Multiply by 100 to get better buckets for Double instances
+        return (int)(number.doubleValue() * 100);
+    }
+
+    /** {@inheritDoc} */
+    public boolean equals(Object obj) {
+        if (this == obj) {
+            return true;
+        }
+        if (obj == null) {
+            return false;
+        }
+        if (getClass() != obj.getClass()) {
+            return false;
+        }
+        final NumberProperty other = (NumberProperty)obj;
+        if (number == null) {
+            if (other.number != null) {
+                return false;
+            }
+        } else if (!number.equals(other.number)) {
+            return false;
+        }
+        return true;
+    }
+
+    
 }

Food for thought...

Jeremias Maerki


Re: The effect of the property cache (or: playing around with a new toy)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 30, 2007, at 04:26, Manuel Mall wrote:

> On Thursday 19 July 2007 07:28, Andreas L Delmelle wrote:
>> On Jul 19, 2007, at 01:10, Andreas L Delmelle wrote:
>>
> <snip />
>>
>> BTW: the following is perfectly legitimate
>>
>> Object o = new Object();
>>
>> Just wondering if anyone can think of a use-case...
>>
>
> If you want a class internal object to synchronize on.

:-)
Yep, also occurred to me in the meantime. Thanks for the reply anyway.

Cheers

Andreas


Re: The effect of the property cache (or: playing around with a new toy)

Posted by Manuel Mall <ma...@apache.org>.
On Thursday 19 July 2007 07:28, Andreas L Delmelle wrote:
> On Jul 19, 2007, at 01:10, Andreas L Delmelle wrote:
>
<snip />
>
> BTW: the following is perfectly legitimate
>
> Object o = new Object();
>
> Just wondering if anyone can think of a use-case...
>

If you want a class internal object to synchronize on.

>
> Cheers
>
> Andreas

Manuel

Re: The effect of the property cache (or: playing around with a new toy)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 19, 2007, at 01:10, Andreas L Delmelle wrote:

<snip />

> Hmm, made me wonder:
>
>   Number oneInt = new Integer(1);
>   Number oneDouble = new Double(1.0);
>   boolean check = (oneInt.hashCode() != oneDouble.hashCode());
>
> => (check == true)
>
> Can this be relied upon?

BTW: the following is perfectly legitimate

Object o = new Object();

Just wondering if anyone can think of a use-case...


Cheers

Andreas


Re: The effect of the property cache (or: playing around with a new toy)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 19, 2007, at 01:10, Andreas L Delmelle wrote:

> <snip />
>   Number oneInt = new Integer(1);
>   Number oneDouble = new Double(1.0);
>   boolean check = (oneInt.hashCode() != oneDouble.hashCode());
>
> => (check == true)
>
> Can this be relied upon?

AFAICT, it can.
Integer.hashCode() will simply return the int value, while  
Double.hashCode() uses Double.doubleToLongBits(), which should always  
produce different results, even if the numeric value is the same.

The used implementations for equals(), I would presume to be safe: we  
can count on the fact that

Integer oneInt = new Integer(1);
check = (oneInt.equals(new Double(1)));

check will always be false here.

FTM, I have committed the change using the Number-subclass hashCode()  
implementations.


Cheers

Andreas

Re: The effect of the property cache (or: playing around with a new toy)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 18, 2007, at 23:31, Jeremias Maerki wrote:

<snip />
   [Me:]
>> All we really need to uniquely identify a NumberProperty is the
>> number-member, I think...
>
> I dont' think that works, since Number doesn't implement hashCode 
> (), for
> example.

Right, but its concrete subclasses in the java.lang package do.
Since the Number class itself is abstract, number will in fact have a  
runtime type of Integer or Double, so we would rely on the subclass  
implementation here... Hmm, made me wonder:

   Number oneInt = new Integer(1);
   Number oneDouble = new Double(1.0);
   boolean check = (oneInt.hashCode() != oneDouble.hashCode());

=> (check == true)

Can this be relied upon?

> BTW, I don't know if we should do anything about the "specVal" member
> variable in Property. I think it's mostly set to null.

Looking a bit closer, I think that member could even be dropped  
entirely. The only place where this value is accessed by means of  
getSpecifiedValue() is in LineHeightPropertyMaker. I'd be surprised  
if that cannot be avoided.


Cheers

Andreas

Re: The effect of the property cache (or: playing around with a new toy)

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 18.07.2007 23:05:57 Andreas L Delmelle wrote:
> On Jul 18, 2007, at 22:17, Jeremias Maerki wrote:
> 
> > So I wanted to play with this really great tool a little more and  
> > ran a
> > big document which I knew would cause an OutOfMemoryError at 64MB heap
> > size. Here's what came out:
> <snip />
> 
> Interesting figures!
> Did you have any chance to run a comparative test with say, FOP 0.93?

No, that need to wait until after my holidays.

One thing, I'm planning is to create a tool that downloads older FOP
revisions in a one-month rhythm to see how performance evolved over time.
I want to do this because I suspect we noticably lost performance
somewhere during the last 2 years. That could also involve taking memory
snapshots for each version to find out about memory development.

> Strange results for TableColumn, though, or are those 14000 instances  
> normal given the input (lots of smaller nested tables)?

No, just lots of small tables (an example from Luis Ferro in 2006).

> <snip />
> > Immediately shows that Andreas' recent change (rev 554091) for  
> > switching
> > off the SAX Locators can save a lot of memory.
> 
> Well, that was also part of Richard Wheeldon's patch (Bugzilla 41044)  
> containing the basic idea for the property cache. The patch itself  
> ultimately got committed in a few parts, and very much trimmed down,  
> but Richard deserves most of the credit. I only applied a razor to   
> his patch. :-)

Oops. Thanks, Richard!!!

> > I remembered Andreas working on a property cache a couple of weeks  
> > ago.
> > Seems to work fine on EnumProperty:
> 
> This should be already the case in 0.93. I seem to remember Simon  
> having applied Richard's initial patch shortly before the 0.93 release.
> 
> > Look for class: .*EnumProperty.*
> > Class Name                                     |Objects |Shallow  
> > Heap |Retained Heap |Perm
> > ---------------------------------------------------------------------- 
> > ----------------------
> > org.apache.fop.fo.properties.EnumProperty$Maker|      94|         
> > 4'512|              |
> > org.apache.fop.fo.properties.EnumProperty      |     182|         
> > 4'368|              |
> > Total of 2 entries                             |     276|         
> > 8'880|              |
> > ---------------------------------------------------------------------- 
> > ----------------------
> >
> > Then I noticed that NumberProperty (which uses the property cache)
> > has 98304 instances which couldn't be right. The problem: just using
> > that WeakHashMap isn't enough without the key objects implementing
> > hashCode() and equals() (see JDK javadocs).
> 
> Ouch! A painful oversight on my part. Good thing you checked. :)
> 
> As to the implementations, I'm wondering, since NumberProperty is  
> actually only a proxy for an underlying java.lang.Number, whether it  
> wouldn't be more convenient to simply reuse those:
> 
> hashCode() {
>    return number.hashCode();
> }
> 
> equals(Object o) {
>    if (o instanceof NumberProperty
>         && o != null) {
>      return this.number.equals(
>               ((NumberProperty) o).number);
>    } else {
>      return false;
>    }
> 
> All we really need to uniquely identify a NumberProperty is the  
> number-member, I think...

I dont' think that works, since Number doesn't implement hashCode(), for
example.

BTW, I don't know if we should do anything about the "specVal" member
variable in Property. I think it's mostly set to null.

Jeremias Maerki


Re: The effect of the property cache (or: playing around with a new toy)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 18, 2007, at 22:17, Jeremias Maerki wrote:

> So I wanted to play with this really great tool a little more and  
> ran a
> big document which I knew would cause an OutOfMemoryError at 64MB heap
> size. Here's what came out:
<snip />

Interesting figures!
Did you have any chance to run a comparative test with say, FOP 0.93?

Strange results for TableColumn, though, or are those 14000 instances  
normal given the input (lots of smaller nested tables)?

<snip />
> Immediately shows that Andreas' recent change (rev 554091) for  
> switching
> off the SAX Locators can save a lot of memory.

Well, that was also part of Richard Wheeldon's patch (Bugzilla 41044)  
containing the basic idea for the property cache. The patch itself  
ultimately got committed in a few parts, and very much trimmed down,  
but Richard deserves most of the credit. I only applied a razor to   
his patch. :-)

> I remembered Andreas working on a property cache a couple of weeks  
> ago.
> Seems to work fine on EnumProperty:

This should be already the case in 0.93. I seem to remember Simon  
having applied Richard's initial patch shortly before the 0.93 release.

> Look for class: .*EnumProperty.*
> Class Name                                     |Objects |Shallow  
> Heap |Retained Heap |Perm
> ---------------------------------------------------------------------- 
> ----------------------
> org.apache.fop.fo.properties.EnumProperty$Maker|      94|         
> 4'512|              |
> org.apache.fop.fo.properties.EnumProperty      |     182|         
> 4'368|              |
> Total of 2 entries                             |     276|         
> 8'880|              |
> ---------------------------------------------------------------------- 
> ----------------------
>
> Then I noticed that NumberProperty (which uses the property cache)
> has 98304 instances which couldn't be right. The problem: just using
> that WeakHashMap isn't enough without the key objects implementing
> hashCode() and equals() (see JDK javadocs).

Ouch! A painful oversight on my part. Good thing you checked. :)

As to the implementations, I'm wondering, since NumberProperty is  
actually only a proxy for an underlying java.lang.Number, whether it  
wouldn't be more convenient to simply reuse those:

hashCode() {
   return number.hashCode();
}

equals(Object o) {
   if (o instanceof NumberProperty
        && o != null) {
     return this.number.equals(
              ((NumberProperty) o).number);
   } else {
     return false;
   }

All we really need to uniquely identify a NumberProperty is the  
number-member, I think...


Cheers

Andreas

Re: The effect of the property cache (or: playing around with a new toy)

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 19.07.2007 00:36:40 Andreas L Delmelle wrote:
<snip/>
> > It may make more sense to have a
> > FlyWeightFactory (per object) per rendering run. Somebody (don't
> > remember who, sorry) wrote that FOP cannot bring the CPU to 100%,
> > probably due to synchronization issues. Could be worth investigating
> > since this would mean we give away CPU cicles here.
> 
> Could indeed be interesting to know if it's really the  
> synchronization or other aspects of FOP's architecture that would be  
> the cause of that. Using statics and the above utility method of  
> course makes performance greatly dependent on the cleverness of the  
> JVM implementation.
> Other than that, /if/ any synchronization should be performed, then  
> it would seem better to do so on the lower level: with the original  
> patch, multiple calls to the fetch() method would be prevented from  
> being executed concurrently, even if none of them altered the  
> underlying Map.

In that context, I've recently started reading
http://www.amazon.com/Java-Concurrency-Practice-Brian-Goetz/dp/0321349601/
which is a very enlightening book. I can only recommend it. 

<snip/>

Jeremias Maerki


Re: Caching CommonHyphenation (was: Re: The effect of the property cache ...)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 20, 2007, at 09:19, Jeremias Maerki wrote:

>> <snip />
>> This means we currently end up in the strange situation where
>> different/separate CommonHyphenation instances are generated from
>> identical sets of base Property instances.
>
> Raises the question for me if for properties without dynamic  
> context-based
> evaluation the property evaluation could be streamlined to directly
> return the primitive values instead of simple container objects like
> NumberProperty. throw new NotEnoughTimeRightNowException();

The 'evaluation' here is precisely triggered by the calls to  
PropertyList.get().
Before those calls in the CommonHyphenation constructor, the base  
properties might not even exist yet (if they were not specified on  
the FO that is bound to the CommonHyphenation).

Trading the NumberProperty for an int... No idea if that's feasible  
without a thorough revision. The entire property resolution mechanism  
currently depends on the generic Property return type. That design  
would obviously have to be abandoned for this handful of cases...

<snip />
>> public boolean hyphenate() {
>>    return (hyphenate.getEnum() == EN_TRUE);
>
> Well, I'd prefer Bean-style getters, i.e. getLanguage(),
> isHyphenationEnabled()

No problem. That was only by means of an example.

>
>>
>> Opinions?
>> For the interested parties: full CommonHyphenation below, following
>> roughly the same principles as the Property caching.
>
> "hash" should probably be transient here because it's a cached value.

Checked this out, and transient seems to be only applicable in a  
serialization context. If the object is never serialized, adding that  
keyword would seem to have zero effect... unless I'm missing something.



Cheers

Andreas

Re: Caching CommonHyphenation (was: Re: The effect of the property cache ...)

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 20.07.2007 06:56:21 Andreas L Delmelle wrote:
> On Jul 19, 2007, at 00:36, Andreas L Delmelle wrote:
> 
> >
> > On Jul 18, 2007, at 23:18, Jeremias Maerki wrote:
> > <snip />
> >> - One of the easiest candidates for another flyweight is probably
> >> CommonHyphenation (56K instances, 2.3MB in my example). The few  
> >> member
> >> variables could probably just be concatenated to a String (to be  
> >> used as
> >> the key).
> >
> > Interesting idea, will look into that asap.
> 
> FWIW:
> Looked a bit closer at this, and it suddenly struck me that all the  
> base Property types, apart from CharacterProperty which I overlooked  
> as a possible candidate, were already cached:
> 
> StringProperty -> language, country, script
> NumberProperty -> hyphenation-push/remain-character-count
> EnumProperty -> hyphenate
> 
> CharacterProperty(*) -> hyphenation-character
> 
> (*) now also added, see http://svn.apache.org/viewvc?view=rev&rev=557814
> 
> This means we currently end up in the strange situation where  
> different/separate CommonHyphenation instances are generated from  
> identical sets of base Property instances.

Raises the question for me if for properties without dynamic context-based
evaluation the property evaluation could be streamlined to directly
return the primitive values instead of simple container objects like
NumberProperty. throw new NotEnoughTimeRightNowException();

> Maybe the CommonHyphenation bundle could store references to the  
> original properties themselves instead of duplicating their content/ 
> value and storing them as primitives? By itself, this should be  
> roughly the same in terms of overall memory consumption: replacement  
> of some primitives with references.
> 
> In that case, one of the additional benefits of the individual  
> Property caching is that you can now actually avoid calls to  
> StringProperty.equals() in the rest of the code. "identity" means the  
> same as "equality" here, so the fastest possible implementation for  
> CommonHyphenation.equals() would then come to look like:
> 
> public final class CommonHyphenation {
> ...
> public final StringProperty language;
> public final StringProperty script;
> public final StringProperty country;
> public final EnumProperty hyphenate;
> ...
> public boolean equals(Object obj) {
>    if (obj == this) {
>      return true;
>    }
>    if (obj instanceof CommonHyphenation) {
>      CommonHyphenation ch = (CommonHyphenation) obj;
>      return (ch.language == this.language
>           && ch.script == this.script
>           && ch.country == this.country
>           && ch.hyphenate == this.hyphenate
>           && ...)
>    }
>    return false;
> }
> 
> One thing that cannot be avoided is the multiple calls to  
> PropertyList.get() to get to the properties that are needed to  
> perform the check for a flyweight bundle. Maybe the initial  
> assignments can be moved into the getInstance() method, so they  
> become part of the static code. getInstance() would get a  
> PropertyList as argument, while the private constructor signature is  
> altered to accept all the base properties as parameters.
> 
> The key in the Map could be a composite String, but could also again  
> be the CommonHyphenation itself, if a decent hashCode()  
> implementation is added.
> The benefit of using the instance itself is that the key in a  
> WeakHashMap is automatically released after the last object referring  
> to it has been cleared. Using a key other than the instance itself  
> would make WeakHashMap unusable, since the keys are in that case not  
> referenced directly by any object. The key cannot be embedded in the  
> instance itself, since that would prevent the entire entry from ever  
> being released...
> 
> The properties themselves being immutable and final, I guess it does  
> no harm to expose them as public members. Only a handful of places in  
> TextLM and LineLM would need a slight adjustment to compensate for  
> the lost getString() and getEnum() conversions. Maybe for  
> convenience, if really needed, accessors could be added like:
> 
> public String language() {
>    return language.getString();
> }
> ...
> public boolean hyphenate() {
>    return (hyphenate.getEnum() == EN_TRUE);

Well, I'd prefer Bean-style getters, i.e. getLanguage(),
isHyphenationEnabled()

> 
> Opinions?
> For the interested parties: full CommonHyphenation below, following  
> roughly the same principles as the Property caching.

"hash" should probably be transient here because it's a cached value.

> Cheers
> 
> Andreas
> 
> --- Sample code ---
> public final class CommonHyphenation {
> 
>      private static final Map cache =  
> java.util.Collections.synchronizedMap(
>                                          new java.util.WeakHashMap());
> 
>      private int hash = 0;
> 
>      /** The "language" property */
>      private final StringProperty language;
> 
>      /** The "country" property */
>      private final StringProperty country;
> 
>      /** The "script" property */
>      private final StringProperty script;
> 
>      /** The "hyphenate" property */
>      private final EnumProperty hyphenate;
> 
>      /** The "hyphenation-character" property */
>      private final CharacterProperty hyphenationCharacter;
> 
>      /** The "hyphenation-push-character-count" property */
>      private final NumberProperty hyphenationPushCharacterCount;
> 
>      /** The "hyphenation-remain-character-count" property*/
>      private final NumberProperty hyphenationRemainCharacterCount;
> 
>      /**
>       * Construct a CommonHyphenation object holding the given  
> properties
>       *
>       */
>      private CommonHyphenation(StringProperty language,
>                                StringProperty country,
>                                StringProperty script,
>                                EnumProperty hyphenate,
>                                CharacterProperty hyphenationCharacter,
>                                NumberProperty  
> hyphenationPushCharacterCount,
>                                NumberProperty  
> hyphenationRemainCharacterCount) {
>          this.language = language;
>          this.country = country;
>          this.script = script;
>          this.hyphenate = hyphenate;
>          this.hyphenationCharacter = hyphenationCharacter;
>          this.hyphenationPushCharacterCount =  
> hyphenationPushCharacterCount;
>          this.hyphenationRemainCharacterCount =  
> hyphenationRemainCharacterCount;
>      }
> 
>      /**
>       * Gets the canonical <code>CommonHyphenation</code> instance  
> corresponding
>       * to the values of the related properties present on the given
>       * <code>PropertyList</code>
>       *
>       * @param propertyList  the <code>PropertyList</code>
>       */
>      public static CommonHyphenation getInstance(PropertyList  
> propertyList) throws PropertyException {
>          StringProperty language =
>              (StringProperty) propertyList.get(Constants.PR_LANGUAGE);
>          StringProperty country =
>              (StringProperty) propertyList.get(Constants.PR_COUNTRY);
>          StringProperty script =
>              (StringProperty) propertyList.get(Constants.PR_SCRIPT);
>          EnumProperty hyphenate =
>              (EnumProperty) propertyList.get(Constants.PR_HYPHENATE);
>          CharacterProperty hyphenationCharacter =
>              (CharacterProperty) propertyList.get 
> (Constants.PR_HYPHENATION_CHARACTER);
>          NumberProperty hyphenationPushCharacterCount =
>              (NumberProperty) propertyList.get 
> (Constants.PR_HYPHENATION_PUSH_CHARACTER_COUNT);
>          NumberProperty hyphenationRemainCharacterCount =
>              (NumberProperty) propertyList.get 
> (Constants.PR_HYPHENATION_REMAIN_CHARACTER_COUNT);
> 
>          CommonHyphenation instance = new CommonHyphenation(
>                                  language,
>                                  country,
>                                  script,
>                                  hyphenate,
>                                  hyphenationCharacter,
>                                  hyphenationPushCharacterCount,
>                                  hyphenationRemainCharacterCount);
> 
>          Object cachedInstance = cache.get(instance);
>          if (cachedInstance == null) {
>              cache.put(instance, instance);
>          } else {
>              instance = (CommonHyphenation) cachedInstance;
>          }
>          return instance;
> 
>      }
> 
>      /** @return the "lanuage" property as a String */
>      public String language() {
>          return language.getString();
>      }
> 
>      /** @return the "country" property as a String */
>      public String country() {
>          return country.getString();
>      }
> 
>      /** @return the "script" property as a String */
>      public String script() {
>          return script.getString();
>      }
> 
>      /** @return the "hyphenate" property as a boolean */
>      public boolean hyphenate() {
>          return (hyphenate.getEnum() == Constants.EN_TRUE);
>      }
> 
>      /** @return the "hyphenation-character" property as a char */
>      public char hyphenationCharacter() {
>          return hyphenationCharacter.getCharacter();
>      }
> 
>      /** @return the "hyphenation-push-character-count" property as  
> an int */
>      public int hyphenationPushCharacterCount() {
>          return hyphenationPushCharacterCount.getNumber().intValue();
>      }
> 
>      /** @return the "hyphenation-remain-character-count" property as  
> an int */
>      public int hyphenationRemainCharacterCount() {
>          return hyphenationRemainCharacterCount.getNumber().intValue();
>      }
> 
>      /** {@inheritDoc */
>      public boolean equals(Object obj) {
>          if (obj == this) {
>              return true;
>          }
>          if (obj instanceof CommonHyphenation) {
>              CommonHyphenation ch = (CommonHyphenation) obj;
>              return (ch.language == this.language
>                      && ch.country == this.country
>                      && ch.script == this.script
>                      && ch.hyphenate == this.hyphenate
>                      && ch.hyphenationCharacter ==  
> this.hyphenationCharacter
>                      && ch.hyphenationPushCharacterCount ==  
> this.hyphenationPushCharacterCount
>                      && ch.hyphenationRemainCharacterCount ==  
> this.hyphenationRemainCharacterCount);
>          }
>          return false;
>      }
> 
>      /** {@inheritDoc} */
>      public int hashCode() {
>          if (hash == 0) {
>              int hash = 7;
>              hash = 31 * hash + (language == null ? 0 :  
> language.hashCode());
>              hash = 31 * hash + (script == null ? 0 : script.hashCode 
> ());
>              hash = 31 * hash + (country == null ? 0 :  
> country.hashCode());
>              hash = 31 * hash + (hyphenate == null ? 0 :  
> hyphenate.hashCode());
>              hash = 31 * hash +
>                  (hyphenationCharacter == null ? 0 :  
> hyphenationCharacter.hashCode());
>              hash = 31 * hash +
>                  (hyphenationPushCharacterCount == null ? 0 :  
> hyphenationPushCharacterCount.hashCode());
>              hash = 31 * hash +
>                  (hyphenationRemainCharacterCount == null ? 0 :  
> hyphenationRemainCharacterCount.hashCode());
>          }
>          return hash;
>      }
> 
> }



Jeremias Maerki


Caching CommonHyphenation (was: Re: The effect of the property cache ...)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 19, 2007, at 00:36, Andreas L Delmelle wrote:

>
> On Jul 18, 2007, at 23:18, Jeremias Maerki wrote:
> <snip />
>> - One of the easiest candidates for another flyweight is probably
>> CommonHyphenation (56K instances, 2.3MB in my example). The few  
>> member
>> variables could probably just be concatenated to a String (to be  
>> used as
>> the key).
>
> Interesting idea, will look into that asap.

FWIW:
Looked a bit closer at this, and it suddenly struck me that all the  
base Property types, apart from CharacterProperty which I overlooked  
as a possible candidate, were already cached:

StringProperty -> language, country, script
NumberProperty -> hyphenation-push/remain-character-count
EnumProperty -> hyphenate

CharacterProperty(*) -> hyphenation-character

(*) now also added, see http://svn.apache.org/viewvc?view=rev&rev=557814

This means we currently end up in the strange situation where  
different/separate CommonHyphenation instances are generated from  
identical sets of base Property instances.

Maybe the CommonHyphenation bundle could store references to the  
original properties themselves instead of duplicating their content/ 
value and storing them as primitives? By itself, this should be  
roughly the same in terms of overall memory consumption: replacement  
of some primitives with references.

In that case, one of the additional benefits of the individual  
Property caching is that you can now actually avoid calls to  
StringProperty.equals() in the rest of the code. "identity" means the  
same as "equality" here, so the fastest possible implementation for  
CommonHyphenation.equals() would then come to look like:

public final class CommonHyphenation {
...
public final StringProperty language;
public final StringProperty script;
public final StringProperty country;
public final EnumProperty hyphenate;
...
public boolean equals(Object obj) {
   if (obj == this) {
     return true;
   }
   if (obj instanceof CommonHyphenation) {
     CommonHyphenation ch = (CommonHyphenation) obj;
     return (ch.language == this.language
          && ch.script == this.script
          && ch.country == this.country
          && ch.hyphenate == this.hyphenate
          && ...)
   }
   return false;
}

One thing that cannot be avoided is the multiple calls to  
PropertyList.get() to get to the properties that are needed to  
perform the check for a flyweight bundle. Maybe the initial  
assignments can be moved into the getInstance() method, so they  
become part of the static code. getInstance() would get a  
PropertyList as argument, while the private constructor signature is  
altered to accept all the base properties as parameters.

The key in the Map could be a composite String, but could also again  
be the CommonHyphenation itself, if a decent hashCode()  
implementation is added.
The benefit of using the instance itself is that the key in a  
WeakHashMap is automatically released after the last object referring  
to it has been cleared. Using a key other than the instance itself  
would make WeakHashMap unusable, since the keys are in that case not  
referenced directly by any object. The key cannot be embedded in the  
instance itself, since that would prevent the entire entry from ever  
being released...

The properties themselves being immutable and final, I guess it does  
no harm to expose them as public members. Only a handful of places in  
TextLM and LineLM would need a slight adjustment to compensate for  
the lost getString() and getEnum() conversions. Maybe for  
convenience, if really needed, accessors could be added like:

public String language() {
   return language.getString();
}
...
public boolean hyphenate() {
   return (hyphenate.getEnum() == EN_TRUE);


Opinions?
For the interested parties: full CommonHyphenation below, following  
roughly the same principles as the Property caching.

Cheers

Andreas

--- Sample code ---
public final class CommonHyphenation {

     private static final Map cache =  
java.util.Collections.synchronizedMap(
                                         new java.util.WeakHashMap());

     private int hash = 0;

     /** The "language" property */
     private final StringProperty language;

     /** The "country" property */
     private final StringProperty country;

     /** The "script" property */
     private final StringProperty script;

     /** The "hyphenate" property */
     private final EnumProperty hyphenate;

     /** The "hyphenation-character" property */
     private final CharacterProperty hyphenationCharacter;

     /** The "hyphenation-push-character-count" property */
     private final NumberProperty hyphenationPushCharacterCount;

     /** The "hyphenation-remain-character-count" property*/
     private final NumberProperty hyphenationRemainCharacterCount;

     /**
      * Construct a CommonHyphenation object holding the given  
properties
      *
      */
     private CommonHyphenation(StringProperty language,
                               StringProperty country,
                               StringProperty script,
                               EnumProperty hyphenate,
                               CharacterProperty hyphenationCharacter,
                               NumberProperty  
hyphenationPushCharacterCount,
                               NumberProperty  
hyphenationRemainCharacterCount) {
         this.language = language;
         this.country = country;
         this.script = script;
         this.hyphenate = hyphenate;
         this.hyphenationCharacter = hyphenationCharacter;
         this.hyphenationPushCharacterCount =  
hyphenationPushCharacterCount;
         this.hyphenationRemainCharacterCount =  
hyphenationRemainCharacterCount;
     }

     /**
      * Gets the canonical <code>CommonHyphenation</code> instance  
corresponding
      * to the values of the related properties present on the given
      * <code>PropertyList</code>
      *
      * @param propertyList  the <code>PropertyList</code>
      */
     public static CommonHyphenation getInstance(PropertyList  
propertyList) throws PropertyException {
         StringProperty language =
             (StringProperty) propertyList.get(Constants.PR_LANGUAGE);
         StringProperty country =
             (StringProperty) propertyList.get(Constants.PR_COUNTRY);
         StringProperty script =
             (StringProperty) propertyList.get(Constants.PR_SCRIPT);
         EnumProperty hyphenate =
             (EnumProperty) propertyList.get(Constants.PR_HYPHENATE);
         CharacterProperty hyphenationCharacter =
             (CharacterProperty) propertyList.get 
(Constants.PR_HYPHENATION_CHARACTER);
         NumberProperty hyphenationPushCharacterCount =
             (NumberProperty) propertyList.get 
(Constants.PR_HYPHENATION_PUSH_CHARACTER_COUNT);
         NumberProperty hyphenationRemainCharacterCount =
             (NumberProperty) propertyList.get 
(Constants.PR_HYPHENATION_REMAIN_CHARACTER_COUNT);

         CommonHyphenation instance = new CommonHyphenation(
                                 language,
                                 country,
                                 script,
                                 hyphenate,
                                 hyphenationCharacter,
                                 hyphenationPushCharacterCount,
                                 hyphenationRemainCharacterCount);

         Object cachedInstance = cache.get(instance);
         if (cachedInstance == null) {
             cache.put(instance, instance);
         } else {
             instance = (CommonHyphenation) cachedInstance;
         }
         return instance;

     }

     /** @return the "lanuage" property as a String */
     public String language() {
         return language.getString();
     }

     /** @return the "country" property as a String */
     public String country() {
         return country.getString();
     }

     /** @return the "script" property as a String */
     public String script() {
         return script.getString();
     }

     /** @return the "hyphenate" property as a boolean */
     public boolean hyphenate() {
         return (hyphenate.getEnum() == Constants.EN_TRUE);
     }

     /** @return the "hyphenation-character" property as a char */
     public char hyphenationCharacter() {
         return hyphenationCharacter.getCharacter();
     }

     /** @return the "hyphenation-push-character-count" property as  
an int */
     public int hyphenationPushCharacterCount() {
         return hyphenationPushCharacterCount.getNumber().intValue();
     }

     /** @return the "hyphenation-remain-character-count" property as  
an int */
     public int hyphenationRemainCharacterCount() {
         return hyphenationRemainCharacterCount.getNumber().intValue();
     }

     /** {@inheritDoc */
     public boolean equals(Object obj) {
         if (obj == this) {
             return true;
         }
         if (obj instanceof CommonHyphenation) {
             CommonHyphenation ch = (CommonHyphenation) obj;
             return (ch.language == this.language
                     && ch.country == this.country
                     && ch.script == this.script
                     && ch.hyphenate == this.hyphenate
                     && ch.hyphenationCharacter ==  
this.hyphenationCharacter
                     && ch.hyphenationPushCharacterCount ==  
this.hyphenationPushCharacterCount
                     && ch.hyphenationRemainCharacterCount ==  
this.hyphenationRemainCharacterCount);
         }
         return false;
     }

     /** {@inheritDoc} */
     public int hashCode() {
         if (hash == 0) {
             int hash = 7;
             hash = 31 * hash + (language == null ? 0 :  
language.hashCode());
             hash = 31 * hash + (script == null ? 0 : script.hashCode 
());
             hash = 31 * hash + (country == null ? 0 :  
country.hashCode());
             hash = 31 * hash + (hyphenate == null ? 0 :  
hyphenate.hashCode());
             hash = 31 * hash +
                 (hyphenationCharacter == null ? 0 :  
hyphenationCharacter.hashCode());
             hash = 31 * hash +
                 (hyphenationPushCharacterCount == null ? 0 :  
hyphenationPushCharacterCount.hashCode());
             hash = 31 * hash +
                 (hyphenationRemainCharacterCount == null ? 0 :  
hyphenationRemainCharacterCount.hashCode());
         }
         return hash;
     }

}


Re: The effect of the property cache (or: playing around with a new toy)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 18, 2007, at 23:18, Jeremias Maerki wrote:

> Some additional thoughts before going to bed:
>
> - I think the static synchronized WeakHashMap should be avoided.  
> There's
> a lot of synchronization that is performed. I suspect this will have a
> bad effect on performance.

I have always assumed it would have some effect that could be  
interpreted as bad.
Richard's initial version of the PropertyCache synchronized the fetch 
() method, and if I remember and interpret correctly, this caused a  
noticeable slowdown in performing the jUnit tests. Once I decided to  
use Collections.synchronizedMap(), the speed went back to normal.

> It may make more sense to have a
> FlyWeightFactory (per object) per rendering run. Somebody (don't
> remember who, sorry) wrote that FOP cannot bring the CPU to 100%,
> probably due to synchronization issues. Could be worth investigating
> since this would mean we give away CPU cicles here.

Could indeed be interesting to know if it's really the  
synchronization or other aspects of FOP's architecture that would be  
the cause of that. Using statics and the above utility method of  
course makes performance greatly dependent on the cleverness of the  
JVM implementation.
Other than that, /if/ any synchronization should be performed, then  
it would seem better to do so on the lower level: with the original  
patch, multiple calls to the fetch() method would be prevented from  
being executed concurrently, even if none of them altered the  
underlying Map.

> - It's probably not the most effective way to use the Property  
> itself as
> the key into the Map, i.e. creating throw-away instances excessively.
> Working with specialized FlyWeightFactories could be more  
> efficient. See
> http://en.wikipedia.org/wiki/Flyweight_pattern

Also my initial thought: if the Property is simply a wrapper around  
an int or a String, then I'd assume that it would be better to check  
for an existing instance corresponding to that base value.
Then again, I do seem to remember Richard pointing out that creating  
a lot of temporary instances is quasi irrelevant in a language like  
Java, especially the later JVM versions. IIRC, it has to do with the  
fact that the throw-away instances never have to leave the stack -- 
which by itself already saves the time it takes to reserve addresses  
in the heap and transfer the instances to them.

This factor does become important once you have a situation where a  
pass through the constructor triggers a method call for each of a  
handful of members, like in the case of the Common-bundles.
In case of the immediate Property subclasses, however, the  
constructors themselves are actually very light, so it should not be  
much of a problem. The bulk of the 'construction' work is carried by  
the Makers and the PropertyParser.

> - One of the easiest candidates for another flyweight is probably
> CommonHyphenation (56K instances, 2.3MB in my example). The few member
> variables could probably just be concatenated to a String (to be  
> used as
> the key).

Interesting idea, will look into that asap.

> - It could be worth looking into the difference between HashMap and
> TreeMap. Maybe one is more suitable than the other. I haven't  
> looked at
> the difference, yet.

No idea here either. Come to think of it, I've never really looked at  
TreeMap before at all. I'd have to check for specific use-cases first  
to see what the specific use could be here. There's no WeakTreeMap  
yet, though... 8-)


Cheers

Andreas


Re: The effect of the property cache (or: playing around with a new toy)

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
Some additional thoughts before going to bed:

- I think the static synchronized WeakHashMap should be avoided. There's
a lot of synchronization that is performed. I suspect this will have a
bad effect on performance. It may make more sense to have a
FlyWeightFactory (per object) per rendering run. Somebody (don't
remember who, sorry) wrote that FOP cannot bring the CPU to 100%,
probably due to synchronization issues. Could be worth investigating
since this would mean we give away CPU cicles here.
- It's probably not the most effective way to use the Property itself as
the key into the Map, i.e. creating throw-away instances excessively.
Working with specialized FlyWeightFactories could be more efficient. See
http://en.wikipedia.org/wiki/Flyweight_pattern
- One of the easiest candidates for another flyweight is probably
CommonHyphenation (56K instances, 2.3MB in my example). The few member
variables could probably just be concatenated to a String (to be used as
the key).
- It could be worth looking into the difference between HashMap and
TreeMap. Maybe one is more suitable than the other. I haven't looked at
the difference, yet.

Jeremias Maerki


Re: The effect of the property cache (or: playing around with a new toy)

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 18, 2007, at 22:17, Jeremias Maerki wrote:

FWIW: going over those initial figures again, and

> Class  
> Name                                                             | 
> Objects  |Shallow Heap |Retained Heap |Perm
> ---------------------------------------------------------------------- 
> -----------------------------------------------
> org.apache.fop.fo.properties.CondLengthProperty                        
>  |  315'760|    7'578'240|     7'578'240|

These still seem to eating up quite some memory. I've also been  
looking at those, but the big bugger here is the Length-component,  
which can be a PercentLength, and as such is currently non-eligible  
for caching... :(

If this could somehow be circumvented, we could easily reach the  
stadium of caching entire CommonBorderPaddingBackground bundles.

> org.apache.fop.fo.flow.Block                                           
>  |   26'391|    4'855'944|    54'554'584|
> org.apache.fop.fo.properties.SpaceProperty                             
>  |   87'592|    4'204'416|     7'160'880|
> org.apache.fop.fo.properties.CommonBorderPaddingBackground             
>  |   78'940|    3'789'120|    16'419'968|
> org.apache.fop.fo.properties.KeepProperty                              
>  |  110'661|    3'541'152|     3'541'152|

KeepProperty's also one I might look into very soon. The three  
components are all either enums or absolute numbers, so caching those  
should prove to be quite doable. As to SpaceProperty, again the  
possibility of PercentLengths prevents caching from having any use at  
all. The main blocker here is that there's a reference to the FObj  
stored in the implied LengthBase, which makes each PercentLength  
specific to the FObj it was created for...

As a closing note for this one:
How about scrapping the CommonHyphenation, CommonFont etc. entirely,  
and binding the applicable properties to the FONodes directly instead?

In the layout-code, you would not have:

boolean hyphenate = ((Block) fo).getCommonHyphenation 
().isHyphenateEnabled();

but simply:

boolean hyphenate = ((Block) fo).isHyphenateEnabled();


Seems much cleaner. Although I'm still not entirely happy about the  
property access, this already looks much more readable and avoids the  
overhead of the bundle-instances themselves altogether...


Cheers

Andreas