You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@santuario.apache.org by "raul-info@r-bg.com" <ra...@r-bg.com> on 2004/05/02 00:32:37 UTC

Speed optimizations

Hello,

I have been working in some optimizations in the xml security library; 
this optimization has reduced the memory foot print & speed up the 
signing & verification of big XML documents. The speed-up ratio varies 
enormously between tests case. Some simple ones (verified a signed 
Liberty LogoutRequest wrapped in a SOAP enveloped) accomplish in half 
time, the opensaml assertion test got a <10% and a modified opensaml 
POSTProfile (where only the assertions are signed) got a 15%-20% 
speed-up. Our Liberty implementation case got a 40% speed-up from the 
1.1 release. So it's very document dependant.

            The changes are a big path file of 168.243 bytes that can be 
summarized in four parts:

    * C14N rewrite: I'll see that a big part of time is gone in the
      canonicalization. This part is our biggest time hog: So I have
      more or less rewrite it using a name spaces table approach. As I
      descend the document nodes I fill a name space table with the
      xmlns definitions and whether are rendered in c14n tree or not.
      The old method relay of having the doc tree circumvented (i.e.
      have all the namespaces of the parent copies in all of its
      children) and then look upward in the tree (using the getParent()
      DOM function) to see if a current namespace had been already
      rendered. This method is memory expensive (it creates a new
      attribute for every namespace defined in a parent node) and not
      very efficient for big tree documents. The new one seems better in
      these two aspects. But the speed improvements depend of the
      specific DOM tree.
      These changes permit to get rid of a lot of calls to
      circumventeBug2650
      These changes are the 60% of the patch. And impact only: 
      org/apache/xml/security/c14n/implementations/*

    * Remove all unnecessary use of XPath API: It seems that using XPath
      is slow and memory tax. So I have try to change all the xpath
      calls for their equivalent in DOM. These have decrease the memory
      usage a little and improve a little the speed (the bigger ones
      where already changed in 1.1). These changes impact: (See (1))
    * Trying to don't use XMLSignatureInput with nodesets: I have tried
      to don't used XMLSignatureInput with a set of nodes but in the
      xpath transformation. Normally the reference node specifies a node
      not a nodeset, only if there is a xpath transformation a nodeset
      is needed. The use of a node instead of a nodeset is better for
      memory purpose and permit to canonicalize using c14nSubtree
      instead of c14Xpathnodeset, the latter need to visit every node in
      the document tree, the former only the node & its children.
      The problem is that if the enveloped signature transformation is
      used a nodeset is needed again. In order to fix it, I have added a
      new field to the XMLSignatureInput and added new functions to c14n
      classes to specified a exclude node in the subtree(this way the
      c14n only skips the exclude node when it finds it). This change
      impacts: (See (2))
    * Miscellaneous changes: In
      org/apache/xml/security/signature/Reference.java some time is
      spend calculating the _transformsInput field in every deference
      (when signing or verifying a reference) this field is seldom used
      (only for debugging purpouse) so it's better to calculate it only
      when required.
      Other little change can be done in
      org/apache/xml/security/signature/SignedInfo.java to don't do the
      cannonicalization and the parse of importing of the signedinfo
      element if safe c14n methods are used (these safes are the
      standard ones).

 

So if there is some interest I can contribute back these changes, so it 
can be peer reviewed and fixed if needed. Or if any one want to test 
without the fuss of recompilling  I can send him the .jar file with the 
modified library.

 

Regards,

 

Raul Benito


p.s- Sorry for the big email in html but it reads better.


1:

    * org/apache/xml/security/keys/content/KeyValue.java
    * org/apache/xml/security/keys/content/RetrievalMethod.java
    * org/apache/xml/security/keys/keyresolver/implementations/DSAKeyValueResolver.java
    * org/apache/xml/security/keys/keyresolver/implementations/RSAKeyValueResolver.java
    * org/apache/xml/security/keys/keyresolver/implementations/X509CertificateResolver.java
    * org/apache/xml/security/keys/keyresolver/implementations/X509SKIResolver.java
    * org/apache/xml/security/keys/keyresolver/implementations/X509SubjectNameResolver.java
    * org/apache/xml/security/signature/SignatureProperties.java
    * org/apache/xml/security/signature/XMLSignature.java
    * org/apache/xml/security/transforms/Transforms.java (Already use a
      DOM method only refactor to invoke a common one).
    * org/apache/xml/security/utils/ElementProxy.java
    * org/apache/xml/security/utils/SignatureElementProxy.java
    * org/apache/xml/security/utils/XMLUtils.java (Added the functions
      to do the DOM searches).


2:

    * org/apache/xml/security/signature/XMLSignatureInput.java       
    * org/apache/xml/security/transforms/implementations/TransformC14N.java
    * org/apache/xml/security/transforms/implementations/TransformC14NExclusive.java
    * org/apache/xml/security/transforms/implementations/TransformC14NExclusiveWithComments.java
    * org/apache/xml/security/transforms/implementations/TransformEnvelopedSignature.java
      (and also move some methods XMLUtils)
    * org/apache/xml/security/utils/IdResolver.java
    * org/apache/xml/security/utils/SignatureElementProxy.java
    * org/apache/xml/security/utils/resolver/implementations/ResolverFragment.java
    * org/apache/xml/security/utils/resolver/implementations/ResolverXPointer.java
    * org/apache/xml/security/utils/XMLUtils.java (Added some methods)
    * (and the c14n implementations but these have been take on account
      above).


Re: Speed optimizations

Posted by Davanum Srinivas <da...@gmail.com>.
awesome!!...please open up a bugzilla bug report and attach the "cvs
diff -u" output.

thx,
dims


----- Original Message -----
From: raul-info@r-bg.com <ra...@r-bg.com>
Date: Sun, 02 May 2004 00:32:37 +0200
Subject: Speed optimizations
To: security-dev@xml.apache.org





  
  




Hello, 


I have been working in
some
optimizations in the xml security library; this optimization has
reduced the
memory foot print & speed up the signing & verification of big
XML
documents. The speed-up ratio varies enormously between tests case.
Some simple
ones (verified a signed Liberty LogoutRequest
wrapped
in a SOAP enveloped) accomplish in half time, the opensaml
assertion test got a <10% and a modified opensaml
POSTProfile (where only the assertions are
signed) got a
15%-20% speed-up. Our Liberty
implementation
case got a 40% speed-up from the 1.1 release. So it's very document
dependant.


            The changes
are a big path file of 168.243 bytes that can be summarized in four
parts:


  C14N rewrite: I'll see that a big part
of time is gone in the canonicalization.
This part is our biggest time hog: So I have more or less rewrite it
using a name spaces table approach. As I descend the document nodes I
fill a name space table with the xmlns
definitions and whether are rendered in c14n tree or not. 

The old method relay of having the doc tree circumvented (i.e. have all
the namespaces of the parent copies in all of its children) and then
look upward in the tree (using the getParent() DOM
function) to see if a current namespace had been already rendered. This
method is memory expensive (it creates a new attribute for every
namespace defined in a parent node) and not very efficient for big tree
documents. The new one seems better in these two aspects. But the speed
improvements depend of the specific DOM tree.

These changes permit to get rid of a lot of calls to circumventeBug2650

    These changes are the 60% of the patch.
And impact only:  org/apache/xml/security/c14n/implementations/*

  



  Remove all unnecessary use of XPath API: It seems that using XPath
is slow and memory tax. So I have try to change all the xpath
calls for their equivalent in DOM. These have decrease the memory usage
a little and improve a little the speed (the bigger ones where already
changed in 1.1). These changes impact: (See (1))

  
  
Trying to don't use XMLSignatureInput with nodesets:
I have tried to don't used XMLSignatureInput with a set of nodes but in the
    xpath transformation. Normally the
reference node specifies a node not a nodeset,
only if there is a xpath
transformation a nodeset is needed. The
use of a node instead of a nodeset is
better for memory purpose and permit to canonicalize
using c14nSubtree instead of c14Xpathnodeset, the latter need to visit
every node in the document tree, the former only the node & its
children.

The problem is that if the enveloped signature transformation is used a
    nodeset is needed again. In order to
fix it, I have added a new field to the XMLSignatureInput
and added new functions to c14n classes to specified a exclude node in
the subtree(this way the c14n only skips the exclude node
when it finds it). This change impacts: (See (2))

  
  
Miscellaneous changes: In
org/apache/xml/security/signature/Reference.java
some time is spend calculating the _transformsInput
field in every deference (when signing or verifying a reference) this
field is seldom used (only for debugging purpouse)
so it's better to calculate it only when required.

Other little change can be done in
org/apache/xml/security/signature/SignedInfo.java to don't do the
cannonicalization and the parse of importing of
the signedinfo element if safe c14n
methods are used (these safes are the standard ones). 



 


So if there is some interest I can contribute back
these
changes, so it can be peer reviewed and fixed if needed. Or if any one
want to test without the fuss of recompilling  I can send him the .jar
file with the modified library.




 


Regards, 


 


Raul Benito






p.s- Sorry for the big email in html but it reads better.






1:




  org/apache/xml/security/keys/content/KeyValue.java
  
org/apache/xml/security/keys/content/RetrievalMethod.java
  
org/apache/xml/security/keys/keyresolver/implementations/DSAKeyValueResolver.java
  org/apache/xml/security/keys/keyresolver/implementations/RSAKeyValueResolver.java
  
org/apache/xml/security/keys/keyresolver/implementations/X509CertificateResolver.java
  org/apache/xml/security/keys/keyresolver/implementations/X509SKIResolver.java
  
org/apache/xml/security/keys/keyresolver/implementations/X509SubjectNameResolver.java
  org/apache/xml/security/signature/SignatureProperties.java
  
org/apache/xml/security/signature/XMLSignature.java
  
org/apache/xml/security/transforms/Transforms.java (Already use a DOM
method only refactor to invoke a common one).
  
org/apache/xml/security/utils/ElementProxy.java
  
org/apache/xml/security/utils/SignatureElementProxy.java
  
org/apache/xml/security/utils/XMLUtils.java (Added the functions to do
the DOM searches).



2:





  org/apache/xml/security/signature/XMLSignatureInput.java        
  
org/apache/xml/security/transforms/implementations/TransformC14N.java
  org/apache/xml/security/transforms/implementations/TransformC14NExclusive.java
  
org/apache/xml/security/transforms/implementations/TransformC14NExclusiveWithComments.java
  org/apache/xml/security/transforms/implementations/TransformEnvelopedSignature.java
(and also move some methods XMLUtils)
  
org/apache/xml/security/utils/IdResolver.java
  
org/apache/xml/security/utils/SignatureElementProxy.java
  
org/apache/xml/security/utils/resolver/implementations/ResolverFragment.java
  org/apache/xml/security/utils/resolver/implementations/ResolverXPointer.java
  
org/apache/xml/security/utils/XMLUtils.java (Added some methods)
  
(and the c14n implementations but
these have been take on account above).

Re: Speed optimizations

Posted by Martin Labarthe Dubois <du...@consist.com.ar>.
I tested the patch and I must say that memory usage improvements are huge, therefore speed.





  ----- Original Message ----- 
  From: raul-info@r-bg.com 
  To: security-dev@xml.apache.org 
  Sent: Saturday, May 01, 2004 7:32 PM
  Subject: Speed optimizations


  Hello, 

  I have been working in some optimizations in the xml security library; this optimization has reduced the memory foot print & speed up the signing & verification of big XML documents. The speed-up ratio varies enormously between tests case. Some simple ones (verified a signed Liberty LogoutRequest wrapped in a SOAP enveloped) accomplish in half time, the opensaml assertion test got a <10% and a modified opensaml POSTProfile (where only the assertions are signed) got a 15%-20% speed-up. Our Liberty implementation case got a 40% speed-up from the 1.1 release. So it's very document dependant.

              The changes are a big path file of 168.243 bytes that can be summarized in four parts:

    a.. C14N rewrite: I'll see that a big part of time is gone in the canonicalization. This part is our biggest time hog: So I have more or less rewrite it using a name spaces table approach. As I descend the document nodes I fill a name space table with the xmlns definitions and whether are rendered in c14n tree or not. 
    The old method relay of having the doc tree circumvented (i.e. have all the namespaces of the parent copies in all of its children) and then look upward in the tree (using the getParent() DOM function) to see if a current namespace had been already rendered. This method is memory expensive (it creates a new attribute for every namespace defined in a parent node) and not very efficient for big tree documents. The new one seems better in these two aspects. But the speed improvements depend of the specific DOM tree.
    These changes permit to get rid of a lot of calls to circumventeBug2650
    These changes are the 60% of the patch. And impact only:  org/apache/xml/security/c14n/implementations/*

    a.. Remove all unnecessary use of XPath API: It seems that using XPath is slow and memory tax. So I have try to change all the xpath calls for their equivalent in DOM. These have decrease the memory usage a little and improve a little the speed (the bigger ones where already changed in 1.1). These changes impact: (See (1))

    b.. Trying to don't use XMLSignatureInput with nodesets: I have tried to don't used XMLSignatureInput with a set of nodes but in the xpath transformation. Normally the reference node specifies a node not a nodeset, only if there is a xpath transformation a nodeset is needed. The use of a node instead of a nodeset is better for memory purpose and permit to canonicalize using c14nSubtree instead of c14Xpathnodeset, the latter need to visit every node in the document tree, the former only the node & its children.
    The problem is that if the enveloped signature transformation is used a nodeset is needed again. In order to fix it, I have added a new field to the XMLSignatureInput and added new functions to c14n classes to specified a exclude node in the subtree(this way the c14n only skips the exclude node when it finds it). This change impacts: (See (2))

    c.. Miscellaneous changes: In org/apache/xml/security/signature/Reference.java some time is spend calculating the _transformsInput field in every deference (when signing or verifying a reference) this field is seldom used (only for debugging purpouse) so it's better to calculate it only when required.
    Other little change can be done in org/apache/xml/security/signature/SignedInfo.java to don't do the cannonicalization and the parse of importing of the signedinfo element if safe c14n methods are used (these safes are the standard ones). 


  So if there is some interest I can contribute back these changes, so it can be peer reviewed and fixed if needed. Or if any one want to test without the fuss of recompilling  I can send him the .jar file with the modified library.




  Regards, 



  Raul Benito



  p.s- Sorry for the big email in html but it reads better.



  1:


    a.. org/apache/xml/security/keys/content/KeyValue.java 
    b.. org/apache/xml/security/keys/content/RetrievalMethod.java 
    c.. org/apache/xml/security/keys/keyresolver/implementations/DSAKeyValueResolver.java 
    d.. org/apache/xml/security/keys/keyresolver/implementations/RSAKeyValueResolver.java 
    e.. org/apache/xml/security/keys/keyresolver/implementations/X509CertificateResolver.java 
    f.. org/apache/xml/security/keys/keyresolver/implementations/X509SKIResolver.java 
    g.. org/apache/xml/security/keys/keyresolver/implementations/X509SubjectNameResolver.java 
    h.. org/apache/xml/security/signature/SignatureProperties.java 
    i.. org/apache/xml/security/signature/XMLSignature.java 
    j.. org/apache/xml/security/transforms/Transforms.java (Already use a DOM method only refactor to invoke a common one). 
    k.. org/apache/xml/security/utils/ElementProxy.java 
    l.. org/apache/xml/security/utils/SignatureElementProxy.java 
    m.. org/apache/xml/security/utils/XMLUtils.java (Added the functions to do the DOM searches). 

  2:


    a.. org/apache/xml/security/signature/XMLSignatureInput.java        
    b.. org/apache/xml/security/transforms/implementations/TransformC14N.java 
    c.. org/apache/xml/security/transforms/implementations/TransformC14NExclusive.java 
    d.. org/apache/xml/security/transforms/implementations/TransformC14NExclusiveWithComments.java 
    e.. org/apache/xml/security/transforms/implementations/TransformEnvelopedSignature.java (and also move some methods XMLUtils) 
    f.. org/apache/xml/security/utils/IdResolver.java 
    g.. org/apache/xml/security/utils/SignatureElementProxy.java 
    h.. org/apache/xml/security/utils/resolver/implementations/ResolverFragment.java 
    i.. org/apache/xml/security/utils/resolver/implementations/ResolverXPointer.java 
    j.. org/apache/xml/security/utils/XMLUtils.java (Added some methods) 
    k.. (and the c14n implementations but these have been take on account above).



Re: Speed optimizations

Posted by "raul-info@r-bg.com" <ra...@r-bg.com>.
Berin Lautenbach wrote:

>> Hello,
>>
>>             The changes are a big path file of 168.243 bytes that can 
>> be summarized in four parts:
>
>
> Send them my way!

It still needs some clean-ups but i think is some good starting point.



Re: Speed optimizations

Posted by "raul-info@r-bg.com" <ra...@r-bg.com>.
Berin Lautenbach wrote:

>> Hello,
>>
>>             The changes are a big path file of 168.243 bytes that can 
>> be summarized in four parts:
>
>
> Send them my way!
>
I have send to you already

>>
>>     * C14N rewrite: I’ll see that a big part of time is gone in the
>>       canonicalization. This part is our biggest time hog: So I have
>>       more or less rewrite it using a name spaces table approach. As I
>>       descend the document nodes I fill a name space table with the
>>       xmlns definitions and whether are rendered in c14n tree or not.
>>       The old method relay of having the doc tree circumvented (i.e.
>>       have all the namespaces of the parent copies in all of its
>>       children) and then look upward in the tree (using the getParent()
>>       DOM function) to see if a current namespace had been already
>>       rendered. This method is memory expensive (it creates a new
>>       attribute for every namespace defined in a parent node) and not
>>       very efficient for big tree documents. The new one seems better in
>>       these two aspects. But the speed improvements depend of the
>>       specific DOM tree.
>>       These changes permit to get rid of a lot of calls to
>>       circumventeBug2650
>>       These changes are the 60% of the patch. And impact only:       
>> org/apache/xml/security/c14n/implementations/*
>
>
> How did you get past the XPath selection problem?
>
> I started doing the same approach in the C++ code but found it fails, 
> because you have to have cascaded the name spaces for certain XPath 
> transformations to work.  There is a test case from Merlin that fails 
> beautifully.  Unless you put each namespace in each element, you don't 
> know whether the XPath expression will select it or not, so you don't 
> know whether to render it or not.
>
Yes, I have the same problem. My first approach was to circumvent when 
the xpath is used, and mark the xmlsignatureInput as circumvented. And 
then check in the c14nXpathnodeset if it was c14ned or not.
And acts apropietly.

But after the exclude node and things i see that the c14nxpathnodeset is 
mostly called because the xpath was needed. And i put the restriction of 
the c14nxpathnodeset must be called with a circumbeted dom. The 
c14nsubtree can be called without this.


Re: Speed optimizations

Posted by Berin Lautenbach <be...@wingsofhermes.org>.
> Hello,
> 
>             The changes are a big path file of 168.243 bytes that can be 
> summarized in four parts:

Send them my way!

> 
>     * C14N rewrite: I’ll see that a big part of time is gone in the
>       canonicalization. This part is our biggest time hog: So I have
>       more or less rewrite it using a name spaces table approach. As I
>       descend the document nodes I fill a name space table with the
>       xmlns definitions and whether are rendered in c14n tree or not.
>       The old method relay of having the doc tree circumvented (i.e.
>       have all the namespaces of the parent copies in all of its
>       children) and then look upward in the tree (using the getParent()
>       DOM function) to see if a current namespace had been already
>       rendered. This method is memory expensive (it creates a new
>       attribute for every namespace defined in a parent node) and not
>       very efficient for big tree documents. The new one seems better in
>       these two aspects. But the speed improvements depend of the
>       specific DOM tree.
>       These changes permit to get rid of a lot of calls to
>       circumventeBug2650
>       These changes are the 60% of the patch. And impact only: 
>       org/apache/xml/security/c14n/implementations/*

How did you get past the XPath selection problem?

I started doing the same approach in the C++ code but found it fails, 
because you have to have cascaded the name spaces for certain XPath 
transformations to work.  There is a test case from Merlin that fails 
beautifully.  Unless you put each namespace in each element, you don't 
know whether the XPath expression will select it or not, so you don't 
know whether to render it or not.

Either way - great stuff!

Cheers,
	Berin