You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commons-dev@ws.apache.org by Ruwan Linton <ru...@gmail.com> on 2006/09/04 09:01:09 UTC

Regarding WSCOMMONS-82 - DOMHASH implementation on AXIOM

Hi,

This is regarding the Jira,

Key: WSCOMMONS-82
URL: http://issues.apache.org/jira/browse/WSCOMMONS-82

I am going to add the DOMHASH implementation to the AXIOM. The domhash
algorithm is a digest generation algorithm that gives a unique digest value
for a given XML node. This is required to compare two XML nodes for the XML
content equality.

I am in the process of implementing a caching module for the Axis2 and it is
required to keep the digest value of the SOAP request OM representation with
the output SOAP response, so that if a request with the same digest value
arises again I can simply send the Response without re calculating that. In
order to do so, I need to have a good mechanism which dosent count comments
and various namespace declarations in generating the digest value.

I cant use any of the existing XML comparing mechanisms like XMLUnit since
all those takes comments also in to account. So I have implemented the
DOMHASH algorithm in the axiom-api, I have added this functionality through
a helper class since I have realized that changing the api is not effective.
The helper class is availabal as the patch.txt and the class is in the '
org.apache.axiom.om.util.DigestGenerator'. You can get the digest value by
calling the getDigest method by giving the OMNode or OMDocument  and the
digest algorithm (MD5 or SHA1).

Thanks,
Ruwan.

Re: Regarding WSCOMMONS-82 - DOMHASH implementation on AXIOM

Posted by Ruwan Linton <ru...@gmail.com>.
Hi Hasalaka,

Comments inline...

On 9/4/06, Hasalaka Waravita <ha...@gmail.com> wrote:
>
> Ruwan
> One of question and a comment
> 1) Do u consider doing client side caching as well ?, considering most
> business scenarios, it can be useful too.


Of course, but not at this moment.

2) Performance of digest algo is not linear with the size of the String, so
> it would be useful decide to compute hash(and cache) or not depending on
> the
> size if the input first


Well, then that depends on the time it takes to 'compute the output and
construct the OM tree' and the 'digest generation and searching the Map'.
But it will be a bit complex. Will see.  A good point.

On 9/4/06, Ruchith Fernando <ru...@gmail.com> wrote:
> >
> > Hi Ruwan,
> >
> > On 9/4/06, Ruwan Linton <ru...@gmail.com> wrote:
> > > Hi Ruchith,
> > >
> > > Yes of course, but I have several concerns.
> > >
> > > 1. I am going to use this digest value for caching stuff, and if the
> > digest
> > > value generation time takes more time than invoking the method itself
> > then
> > > it is useless. Since we need to convert OMElement to DOM in this
> > approach
> > > (C14N) it will have some performance drawbacks I guess.
> >
> > Hmm... DOOM is same as OM ... and it has all properties of OM in
> > addition to the DOM behaviour... maybe the additional conversion step
> > using the stax events from the partially read soap env and the
> > incoming stream might be a bit of an overhead.
> >
> > >
> > > 2. How about comments, my implementation of DOMHASH (That's the normal
> > > behaviour of DOMHASH as well) dosent take comments in to account,
> which
> > is
> > > required for checking the equality of two OMNodes, for caching.
> >
> > Well .. you can do C14N ignoring comments:
> >
> >         Document doc =
> > DocumentBuilderFactory.newInstance
> > ().newDocumentBuilder().parse("/path/to/your/xml");
> >
> >         byte[] out = Canonicalizer.getInstance(
> >
> > Canonicalizer.ALGO_ID_C14N_EXCL_OMIT_COMMENTS).canonicalizeSubtree(
> > doc.getDocumentElement());
> >
> > >
> > > 3. I dont know weather I am right or wrong here (I dont have a deep
> > > understanding of C14N), but it takes namespace declarations in to
> > account
> > > and we need to neglect namespace declarations and subtitue the
> namespace
> > URI
> > > instead.
> >
> > Good point ... Agreed !
> >
> > In the case where two XML elements carries different prefixes in
> > namespace declarations we cannot use C14N to compare.
> >
> > Thanks,
> > Ruchith
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
> > For additional commands, e-mail: commons-dev-help@ws.apache.org
> >
> >


Thanks,
Ruwan

Re: Regarding WSCOMMONS-82 - DOMHASH implementation on AXIOM

Posted by Hasalaka Waravita <ha...@gmail.com>.
Ruwan
One of question and a comment
1) Do u consider doing client side caching as well ?, considering most
business scenarios, it can be useful too.
2) Performance of digest algo is not linear with the size of the String, so
it would be useful decide to compute hash(and cache) or not depending on the
size if the input first


On 9/4/06, Ruchith Fernando <ru...@gmail.com> wrote:
>
> Hi Ruwan,
>
> On 9/4/06, Ruwan Linton <ru...@gmail.com> wrote:
> > Hi Ruchith,
> >
> > Yes of course, but I have several concerns.
> >
> > 1. I am going to use this digest value for caching stuff, and if the
> digest
> > value generation time takes more time than invoking the method itself
> then
> > it is useless. Since we need to convert OMElement to DOM in this
> approach
> > (C14N) it will have some performance drawbacks I guess.
>
> Hmm... DOOM is same as OM ... and it has all properties of OM in
> addition to the DOM behaviour... maybe the additional conversion step
> using the stax events from the partially read soap env and the
> incoming stream might be a bit of an overhead.
>
> >
> > 2. How about comments, my implementation of DOMHASH (That's the normal
> > behaviour of DOMHASH as well) dosent take comments in to account, which
> is
> > required for checking the equality of two OMNodes, for caching.
>
> Well .. you can do C14N ignoring comments:
>
>         Document doc =
> DocumentBuilderFactory.newInstance
> ().newDocumentBuilder().parse("/path/to/your/xml");
>
>         byte[] out = Canonicalizer.getInstance(
>
> Canonicalizer.ALGO_ID_C14N_EXCL_OMIT_COMMENTS).canonicalizeSubtree(
> doc.getDocumentElement());
>
> >
> > 3. I dont know weather I am right or wrong here (I dont have a deep
> > understanding of C14N), but it takes namespace declarations in to
> account
> > and we need to neglect namespace declarations and subtitue the namespace
> URI
> > instead.
>
> Good point ... Agreed !
>
> In the case where two XML elements carries different prefixes in
> namespace declarations we cannot use C14N to compare.
>
> Thanks,
> Ruchith
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
> For additional commands, e-mail: commons-dev-help@ws.apache.org
>
>

Re: Regarding WSCOMMONS-82 - DOMHASH implementation on AXIOM

Posted by Ruwan Linton <ru...@gmail.com>.
Hi Ruchith,

On 9/4/06, Ruchith Fernando <ru...@gmail.com> wrote:
>
> Hi Ruwan,
>
> On 9/4/06, Ruwan Linton <ru...@gmail.com> wrote:
> > Hi Ruchith,
> >
> > Yes of course, but I have several concerns.
> >
> > 1. I am going to use this digest value for caching stuff, and if the
> digest
> > value generation time takes more time than invoking the method itself
> then
> > it is useless. Since we need to convert OMElement to DOM in this
> approach
> > (C14N) it will have some performance drawbacks I guess.
>
> Hmm... DOOM is same as OM ... and it has all properties of OM in
> addition to the DOM behaviour... maybe the additional conversion step
> using the stax events from the partially read soap env and the
> incoming stream might be a bit of an overhead.
>
> >
> > 2. How about comments, my implementation of DOMHASH (That's the normal
> > behaviour of DOMHASH as well) dosent take comments in to account, which
> is
> > required for checking the equality of two OMNodes, for caching.
>
> Well .. you can do C14N ignoring comments:
>
>         Document doc =
> DocumentBuilderFactory.newInstance
> ().newDocumentBuilder().parse("/path/to/your/xml");
>
>         byte[] out = Canonicalizer.getInstance(
>
> Canonicalizer.ALGO_ID_C14N_EXCL_OMIT_COMMENTS).canonicalizeSubtree(
> doc.getDocumentElement());
>
> >
> > 3. I dont know weather I am right or wrong here (I dont have a deep
> > understanding of C14N), but it takes namespace declarations in to
> account
> > and we need to neglect namespace declarations and subtitue the namespace
> URI
> > instead.
>
> Good point ... Agreed !
>
> In the case where two XML elements carries different prefixes in
> namespace declarations we cannot use C14N to compare.


Yes, exactly. If two clients send requests with the same namespace URI but
with seperate namesapce declarations then C14N is going to fail in caching.
Thanks for the comments.

Thanks,
> Ruchith
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
> For additional commands, e-mail: commons-dev-help@ws.apache.org
>
>

Re: Regarding WSCOMMONS-82 - DOMHASH implementation on AXIOM

Posted by Ruchith Fernando <ru...@gmail.com>.
Hi Ruwan,

On 9/4/06, Ruwan Linton <ru...@gmail.com> wrote:
> Hi Ruchith,
>
> Yes of course, but I have several concerns.
>
> 1. I am going to use this digest value for caching stuff, and if the digest
> value generation time takes more time than invoking the method itself then
> it is useless. Since we need to convert OMElement to DOM in this approach
> (C14N) it will have some performance drawbacks I guess.

Hmm... DOOM is same as OM ... and it has all properties of OM in
addition to the DOM behaviour... maybe the additional conversion step
using the stax events from the partially read soap env and the
incoming stream might be a bit of an overhead.

>
> 2. How about comments, my implementation of DOMHASH (That's the normal
> behaviour of DOMHASH as well) dosent take comments in to account, which is
> required for checking the equality of two OMNodes, for caching.

Well .. you can do C14N ignoring comments:

        Document doc =
DocumentBuilderFactory.newInstance().newDocumentBuilder().parse("/path/to/your/xml");

        byte[] out = Canonicalizer.getInstance(

Canonicalizer.ALGO_ID_C14N_EXCL_OMIT_COMMENTS).canonicalizeSubtree(doc.getDocumentElement());

>
> 3. I dont know weather I am right or wrong here (I dont have a deep
> understanding of C14N), but it takes namespace declarations in to account
> and we need to neglect namespace declarations and subtitue the namespace URI
> instead.

Good point ... Agreed !

In the case where two XML elements carries different prefixes in
namespace declarations we cannot use C14N to compare.

Thanks,
Ruchith

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: commons-dev-help@ws.apache.org


Re: Regarding WSCOMMONS-82 - DOMHASH implementation on AXIOM

Posted by Ruwan Linton <ru...@gmail.com>.
Hi Ruchith,

Yes of course, but I have several concerns.

1. I am going to use this digest value for caching stuff, and if the digest
value generation time takes more time than invoking the method itself then
it is useless. Since we need to convert OMElement to DOM in this approach
(C14N) it will have some performance drawbacks I guess.

2. How about comments, my implementation of DOMHASH (That's the normal
behaviour of DOMHASH as well) dosent take comments in to account, which is
required for checking the equality of two OMNodes, for caching.

3. I dont know weather I am right or wrong here (I dont have a deep
understanding of C14N), but it takes namespace declarations in to account
and we need to neglect namespace declarations and subtitue the namespace URI
instead.

BTW, I have done the implementation and I think it is good to use something
in AXIOM rather than using some other (performance)

Thanks,
Ruwan.

On 9/4/06, Ruchith Fernando <ru...@gmail.com> wrote:
>
> Hi Ruwan,
>
> If you are looking to compare two semantically equal XML elements how
> about using C14N and then calculating the digest C14Ned XML. If you
> convert the OMElement to DOOM then you can use the C14N implementation
> available in XML-Security for this.
>
> Thanks,
> Ruchith
>
> On 9/4/06, Ruwan Linton <ru...@gmail.com> wrote:
> > Hi,
> >
> > This is regarding the Jira,
> >
> > Key: WSCOMMONS-82
> > URL: http://issues.apache.org/jira/browse/WSCOMMONS-82
> >
> > I am going to add the DOMHASH implementation to the AXIOM. The domhash
> > algorithm is a digest generation algorithm that gives a unique digest
> value
> > for a given XML node. This is required to compare two XML nodes for the
> XML
> > content equality.
> >
> > I am in the process of implementing a caching module for the Axis2 and
> it is
> > required to keep the digest value of the SOAP request OM representation
> with
> > the output SOAP response, so that if a request with the same digest
> value
> > arises again I can simply send the Response without re calculating that.
> In
> > order to do so, I need to have a good mechanism which dosent count
> comments
> > and various namespace declarations in generating the digest value.
> >
> > I cant use any of the existing XML comparing mechanisms like XMLUnit
> since
> > all those takes comments also in to account. So I have implemented the
> > DOMHASH algorithm in the axiom-api, I have added this functionality
> through
> > a helper class since I have realized that changing the api is not
> effective.
> > The helper class is availabal as the patch.txt and the class is in the '
> > org.apache.axiom.om.util.DigestGenerator'. You can get the digest value
> by
> > calling the getDigest method by giving the OMNode or OMDocument  and the
> > digest algorithm (MD5 or SHA1).
> >
> > Thanks,
> > Ruwan.
> >
> >
>
>
> --
> www.ruchith.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
> For additional commands, e-mail: commons-dev-help@ws.apache.org
>
>

Re: Regarding WSCOMMONS-82 - DOMHASH implementation on AXIOM

Posted by Ruchith Fernando <ru...@gmail.com>.
Hi Ruwan,

If you are looking to compare two semantically equal XML elements how
about using C14N and then calculating the digest C14Ned XML. If you
convert the OMElement to DOOM then you can use the C14N implementation
available in XML-Security for this.

Thanks,
Ruchith

On 9/4/06, Ruwan Linton <ru...@gmail.com> wrote:
> Hi,
>
> This is regarding the Jira,
>
> Key: WSCOMMONS-82
> URL: http://issues.apache.org/jira/browse/WSCOMMONS-82
>
> I am going to add the DOMHASH implementation to the AXIOM. The domhash
> algorithm is a digest generation algorithm that gives a unique digest value
> for a given XML node. This is required to compare two XML nodes for the XML
> content equality.
>
> I am in the process of implementing a caching module for the Axis2 and it is
> required to keep the digest value of the SOAP request OM representation with
> the output SOAP response, so that if a request with the same digest value
> arises again I can simply send the Response without re calculating that. In
> order to do so, I need to have a good mechanism which dosent count comments
> and various namespace declarations in generating the digest value.
>
> I cant use any of the existing XML comparing mechanisms like XMLUnit since
> all those takes comments also in to account. So I have implemented the
> DOMHASH algorithm in the axiom-api, I have added this functionality through
> a helper class since I have realized that changing the api is not effective.
> The helper class is availabal as the patch.txt and the class is in the '
> org.apache.axiom.om.util.DigestGenerator'. You can get the digest value by
> calling the getDigest method by giving the OMNode or OMDocument  and the
> digest algorithm (MD5 or SHA1).
>
> Thanks,
> Ruwan.
>
>


-- 
www.ruchith.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: commons-dev-help@ws.apache.org