You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commons-dev@ws.apache.org by Glen Daniels <gl...@thoughtcraft.com> on 2007/03/07 21:57:35 UTC

Re: [Axiom] What's the best way to get the actual child text from an OMElement? (PROPOSAL)

Hi David, Chinthaka:

I think you're definitely on to something here, David.  There is no way 
we should be doing any kind of QName resolving unless and until someone 
asks for the value as a QName, IMO.  It's definitely a performance hit 
(esp. potentially walking up the NS stack), and it's just unnecessary - 
the API as it stands is not very intuitive, and I'd love to see it get 
cleaned up a bit.

I think getText() should be changed to do no extra processing and simply 
return the actual String content of the node.  The only time we should 
do QName processing is when getTextAsQName() is called.  The 
getNamespace() API should go away entirely.

There are also a few other cleanup tasks that should happen in there, 
like pulling out the localName constant into OMConstants instead of 
having a copy of it in every OMText....

Thoughts?  Anyone mind if I go ahead and do this?

--Glen

David Illsley wrote:
> Hi,
> My apologies, I've dug further into the code and I misunderstood what
> it was doing. Happily the first concern I had turns out not to be a
> problem (I thought it was concatenating namespace:localpart rather
> than re-concatenating prefix:localpart).
> 
> I do still think that there may be an inefficiency in there due to
> getText() processing uris (e.g. wsa:MessageID, wsa:Action) as
> potential QNames, doing a namespace lookup, braking apart the string
> and then reconcatenating it... but I'm not sure how much of a hit that
> really is?
> 
> Sorry for the confusion,
> David
> 
> On 07/03/07, Eran Chinthaka <ch...@opensource.lk> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Hi David,
>>
>> I can not properly understand what your concern is, but I can try to
>> give insights in to the methods you proposed.
>>
>> I introduced the getText util method to get the aggregation of all the
>> texts that can be found under an OMElement. There are couple of things
>> that qualify as OMTexts and MTOMized input streams and namespace
>> qualified texts are some of them.
>>
>> So wherever you introduce those methods you will endup writing the same
>> code which OMText contains now.
>>
>> I am just trying to comments on your proposals. If you can give me some
>> more hints on the problem at hand, I might be able to help as I know a
>> little bit of both Axiom and addressing :).
>>
>> Thanks,
>> Chinthaka
>>
>> David Illsley wrote:
>> > Hi all,
>> > I've been looking at the work that occurs for OMElement.getText()
>> > which is currently used by the Axis2 addressing handlers to extract
>> > the (text) values of certain headers (wsa:To, wsa:Action,
>> > wsa:MessageID).
>> >
>> > I've noticed that this delegates to OMTextImpl.getText() which then
>> > does some fancy processing to identify QNames and expand
>> > prefix:localname to namespace:localname
>> >
>> > The values in these headers are generally http or urn URIs, hence
>> > getText() finds the ":" and looks for an associated namespace.
>> >
>> > This leads to a couple of concerns
>> > 1. The value returned may not be what we're looking for if an
>> > xmlns:http="" or xmlns:urn="" is in scope. This is a very real
>> > problem.
>> > 2. In the 99% case where not such declarations are in scope, the
>> > string is being indexOf()'d and a lookup done when there is probably
>> > nothing to find.
>> >
>> > It appears that I could access the value directly by calling
>> > OMText.getTextCharacters()? However, I'd then have to replicate the
>> > code in OMElement to find the OMText node which I'd prefer not to do.
>> > Would the Axiom team consider adding a char[] getTextCharacters() or
>> > String getActualText() to OMElement?
>> >
>> > Cheers,
>> > David
>> >
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.3 (GNU/Linux)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>>
>> iD8DBQFF7hyWjON2uBzUhh8RAkJIAJ4siThFKyCvQoXoE3XEU5Cdq11v/gCgp63J
>> e7F59XkCHFOmvpUZN8QgHVo=
>> =hYfH
>> -----END PGP SIGNATURE-----
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
>> For additional commands, e-mail: commons-dev-help@ws.apache.org
>>
>>
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: commons-dev-help@ws.apache.org


Re: [Axiom] What's the best way to get the actual child text from an OMElement? (PROPOSAL)

Posted by Davanum Srinivas <da...@gmail.com>.
Please tread gently :) +1

-- dims

On 3/7/07, Glen Daniels <gl...@thoughtcraft.com> wrote:
> Hi David, Chinthaka:
>
> I think you're definitely on to something here, David.  There is no way
> we should be doing any kind of QName resolving unless and until someone
> asks for the value as a QName, IMO.  It's definitely a performance hit
> (esp. potentially walking up the NS stack), and it's just unnecessary -
> the API as it stands is not very intuitive, and I'd love to see it get
> cleaned up a bit.
>
> I think getText() should be changed to do no extra processing and simply
> return the actual String content of the node.  The only time we should
> do QName processing is when getTextAsQName() is called.  The
> getNamespace() API should go away entirely.
>
> There are also a few other cleanup tasks that should happen in there,
> like pulling out the localName constant into OMConstants instead of
> having a copy of it in every OMText....
>
> Thoughts?  Anyone mind if I go ahead and do this?
>
> --Glen
>
> David Illsley wrote:
> > Hi,
> > My apologies, I've dug further into the code and I misunderstood what
> > it was doing. Happily the first concern I had turns out not to be a
> > problem (I thought it was concatenating namespace:localpart rather
> > than re-concatenating prefix:localpart).
> >
> > I do still think that there may be an inefficiency in there due to
> > getText() processing uris (e.g. wsa:MessageID, wsa:Action) as
> > potential QNames, doing a namespace lookup, braking apart the string
> > and then reconcatenating it... but I'm not sure how much of a hit that
> > really is?
> >
> > Sorry for the confusion,
> > David
> >
> > On 07/03/07, Eran Chinthaka <ch...@opensource.lk> wrote:
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA1
> >>
> >> Hi David,
> >>
> >> I can not properly understand what your concern is, but I can try to
> >> give insights in to the methods you proposed.
> >>
> >> I introduced the getText util method to get the aggregation of all the
> >> texts that can be found under an OMElement. There are couple of things
> >> that qualify as OMTexts and MTOMized input streams and namespace
> >> qualified texts are some of them.
> >>
> >> So wherever you introduce those methods you will endup writing the same
> >> code which OMText contains now.
> >>
> >> I am just trying to comments on your proposals. If you can give me some
> >> more hints on the problem at hand, I might be able to help as I know a
> >> little bit of both Axiom and addressing :).
> >>
> >> Thanks,
> >> Chinthaka
> >>
> >> David Illsley wrote:
> >> > Hi all,
> >> > I've been looking at the work that occurs for OMElement.getText()
> >> > which is currently used by the Axis2 addressing handlers to extract
> >> > the (text) values of certain headers (wsa:To, wsa:Action,
> >> > wsa:MessageID).
> >> >
> >> > I've noticed that this delegates to OMTextImpl.getText() which then
> >> > does some fancy processing to identify QNames and expand
> >> > prefix:localname to namespace:localname
> >> >
> >> > The values in these headers are generally http or urn URIs, hence
> >> > getText() finds the ":" and looks for an associated namespace.
> >> >
> >> > This leads to a couple of concerns
> >> > 1. The value returned may not be what we're looking for if an
> >> > xmlns:http="" or xmlns:urn="" is in scope. This is a very real
> >> > problem.
> >> > 2. In the 99% case where not such declarations are in scope, the
> >> > string is being indexOf()'d and a lookup done when there is probably
> >> > nothing to find.
> >> >
> >> > It appears that I could access the value directly by calling
> >> > OMText.getTextCharacters()? However, I'd then have to replicate the
> >> > code in OMElement to find the OMText node which I'd prefer not to do.
> >> > Would the Axiom team consider adding a char[] getTextCharacters() or
> >> > String getActualText() to OMElement?
> >> >
> >> > Cheers,
> >> > David
> >> >
> >>
> >> -----BEGIN PGP SIGNATURE-----
> >> Version: GnuPG v1.4.3 (GNU/Linux)
> >> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> >>
> >> iD8DBQFF7hyWjON2uBzUhh8RAkJIAJ4siThFKyCvQoXoE3XEU5Cdq11v/gCgp63J
> >> e7F59XkCHFOmvpUZN8QgHVo=
> >> =hYfH
> >> -----END PGP SIGNATURE-----
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
> >> For additional commands, e-mail: commons-dev-help@ws.apache.org
> >>
> >>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
> For additional commands, e-mail: commons-dev-help@ws.apache.org
>
>


-- 
Davanum Srinivas :: http://wso2.org/ :: Oxygen for Web Services Developers

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: commons-dev-help@ws.apache.org