You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@santuario.apache.org by Vishal Mahajan <vm...@amberpoint.com> on 2006/11/05 16:02:23 UTC

Re: PATCH: String Comparison in ElementProxy

Hi Raul,

Could you think of a way we can fix the problem without having a 
significant performance hit. This clearly is a blocking issue for me. If 
eventually we decide to change the comparisons to use String.equals() I 
could give you the list of all relevant occurrences in the code.

Thanks,
Vishal

Raul Benito wrote:
> Hi Sean,
>
> The penalty hit is taken when the strings are not equal, sadly of the
> same length.
> And have a lot of common begging characters. That is sadly a common
> problem with namespaces URI, they are more or less equal in length and
> have a lot of damn http://.../... or urn:....: whatever at the
> begining. And that is why Xerces and other DOM implementations  intern
> namespaces URI.
>
> I have profile and it takes a lot of time.
> My point is that all the parsers I know do the intern (or it did when
> I do the implementation). And this is an old commit 8 months old(it is
> true that it is not yet on a official release), and it takes a
> measurable hit if not use in small messages(the kind of one that are
> in xml protocols).
>
> So I will first check other options (change the configuration of the
> offending parser with a
> feature[http://xerces.apache.org/xerces2-j/features.html] ).
> If it does not work I will change from == to equals, but I will let
> this as last resort.
>
> On 10/5/06, Sean Mullan <Se...@sun.com> wrote:
>> String.equals will work for both interned and non-interned Strings,
>> since it first checks if they are a reference to the same object. So
>> using String.equals seems safer and should be comparable performance I
>> would think. But maybe I'm missing something?
>>
>> --Sean
>>
>> Vishal Mahajan wrote:
>> > Do others also have views on this discussion?
>> >
>> > Thanks,
>> > Vishal
>> >
>> > Vishal Mahajan wrote:
>> >> Hi Raul,
>> >>
>> >> The parser that I am working with clearly doesn't intern element
>> >> namespace strings which is the reason I ran into this problem. And
>> >> actually I am not sure whether it's a good idea for a parser to 
>> intern
>> >> element namespace strings given that there could be huge number of
>> >> elements being parsed and there's a potential risk of running out of
>> >> memory. Also you mention that xerces might be interning namespace
>> >> stings but looking at their code I was unable to find that. Can you
>> >> point me to the relevant piece of code?
>> >>
>> >> Thanks,
>> >>
>> >> Vishal
>> >>
>> >> Raul Benito wrote:
>> >>> Vishal the problem is that this codes is called gazillion of times,
>> >>> and even it
>> >>> seems a small thing, it takes a lot of accumulated time, I even 
>> think
>> >>> in remove this checking altogether or control it by a property.
>> >>> Perhaps there is a feature set in your DOM parser that interns the
>> >>> namespaces. I have test with several DOM parsers (xerces, xmlbeans,
>> >>> jaxb) and all of them the namespaces strings are interns.
>> >>> If you are not able too toggle the behavior, We can begin to 
>> think in
>> >>> other possibilities (create code on the fly, create an interface 
>> with
>> >>> one implementation or the other a let the JVM inline it). But I 
>> think
>> >>> will be the last resort.
>> >>>
>> >>> Regards,
>> >>> Raul
>> >>>
>> >>> On 10/2/06, Vishal Mahajan <vm...@amberpoint.com> wrote:
>> >>>> Any signature verification was failing for me, and I have a 
>> different
>> >>>> DOM implementation in my environment, so probably you are right. 
>> It was
>> >>>> such a basic error that it had to be something like this. In any
>> >>>> case, I
>> >>>> think we should keep string comparison safe.
>> >>>>
>> >>>> Vishal
>> >>>>
>> >>>> Raul Benito wrote:
>> >>>> > Hi Vishal,
>> >>>> >
>> >>>> > The namespaces strings are intern, at least in xerces.
>> >>>> >
>> >>>> > Can you post the code that is failing?
>> >>>> >
>> >>>> > On 10/2/06, Vishal Mahajan <vm...@amberpoint.com> wrote:
>> >>>> >> This problem was not allowing successful creation of 
>> signature space
>> >>>> >> elements. Fix attached.
>> >>>> >>
>> >>>> >> Vishal
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> Index: ElementProxy.java
>> >>>> >> 
>> ===================================================================
>> >>>> >> --- ElementProxy.java   (revision 451991)
>> >>>> >> +++ ElementProxy.java   (working copy)
>> >>>> >> @@ -281,7 +281,7 @@
>> >>>> >>
>> >>>> >>        String localnameIS =
>> >>>> this._constructionElement.getLocalName();
>> >>>> >>        String namespaceIS =
>> >>>> this._constructionElement.getNamespaceURI();
>> >>>> >> -      if ((namespaceSHOULDBE!=namespaceIS) ||
>> >>>> >> +      if (!namespaceSHOULDBE.equals(namespaceIS) ||
>> >>>> >>         !localnameSHOULDBE.equals(localnameIS) ) {
>> >>>> >>           Object exArgs[] = { namespaceIS +":"+ localnameIS,
>> >>>> >>             namespaceSHOULDBE +":"+ localnameSHOULDBE};
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >
>> >>>> >
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>
>> >>
>> >
>> >
>>
>>
>
>



Re: PATCH: String Comparison in ElementProxy

Posted by Raul Benito <ra...@apache.org>.
Hi VIshal,
This is going to be one of the features for 1.4.1 with the
DocumentBuilder polling.
I want to take out all the comparisons with namespaces and let the API
user to decide how lax he wants to be with the strictness of the
elements. He can disable namespace checking altogether or do it with
== or equals()
I will post my ideas for discussing when I finish with 1.4 release.

On 11/5/06, Vishal Mahajan <vm...@amberpoint.com> wrote:
> Hi Raul,
>
> Could you think of a way we can fix the problem without having a
> significant performance hit. This clearly is a blocking issue for me. If
> eventually we decide to change the comparisons to use String.equals() I
> could give you the list of all relevant occurrences in the code.
>
> Thanks,
> Vishal
>
> Raul Benito wrote:
> > Hi Sean,
> >
> > The penalty hit is taken when the strings are not equal, sadly of the
> > same length.
> > And have a lot of common begging characters. That is sadly a common
> > problem with namespaces URI, they are more or less equal in length and
> > have a lot of damn http://.../... or urn:....: whatever at the
> > begining. And that is why Xerces and other DOM implementations  intern
> > namespaces URI.
> >
> > I have profile and it takes a lot of time.
> > My point is that all the parsers I know do the intern (or it did when
> > I do the implementation). And this is an old commit 8 months old(it is
> > true that it is not yet on a official release), and it takes a
> > measurable hit if not use in small messages(the kind of one that are
> > in xml protocols).
> >
> > So I will first check other options (change the configuration of the
> > offending parser with a
> > feature[http://xerces.apache.org/xerces2-j/features.html] ).
> > If it does not work I will change from == to equals, but I will let
> > this as last resort.
> >
> > On 10/5/06, Sean Mullan <Se...@sun.com> wrote:
> >> String.equals will work for both interned and non-interned Strings,
> >> since it first checks if they are a reference to the same object. So
> >> using String.equals seems safer and should be comparable performance I
> >> would think. But maybe I'm missing something?
> >>
> >> --Sean
> >>
> >> Vishal Mahajan wrote:
> >> > Do others also have views on this discussion?
> >> >
> >> > Thanks,
> >> > Vishal
> >> >
> >> > Vishal Mahajan wrote:
> >> >> Hi Raul,
> >> >>
> >> >> The parser that I am working with clearly doesn't intern element
> >> >> namespace strings which is the reason I ran into this problem. And
> >> >> actually I am not sure whether it's a good idea for a parser to
> >> intern
> >> >> element namespace strings given that there could be huge number of
> >> >> elements being parsed and there's a potential risk of running out of
> >> >> memory. Also you mention that xerces might be interning namespace
> >> >> stings but looking at their code I was unable to find that. Can you
> >> >> point me to the relevant piece of code?
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Vishal
> >> >>
> >> >> Raul Benito wrote:
> >> >>> Vishal the problem is that this codes is called gazillion of times,
> >> >>> and even it
> >> >>> seems a small thing, it takes a lot of accumulated time, I even
> >> think
> >> >>> in remove this checking altogether or control it by a property.
> >> >>> Perhaps there is a feature set in your DOM parser that interns the
> >> >>> namespaces. I have test with several DOM parsers (xerces, xmlbeans,
> >> >>> jaxb) and all of them the namespaces strings are interns.
> >> >>> If you are not able too toggle the behavior, We can begin to
> >> think in
> >> >>> other possibilities (create code on the fly, create an interface
> >> with
> >> >>> one implementation or the other a let the JVM inline it). But I
> >> think
> >> >>> will be the last resort.
> >> >>>
> >> >>> Regards,
> >> >>> Raul
> >> >>>
> >> >>> On 10/2/06, Vishal Mahajan <vm...@amberpoint.com> wrote:
> >> >>>> Any signature verification was failing for me, and I have a
> >> different
> >> >>>> DOM implementation in my environment, so probably you are right.
> >> It was
> >> >>>> such a basic error that it had to be something like this. In any
> >> >>>> case, I
> >> >>>> think we should keep string comparison safe.
> >> >>>>
> >> >>>> Vishal
> >> >>>>
> >> >>>> Raul Benito wrote:
> >> >>>> > Hi Vishal,
> >> >>>> >
> >> >>>> > The namespaces strings are intern, at least in xerces.
> >> >>>> >
> >> >>>> > Can you post the code that is failing?
> >> >>>> >
> >> >>>> > On 10/2/06, Vishal Mahajan <vm...@amberpoint.com> wrote:
> >> >>>> >> This problem was not allowing successful creation of
> >> signature space
> >> >>>> >> elements. Fix attached.
> >> >>>> >>
> >> >>>> >> Vishal
> >> >>>> >>
> >> >>>> >>
> >> >>>> >>
> >> >>>> >> Index: ElementProxy.java
> >> >>>> >>
> >> ===================================================================
> >> >>>> >> --- ElementProxy.java   (revision 451991)
> >> >>>> >> +++ ElementProxy.java   (working copy)
> >> >>>> >> @@ -281,7 +281,7 @@
> >> >>>> >>
> >> >>>> >>        String localnameIS =
> >> >>>> this._constructionElement.getLocalName();
> >> >>>> >>        String namespaceIS =
> >> >>>> this._constructionElement.getNamespaceURI();
> >> >>>> >> -      if ((namespaceSHOULDBE!=namespaceIS) ||
> >> >>>> >> +      if (!namespaceSHOULDBE.equals(namespaceIS) ||
> >> >>>> >>         !localnameSHOULDBE.equals(localnameIS) ) {
> >> >>>> >>           Object exArgs[] = { namespaceIS +":"+ localnameIS,
> >> >>>> >>             namespaceSHOULDBE +":"+ localnameSHOULDBE};
> >> >>>> >>
> >> >>>> >>
> >> >>>> >>
> >> >>>> >
> >> >>>> >
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >
> >> >
> >>
> >>
> >
> >
>
>
>


-- 
http://r-bg.com