You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Dave Brosius <db...@qis.net> on 2004/07/08 04:29:47 UTC

Using == on strings

I ran across this in org.apache.xerces.dom.DeferredDocumentImpl

Shouldn't this be 

            if (getChunkValue(fNodeName, achunk, aindex).equals( name)) {

?





    public String getAttribute(int elemIndex, String name) {
        if (elemIndex == -1 || name == null) {
            return null;
        }
        int echunk = elemIndex >> CHUNK_SHIFT;
        int eindex = elemIndex & CHUNK_MASK;
        int attrIndex = getChunkIndex(fNodeExtra, echunk, eindex);
        while (attrIndex != -1) {
            int achunk = attrIndex >> CHUNK_SHIFT;
            int aindex = attrIndex & CHUNK_MASK;
            if (getChunkValue(fNodeName, achunk, aindex) == name) {
                return getChunkValue(fNodeValue, achunk, aindex);
            }
            attrIndex = getChunkIndex(fNodePrevSib, achunk, aindex);
        }
        return null;
    }

Re: Using == on strings

Posted by db...@qis.net.
ah! ok... silly me.

Quoting Sandy Gao <sa...@ca.ibm.com>:

> 
> 
> 
> 
> But when 2 strings are not equal, == gives the "false" answer immediately,
> while equals() still tries other things.
> (Think about "aaaaaaaaaaaaaaaaaaaaaaaaaa" and "aaaaaaaaaaaaaaaaaaaaaaaaab",
> same length and differ at the last point).
> 
> Sandy Gao
> Software Developer, IBM Canada
> (1-905) 413-3255
> sandygao@ca.ibm.com
> 
> 
> dbrosius@qis.net wrote on 07/09/2004 12:36:07 PM:
> 
> > Ok,
> >
> >     Seems like a silly design choice as String.equals does this
> >
> >    public boolean equals(Object anObject) {
> >    if (this == anObject) {
> >        return true;
> >    }
> >
> >         ....
> >         ....
> >
> >
> >
> > There are over 30 String comparisons using == throughout the code, btw.
> >
> > But thanks for clearing that up.
> >
> >
> > Quoting Michael Glavassevich <mr...@ca.ibm.com>:
> >
> > > Hello Dave,
> > >
> > > If you've been looking through the source code you'll notice that in
> many
> > > places we compare strings by reference instead of calling equals. For
> > > performance reasons Xerces keeps a table of unique strings for XML
> names
> > > (elements, attributes, entities, etc...) and namespace names (URIs).
> These
> > > strings have all been internalized (by calling String.intern()) so
> unique
> > > references for names and namespace names are passed through the
> parser's
> > > components. Methods which are part of the public API cannot take
> advantage
> > > of this because a user may pass a reference to some other string
> object,
> > > however this method is not part of the DOM API. I did a search for
> > > references to this method using Eclipse and it seems that it's not
> being
> > > called anywhere.
> > >
> > > Thanks.
> > >
> > > "Dave Brosius" <db...@qis.net> wrote on 07/07/2004 10:29:47 PM:
> > >
> > > > I ran across this in org.apache.xerces.dom.DeferredDocumentImpl
> > > >
> > > > Shouldn't this be
> > > >
> > > >             if (getChunkValue(fNodeName, achunk, aindex).equals(
> name))
> > > {
> > > >
> > > > ?
> > > >
> > > >
> > > >     public String getAttribute(int elemIndex, String name) {
> > > >         if (elemIndex == -1 || name == null) {
> > > >             return null;
> > > >         }
> > > >         int echunk = elemIndex >> CHUNK_SHIFT;
> > > >         int eindex = elemIndex & CHUNK_MASK;
> > > >         int attrIndex = getChunkIndex(fNodeExtra, echunk, eindex);
> > > >         while (attrIndex != -1) {
> > > >             int achunk = attrIndex >> CHUNK_SHIFT;
> > > >             int aindex = attrIndex & CHUNK_MASK;
> > > >             if (getChunkValue(fNodeName, achunk, aindex) == name) {
> > > >                 return getChunkValue(fNodeValue, achunk, aindex);
> > > >             }
> > > >             attrIndex = getChunkIndex(fNodePrevSib, achunk, aindex);
> > > >         }
> > > >         return null;
> > > >     }
> > >
> > > Michael Glavassevich
> > > XML Parser Development
> > > IBM Toronto Lab
> > > E-mail: mrglavas@ca.ibm.com
> > > E-mail: mrglavas@apache.org
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: Using == on strings

Posted by Joseph Kesselman <ke...@us.ibm.com>.



Typical implementation of String.equals() (like most hashable objects) is:

      Compare ==, in case they're the same object.
      Compare cached hashcodes; if they don't match they aren't the same
value.
      Then compare content in case two values just happened to have the
same hash.

Using ==, when you know the strings are interned or otherwise unique
values, does potentially save some cycles for the call-and-return. A good
JIT compiler *may* be able to give close to the same performance by doing a
bit of slightly-intelligent inlining... but if you're in an inner loop, the
hand-optimization may be fully justified. Especially if performance
analysers are telling you it really does make a difference.

But it may be worth tossing a comment or two into the code to reassure
folks that == really was intended.

______________________________________
Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more.
"The world changed profoundly and unpredictably the day Tim Berners Lee
got bitten by a radioactive spider." -- Rafe Culpin, in r.m.filk


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: Using == on strings

Posted by Sandy Gao <sa...@ca.ibm.com>.



But when 2 strings are not equal, == gives the "false" answer immediately,
while equals() still tries other things.
(Think about "aaaaaaaaaaaaaaaaaaaaaaaaaa" and "aaaaaaaaaaaaaaaaaaaaaaaaab",
same length and differ at the last point).

Sandy Gao
Software Developer, IBM Canada
(1-905) 413-3255
sandygao@ca.ibm.com


dbrosius@qis.net wrote on 07/09/2004 12:36:07 PM:

> Ok,
>
>     Seems like a silly design choice as String.equals does this
>
>    public boolean equals(Object anObject) {
>    if (this == anObject) {
>        return true;
>    }
>
>         ....
>         ....
>
>
>
> There are over 30 String comparisons using == throughout the code, btw.
>
> But thanks for clearing that up.
>
>
> Quoting Michael Glavassevich <mr...@ca.ibm.com>:
>
> > Hello Dave,
> >
> > If you've been looking through the source code you'll notice that in
many
> > places we compare strings by reference instead of calling equals. For
> > performance reasons Xerces keeps a table of unique strings for XML
names
> > (elements, attributes, entities, etc...) and namespace names (URIs).
These
> > strings have all been internalized (by calling String.intern()) so
unique
> > references for names and namespace names are passed through the
parser's
> > components. Methods which are part of the public API cannot take
advantage
> > of this because a user may pass a reference to some other string
object,
> > however this method is not part of the DOM API. I did a search for
> > references to this method using Eclipse and it seems that it's not
being
> > called anywhere.
> >
> > Thanks.
> >
> > "Dave Brosius" <db...@qis.net> wrote on 07/07/2004 10:29:47 PM:
> >
> > > I ran across this in org.apache.xerces.dom.DeferredDocumentImpl
> > >
> > > Shouldn't this be
> > >
> > >             if (getChunkValue(fNodeName, achunk, aindex).equals(
name))
> > {
> > >
> > > ?
> > >
> > >
> > >     public String getAttribute(int elemIndex, String name) {
> > >         if (elemIndex == -1 || name == null) {
> > >             return null;
> > >         }
> > >         int echunk = elemIndex >> CHUNK_SHIFT;
> > >         int eindex = elemIndex & CHUNK_MASK;
> > >         int attrIndex = getChunkIndex(fNodeExtra, echunk, eindex);
> > >         while (attrIndex != -1) {
> > >             int achunk = attrIndex >> CHUNK_SHIFT;
> > >             int aindex = attrIndex & CHUNK_MASK;
> > >             if (getChunkValue(fNodeName, achunk, aindex) == name) {
> > >                 return getChunkValue(fNodeValue, achunk, aindex);
> > >             }
> > >             attrIndex = getChunkIndex(fNodePrevSib, achunk, aindex);
> > >         }
> > >         return null;
> > >     }
> >
> > Michael Glavassevich
> > XML Parser Development
> > IBM Toronto Lab
> > E-mail: mrglavas@ca.ibm.com
> > E-mail: mrglavas@apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: Using == on strings

Posted by db...@qis.net.
Ok,

    Seems like a silly design choice as String.equals does this

   public boolean equals(Object anObject) {
	if (this == anObject) {
	    return true;
	}

        ....
        ....



There are over 30 String comparisons using == throughout the code, btw.

But thanks for clearing that up.


Quoting Michael Glavassevich <mr...@ca.ibm.com>:

> Hello Dave,
> 
> If you've been looking through the source code you'll notice that in many 
> places we compare strings by reference instead of calling equals. For 
> performance reasons Xerces keeps a table of unique strings for XML names 
> (elements, attributes, entities, etc...) and namespace names (URIs). These 
> strings have all been internalized (by calling String.intern()) so unique 
> references for names and namespace names are passed through the parser's 
> components. Methods which are part of the public API cannot take advantage 
> of this because a user may pass a reference to some other string object, 
> however this method is not part of the DOM API. I did a search for 
> references to this method using Eclipse and it seems that it's not being 
> called anywhere.
> 
> Thanks.
> 
> "Dave Brosius" <db...@qis.net> wrote on 07/07/2004 10:29:47 PM:
> 
> > I ran across this in org.apache.xerces.dom.DeferredDocumentImpl
> > 
> > Shouldn't this be 
> > 
> >             if (getChunkValue(fNodeName, achunk, aindex).equals( name)) 
> {
> > 
> > ?
> > 
> > 
> >     public String getAttribute(int elemIndex, String name) {
> >         if (elemIndex == -1 || name == null) {
> >             return null;
> >         }
> >         int echunk = elemIndex >> CHUNK_SHIFT;
> >         int eindex = elemIndex & CHUNK_MASK;
> >         int attrIndex = getChunkIndex(fNodeExtra, echunk, eindex);
> >         while (attrIndex != -1) {
> >             int achunk = attrIndex >> CHUNK_SHIFT;
> >             int aindex = attrIndex & CHUNK_MASK;
> >             if (getChunkValue(fNodeName, achunk, aindex) == name) {
> >                 return getChunkValue(fNodeValue, achunk, aindex);
> >             }
> >             attrIndex = getChunkIndex(fNodePrevSib, achunk, aindex);
> >         }
> >         return null;
> >     }
> 
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: Using == on strings

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hello Dave,

If you've been looking through the source code you'll notice that in many 
places we compare strings by reference instead of calling equals. For 
performance reasons Xerces keeps a table of unique strings for XML names 
(elements, attributes, entities, etc...) and namespace names (URIs). These 
strings have all been internalized (by calling String.intern()) so unique 
references for names and namespace names are passed through the parser's 
components. Methods which are part of the public API cannot take advantage 
of this because a user may pass a reference to some other string object, 
however this method is not part of the DOM API. I did a search for 
references to this method using Eclipse and it seems that it's not being 
called anywhere.

Thanks.

"Dave Brosius" <db...@qis.net> wrote on 07/07/2004 10:29:47 PM:

> I ran across this in org.apache.xerces.dom.DeferredDocumentImpl
> 
> Shouldn't this be 
> 
>             if (getChunkValue(fNodeName, achunk, aindex).equals( name)) 
{
> 
> ?
> 
> 
>     public String getAttribute(int elemIndex, String name) {
>         if (elemIndex == -1 || name == null) {
>             return null;
>         }
>         int echunk = elemIndex >> CHUNK_SHIFT;
>         int eindex = elemIndex & CHUNK_MASK;
>         int attrIndex = getChunkIndex(fNodeExtra, echunk, eindex);
>         while (attrIndex != -1) {
>             int achunk = attrIndex >> CHUNK_SHIFT;
>             int aindex = attrIndex & CHUNK_MASK;
>             if (getChunkValue(fNodeName, achunk, aindex) == name) {
>                 return getChunkValue(fNodeValue, achunk, aindex);
>             }
>             attrIndex = getChunkIndex(fNodePrevSib, achunk, aindex);
>         }
>         return null;
>     }

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org