You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Dave Brosius <db...@mebigfatguy.com> on 2008/01/09 07:01:06 UTC

Re: Interning strategy

Greetings, i was purusing old mailing list emails, and stumbled onto the 
following email sent some time ago :)

Luckily, from a quick perusal of the code, it appears that the email still 
applies.

I have a question about the implementation of SymbolTable

As expected, it appears to me to that it does hashing to find a bucket, then 
walks the chain of pointers from the bucket to find a string that is 
'equals'

Only if it doesn't exist is a new one added. All of this makes sense.

The question i have then, is why when you add an entry

public Entry(String symbol, Entry next) {
    this.symbol = symbol.intern();
    characters = new char[symbol.length()];
    symbol.getChars(0, characters.length, characters, 0);
    this.next = next;
}

does the code intern the string? Isn't the point of this class to stop 
pollution of the constant pool and perm gen? (besides allowing for alternate 
hashing?)
Given that the one String that lives in the SymbolTable is returned, i would 
think intern is redundant.

thanks,
dave

----- Original Message ----- 
From: "Michael Glavassevich" <mr...@ca.ibm.com>
To: <j-...@xerces.apache.org>
Sent: Sunday, July 24, 2005 11:57 AM
Subject: Re: Interning strategy


Elliotte Harold <el...@metalab.unc.edu> wrote on 07/22/2005 09:35:02 PM:

> Suppose I turn on interning in the parser by setting the SAX property
> http://xml.org/sax/features/string-interning to true. Will Xerces simply

> invoke the String.intern() method on the strings it creates or does it
> do something fancier like maintaining its own pool of string constants
> and reuse those?

It maintains a pool. See org.apache.xerces.util.SymbolTable, specifically
the addSymbol() methods.

> -- 
> Elliotte Rusty Harold  elharo@metalab.unc.edu
> XML in a Nutshell 3rd Edition Just Published!
> http://www.cafeconleche.org/books/xian3/
> http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org
>

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Re: Question regarding getting xml document encoding type

Posted by Nathan Beyer <nb...@gmail.com>.
Try the Locator2 interface -
http://java.sun.com/javase/6/docs/api/org/xml/sax/ext/Locator2.html

-Nathan

On Jan 9, 2008 4:57 PM, Huynh, Lynn T. <ly...@unisys.com> wrote:
> Hi,
> I'm using SAX to parse XML documents, and have a need to know the
> encoding type that was declared in the XML document.  For example, if
> the XML document has this:
> <?xml version="1.0" encoding="UTF-8" ?>
>
> I would like to receive the "UTF-8". I'm a newbie to XML parsing.  I've
> searched, but not able to find out a way to get such information.  Would
> you please give me some suggestion as to what I can do to get this
> information?
>
> Thank you in advance.
> Lynn Huynh
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Question regarding getting xml document encoding type

Posted by "Huynh, Lynn T." <ly...@unisys.com>.
Hi,
I'm using SAX to parse XML documents, and have a need to know the
encoding type that was declared in the XML document.  For example, if
the XML document has this:
<?xml version="1.0" encoding="UTF-8" ?> 

I would like to receive the "UTF-8". I'm a newbie to XML parsing.  I've
searched, but not able to find out a way to get such information.  Would
you please give me some suggestion as to what I can do to get this
information?  

Thank you in advance.
Lynn Huynh

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Re: Interning strategy

Posted by Dave Brosius <db...@mebigfatguy.com>.
So then if i am to understand you correctly, the SymbolTable is used 
primarily to

1) avoid the synchronization cost of intern.

and secondarily to

2) allow for possible alternative hash algorithms.

The SoftReferenceSymbolTable in addition

allows for the releasing of temporary symbol table buttressing, but does not 
do anything for intern bloat.

It's a shame that xerces doesn't just allow for the installation of a custom 
interning manager, instead

such as

public SAXParser(Interner interner);

With a default implementation (among others) just being a simple identity 
hash map, with the user code needing then to reference that interner

public MyContentManager extends DefaultHandler {
    String lookForNode = interner.intern("TheNodeNameIWant");
    public void startElement(String uri, String localName, String qName, 
Attributes atts) {
        if (localName == lookForNode)
            dosomethinguseful();
    }
}

But ok, thanks, I think i understand now.
dave

----- Original Message ----- 
From: "Michael Glavassevich" <mr...@ca.ibm.com>
To: <j-...@xerces.apache.org>
Sent: Wednesday, January 09, 2008 10:52 PM
Subject: Re: Interning strategy


> Hi Dave,
>
> The strings didn't need to be interned for Xerces' internals to work
> correctly (though the code has since evolved to depend on that now). It's
> just cheaper to do the intern once and cache it in the SymbolTable than to
> do it later, possibly multiple times at the API layer. Some history here
> [1] if you're interested.
>
> Thanks.
>
> [1] http://issues.apache.org/jira/browse/XERCESJ-6
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> "Dave Brosius" <db...@apache.org> wrote on 01/09/2008 10:27:20 PM:
>
>> Clearly based on your response, and the fact that the Soft referenced
> table
>> also interns, i completely misunderstood (and still do) what the
> SymbolTable
>> class is used for.
>>
>> I guess i'll have to take another attempt at understanding what it is
> being
>> used for.
>>
>>
>> ----- Original Message -----
>> From: "Michael Glavassevich" <mr...@ca.ibm.com>
>> To: <j-...@xerces.apache.org>
>> Sent: Wednesday, January 09, 2008 4:16 PM
>> Subject: Re: Interning strategy
>>
>>
>> > Hi Dave,
>> >
>> > It's being interned for the application. Allows your SAX content
> handler
>> > to
>> > compare the names of elements, attributes, etc... using reference
>> > comparison [1] instead of equals for better performance. There's an
>> > alternate implementation of the SymbolTable [2] which is more sensitive
> to
>> > memory usage. It allows interned strings to be garbage collected if
>> > they're
>> > only reachable through the SymbolTable.
>> >
>> > Thanks.
>> >
>> > [1] http://xerces.apache.org/xerces2-j/features.html#string-interning
>> > [2]
>> > http://xerces.apache.org/xerces2-
>> j/javadocs/xerces2/org/apache/xerces/util/SoftReferenceSymbolTable.html
>> >
>> > Michael Glavassevich
>> > XML Parser Development
>> > IBM Toronto Lab
>> > E-mail: mrglavas@ca.ibm.com
>> > E-mail: mrglavas@apache.org
>> >
>> > "Dave Brosius" <db...@mebigfatguy.com> wrote on 01/09/2008 01:01:06
> AM:
>> >
>> >> Greetings, i was purusing old mailing list emails, and stumbled onto
> the
>> >> following email sent some time ago :)
>> >>
>> >> Luckily, from a quick perusal of the code, it appears that the email
>> > still
>> >> applies.
>> >>
>> >> I have a question about the implementation of SymbolTable
>> >>
>> >> As expected, it appears to me to that it does hashing to find a
> bucket,
>> > then
>> >> walks the chain of pointers from the bucket to find a string that is
>> >> 'equals'
>> >>
>> >> Only if it doesn't exist is a new one added. All of this makes sense.
>> >>
>> >> The question i have then, is why when you add an entry
>> >>
>> >> public Entry(String symbol, Entry next) {
>> >>     this.symbol = symbol.intern();
>> >>     characters = new char[symbol.length()];
>> >>     symbol.getChars(0, characters.length, characters, 0);
>> >>     this.next = next;
>> >> }
>> >>
>> >> does the code intern the string? Isn't the point of this class to stop
>> >> pollution of the constant pool and perm gen? (besides allowing for
>> > alternate
>> >> hashing?)
>> >> Given that the one String that lives in the SymbolTable is returned, i
>> > would
>> >> think intern is redundant.
>> >>
>> >> thanks,
>> >> dave
>> >>
>> >> ----- Original Message -----
>> >> From: "Michael Glavassevich" <mr...@ca.ibm.com>
>> >> To: <j-...@xerces.apache.org>
>> >> Sent: Sunday, July 24, 2005 11:57 AM
>> >> Subject: Re: Interning strategy
>> >>
>> >>
>> >> Elliotte Harold <el...@metalab.unc.edu> wrote on 07/22/2005 09:35:02
> PM:
>> >>
>> >> > Suppose I turn on interning in the parser by setting the SAX
> property
>> >> > http://xml.org/sax/features/string-interning to true. Will Xerces
>> > simply
>> >>
>> >> > invoke the String.intern() method on the strings it creates or does
> it
>> >> > do something fancier like maintaining its own pool of string
> constants
>> >> > and reuse those?
>> >>
>> >> It maintains a pool. See org.apache.xerces.util.SymbolTable,
> specifically
>> >> the addSymbol() methods.
>> >>
>> >> > --
>> >> > Elliotte Rusty Harold  elharo@metalab.unc.edu
>> >> > XML in a Nutshell 3rd Edition Just Published!
>> >> > http://www.cafeconleche.org/books/xian3/
>> >> >
> http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
>> >> >
>> >> >
> ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
>> >> > For additional commands, e-mail: j-dev-help@xerces.apache.org
>> >> >
>> >>
>> >> Michael Glavassevich
>> >> XML Parser Development
>> >> IBM Toronto Lab
>> >> E-mail: mrglavas@ca.ibm.com
>> >> E-mail: mrglavas@apache.org
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
>> >> For additional commands, e-mail: j-dev-help@xerces.apache.org
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
>> >> For additional commands, e-mail: j-dev-help@xerces.apache.org
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
>> > For additional commands, e-mail: j-dev-help@xerces.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: j-dev-help@xerces.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Re: Interning strategy

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Dave,

The strings didn't need to be interned for Xerces' internals to work
correctly (though the code has since evolved to depend on that now). It's
just cheaper to do the intern once and cache it in the SymbolTable than to
do it later, possibly multiple times at the API layer. Some history here
[1] if you're interested.

Thanks.

[1] http://issues.apache.org/jira/browse/XERCESJ-6

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Dave Brosius" <db...@apache.org> wrote on 01/09/2008 10:27:20 PM:

> Clearly based on your response, and the fact that the Soft referenced
table
> also interns, i completely misunderstood (and still do) what the
SymbolTable
> class is used for.
>
> I guess i'll have to take another attempt at understanding what it is
being
> used for.
>
>
> ----- Original Message -----
> From: "Michael Glavassevich" <mr...@ca.ibm.com>
> To: <j-...@xerces.apache.org>
> Sent: Wednesday, January 09, 2008 4:16 PM
> Subject: Re: Interning strategy
>
>
> > Hi Dave,
> >
> > It's being interned for the application. Allows your SAX content
handler
> > to
> > compare the names of elements, attributes, etc... using reference
> > comparison [1] instead of equals for better performance. There's an
> > alternate implementation of the SymbolTable [2] which is more sensitive
to
> > memory usage. It allows interned strings to be garbage collected if
> > they're
> > only reachable through the SymbolTable.
> >
> > Thanks.
> >
> > [1] http://xerces.apache.org/xerces2-j/features.html#string-interning
> > [2]
> > http://xerces.apache.org/xerces2-
> j/javadocs/xerces2/org/apache/xerces/util/SoftReferenceSymbolTable.html
> >
> > Michael Glavassevich
> > XML Parser Development
> > IBM Toronto Lab
> > E-mail: mrglavas@ca.ibm.com
> > E-mail: mrglavas@apache.org
> >
> > "Dave Brosius" <db...@mebigfatguy.com> wrote on 01/09/2008 01:01:06
AM:
> >
> >> Greetings, i was purusing old mailing list emails, and stumbled onto
the
> >> following email sent some time ago :)
> >>
> >> Luckily, from a quick perusal of the code, it appears that the email
> > still
> >> applies.
> >>
> >> I have a question about the implementation of SymbolTable
> >>
> >> As expected, it appears to me to that it does hashing to find a
bucket,
> > then
> >> walks the chain of pointers from the bucket to find a string that is
> >> 'equals'
> >>
> >> Only if it doesn't exist is a new one added. All of this makes sense.
> >>
> >> The question i have then, is why when you add an entry
> >>
> >> public Entry(String symbol, Entry next) {
> >>     this.symbol = symbol.intern();
> >>     characters = new char[symbol.length()];
> >>     symbol.getChars(0, characters.length, characters, 0);
> >>     this.next = next;
> >> }
> >>
> >> does the code intern the string? Isn't the point of this class to stop
> >> pollution of the constant pool and perm gen? (besides allowing for
> > alternate
> >> hashing?)
> >> Given that the one String that lives in the SymbolTable is returned, i
> > would
> >> think intern is redundant.
> >>
> >> thanks,
> >> dave
> >>
> >> ----- Original Message -----
> >> From: "Michael Glavassevich" <mr...@ca.ibm.com>
> >> To: <j-...@xerces.apache.org>
> >> Sent: Sunday, July 24, 2005 11:57 AM
> >> Subject: Re: Interning strategy
> >>
> >>
> >> Elliotte Harold <el...@metalab.unc.edu> wrote on 07/22/2005 09:35:02
PM:
> >>
> >> > Suppose I turn on interning in the parser by setting the SAX
property
> >> > http://xml.org/sax/features/string-interning to true. Will Xerces
> > simply
> >>
> >> > invoke the String.intern() method on the strings it creates or does
it
> >> > do something fancier like maintaining its own pool of string
constants
> >> > and reuse those?
> >>
> >> It maintains a pool. See org.apache.xerces.util.SymbolTable,
specifically
> >> the addSymbol() methods.
> >>
> >> > --
> >> > Elliotte Rusty Harold  elharo@metalab.unc.edu
> >> > XML in a Nutshell 3rd Edition Just Published!
> >> > http://www.cafeconleche.org/books/xian3/
> >> >
http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
> >> >
> >> >
---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> >> > For additional commands, e-mail: j-dev-help@xerces.apache.org
> >> >
> >>
> >> Michael Glavassevich
> >> XML Parser Development
> >> IBM Toronto Lab
> >> E-mail: mrglavas@ca.ibm.com
> >> E-mail: mrglavas@apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> >> For additional commands, e-mail: j-dev-help@xerces.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> >> For additional commands, e-mail: j-dev-help@xerces.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> > For additional commands, e-mail: j-dev-help@xerces.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Re: Interning strategy

Posted by Dave Brosius <db...@apache.org>.
Clearly based on your response, and the fact that the Soft referenced table 
also interns, i completely misunderstood (and still do) what the SymbolTable 
class is used for.

I guess i'll have to take another attempt at understanding what it is being 
used for.


----- Original Message ----- 
From: "Michael Glavassevich" <mr...@ca.ibm.com>
To: <j-...@xerces.apache.org>
Sent: Wednesday, January 09, 2008 4:16 PM
Subject: Re: Interning strategy


> Hi Dave,
>
> It's being interned for the application. Allows your SAX content handler 
> to
> compare the names of elements, attributes, etc... using reference
> comparison [1] instead of equals for better performance. There's an
> alternate implementation of the SymbolTable [2] which is more sensitive to
> memory usage. It allows interned strings to be garbage collected if 
> they're
> only reachable through the SymbolTable.
>
> Thanks.
>
> [1] http://xerces.apache.org/xerces2-j/features.html#string-interning
> [2]
> http://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/util/SoftReferenceSymbolTable.html
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> "Dave Brosius" <db...@mebigfatguy.com> wrote on 01/09/2008 01:01:06 AM:
>
>> Greetings, i was purusing old mailing list emails, and stumbled onto the
>> following email sent some time ago :)
>>
>> Luckily, from a quick perusal of the code, it appears that the email
> still
>> applies.
>>
>> I have a question about the implementation of SymbolTable
>>
>> As expected, it appears to me to that it does hashing to find a bucket,
> then
>> walks the chain of pointers from the bucket to find a string that is
>> 'equals'
>>
>> Only if it doesn't exist is a new one added. All of this makes sense.
>>
>> The question i have then, is why when you add an entry
>>
>> public Entry(String symbol, Entry next) {
>>     this.symbol = symbol.intern();
>>     characters = new char[symbol.length()];
>>     symbol.getChars(0, characters.length, characters, 0);
>>     this.next = next;
>> }
>>
>> does the code intern the string? Isn't the point of this class to stop
>> pollution of the constant pool and perm gen? (besides allowing for
> alternate
>> hashing?)
>> Given that the one String that lives in the SymbolTable is returned, i
> would
>> think intern is redundant.
>>
>> thanks,
>> dave
>>
>> ----- Original Message -----
>> From: "Michael Glavassevich" <mr...@ca.ibm.com>
>> To: <j-...@xerces.apache.org>
>> Sent: Sunday, July 24, 2005 11:57 AM
>> Subject: Re: Interning strategy
>>
>>
>> Elliotte Harold <el...@metalab.unc.edu> wrote on 07/22/2005 09:35:02 PM:
>>
>> > Suppose I turn on interning in the parser by setting the SAX property
>> > http://xml.org/sax/features/string-interning to true. Will Xerces
> simply
>>
>> > invoke the String.intern() method on the strings it creates or does it
>> > do something fancier like maintaining its own pool of string constants
>> > and reuse those?
>>
>> It maintains a pool. See org.apache.xerces.util.SymbolTable, specifically
>> the addSymbol() methods.
>>
>> > --
>> > Elliotte Rusty Harold  elharo@metalab.unc.edu
>> > XML in a Nutshell 3rd Edition Just Published!
>> > http://www.cafeconleche.org/books/xian3/
>> > http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
>> > For additional commands, e-mail: j-dev-help@xerces.apache.org
>> >
>>
>> Michael Glavassevich
>> XML Parser Development
>> IBM Toronto Lab
>> E-mail: mrglavas@ca.ibm.com
>> E-mail: mrglavas@apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: j-dev-help@xerces.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: j-dev-help@xerces.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Re: Interning strategy

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Dave,

It's being interned for the application. Allows your SAX content handler to
compare the names of elements, attributes, etc... using reference
comparison [1] instead of equals for better performance. There's an
alternate implementation of the SymbolTable [2] which is more sensitive to
memory usage. It allows interned strings to be garbage collected if they're
only reachable through the SymbolTable.

Thanks.

[1] http://xerces.apache.org/xerces2-j/features.html#string-interning
[2]
http://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/util/SoftReferenceSymbolTable.html

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Dave Brosius" <db...@mebigfatguy.com> wrote on 01/09/2008 01:01:06 AM:

> Greetings, i was purusing old mailing list emails, and stumbled onto the
> following email sent some time ago :)
>
> Luckily, from a quick perusal of the code, it appears that the email
still
> applies.
>
> I have a question about the implementation of SymbolTable
>
> As expected, it appears to me to that it does hashing to find a bucket,
then
> walks the chain of pointers from the bucket to find a string that is
> 'equals'
>
> Only if it doesn't exist is a new one added. All of this makes sense.
>
> The question i have then, is why when you add an entry
>
> public Entry(String symbol, Entry next) {
>     this.symbol = symbol.intern();
>     characters = new char[symbol.length()];
>     symbol.getChars(0, characters.length, characters, 0);
>     this.next = next;
> }
>
> does the code intern the string? Isn't the point of this class to stop
> pollution of the constant pool and perm gen? (besides allowing for
alternate
> hashing?)
> Given that the one String that lives in the SymbolTable is returned, i
would
> think intern is redundant.
>
> thanks,
> dave
>
> ----- Original Message -----
> From: "Michael Glavassevich" <mr...@ca.ibm.com>
> To: <j-...@xerces.apache.org>
> Sent: Sunday, July 24, 2005 11:57 AM
> Subject: Re: Interning strategy
>
>
> Elliotte Harold <el...@metalab.unc.edu> wrote on 07/22/2005 09:35:02 PM:
>
> > Suppose I turn on interning in the parser by setting the SAX property
> > http://xml.org/sax/features/string-interning to true. Will Xerces
simply
>
> > invoke the String.intern() method on the strings it creates or does it
> > do something fancier like maintaining its own pool of string constants
> > and reuse those?
>
> It maintains a pool. See org.apache.xerces.util.SymbolTable, specifically
> the addSymbol() methods.
>
> > --
> > Elliotte Rusty Harold  elharo@metalab.unc.edu
> > XML in a Nutshell 3rd Edition Just Published!
> > http://www.cafeconleche.org/books/xian3/
> > http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> > For additional commands, e-mail: j-dev-help@xerces.apache.org
> >
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org