You are viewing a plain text version of this content. The canonical link for it is here.
Posted to p-dev@xerces.apache.org by Brian Faull <bf...@mitre.org> on 2003/11/15 02:54:02 UTC

DOMInputSource vs InputSource (was Re: Memory Leak?)

Jason, et al,

Trying to implement the solution you list -- but I need to take input from
a local variable. My previous attempt used a XercesDOMParser rather than
DOMBuilder... and XercesDOMParser::parse was AbstractDOMParser::parse,
which took an InputSource (can be StdIn*, MemBuf*, LocalFile*, ...).
DOMBuilder::parse rather takes a DOMInputSource, which seems that it may
only be a file or URL natively.

There is a handy class called Wrapper4InputSource which isa DOMInputSource
which can allow an InputSource to be used as a DOMInputSource. However...
Wrapper4InputSource doesn't appear to be available in XML::Xerces. I
didn't see any tests with a DOMInputSource used, either... Have I missed
something, or...?

-brian


"Jason E. Stewart" wrote:
> 
> Chris Cheung <ch...@clc.cuhk.edu.hk> writes:
> 
> >   I need to use Xerces-Perl to repeatedly
> >
> >     - parse XML to DOM tree
> >     - manipulate the DOM tree
> >     - pretty-print the DOM tree
> >
> > It seems that serious memory leak occurs. I do not hold any reference to
> > the Perl Xerces object used in the programme and hence the Perl objects
> > should be DESTROYed after use, but memory usage keeps on increasing as
> > reported by the OS.
> 
> Hi Chris,
> 
> So I think I've solved this. First use DOMBuilder, and second ensure
> that resetDocumentPool() gets called after you are finished with the
> DOM tree. Here's some code stolen from my memtest.pl:
> 
> my $impl = XML::Xerces::DOMImplementationRegistry::getDOMImplementation('LS');
> my $parser = $impl->createDOMBuilder($XML::Xerces::DOMImplementationLS::MODE_SYNCHRONOUS,'');
> 
> sub validate_builder {
>   my $xml = shift ;
>   my $parser = shift;
> 
>   eval {
>     $parser->setFeature("$XML::Xerces::XMLUni::fgDOMNamespaces", $namespace) ;
>     $parser->setFeature("$XML::Xerces::XMLUni::fgXercesSchema", $schema) ;
>     $parser->setFeature("$XML::Xerces::XMLUni::fgXercesSchemaFullChecking", $schema) ;
> #    $parser->setFeature("http://apache.org/xml/features/validation-error-as-fatal", $validate) ;
>   };
>   XML::Xerces::error($@) if $@;
> 
>   eval {
>     $parser->setFeature("$XML::Xerces::XMLUni::fgDOMValidation", $validate) ;
> #    $parser->setFeature("$XML::Xerces::XMLUni::fgDOMValidateIfSchema", 0) ;
>   };
>   XML::Xerces::error($@) if $@;
> 
>   eval {
>     # my $is = XML::Xerces::MemBufInputSource->new($xml);
>     # my $is = XML::Xerces::LocalFileInputSource->new($OPTIONS{file});
>     $parser->parseURI($OPTIONS{file}) ;
>   } ;
>   XML::Xerces::error($@) if $@;
>   $parser->resetDocumentPool();
> }
> 
> Using this code I'm not getting any perceptable leaks with 2.3.0-4.
> 
> Let me know if this helps,
> jas.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-p-dev-help@xml.apache.org

-- 
Brian Faull
Senior Integrated Electronics Engineer
D620 - Communications and Networking
The MITRE Corporation
202 Burlington Road, MS E015
Bedford, MA 01730-1420
V:781.271.5736  F:781.271.8875
mailto:bfaull@mitre.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: DOMInputSource vs InputSource (was Re: Memory Leak?)

Posted by "Jason E. Stewart" <ja...@openinformatics.com>.
Brian Faull <bf...@mitre.org> writes:

> > So since I don't know how to do automatic wrapping of the docs, I've
> > been trying to generate as many example files as possible that show
> > how to use the API.
> 
> ... often more helpful anwyay! The steal-and-modify programming paradigm
> is much easier than the figure-it-out-yourself technique. :)

;-)

Yup.

> > Ok, will do. What error do you get when you run:
> > 
> >   svn co http://svn.apache.org/repos/asf/xml/xerces-p/trunk/
> 
>   svn: RA layer request failed
>   svn: REPORT request failed on '/repos/asf/!svn/vcc/default'
>   svn: REPORT of '/repos/asf/!svn/vcc/default': 400 Bad Request
> (http://svn.apache.org)
> 
> > Are you behind a firewall? If so you need to set up your proxy
> > attributes in ~/.subversion.
> 
> Already did... am behind firewall -- modified proxy settings accordingly
> in ~/.subversion/servers which gives the above error. Just tried it
> without proxy settings, gives "could not connect to server" as
> expected...

This sounds like your HTTP proxy is blocking some of the protocol
commands that SVN needs (but web browsers don't). See the SNV FAQ for
details. 

> On subversion.tigris.org, it suggests using the form
>   $ svn co <URL> <project_name>

Only if you want to rename the directory created by the checkout.

> Did you (or where did you?) post the tarball? (sorry... not trying to be
> impatient, no hurry, just couldn't find it...)

No, I'm still running memory tests to figure out where the problem
lies. Once I post the tarball, I'll message the list.

Cheers,
jas.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: DOMInputSource vs InputSource (was Re: Memory Leak?)

Posted by Brian Faull <bf...@mitre.org>.
"Jason E. Stewart" wrote:
<snip>
> The issue I have always had is maintenance of static documentation -
> or docs would be based on the Xerces-C API, and if it changes or docs
> need to change. It's one thing if you're documenting your *own* API -
> if you change the code, you change the docs. But in this case it's
> monitoring and changing someone elses docs, and that rubs me the wrong
> way.

Agree completely -- no blame from here!

> Unfortunately, while there are excellent automatic documentation tools
> (like the doxygen system used by Xerces-C), there are no systems (that
> I'm aware of) that will auto-convert from one API (in C++) to another
> API (in Perl), unfortunately SWIG only handles wrapping of the
> code. So, for example, I would want any method in Xerces that requires
> an XMLCh* (their unicode string), to be documented in Perl as simply
> taking a scalar (I do a bit under the hood to ensure that you can pass
> any scalar value, string, int, or float and have it handled properly).

Sounds like a darn good idea. Also sounds like a difficult task, to get
auto-documentation to get the semantics of arguments right in such a
loosely-typed language as Perl...

> So since I don't know how to do automatic wrapping of the docs, I've
> been trying to generate as many example files as possible that show
> how to use the API.

... often more helpful anwyay! The steal-and-modify programming paradigm
is much easier than the figure-it-out-yourself technique. :)

> > > I'll add it and commit it to the SVN repository. You can either grab
> > > it from there or I can make a dev snapshot tarball.
> >
> > Installed and like svn, but don't seem to have the right configuration yet
> > -- can't snarf the repository. If you would post a tarball at your
> > convenience, that would be super.
> 
> Ok, will do. What error do you get when you run:
> 
>   svn co http://svn.apache.org/repos/asf/xml/xerces-p/trunk/

  svn: RA layer request failed
  svn: REPORT request failed on '/repos/asf/!svn/vcc/default'
  svn: REPORT of '/repos/asf/!svn/vcc/default': 400 Bad Request
(http://svn.apache.org)

> Are you behind a firewall? If so you need to set up your proxy
> attributes in ~/.subversion.

Already did... am behind firewall -- modified proxy settings accordingly
in ~/.subversion/servers which gives the above error. Just tried it
without proxy settings, gives "could not connect to server" as expected...

On subversion.tigris.org, it suggests using the form
  $ svn co <URL> <project_name>

Or using port 81 rather than 80, or using SSL (https)... also tried
those... no luck... 

Did you (or where did you?) post the tarball? (sorry... not trying to be
impatient, no hurry, just couldn't find it...)

Thanks again,
-brian


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: DOMInputSource vs InputSource (was Re: Memory Leak?)

Posted by "Jason E. Stewart" <ja...@openinformatics.com>.
Brian Faull <bf...@mitre.org> writes:

> > Aha! I wondered what that class was for (blush) ...
> 
> So did I. It was mind-numbing looking at the API docs for DOMInputSource,
> InputSource, and Wrapper4* classes for both... :-\ I'm no OO expert,
> but... hmmm...

Yes... At least the Xerces-C team is better at documentation than I
have been. Making POD docs for XML-Xerces has been on the TODO list
for, well, ever since the project started!

The issue I have always had is maintenance of static documentation -
or docs would be based on the Xerces-C API, and if it changes or docs
need to change. It's one thing if you're documenting your *own* API -
if you change the code, you change the docs. But in this case it's
monitoring and changing someone elses docs, and that rubs me the wrong
way.

Unfortunately, while there are excellent automatic documentation tools
(like the doxygen system used by Xerces-C), there are no systems (that
I'm aware of) that will auto-convert from one API (in C++) to another
API (in Perl), unfortunately SWIG only handles wrapping of the
code. So, for example, I would want any method in Xerces that requires
an XMLCh* (their unicode string), to be documented in Perl as simply
taking a scalar (I do a bit under the hood to ensure that you can pass
any scalar value, string, int, or float and have it handled properly).

So since I don't know how to do automatic wrapping of the docs, I've
been trying to generate as many example files as possible that show
how to use the API.

> > I'll add it and commit it to the SVN repository. You can either grab
> > it from there or I can make a dev snapshot tarball.
> 
> Installed and like svn, but don't seem to have the right configuration yet
> -- can't snarf the repository. If you would post a tarball at your
> convenience, that would be super. 

Ok, will do. What error do you get when you run:

  svn co http://svn.apache.org/repos/asf/xml/xerces-p/trunk/

Are you behind a firewall? If so you need to set up your proxy
attributes in ~/.subversion.

Cheers,
jas.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: DOMInputSource vs InputSource (was Re: Memory Leak?)

Posted by Brian Faull <bf...@mitre.org>.
"Jason E. Stewart" wrote:
> 
> Brian Faull <bf...@mitre.org> writes:
> 
> > Trying to implement the solution you list -- but I need to take
> > input from a local variable. My previous attempt used a
> > XercesDOMParser rather than DOMBuilder... and XercesDOMParser::parse
> > was AbstractDOMParser::parse, which took an InputSource (can be
> > StdIn*, MemBuf*, LocalFile*, ...).  DOMBuilder::parse rather takes a
> > DOMInputSource, which seems that it may only be a file or URL
> > natively.
> >
> > There is a handy class called Wrapper4InputSource which isa
> > DOMInputSource which can allow an InputSource to be used as a
> > DOMInputSource. However...  Wrapper4InputSource doesn't appear to be
> > available in XML::Xerces. I didn't see any tests with a
> > DOMInputSource used, either... Have I missed something, or...?
> 
> Aha! I wondered what that class was for (blush) ...

So did I. It was mind-numbing looking at the API docs for DOMInputSource,
InputSource, and Wrapper4* classes for both... :-\ I'm no OO expert,
but... hmmm...

> The file Xerces.i controls which Xerces-C API classes that I
> wrap. Currently, this is done manually - I just update the file an a
> per-need basis. So if a class' header file is not there, XML-Xerces
> won't know anything about it.
> 
> I was just thinking that it would be convenient to know what public
> header files are available and which ones I'm not currently wrapping.

Good to know, and great idea -- I'll view that file in the future.

> I'll add it and commit it to the SVN repository. You can either grab
> it from there or I can make a dev snapshot tarball.

Installed and like svn, but don't seem to have the right configuration yet
-- can't snarf the repository. If you would post a tarball at your
convenience, that would be super. 

Thanks a bunch!
-brian


> Cheers,
> jas.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: DOMInputSource vs InputSource (was Re: Memory Leak?)

Posted by "Jason E. Stewart" <ja...@openinformatics.com>.
Brian Faull <bf...@mitre.org> writes:

> Trying to implement the solution you list -- but I need to take
> input from a local variable. My previous attempt used a
> XercesDOMParser rather than DOMBuilder... and XercesDOMParser::parse
> was AbstractDOMParser::parse, which took an InputSource (can be
> StdIn*, MemBuf*, LocalFile*, ...).  DOMBuilder::parse rather takes a
> DOMInputSource, which seems that it may only be a file or URL
> natively.
> 
> There is a handy class called Wrapper4InputSource which isa
> DOMInputSource which can allow an InputSource to be used as a
> DOMInputSource. However...  Wrapper4InputSource doesn't appear to be
> available in XML::Xerces. I didn't see any tests with a
> DOMInputSource used, either... Have I missed something, or...?

Aha! I wondered what that class was for (blush) ...

The file Xerces.i controls which Xerces-C API classes that I
wrap. Currently, this is done manually - I just update the file an a
per-need basis. So if a class' header file is not there, XML-Xerces
won't know anything about it. 

I was just thinking that it would be convenient to know what public
header files are available and which ones I'm not currently wrapping.

I'll add it and commit it to the SVN repository. You can either grab
it from there or I can make a dev snapshot tarball.

Cheers,
jas.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org