You are viewing a plain text version of this content. The canonical link for it is here.
Posted to p-dev@xerces.apache.org by "Lars Preben S. Arnesen" <l....@usit.uio.no> on 2001/06/29 10:12:38 UTC

Parsing dom document

How do I parse a DOM Document? The document is read from file and then
alterd interactivly by the user. Before I write it to file I'd like to
verify that it's valid according to the dtd.



$parser->parse (XML::Xerces::StdInInputSource->new("foobar.xml");
$doc = parser->getDocument();

# Do some modifications to $doc

# Now I need to parse $doc again, but how...?

-- 
Lars Preben

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: Parsing dom document

Posted by "Jason E. Stewart" <ja...@openinformatics.com>.
"Lars Preben S. Arnesen" <l....@usit.uio.no> writes:

> [ Jason E. Stewart ]
> 
> > I really think you're making a mistake. If you don't check the user's
> > input what are you supposed to do when they make a mistake? Are you
> > going to output the error from the parser saying: 'Invalid document'
> > --> not very helpful. 
> 
> Generally I agree, but there are only a handful of people that are
> going to use this script. They are expert users and should be able to
> give valid data.

OK. That sounds a lot better...

> A nicer solution would be to use the dtd for manual validation for
> each attribute. What's the most elegant way to implement this? I don't
> want any data rules in my script so when I check an attribute I have
> to do this based on the current dtd.

Hmmmm....

DTD's are a real pain. The weakest point of either of the two XML
API's (SAX or DOM) is DTD handling. 

Xerces has a lot of internal stuff for parsing and keeping the DTD's
memory resident, but I can't give you a lot of help their. You can
always try asking on the Xerces-C list (since the Xerces.pm API is a
nearly 1-to-1 mapping of the Xerces-C API).

You may also want to look into using schemas. They will be much better
supported than DTD's. Basically DTD's are an anomaly that will go away
as soon as the schema standard is agreed upon.

my $0.05,
jas.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: Parsing dom document

Posted by "Lars Preben S. Arnesen" <l....@usit.uio.no>.
[ Jason E. Stewart ]

> I really think you're making a mistake. If you don't check the user's
> input what are you supposed to do when they make a mistake? Are you
> going to output the error from the parser saying: 'Invalid document'
> --> not very helpful. 

Generally I agree, but there are only a handful of people that are
going to use this script. They are expert users and should be able to
give valid data.

A nicer solution would be to use the dtd for manual validation for
each attribute. What's the most elegant way to implement this? I don't
want any data rules in my script so when I check an attribute I have
to do this based on the current dtd.
 
-- 
Lars Preben S. Arnesen
USIT, University of Oslo

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: Parsing dom document

Posted by "Jason E. Stewart" <ja...@openinformatics.com>.
"Lars Preben S. Arnesen" <l....@usit.uio.no> writes:

> [ Jason E. Stewart ]
> 
> > > How do I parse a DOM Document? The document is read from file and
> > > then alterd interactivly by the user. Before I write it to file I'd
> > > like to verify that it's valid according to the dtd.
> > 
> > I would suggest only making valid changes to the file, and then you
> > don't have to validate it ;-)
> > 
> > Otherwise, you'll have to write it out as a file, and re-parse it. 
> 
> 
> Hrrrmmmm. The users are going to alter some attributes of
> DOM-elements. I don't want to check every attribute myself. That's why
> I use a XML parser... Writing the DOM tree to file and then read it is
> going to be a nasty hack since I cannot write the XML file (to it's
> final destination) unless it's valid. (I'm writing a script that
> generates a configuration file and it's critical that the
> configuration file valid at all times.)

I really think you're making a mistake. If you don't check the user's
input what are you supposed to do when they make a mistake? Are you
going to output the error from the parser saying: 'Invalid document'
--> not very helpful. 

But if you must:

Don't bother with a file, just use serialize() and write it to a
string, and then create a new DOMParser and use the MemBufInputSource
class to turn the string into an input source. See
t/MemBufInputSource.t for an example.

jas.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: Parsing dom document

Posted by "Lars Preben S. Arnesen" <l....@usit.uio.no>.
[ Jason E. Stewart ]

> > How do I parse a DOM Document? The document is read from file and
> > then alterd interactivly by the user. Before I write it to file I'd
> > like to verify that it's valid according to the dtd.
> 
> I would suggest only making valid changes to the file, and then you
> don't have to validate it ;-)
> 
> Otherwise, you'll have to write it out as a file, and re-parse it. 


Hrrrmmmm. The users are going to alter some attributes of
DOM-elements. I don't want to check every attribute myself. That's why
I use a XML parser... Writing the DOM tree to file and then read it is
going to be a nasty hack since I cannot write the XML file (to it's
final destination) unless it's valid. (I'm writing a script that
generates a configuration file and it's critical that the
configuration file valid at all times.)

-- 
Lars Preben S. Arnesen
USIT, University of Oslo

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Re: Parsing dom document

Posted by "Jason E. Stewart" <ja...@openinformatics.com>.
"Lars Preben S. Arnesen" <l....@usit.uio.no> writes:

> How do I parse a DOM Document? The document is read from file and
> then alterd interactivly by the user. Before I write it to file I'd
> like to verify that it's valid according to the dtd.

I would suggest only making valid changes to the file, and then you
don't have to validate it ;-)

Otherwise, you'll have to write it out as a file, and re-parse it. 

jas.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org