You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Dennis Thrysoe - Netnord A/S <dt...@netnord.dk> on 2002/06/03 11:25:15 UTC

Validating without parsing

Hi,

I was wondering if the XNI component approach to building a parser would 
make it easy to use the XMLSchemaValidator or XMLDTDValidator to 
validate generated XML fragments, without parsing a document.

Oh, and by the way. Is anything included in Xerces 2 that can serialize 
SAX events into a document?

TIA,

-dennis


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Validating without parsing

Posted by Dennis Thrysoe - Netnord A/S <dt...@netnord.dk>.
Andy Clark wrote:
> Dennis Thrysoe - Netnord A/S wrote:
> 
>>I was wondering if the XNI component approach to building a parser would
>>make it easy to use the XMLSchemaValidator or XMLDTDValidator to
>>validate generated XML fragments, without parsing a document.
> 
> 
> Sure, that's possible but there's currently no code to
> make this doable directly.

Great. What would then be the best way to specify a filename, Reader or 
similar with the schema data in it?

-dennis


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Validating without parsing

Posted by Andy Clark <an...@apache.org>.
Dennis Thrysoe - Netnord A/S wrote:
> I was wondering if the XNI component approach to building a parser would
> make it easy to use the XMLSchemaValidator or XMLDTDValidator to
> validate generated XML fragments, without parsing a document.

Sure, that's possible but there's currently no code to
make this doable directly.

> Oh, and by the way. Is anything included in Xerces 2 that can serialize
> SAX events into a document?

Check out the org.apache.xml.serialize package. You could
also use a SAXSource as an input to Xalan (via TrAX) and
the identity transform (i.e. no stylesheet) to do the same
thing.

-- 
Andy Clark * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


RE: How can I parse HTML not XTML?

Posted by Marcial Atienzar <ma...@servicom2000.com>.
Thanks,

	I'm going to try it.
-----Mensaje original-----
De: SB [mailto:step1b@cyberspace.org]
Enviado el: jueves, 06 de junio de 2002 10:56
Para: xerces-j-user@xml.apache.org
Asunto: Re: How can I parse HTML not XTML?


Quoting Marcial Atienzar (matienzar@servicom2000.com):
>
> 	And with this code I can't parse HTML, only XHTML. I've an error when I
try
> to do it. A lot of thanks,
>
You will have to XMLize your HTML before using the DOMParser - tools for
these
are NekoHTML and JTidy.

http://www.apache.org/~andyc/nekohtml/doc/index.html
http://sourceforge.net/projects/jtidy

--st.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: How can I parse HTML not XTML?

Posted by SB <st...@cyberspace.org>.
Quoting Marcial Atienzar (matienzar@servicom2000.com):
> 
> 	And with this code I can't parse HTML, only XHTML. I've an error when I try
> to do it. A lot of thanks,
> 
You will have to XMLize your HTML before using the DOMParser - tools for these
are NekoHTML and JTidy.

http://www.apache.org/~andyc/nekohtml/doc/index.html
http://sourceforge.net/projects/jtidy

--st.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Validating without parsing

Posted by Elena Litani <el...@ca.ibm.com>.
Dennis Thrysoe - Netnord A/S wrote:
> > I expect more discussion on revalidation design on xerces-j-dev list.
> > If you are interested, please subscribe to this list.
> 
> Actually I'm only searching for the best way to instantiate an
> XMLSchemaValidator or XMLDTDValidator, specify some named resource to
> load the grammar from, and the rest *should* be a breeze...
> 
> Does anybody have any good suggestions or pointers for relevant
> documentation?

Not yet, this is a new feature.

Here are couple of suggestions on how to validate against XMLSchema:

1) Create your own parser configuration that includes all the
features/properties needed by XMLSchemaValidator. 
The configuration is needed, so you can easily reset()
XMLSchemaValidator. For examples, see dom.DOMRevalidationConfiguration. 

2) To specify location of the schema, you could:
   a) Use properties:
http://apache.org/xml/properties/schema/external-schemaLocation and 
     
http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation
   b) specify schemaLocation attributes at the root element

3) XMLSchemaValidator also implements impl.RevalidationHandler that
allows you to set baseURI relative to which the resolution of schema
location should occur. The handler also has an additional method to pass
characters as strings (instead of XMLString).

Note: this is experimental code: the implementation can be
changed/removed or modified. (So you might need to change your code in
the future..)


-- 
Elena Litani / IBM Toronto

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


How can I parse HTML not XTML?

Posted by Marcial Atienzar <ma...@servicom2000.com>.
Hello,

	I've this code:

    DOMParser parser = new DOMParser();
    HTMLDocument document = null;
    try {

parser.setProperty("http://apache.org/xml/properties/dom/document-class-name
",
                          "org.apache.html.dom.HTMLDocumentImpl");
          parser.parse(archivo);
          document  = (HTMLDocument)parser.getDocument();
	...

	And with this code I can't parse HTML, only XHTML. I've an error when I try
to do it. A lot of thanks,

		Marcial Atienzar Navarro


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Validating without parsing

Posted by Dennis Thrysoe - Netnord A/S <dt...@netnord.dk>.
Elena Litani wrote:
> Dennis hi,
> 
> Dennis Thrysoe - Netnord A/S wrote:
> 
>>I was wondering if the XNI component approach to building a parser would
>>make it easy to use the XMLSchemaValidator or XMLDTDValidator to
>>validate generated XML fragments, without parsing a document.
> 
> Yes, this is the approach we used to implement DOM revalidation in
> Xerces.

Great. That looks like, I can do something simlar then.

> I expect more discussion on revalidation design on xerces-j-dev list. 
> If you are interested, please subscribe to this list.

Actually I'm only searching for the best way to instantiate an 
XMLSchemaValidator or XMLDTDValidator, specify some named resource to 
load the grammar from, and the rest *should* be a breeze...

Does anybody have any good suggestions or pointers for relevant 
documentation?

-dennis


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Validating without parsing

Posted by Elena Litani <el...@ca.ibm.com>.
Dennis hi,

Dennis Thrysoe - Netnord A/S wrote:
> I was wondering if the XNI component approach to building a parser would
> make it easy to use the XMLSchemaValidator or XMLDTDValidator to
> validate generated XML fragments, without parsing a document.

Yes, this is the approach we used to implement DOM revalidation in
Xerces.

I expect more discussion on revalidation design on xerces-j-dev list. 
If you are interested, please subscribe to this list.

Thank you,
-- 
Elena Litani / IBM Toronto

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org