You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Donald Ball <ba...@webslingerZ.com> on 2000/06/02 18:35:01 UTC

Re: XInclude Processor should be fixed now

On Fri, 2 Jun 2000, Daniel Schneider wrote:

> I don't know if anybody cares listening talking to myself, but including files
> within files has become an obsession of mine in the last days, that is maybe
> shared by other folks :)
> Anyhow, I have a correction to make, should have thought of it before :(
> 
> OK       no DTD in included file
> OK       <!DOCTYPE hello SYSTEM
> "http://tecfa2.unige.ch/guides/xml/cocoon/includes/hello.dtd">
> BAD:     <!DOCTYPE hello SYSTEM "hello.dtd">

I suppose the proper and simple solution would be to remove the DOCTYPE
declaration from the included document. Do you know how to strip that from
a DOM?

- donald


Re: XInclude Processor should be fixed now

Posted by Donald Ball <ba...@webslingerZ.com>.
On Mon, 5 Jun 2000, Daniel Schneider wrote:

> > On Sat, 3 Jun 2000, Daniel Schneider wrote:
> > 
> > > > On Fri, 2 Jun 2000, Daniel Schneider wrote:
> > > >
>   .......
> > >     // (3) then catch the special node
> > >     //  loop
> > >         node = children.item(i)
> > >         int type = node.getNodeType();
> > >         switch ( type ) {
> > >         case Node.DOCUMENT_TYPE_NODE: {
> > >         // NOW KILL THE THING :)
> > >         doc.removeChild(node);
> > >     // exit loop
> > 
> > You sure got the right idea. I just checked in a patch based on this,
> > check it out and see if it helps your problem.
> > 
> Hi thanx a lot for looking into this, ..... but no, it does not help :(
> 
> In my uneducated guess there is a problem with the parser. I don't
> have any clue what that might be. I looked at
> org.apache.cocoon.parser.XercesParser and its code does not seem to
> ask for any validation entailing that the external DTD must be read
> (Anyhow the question WHY xerces can not find a simple relative DTD
> declaration remains is another thing to look at ....)
> 
> I diversified a bit your error catching and sent messages to the log file.
> So basically the thing breaks BEFORE your stripDocumentTypeNodes fix:
> 
> [05/06/2000 20:37:52:046 CEST] +++ processXIncludeElement +++ SAX exception found
> [05/06/2000 20:37:52:047 CEST] File "file:/hello.dtd" not found. [FATAL ERROR] [File:
> "null" Line: -1 Column: -1]
> [05/06/2000 20:37:52:047 CEST] *** Other exception found:
> java.lang.NullPointerException
> 
> Code snippet:
> 
> try {
> 	    included_document = parser.parse(input,false);
> 	    stripDocumentTypeNodes(included_document.getDocumentElement());
> 	    // stripDocumentTypeNodes(included_document);
> 	}
> 	catch (SAXParseException e) { 
> 	    logger.log("+++ processXIncludeElement +++ The File is not well formed.", 1);
> 	    logger.log(e.getMessage()
> 		       + " at line " + e.getLineNumber() 
> 		       + ", column " + e.getColumnNumber(), 1);
> 	}
> 	catch (SAXException e) { 
> 	    logger.log("+++ processXIncludeElement +++ SAX exception found", 1);
> 	    logger.log(e.getMessage(), 1);
> 	    // e.printStackTrace(logger);
> 	}
>         .... and so  forth
> 
> If I am right you can't do anything about this and I don't know who's problem
> this might be. Just in case Stefano is listening in: My  main 
> motivation for wanting xincluded files to have DTD declarations is that I believe
> that even "sloppy" Web sites have to make sure that XML is somewhat valid .. but
> unfourtunately it is difficult for some persons to remember removing the DTD,
> especically if the WWW tree is mounted on all local file systems 
> (got empirical proof for that)... and finally the tools we are using can't
> read DTDs over HTTP (so absolute URLs are out too).

Actually, I reckon that if I called setSystemId on the InputSource I'm
passing to Xerces, it should be able to resolve the relative link to the
DTD with no trouble. Question is, what parameter to I use when calling
setSystemId...? Anyone have a clue for me?

- donald


Re: XInclude Processor should be fixed now

Posted by Daniel Schneider <Da...@tecfa.unige.ch>.
Donald Ball wrote:
> 
> On Sat, 3 Jun 2000, Daniel Schneider wrote:
> 
> > > On Fri, 2 Jun 2000, Daniel Schneider wrote:
> > >
  .......
> >     // (3) then catch the special node
> >     //  loop
> >         node = children.item(i)
> >         int type = node.getNodeType();
> >         switch ( type ) {
> >         case Node.DOCUMENT_TYPE_NODE: {
> >         // NOW KILL THE THING :)
> >         doc.removeChild(node);
> >     // exit loop
> 
> You sure got the right idea. I just checked in a patch based on this,
> check it out and see if it helps your problem.
> 
Hi thanx a lot for looking into this, ..... but no, it does not help :(

In my uneducated guess there is a problem with the parser. I don't have 
any clue what that might be. I looked at org.apache.cocoon.parser.XercesParser
and its code does not seem to ask for any validation entailing that the
external DTD must be read (Anyhow the question WHY xerces can not find
a simple relative DTD declaration remains is another thing to look at ....)

I diversified a bit your error catching and sent messages to the log file.
So basically the thing breaks BEFORE your stripDocumentTypeNodes fix:

[05/06/2000 20:37:52:046 CEST] +++ processXIncludeElement +++ SAX exception found
[05/06/2000 20:37:52:047 CEST] File "file:/hello.dtd" not found. [FATAL ERROR] [File:
"null" Line: -1 Column: -1]
[05/06/2000 20:37:52:047 CEST] *** Other exception found:
java.lang.NullPointerException

Code snippet:

try {
	    included_document = parser.parse(input,false);
	    stripDocumentTypeNodes(included_document.getDocumentElement());
	    // stripDocumentTypeNodes(included_document);
	}
	catch (SAXParseException e) { 
	    logger.log("+++ processXIncludeElement +++ The File is not well formed.", 1);
	    logger.log(e.getMessage()
		       + " at line " + e.getLineNumber() 
		       + ", column " + e.getColumnNumber(), 1);
	}
	catch (SAXException e) { 
	    logger.log("+++ processXIncludeElement +++ SAX exception found", 1);
	    logger.log(e.getMessage(), 1);
	    // e.printStackTrace(logger);
	}
        .... and so  forth

If I am right you can't do anything about this and I don't know who's problem
this might be. Just in case Stefano is listening in: My  main 
motivation for wanting xincluded files to have DTD declarations is that I believe
that even "sloppy" Web sites have to make sure that XML is somewhat valid .. but
unfourtunately it is difficult for some persons to remember removing the DTD,
especically if the WWW tree is mounted on all local file systems 
(got empirical proof for that)... and finally the tools we are using can't
read DTDs over HTTP (so absolute URLs are out too).

- Daniel
PS: The examples on my site (previous postings) won't break now, but are empty because
of my error catching.

Re: XInclude Processor should be fixed now

Posted by Donald Ball <ba...@webslingerZ.com>.
On Sat, 3 Jun 2000, Daniel Schneider wrote:

> > On Fri, 2 Jun 2000, Daniel Schneider wrote:
> > 
> Since I hardly can be called a programmer and since I never did any
> serious Java & XML it took me few hours to find an idea (don't know if
> it's the best way).
> 
>     // (1) When you have  
>             doc = parser.getDocument();
>     // (2) DO NOT doc.getDocumentElement();
>     //        but look at the children
>             children = node.getChildNodes();
>     // (3) then catch the special node
>     //  loop
>         node = children.item(i) 
>         int type = node.getNodeType();
>         switch ( type ) {
>         case Node.DOCUMENT_TYPE_NODE: {
>         // NOW KILL THE THING :)
>         doc.removeChild(node);
>     // exit loop

You sure got the right idea. I just checked in a patch based on this,
check it out and see if it helps your problem.

- donald


Re: XInclude Processor should be fixed now

Posted by Daniel Schneider <Da...@tecfa.unige.ch>.
Hi and thanks for your interest :)
-> anwser below
Donald Ball wrote:
> 
> On Fri, 2 Jun 2000, Daniel Schneider wrote:
> 
> > I don't know if anybody cares listening talking to myself, but including files
> > within files has become an obsession of mine in the last days, that is maybe
> > shared by other folks :)
> > Anyhow, I have a correction to make, should have thought of it before :(
> >
> > OK       no DTD in included file
> > OK       <!DOCTYPE hello SYSTEM
> > "http://tecfa2.unige.ch/guides/xml/cocoon/includes/hello.dtd">
> > BAD:     <!DOCTYPE hello SYSTEM "hello.dtd">
> 
> I suppose the proper and simple solution would be to remove the DOCTYPE
> declaration from the included document. Do you know how to strip that from
> a DOM?
> 
> - donald
Since I hardly can be called a programmer and since I never did any
serious Java & XML it took me few hours to find an idea (don't know if
it's the best way).

    // (1) When you have  
            doc = parser.getDocument();
    // (2) DO NOT doc.getDocumentElement();
    //        but look at the children
            children = node.getChildNodes();
    // (3) then catch the special node
    //  loop
        node = children.item(i) 
        int type = node.getNodeType();
        switch ( type ) {
        case Node.DOCUMENT_TYPE_NODE: {
        // NOW KILL THE THING :)
        doc.removeChild(node);
    // exit loop

To demo this I wrote a servlet (adapted from some code in the xerces samples)
It prints the tree of an XML file ... plus tells what special nodes
it finds (like document_type or comments).

Directory with source: http://tecfa2.unige.ch/guides/java/staf2x/ex/xml/dom-tree/

Servlet running:       http://tecfa2.unige.ch/servlet/DomTreeWriteServlet2?url=.....

e.g.
http://tecfa2.unige.ch/servlet/DomTreeWriteServlet2?url=http://tecfa.unige.ch/guides/xml/cocoon/includes/hello-with-url-dtd.xml

I don't know if this helps you fixing something and I certainly don't
know enough to discuss whether ripping this node away is the good solution.

- Greetings ! - Daniel

PS: I looked at XIncludeProcessor and don't understand where you read
this special node and therefore why your code should choke on "my
problem", but then again I am not good at programming and in addition
it's after mid-night ... and I need some sleep (else I would have
tried a few hours on XIncludeProcessor). Back on Monday :)

apache->cocoon not tomcat->cocoon

Posted by Kenwrick Chan <kc...@hawaii.edu>.
Folks,
I'm running Apache 1.3.12, tomcat 3.1, and cocoon 1.7.4 on solaris 2.6. I've added
the following to my httpd.conf file.

AddType text/xml .xml
AddHandler jserv-servlet .xml

Apache forwards any .xml extension files off to tomcat/cocoon if I have a symbolic
link from the .xml files in my apache htdocs directory to .xml files in my tomcat
directory (or vice versa).  Is there a way to set this up such that the .xml files
only need to exist in the apache htdocs directory?  I'm working on a project with a
short time frame so I may not be able to impliment the whole project with cocoon.
I'd like to be able to mix and match cocoon and php pages, but not have users
access my files via the tomcat directory tree.

Thanks,
Kenwrick