You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Mike O'Leary <tm...@comcast.net> on 2007/02/25 17:33:15 UTC

Specifying a DTD file when parsing XML files that don't contain !DOCTYPE lines

I have a project in which I am supposed to write a parser in Java for a set
of files that don't contain <!DOCTYPE ...> lines. A DTD file was defined for
these files, and the files contain entity references whose definitions are
in the DTD file. I looked at the JAXP documentation, and it looks like I
have several options: There appear to be a variety of ways I can provide a
system id to a factory object, to a parser object, or as an argument to the
parse function. Or I can specify an entityResolver function for the parser,
or I can tell the parser not to expand entity references. I have tried
several of these, and I haven't gotten any of them to work yet. In each
case, it appears that the DTD file is not read and the parse crashes with an
error saying that an entity was referenced that was not declared. Can anyone
provide me with a sort of recipe for constructing a parser in which the DTD
file is specified without the use of a <!DOCTYPE ... > line in the file to
be parsed, and the DTD file is read and is available to resolving entity
references that are encountered during the parse? Thanks.

Mike O'Leary


Re: Specifying a DTD file when parsing XML files that don't contain !DOCTYPE lines

Posted by Will Holcomb <wh...@gmail.com>.
On 2/26/07, Mike O'Leary <tm...@comcast.net> wrote:
>
>  This file is in JDK 1.5 in the package
> com.sun.org.apache.xerces.internal.jaxp, but it is defined in a jar file
> called rt.jar. I downloaded a copy of Xerces-J-tools.2.9.0 and added
> xercesImpl.jar to my classpath. Now when I get to the setFeature function
> call, I get an error that says
>
> Exception in thread "main" java.lang.AbstractMethodError:
> javax.xml.parsers.DocumentBuilderFactory.setFeature(Ljava/lang/String;Z)V
>
> It does look like the version of DocumentBuilderFactoryImpl does not
> contain a definition for setFeature and DcoumentBuilderFactory is an
> abstract class, so I guess this all makes sense. Did you encounter things
> like this when you were setting things up?
>
I didn't. I didn't realize at the time that I was getting lucky. =)

For me, the following code:

final static String DOCUMENT_CLASS_ID = "
http://apache.org/xml/properties/dom/document-class-name";
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
log.debug("Document Class: " + dbFactory.getAttribute(DOCUMENT_CLASS_ID));

Prints:

Document Class: org.apache.xerces.dom.DocumentImpl

The system property javax.xml.parsers.DocumentBuilderFactory is not set.
What I have that compiles and runs. It's very alpha, but it does compile and
work for me.

http://odin.himinbi.org/jars/src/org/himinbi/templ/
http://odin.himinbi.org/templ/build.xml

If you're really stuck and want to try and run it, the hierarchy should look
like:

/build.xml
/src/org/himinbi/templ/
/lib/

Where /lib/ contains the jars from the most recent releases of log4j and
xerces.

Will

RE: Specifying a DTD file when parsing XML files that don't contain !DOCTYPE lines

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Mike,

"Mike O'Leary" <tm...@comcast.net> wrote on 02/26/2007 01:57:57 PM:

> When I step into the
> dbFactory.setFeature(ENTITY_RESOLVER_2_ID, true);
> function call, I go to a function in DocumentBuilderFactoryImpl.java
> that looks like this:
> 
>     public  void setFeature(String name, boolean value)
>     throws ParserConfigurationException{
> 
>         //Revisit::
>         //for now use attributes itself. we just support on feature.
>         //If we need to use setFeature in full fledge we should
>         //document what is supported by setAttribute
>         //and what is by setFeature.
>         //user should not use setAttribute("xyz",Boolean.TRUE)
>         //instead of setFeature("xyz",true);
>         if(attributes == null)
>             attributes = new Hashtable();
>         if(name.equals(Constants.FEATURE_SECURE_PROCESSING)){
>             attributes.put(Constants.FEATURE_SECURE_PROCESSING,
> Boolean.valueOf(value));
>         } else throw new ParserConfigurationException(
> DOMMessageFormatter.formatMessage(DOMMessageFormatter.DOM_DOMAIN,
>         "jaxp_feature_not_supported",
>         new Object[] {name})); 
>     }
> 
> This file is in JDK 1.5 in the package com.sun.org.apache.xerces.
> internal.jaxp, but it is defined in a jar file called rt.jar.

Obviously not supported by the JAXP implementation that Sun included in 
their release of Java 5. This should work if you use Apache Xerces-J 2.7.0 
and above.

> I downloaded a copy of Xerces-J-tools.2.9.0 and added xercesImpl.jar 
> to my classpath. Now when I get to the setFeature function call, I 
> get an error that says
> Exception in thread "main" java.lang.AbstractMethodError: javax.xml.
> parsers.DocumentBuilderFactory.setFeature(Ljava/lang/String;Z)V

The "Xerces-J-tools.2.9.0" package contains build tools for compiling the 
Xerces-J 2.9.0 source. The xercesImpl.jar included in it is a very old 
release which is there for bootstrapping Ant on JDK 1.3 and lower (since 
it requires an XML parser to function). What you really meant to download 
was the binary distribution: "Xerces-J-bin.2.9.0".

> It does look like the version of DocumentBuilderFactoryImpl does not
> contain a definition for setFeature and DcoumentBuilderFactory is an
> abstract class, so I guess this all makes sense. Did you encounter 
> things like this when you were setting things up?
> Mike

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


RE: Specifying a DTD file when parsing XML files that don't contain !DOCTYPE lines

Posted by Mike O'Leary <tm...@comcast.net>.
When I step into the

dbFactory.setFeature(ENTITY_RESOLVER_2_ID, true);

function call, I go to a function in DocumentBuilderFactoryImpl.java that
looks like this:

 

    public  void setFeature(String name, boolean value)

    throws ParserConfigurationException{

        

        //Revisit::

        //for now use attributes itself. we just support on feature.

        //If we need to use setFeature in full fledge we should

        //document what is supported by setAttribute

        //and what is by setFeature.

        //user should not use setAttribute("xyz",Boolean.TRUE)

        //instead of setFeature("xyz",true);

        if(attributes == null)

            attributes = new Hashtable();

        if(name.equals(Constants.FEATURE_SECURE_PROCESSING)){

 
attributes.put(Constants.FEATURE_SECURE_PROCESSING,Boolean.valueOf(value));

        } else throw new ParserConfigurationException(

        DOMMessageFormatter.formatMessage(DOMMessageFormatter.DOM_DOMAIN,

        "jaxp_feature_not_supported",

        new Object[] {name}));    

    }

 

This file is in JDK 1.5 in the package
com.sun.org.apache.xerces.internal.jaxp, but it is defined in a jar file
called rt.jar. I downloaded a copy of Xerces-J-tools.2.9.0 and added
xercesImpl.jar to my classpath. Now when I get to the setFeature function
call, I get an error that says

Exception in thread "main" java.lang.AbstractMethodError:
javax.xml.parsers.DocumentBuilderFactory.setFeature(Ljava/lang/String;Z)V

It does look like the version of DocumentBuilderFactoryImpl does not contain
a definition for setFeature and DcoumentBuilderFactory is an abstract class,
so I guess this all makes sense. Did you encounter things like this when you
were setting things up?

Mike

 

  _____  

From: Will Holcomb [mailto:wholcomb@gmail.com] 
Sent: Monday, February 26, 2007 10:06 AM
To: j-users@xerces.apache.org
Subject: Re: Specifying a DTD file when parsing XML files that don't contain
!DOCTYPE lines

 

Are you using the default jdk parser? I don't really know if there is some
commonly accepted method for doing it, but I have this in my ant build.xml:

<jar destfile="dist/templ-0.1.jar" basedir="build"> 
  <zipfileset src="lib/log4j-1.2.13.jar"/>
  <zipfileset src="lib/xercesImpl.jar"/>
  <manifest>
    <attribute name="Built-By" value="${ user.name <http://user.name> }"/>
    <attribute name="Main-Class" value="org.himinbi.templ.HooksProcessor"/>
  </manifest>
</jar>

This sticks the xerces classes in the jar with my classes. I think though
that you can just include xercesImpl.jar in the classpath. (I started this a
week ago and am still learning myself.)

On 2/25/07, Mike O'Leary < <ma...@comcast.net>
tm-oleary@comcast.net> wrote:

When I try this, I get an error that says:

javax.xml.parsers.ParserConfigurationException: jaxp_feature_not_supported:
Feature "http://xml.org/sax/features/use-entity-resolver2 " is not
supported.

What am I doing wrong?

Mike


Re: Specifying a DTD file when parsing XML files that don't contain !DOCTYPE lines

Posted by Will Holcomb <wh...@gmail.com>.
Are you using the default jdk parser? I don't really know if there is some
commonly accepted method for doing it, but I have this in my ant build.xml:

<jar destfile="dist/templ-0.1.jar" basedir="build">
  <zipfileset src="lib/log4j-1.2.13.jar"/>
  <zipfileset src="lib/xercesImpl.jar"/>
  <manifest>
    <attribute name="Built-By" value="${user.name}"/>
    <attribute name="Main-Class" value="org.himinbi.templ.HooksProcessor"/>
  </manifest>
</jar>

This sticks the xerces classes in the jar with my classes. I think though
that you can just include xercesImpl.jar in the classpath. (I started this a
week ago and am still learning myself.)

On 2/25/07, Mike O'Leary <tm...@comcast.net> wrote:
>
>  When I try this, I get an error that says:
>
> javax.xml.parsers.ParserConfigurationException:
> jaxp_feature_not_supported: Feature "
> http://xml.org/sax/features/use-entity-resolver2" is not supported.
>
> What am I doing wrong?
>
> Mike
>

RE: Specifying a DTD file when parsing XML files that don't contain !DOCTYPE lines

Posted by Mike O'Leary <tm...@comcast.net>.
Will,

When I try this, I get an error that says:

 

javax.xml.parsers.ParserConfigurationException: jaxp_feature_not_supported:
Feature "http://xml.org/sax/features/use-entity-resolver2" is not supported.

What am I doing wrong?

Mike

 

  _____  

From: Will Holcomb [mailto:wholcomb@gmail.com] 
Sent: Sunday, February 25, 2007 9:47 AM
To: j-users@xerces.apache.org; tm-oleary@comcast.net
Subject: Re: Specifying a DTD file when parsing XML files that don't contain
!DOCTYPE lines

 

One more note about a bug I found in this code. It only works once. When you
parse a second document, you get an exception because the FileReader for the
DTD has already been closed.

Will Holcomb

On 2/25/07, Will Holcomb <wh...@gmail.com> wrote:

I happened to send a message to this list asking this exact question earlier
this week. I figure I'll pass on the kindness someone showed me and answer
it this time/ (Perhaps this should go in the FAQ because it isn't really
intuitive.) 

final static String ENTITY_RESOLVER_2_ID = "
<http://xml.org/sax/features/use-entity-resolver2>
http://xml.org/sax/features/use-entity-resolver2"; 
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setFeature(ENTITY_RESOLVER_2_ID, true);
DocumentBuilder builder = dbFactory.newDocumentBuilder();
builder.setEntityResolver(new DTDEntityResolver(new
FileReader(dtdFilename))); 

public class DTDEntityResolver implements EntityResolver2 {
  InputSource input;
  public DTDEntityResolver(Reader dtd) { input = new InputSource(dtd); }
  public InputSource getExternalSubset(String name, String baseURI)

    throws SAXException { return input; }
  public InputSource resolveEntity(String publicId, String systemId) {
return null; }
  public InputSource resolveEntity(String name, String publicId, String
baseURI, String systemId) 
        throws SAXException { return null; }

}

 


Re: Specifying a DTD file when parsing XML files that don't contain !DOCTYPE lines

Posted by Will Holcomb <wh...@gmail.com>.
One more note about a bug I found in this code. It only works once. When you
parse a second document, you get an exception because the FileReader for the
DTD has already been closed.

Will Holcomb

On 2/25/07, Will Holcomb <wh...@gmail.com> wrote:
>
> I happened to send a message to this list asking this exact question
> > earlier this week. I figure I'll pass on the kindness someone showed me and
> > answer it this time/ (Perhaps this should go in the FAQ because it isn't
> > really intuitive.)
> >
> > final static String ENTITY_RESOLVER_2_ID = "http://xml.org/sax/features/use-entity-resolver2";
> >
> > DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
> > dbFactory.setFeature(ENTITY_RESOLVER_2_ID, true);
> > DocumentBuilder builder = dbFactory.newDocumentBuilder();
> > builder.setEntityResolver(new DTDEntityResolver(new
> > FileReader(dtdFilename)));
> >
> > public class DTDEntityResolver implements EntityResolver2 {
> >   InputSource input;
> >   public DTDEntityResolver(Reader dtd) { input = new InputSource(dtd); }
> >   public InputSource getExternalSubset(String name, String baseURI)
>
>     throws SAXException { return input; }
> >   public InputSource resolveEntity(String publicId, String systemId) {return null;}
> >   public InputSource resolveEntity(String name, String publicId, String
> > baseURI, String systemId)
> >         throws SAXException { return null; }
> > }
> >
>

Re: Specifying a DTD file when parsing XML files that don't contain !DOCTYPE lines

Posted by Will Holcomb <wh...@gmail.com>.
Didn't actually mean to send that just yet, the code is right, just ignore
the log statements...

Will Holcomb

On 2/25/07, Will Holcomb <wh...@gmail.com> wrote:
>
> I happened to send a message to this list asking this exact question
> earlier this week. I figure I'll pass on the kindness someone showed me and
> answer it this time/ (Perhaps this should go in the FAQ because it isn't
> really intuitive.)
>
> final static String ENTITY_RESOLVER_2_ID = "
> http://xml.org/sax/features/use-entity-resolver2";
> DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
> dbFactory.setFeature(ENTITY_RESOLVER_2_ID, true);
> DocumentBuilder builder = dbFactory.newDocumentBuilder();
> builder.setEntityResolver(new DTDEntityResolver(new
> FileReader(dtdFilename)));
>
> public class DTDEntityResolver implements EntityResolver2 {
>   InputSource input;
>   public DTDEntityResolver(Reader dtd) { input = new InputSource(dtd); }
>   public InputSource getExternalSubset(String name, String baseURI)

    throws SAXException { return input; }
>   public InputSource resolveEntity(String publicId, String systemId) {return null;}
>   public InputSource resolveEntity(String name, String publicId, String
> baseURI, String systemId)
>         throws SAXException { return null; }
> }
>

Re: Specifying a DTD file when parsing XML files that don't contain !DOCTYPE lines

Posted by Will Holcomb <wh...@gmail.com>.
I happened to send a message to this list asking this exact question earlier
this week. I figure I'll pass on the kindness someone showed me and answer
it this time/ (Perhaps this should go in the FAQ because it isn't really
intuitive.)

final static String ENTITY_RESOLVER_2_ID = "
http://xml.org/sax/features/use-entity-resolver2";
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setFeature(ENTITY_RESOLVER_2_ID, true);
DocumentBuilder builder = dbFactory.newDocumentBuilder();
builder.setEntityResolver(new DTDEntityResolver(new
FileReader(dtdFilename)));

public class DTDEntityResolver implements EntityResolver2 {
  InputSource input;
  public DTDEntityResolver(Reader dtd) { input = new InputSource(dtd);
}
  public InputSource getExternalSubset(String name, String baseURI) throws
SAXException {
   return input;
  }

    public InputSource resolveEntity(String publicId, String systemId) {
        log.debug("Entity: " + publicId + " : " + systemId);
        return null;
    }

    public InputSource resolveEntity(String name, String publicId, String
baseURI, String systemId)
        throws SAXException {
        log.debug("Entity: " + name + ": " + baseURI + " [" + publicId + " :
" + systemId + "]");
        return null;
    }
}


On 2/25/07, Mike O'Leary <tm...@comcast.net> wrote:
>
>  I have a project in which I am supposed to write a parser in Java for a
> set of files that don't contain <!DOCTYPE ...> lines. A DTD file was defined
> for these files, and the files contain entity references whose definitions
> are in the DTD file. I looked at the JAXP documentation, and it looks like I
> have several options: There appear to be a variety of ways I can provide a
> system id to a factory object, to a parser object, or as an argument to the
> parse function. Or I can specify an entityResolver function for the parser,
> or I can tell the parser not to expand entity references. I have tried
> several of these, and I haven't gotten any of them to work yet. In each
> case, it appears that the DTD file is not read and the parse crashes with an
> error saying that an entity was referenced that was not declared. Can anyone
> provide me with a sort of recipe for constructing a parser in which the DTD
> file is specified without the use of a <!DOCTYPE ... > line in the file to
> be parsed, and the DTD file is read and is available to resolving entity
> references that are encountered during the parse? Thanks.
>
> Mike O'Leary
>