You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Andy Clark <an...@apache.org> on 2001/09/10 08:09:25 UTC

[Xerces2] Proposed XNI Changes

Lately I've been playing around with XNI to make various
kinds of configurations. While the framework is very very
cool, I have seen a few of the "warts", if you will, and
would like to make some changes to the interfaces. So I'm
posting my wish list here to see what other people think.

(+) Define Filter Interfaces

Currently, we have xxxHandler and xxxSource interfaces
but without an equivalent xxxFilter interface, it makes 
it rather hard to construct parser pipelines generically.
Therefore, I would like to add the following interfaces:

  public interface XMLDocumentFilter 
    extends XMLDocumentHandler, XMLDocumentSource { }
  public interface XMLDTDFilter
    extends XMLDTDHandler, XMLDTDSource { }
  public interface XMLDTDContentModelFilter
    extends XMLDTDContentModelHandler, XMLDTDContentModelSource { }

These interfaces have no body but would simplify the 
construction of parser pipelines in a generic fashion.
If we had a document filter interface then we could
define a document scanning pipeline as the following:

  XMLDocumentScanner
  XMLDocumentFilter*
  XMLDocumentHandler

Without the filter interface, it makes it harder to
build a dynamic parser configuration.

(+) Add Getters for Setters in Parser Configurations

We have setters for handlers, etc. in the parser config
interface but no way to query them once they're set. I
think this is a deficiency and we should add the following
methods to the XMLParserConfiguration interface:

  public XMLDocumemntHandler getDocumentHandler();
  public XMLDTDHandler getDTDHandler();
  public XMLDTDContentModelHandler getDTDContentModelHandler();
  public XMLErrorHandler getErrorHandler();
  public XMLEntityResolver getEntityResolver();
  public Locale getLocale();

(+) Add Non-Normalized Value to Internal Entity Decl

James Clark recently released a really cool tool that can
read in a DTD and by analyzing the way parameter entities are
nested within other parameter entities (and then used in 
content models), can output a RelaxNG grammar that separates
the pieces appropriately. So I started thinking about if we
could do the same thing in XNI.

Earlier we re-arranged the XMLDTDContentModelHandler methods
to make it easier to detect where parameter entities are used
within a content model. However, the internalEntityDecl
method on the XMLDTDHandler interface only provides you the
normalized value of the entity. Therefore, I think we need
to add another parameter to this method to pass the non-
normalized value. For example:

  public void internalEntityDecl(String name, XMLString value,
                                 XMLString nonNormalizedValue)
    throws XNIException;

(-) Remove the Attribute Entity Information

I would like to remove the methods for adding entity
reference information from the XMLAttributes interface.
I think that this information is redundant considering
that we already store the non-normalized value of the
attribute value. And we could provide a utility class
that helps users break the attribute's value into its
component parts.

The inclusion of this data unnecessarily complicates
the XMLAttributes interface and its use is limited. (At
the moment, we don't use it at all. The idea was to be
able to support the DOM's attribute nodes containing
nested entity reference nodes. However, an attribute in
the DOM doesn't take into account the non-normalized
value; it only stores the normalized value and the
entities only apply to the non-normalized value.)

(?) Should Handler/Source/Filter Be in Same Package

Does it make sense to have the XMLDocumentHandler,
XMLDocumentSource, and XMLDocumentFilter in separate
packages? Or should they live in the same package?

Comments and additional wish-list items would be
appreciated.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Proposed XNI Changes

Posted by Andy Clark <an...@apache.org>.
Andy Clark wrote:
> (+) Add Getters for Setters in Parser Configurations

I've seen no arguments against this proposed change so I have
updated the interface and implementation and checked in the
changes.

Next, I'll be adding filter interfaces unless someone speaks
up against it.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Proposed XNI Changes

Posted by Andy Clark <an...@apache.org>.
PING! Does anyone have any thoughts about my proposed changes
to XNI? I'll give it some more time before I accept silence as
agreement.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Proposed XNI Changes

Posted by Andy Clark <an...@apache.org>.
Andy Clark wrote:
> For example, the following config file would construct a
> configuration with just a scanner -- no namespace binder or
> validation:
> 
> <config>
>  <property id='doc-scanner'
>            class='org.apache.xerces.impl.XMLDocumentScannerImpl'/>
>  <property id='dtd-scanner'
>            class='org.apache.xerces.impl.XMLDTDScannerImpl'/>

Sheesh! I'm not following my own XNI documentation! Of
course the properties that these classes depend on are
required as well. PLUS... I should add a propertyId to
the <property> element so that the dynamic configuration
would know the property id to use for the setting. So I 
will make some more modifications to the proposed DTD.

<!ELEMENT config (property*,pipeline?)>
<!ELEMENT property EMPTY>
<!ATTLIST property id ID #IMPLIED>
<!ATTLIST property propertyId CDATA #IMPLIED>
<!ATTLIST property class NMTOKEN #REQUIRED>
<!ELEMENT pipeline ((doc,dtd?)|(doc?,dtd))>
<!ELEMENT doc (scanner,filter*)>
<!ELEMENT dtd (scanner,filter*)>
<!ELEMENT scanner EMPTY>
<!ATTLIST scanner idref IDREF #REQUIRED>
<!ELEMENT filter EMPTY>
<!ATTLIST filter idref IDREF #REQUIRED>

And the sample would be rewritten as:

<!DOCTYPE config SYSTEM 'config.dtd' [
<!ENTITY apache 'http://apache.org/xml/properties/internal'>
<!ENTITY xerces 'org.apache.xerces'>
]>
<config>
 <property propertyId='&apache;/symbol-table'
           class='&xerces;.util.SymbolTable'/>
 <property propertyId='&apache;/error-reporter'
           class='&xerces;.impl.XMLErrorReporter'/>
 <property id='doc-scanner' propertyId='&apache;/document-scanner'
           class='&xerces;.impl.XMLDocumentScannerImpl'/>
 <property id='dtd-scanner' propertyId='&apache;/dtd-scanner'
           class='&xerces;.impl.XMLDTDScannerImpl'/>
 <pipeline>
  <doc> <scanner idref='doc-scanner'/> </doc>
  <dtd> <scanner idref='dtd-scanner'/> </dtd>
 </pipeline>
</config>

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [Xerces2] Proposed XNI Changes

Posted by Andy Clark <an...@apache.org>.
Andy Clark wrote:
> (+) Define Filter Interfaces

Done.

This should make pipeline construction easier. Now if someone
would write a dynamic configuration or tool to generate the
code for a configuration based on an XML file (for example),
that would be super cool.

Here's a proposal for a super-simple configuration DTD:

<!ELEMENT config (property*,pipeline?)>
<!ELEMENT property EMPTY>
<!ATTLIST property id ID #REQUIRED>
<!ATTLIST property class NMTOKEN #REQUIRED>
<!ELEMENT pipeline (doc?,dtd?,dtdcm?)>
<!ELEMENT doc (scanner,filter*)>
<!ELEMENT dtd (scanner,filter*)>
<!ELEMENT dtdcm (scanner,filter*)>
<!ELEMENT scanner EMPTY>
<!ATTLIST scanner idref IDREF #REQUIRED>
<!ELEMENT filter EMPTY>
<!ATTLIST filter idref IDREF #REQUIRED>

For example, the following config file would construct a
configuration with just a scanner -- no namespace binder or
validation:

<config>
 <property id='doc-scanner' 
           class='org.apache.xerces.impl.XMLDocumentScannerImpl'/>
 <property id='dtd-scanner'
           class='org.apache.xerces.impl.XMLDTDScannerImpl'/>
 <pipeline>
  <doc> <scanner idref='doc-scanner'/> </doc>
  <dtd> <scanner idref='dtd-scanner'/> </dtd>
  <dtdcm> <scanner idref='dtd-scanner'/> </dtdcm>
 </pipeline>
</config>

Who wants to have some fun implementing it as a sample?

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org