You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "K.Kawaguchi" <kk...@kohsuke.org> on 2003/10/28 19:22:20 UTC

XNI question (component configuration)

I'm hacking Xerces source code and got a few questions. If someone could
shed some light on it, I'd appreciate it very much.


The first question is regarding how a component should configure itself.

Apparently, there're two ways to do this. One is to query the manager
from the XMLComponent.reset method, and the other is to wait for the
XMLComponent.setProperty() invocation.

I'm having hard time figuring out which route I should use for any given
property/feature.

Take the error-reporter property as an example. Looking at the javadoc
of XMLErrorReporter, my guess was that there will be one instance of
this object per a parser configuration. If so, it seems like using the
reset to query this object would be the natural choice. On the other
hand, this object might be being recreated every time a different
ErrorHandler is set, in which case this object should be better queried
from the setProperty method.

I searched how other components are obtaining a reference to
tXMLErrorReporter, and found that some are indeed using the setProperty
methods (such as XMLEntityManager, CMNodeFactory), while others are 
using the reset method (such as XMLDTDValidator), and yet some others
are using both (such as XMLScanner.)

So it seems to me that it's not just my lack of competence.

I had to do this much research for just one property. If you think about
doing this for every property that I need, I have to say it's definitely
not fun.


Now my questions are:

- how are developers expected to find out how a property should be
  queried? How is it documented? Or is it not?


- To me, the difference between the reset method and the setProperty
  method doesn't really make sense. Why do I have two routes for
  configuration? Is there any problem in just asking manager to feed
  every property through the XMLComponent.setProperty method?


- There are bunch of properties defined in the Constants class, but it
  is almost always used as
  
    Constants.XERCES_PROPERTY_PREFIX + Constants.XXX_YYY_ZZZ_PROPERTY
  
  If so, why not define this as the constant? I find it tedious to
  define all the properties again and again.
  
  As a data-point, Constants.ENTITY_RESOLVER_PROPERTY is referenced
  14 times and 12 of them are used in this form, and the remaining two
  were more or less used in the same way.
  
  Is there any reason why constants are defined in the way they are
  today?



regards,
----------------------
Kohsuke Kawaguchi
E-Mail: kk@kohsuke.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: XNI question (component configuration)

Posted by Andy Clark <an...@apache.org>.
K.Kawaguchi wrote:
>>   Components are managed by a component manager. The
>>   component manager keeps track of the parser state
>>   for features and properties. The component manager
>>   is responsible for notifying each component when
>>   the value of those features and properties change.
> 
> I get an impression that I can effectively ignore the reset method and
> just rely on the setProperty method, since for any feature/property,
> it is saying that the manager is responsible for notifying changes to me.
> (assuming that "notifying each component" means "invoking setProperty")
> 
> But you are saying that the setProperty method won't be called for
> changes between parsing.

That's not what I meant to say. Sorry for the
misunderstanding. Perhaps the last part of that
paragraph should be changed to read:

                                 The component manager
    is responsible for notifying each component when
    the value of those features and properties change
    DURING PARSING.

As well as stating that the component manager
initializes each component before parsing, if it
doesn't say so already.

Speaking of which, did you read the XNI Manual
in the documentation? Do you have suggestions
to improve the content?

> So I think my question still stands. For any given property (be it
> SymbolTable, ErrorHandler or what not), is there any way to know whether
> I should be prepared for it to be changed during parsing or not?

Each component makes up its own mind about what
settings are allowed to be changed during parsing.
In other words, the component does not have to
listen to changes if the changes do not affect the
way that the component operates.

Beyond that, however, we do have a lot of Xerces
components that do NOT change during parsing. But
I don't think we've properly stated that within
the documentation.


> I see. Then I think I'm going to ask if it makes sense for the full form
> of the property name (PREFIX+PROPERTY) to be defined as well as the
> short form (just PROPERTY).
> 
> As I wrote, 12 different classes are defining 12 different constants of
> the same string. (And I'm just picking up this property randomly and I'm
> not trying to show the worst example here.) It seems like a lot of waste.
> 
> Or is there any reason why the full form should not be defined as
> a constant?

The Java compiler will automatically copy static
final strings from other classes/interfaces into
the class file being compiled. This also applies
to strings that are constructed by using the "+"
operator. But this does take up more space.

I guess making Constants an interface and having
components implement the interface would be more
efficient in that regard. Oh well... I really
don't feel like modifying all of the Xerces code
to change it now.

-- 
Andy Clark * andyc@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: XNI question (component configuration)

Posted by "K.Kawaguchi" <kk...@kohsuke.org>.
> However, that being said, if the component does not
> care about changes to settings during parsing, then
> it does not need to "listen" to changes sent to the
> setFeature/Property methods. In other words, the
> component may query the value of the settings before
> parsing (in the reset method) and then ignore any
> changes that may occur afterwards (as communicated
> by the setFeature/Property methods). These would be
> settings that are not allowed (by the component) to
> change.

Thank you for your clarification. But I'm still bit puzzled.
If I read the following straightly,

>    Components are managed by a component manager. The
>    component manager keeps track of the parser state
>    for features and properties. The component manager
>    is responsible for notifying each component when
>    the value of those features and properties change.

I get an impression that I can effectively ignore the reset method and
just rely on the setProperty method, since for any feature/property,
it is saying that the manager is responsible for notifying changes to me.
(assuming that "notifying each component" means "invoking setProperty")

But you are saying that the setProperty method won't be called for
changes between parsing.

So I think my question still stands. For any given property (be it
SymbolTable, ErrorHandler or what not), is there any way to know whether
I should be prepared for it to be changed during parsing or not?

You wrote:

> However, that being said, if the component does not care
> about changes to settings during parsing,

But it doesn't seem like a decision up to each component. Whether a
property can be changed during parsing or not is up to the definition of
the property. Therefore, to write a well-behaving component, I think I
need to find out this behavior for each property. (and hence my original
question)

By the way, why a component manager does not notify components of
property changes between parsing? If only it does, then I can just
implement setProperty and ignore the reset method. Easier to write.


This is just a feedback from an external component developer, so just
take it as such.




> >   Is there any reason why constants are defined in the way they are
> >   today?
> 
> Yes.
> 
> By defining the prefix separate from the feature or
> property identifier, you can write code such as:
> 
>    if (featureId.startsWith(Constants.XERCES_FEATURE_PREFIX)) {
>      String feature = 
> featureId.substring(Constants.XERCES_FEATURE_PREFIX.length());
>      if (feature.equals(XXX_YYY_ZZZ_FEATURE)) {
>        // do something
>      }
>    }
> 
> If you don't like that style, then you don't have
> to use it.

I see. Then I think I'm going to ask if it makes sense for the full form
of the property name (PREFIX+PROPERTY) to be defined as well as the
short form (just PROPERTY).

As I wrote, 12 different classes are defining 12 different constants of
the same string. (And I'm just picking up this property randomly and I'm
not trying to show the worst example here.) It seems like a lot of waste.

Or is there any reason why the full form should not be defined as
a constant?


regards,
----------------------
Kohsuke Kawaguchi
E-Mail: kk@kohsuke.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: XNI question (component configuration)

Posted by Andy Clark <an...@apache.org>.
K.Kawaguchi wrote:
> - how are developers expected to find out how a property should be
>   queried? How is it documented? Or is it not?

Here is some appropriate text from the documentation[1].

   Components are managed by a component manager. The
   component manager keeps track of the parser state
   for features and properties. The component manager
   is responsible for notifying each component when
   the value of those features and properties change.

   Before parsing a document, a parser configuration
   must use the component manager to reset all of the
   parser components. Then, during parsing, each time
   a feature or property value is modified, all of the
   components must be informed of the change.

The reset() method is used to initialize a component.
The setFeature/Property() methods are used *during*
parsing for when a setting has changed.

You can't just rely on setFeature/Property because
you will not be notified of the values of settings
unless they explicitly change during the course of
parsing. If there are components in Xerces that
rely on this style, then they should be fixed.

However, that being said, if the component does not
care about changes to settings during parsing, then
it does not need to "listen" to changes sent to the
setFeature/Property methods. In other words, the
component may query the value of the settings before
parsing (in the reset method) and then ignore any
changes that may occur afterwards (as communicated
by the setFeature/Property methods). These would be
settings that are not allowed (by the component) to
change.

> - To me, the difference between the reset method and the setProperty
>   method doesn't really make sense. Why do I have two routes for
>   configuration? Is there any problem in just asking manager to feed
>   every property through the XMLComponent.setProperty method?

I hope that my previous explanation clarifies this
point.

> - There are bunch of properties defined in the Constants class, but it
>   is almost always used as
>   
>     Constants.XERCES_PROPERTY_PREFIX + Constants.XXX_YYY_ZZZ_PROPERTY
>   
>   If so, why not define this as the constant? I find it tedious to
>   define all the properties again and again.
>   
>   As a data-point, Constants.ENTITY_RESOLVER_PROPERTY is referenced
>   14 times and 12 of them are used in this form, and the remaining two
>   were more or less used in the same way.
>   
>   Is there any reason why constants are defined in the way they are
>   today?

Yes.

By defining the prefix separate from the feature or
property identifier, you can write code such as:

   if (featureId.startsWith(Constants.XERCES_FEATURE_PREFIX)) {
     String feature = 
featureId.substring(Constants.XERCES_FEATURE_PREFIX.length());
     if (feature.equals(XXX_YYY_ZZZ_FEATURE)) {
       // do something
     }
   }

If you don't like that style, then you don't have
to use it.

I hope this helps. Let me know if you have any more
questions or comments on how we can make the parser
configuration easier to understand.

[1] http://xml.apache.org/xerces2-j/xni-config.html

-- 
Andy Clark * andyc@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org