You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Ben Laurie <be...@algroup.co.uk> on 2000/01/17 22:04:00 UTC

Re: XML config

Stefano Mazzocchi wrote:
> 
> Ehmmm,
> 
> pardon me, you guys. But having dealt with XML configurations for about
> a year now, I think I have something to say in this area.
> 
> Ben had a problem: how to validate an XML configutation. The problem is
> resolved: you don't. Period. XSchema will not be powerful enough to
> allow an external tool to validate, say, apache configurations before
> this is applied to the web server. The web server will validate it and
> this is how it works for configurations.

That wasn't my only problem, but I have to ask: if you can't validate
it, how can you define it? If you can't define it, how can you expect
anyone to use it?

> - the worst case is where configurations overlap between modules.
> 
> The question is: can we topologically redefine the DTD so that
> overlapping configurations don't overlap anymore? can something like
> inside Xlinks help in this area?
> 
> I admit I don't know the answer to this last question, but I'm more than
> willing to help to find one since it would solve both Apache and Tomcat
> problems as well as establish some design patterns useful for other
> projects (cocoon, for example).
> 
> What do you say?

I say "let's go". Now, there are some other problems that need
addressing. In no particular order, here they are:

a) Namespaces. I think this is a no-brainer if you are prepared to have
long names, but is that reasonable?

b) Conditions: a vital point, I think, is that some Apache modules deal
with what configuration is currently applicable, and the rest only worry
about the current applicable configuration. This is currently handled by
mucho magic inside Apache, and it'd be nice if it weren't.

It occurs to me that what I was really trying to get at with my proposal
was the idea that as each element is processed, it should determine how
the contained elements are processed (or not) by the module that
processes the containing element, even if the contained elements are
unknown to it. So long as something has the ability to make this happen,
and it is possible to cache the results, then the <Module...> notation
is overkill (the relevant module can be determined by the element name).
I'm beginning to think this is actually trivial but I still can't quite
put my finger on how its done.

As for DTD-like stuff, well, the per-module DTD-like-thing neeeds to say
whether each top-level element can be contained within another module's
stuff, or not (i.e. is the configuration element server-wide, or not),
and, on the other side of the coin, whether it can contain other
module's stuff. I'm not sure there's any need for more subtlety than
that, is there? Or am I missing something?

Cheers,

Ben.

--
SECURE HOSTING AT THE BUNKER! http://www.thebunker.net/hosting.htm

http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi

RE: XML config

Posted by Ricardo Rocha <ri...@apache.org>.

> > Stefano Mazzocchi wrote:

> > >     I believe XPath is too complex for what we need... But I'm 
> > >     flexible (if you need XPATH, why not using XSLT? It might
> > >     be more complex, but....)

> > XSLT is not self-referencing, this is the key point.

> Pier Paolo Fumagalli wrote:
> W.H.A.T? I really don't understand what self-referencing means...

I wonder if this has to do with the ability to repeatedly apply
XSLT templates to an XSLT result tree in a recursive fashion.

If that's the case, such "recursion" can be achieved by storing
the result of a previous <xsl:apply-templates/> in an XSLT
variable and then using the resulting node variable value as the
select expression for a subsequent <xsl:apply-templates/>

Another radically dynamic, lisp-like  possibility is to have an
XSLT stylesheet generate another stylesheet by using
namespace aliasing, but that sounds way overkill. I guess it
should always be possible to achieve the same effect using
the above mentioned technique.

On the other hand, I might well be missing Stefano's point
here... my apologies in advance.

Ricardo

Re: XML config

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Stefano Mazzocchi wrote:
> 
> > I believe XPath is too complex for what we need... But I'm flexible (if
> > you need XPATH, why not using XSLT? It might be more complex, but....)
> 
> XSLT is not self-referencing, this is the key point.

W.H.A.T? I really don't understand what self-referencing means...

> > > But this should not be done at this level, but in more XML-oriented
> > > forums.
> >
> > general@xml.apache.org?
> 
> yes, or xml-dev, or create a new xml-config@xml.apache.org just for that
> and move over the people from rdf-www-config@w3c.org (i already asked
> eric@w3c.org about it)

If you want to host it on xml.apache.org, ask the PMC :)

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>

Re: XML config

Posted by Stefano Mazzocchi <st...@apache.org>.

Pierpaolo Fumagalli wrote:
> 
> Stefano Mazzocchi wrote:
> >
> > Seriously, I made a proposal a couple of weeks ago on the
> > general@xml.apache.org mail list about what I called "XML Inheritance",
> > a way to allow your xml file to "inherit" parts of other documents or
> > parts of the document itself.
> > [...]
> > nice and simple.
> 
> I still have some doubts on how complex documents are merged... We need
> to have a very strict set of rules to do that....

Yep, this is what specs are for :)
 
> > Other people (expecially Donald Ball) proposed some internal
> > hard-linking capabilities using XPaths... something along the lines of
> > what you proposed.
> 
> I believe XPath is too complex for what we need... But I'm flexible (if
> you need XPATH, why not using XSLT? It might be more complex, but....)

XSLT is not self-referencing, this is the key point.
 
> > In conclusion: the XML model is not ready to handle configuration in a
> > good way. DTD should not be used. Thus alternatives to external entity
> > inclusion should be evaluated with XInclude and XInherit proposals. This
> > is a very general thing and might apply to all possible XML
> > configurations and to most XML datapages and documents.
> 
> Agreed...
> 
> > But this should not be done at this level, but in more XML-oriented
> > forums.
> 
> general@xml.apache.org?

yes, or xml-dev, or create a new xml-config@xml.apache.org just for that
and move over the people from rdf-www-config@w3c.org (i already asked
eric@w3c.org about it)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Come to the first official Apache Software Foundation Conference!  
------------------------- http://ApacheCon.Com ---------------------

Re: XML config

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Stefano Mazzocchi wrote:
> 
> Seriously, I made a proposal a couple of weeks ago on the
> general@xml.apache.org mail list about what I called "XML Inheritance",
> a way to allow your xml file to "inherit" parts of other documents or
> parts of the document itself.
> [...] 
> nice and simple.

I still have some doubts on how complex documents are merged... We need
to have a very strict set of rules to do that....

> Other people (expecially Donald Ball) proposed some internal
> hard-linking capabilities using XPaths... something along the lines of
> what you proposed.

I believe XPath is too complex for what we need... But I'm flexible (if
you need XPATH, why not using XSLT? It might be more complex, but....)

> In conclusion: the XML model is not ready to handle configuration in a
> good way. DTD should not be used. Thus alternatives to external entity
> inclusion should be evaluated with XInclude and XInherit proposals. This
> is a very general thing and might apply to all possible XML
> configurations and to most XML datapages and documents.

Agreed...

> But this should not be done at this level, but in more XML-oriented
> forums.

general@xml.apache.org?

	Pier

Re: XML config

Posted by Stefano Mazzocchi <st...@apache.org>.

Dan Kearns wrote:

> again you would have to set some sort of standard for how to write the
> internal references - specifically which directions the pointers need to
> point.

Dan,

welcome to the "XML Inheritance Lobby" :)

Seriously, I made a proposal a couple of weeks ago on the
general@xml.apache.org mail list about what I called "XML Inheritance",
a way to allow your xml file to "inherit" parts of other documents or
parts of the document itself.

Both inclusion and inheritance are done at the XSLT level with
xsl:import and xsl:include but some people (myself included) think these
are general enough patterns that should need a special addiction to the
global xml model.

IN this director goes the XInclude proposal recently submitted to W3C.
We try to complete the thing by adding XInherit, which should allow us
to introduce OO inheritance capabilities to XML documents.

Example: say you have something like

<page>
 <author>Stefano</author>
 <body/>
 <legal>Copyright by me</legal>
</page>

you write something like this

<page xml:extend="template.xml">
 <body>
  <p>Hi, this is my page</p>
 </body>
</page>

and you end up with

<page>
 <author>Stefano</author>
 <body>
  <p>Hi, this is my page</p>
 </body>
 <legal>Copyright by me</legal>
</page>

nice and simple.

Other people (expecially Donald Ball) proposed some internal
hard-linking capabilities using XPaths... something along the lines of
what you proposed.

In conclusion: the XML model is not ready to handle configuration in a
good way. DTD should not be used. Thus alternatives to external entity
inclusion should be evaluated with XInclude and XInherit proposals. This
is a very general thing and might apply to all possible XML
configurations and to most XML datapages and documents.

But this should not be done at this level, but in more XML-oriented
forums.

Stefano.

Re: XML config

Posted by Dan Kearns <Da...@motorola.com>.

Ben Laurie <be...@algroup.co.uk> wrote:
> Stefano Mazzocchi wrote:
> > Ben had a problem: how to validate an XML configutation. The problem is
> > resolved: you don't. Period. XSchema will not be powerful enough to
> > allow an external tool to validate, say, apache configurations before
> > this is applied to the web server. The web server will validate it and
> > this is how it works for configurations.

> That wasn't my only problem, but I have to ask: if you can't validate
> it, how can you define it? If you can't define it, how can you expect
> anyone to use it?

Validation is pretty much overrated, and like Stefano says, XSchema really
doesn't fit yet as a way to describe behavioral traits - it's much better
for datatype-like constraints.

Instead of outputting to the DTD spec, think of it as the output determines
the DTD spec. You could use namespaces if you want, but it might be simpler
to follow a tag naming standard like <moduleName_tagName>.

For example, we tend to write stuff that spits out xml by calling up along
an object inheritance stack. Each layer outputs a self-contained block which
by itself would represent whatever that supertype was. Unrelated objects can
be aggregated into a container just by appending them and slapping the
container tags on the front and back.

Say B extends A and C is unrelated. You might end up with a config like:

<config>
  <b><dataForB/>
    <a><dataForA/></a>
  </b>
  <c><dataForC/></c>
</config>

If you know what to do with all the <a/>, you can either process them as you
come to them or apply a stylesheet that pulls them out into however you want
them. If you are responsible for processing <a/> you can always validate
against it by applying a copy-through stylesheet which just looks like:

<xsl:template match="/">
  <xsl:apply-templates select="//a">
</xsl:template>

If you own <b/> you have to either track changes in the DTD for <a/> or make
yourself immune to them. (btw if you write the DTD for <a/> it is helpful to
actually surround it in something like a <setOfA> to make the above simpler)

We also often end up needing internal references to "things we can't prove
exist". For example, say one module lets you configure virtual hosts, and
another lets you add aliases to a specific virtual host. You can keep the
two orthogonal as far as the XML goes with ID tags (although I wouldn't
count on IDREF tags yet):

<config>
  <vhost id="x"> ... </vhost>
  <alias id="y" vhost="x">...</alias>
</config>

again stylesheets can rewrite this in such a way that the code which
processes the <vhost/> block gets everything it needs out of the <alias/>
block and vice-versa. The long-winded method looks sort of like:
<xsl:variable name="me" select="./@id"/> followed by something like
select="//alias[@vhost=$id]"

again you would have to set some sort of standard for how to write the
internal references - specifically which directions the pointers need to
point.

help any?

-d
---
Dan Kearns <Da...@motorola.com>  +1-602-383-5011

Re: XML config

Posted by James Todd <jw...@pacbell.net>.

my .02 ... i don't see a problem with storing an xml file, if only
in memory, from one system to another.

to the point, tomcat today has local server.xml as a persistent and
readily configurable format (no need for any other tools) ... as
i see it, tomcat could be modified so that it obtained a derived
config file, in xml format, from any source in addition to a local
file read. the code is not all that difficult (eg one url read) and
furthers the administration capabilities of tomcat tremendously
imo. the implementation details as to how the xml formated data
was derived should be hidden behind the service that generated the
data.

this was my intention of bringing server.xml on the scene. a local
file read today, a url pull tomorrow, we'll see what next the day
after. being a practical guy i chose simple steps along the path to
this objective. it is quite possible that others have interpretted this
planned and staged roll-out differently then it was intended.

hope this helps,

- james

costin@costin.dnt.ro wrote:

> > > It's like taking the data out of a relational database and save it in XML,
> > > and then parsing the XML to do a search or manipulation.
> >
> > Hmm. Oracle 8i allows you to do a query and get the results as an
> > XML document, instead of as a JDBC result set.
> >
> > Making XML the wrapper for relational (or LDAP, or whatever) data does
> > not
> > mean losing the abilities of the underlying store; it just means that
> > queries return a single format parseable in a generic way.
>
> > It's another question whether baking in XML knowledge is the right thing
> > to do, but using XML as the data representation does not imply using it
> > as the storage mechanism.
>
> If you read carefully all the configuration proposals based on XML, that's
> exactly what they want - XML is used as the storage for the configuration
> info, and the sofware is based around reading this file.
>
> It's not XML used as data representation, it's XML used as storage and
> manipulation and query.
>
> In your example, Oracle 8i doesn't store it's data in XML, and the
> software will ask oracle to do a query and return the result, it will not
> get the full database and parse it and compute the response.
>
> For configuration - the software should be able to connect to a
> configuration service and query it, instead of exporting the data from
> LDAP to XML and then parsing it and query.
>
> When you save it in XML you lose the ability of LDAP - you can't get
> notification ( triggers) when data changes, you can't have ACL to control
> access to individual portions, etc, etc.
>
> Costin
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org

Re: XML config

Posted by James Todd <jw...@pacbell.net>.


costin@costin.dnt.ro wrote:

> > > If you read carefully all the configuration proposals based on XML, that's
> > > exactly what they want - XML is used as the storage for the configuration
> > > info, and the sofware is based around reading this file.
> >
> > I guess I missed that -- I thought the discussion was about whether
> > Tomcat should read its configs as an XML node vs. Ant-style
> > introspection or a wrapper like XmlTree.
>
> > The only point I'm making is that having Tomcat know about an
> > XML tree does not require storing XML in a file somewhere, and that
> > adaptors from various storage mechanisms to XML are popping up all
> > over to enable such things.
>
> Yes, you can create a DOM adapter for any repository ( including directory
> service), but a directory service (or database ) is not only a data
> storage, and DOM is not equivalent with JNDI.
>
> Also, some people think you can support LDAP by reading the data in a
> file, and using this file for configuration. That miss the whole ideea of
> directory services.

?really? news to me.

i don't mind, and in fact highly encourage, interchangeable format
interchanges but to chain two things together unecessarily is a bit
out there in my book ... interesting.

>
>
> Costin
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org

Re: XML config

Posted by co...@costin.dnt.ro.

> > If you read carefully all the configuration proposals based on XML, that's
> > exactly what they want - XML is used as the storage for the configuration
> > info, and the sofware is based around reading this file.
> 
> I guess I missed that -- I thought the discussion was about whether
> Tomcat should read its configs as an XML node vs. Ant-style 
> introspection or a wrapper like XmlTree. 

> The only point I'm making is that having Tomcat know about an
> XML tree does not require storing XML in a file somewhere, and that
> adaptors from various storage mechanisms to XML are popping up all
> over to enable such things.
 
Yes, you can create a DOM adapter for any repository ( including directory
service), but a directory service (or database ) is not only a data
storage, and DOM is not equivalent with JNDI. 

Also, some people think you can support LDAP by reading the data in a
file, and using this file for configuration. That miss the whole ideea of
directory services.

Costin

Re: XML config

Posted by Rod McChesney <ro...@korobra.com>.

> If you read carefully all the configuration proposals based on XML, that's
> exactly what they want - XML is used as the storage for the configuration
> info, and the sofware is based around reading this file.

I guess I missed that -- I thought the discussion was about whether
Tomcat should read its configs as an XML node vs. Ant-style 
introspection or a wrapper like XmlTree. 

The only point I'm making is that having Tomcat know about an
XML tree does not require storing XML in a file somewhere, and that
adaptors from various storage mechanisms to XML are popping up all
over to enable such things.

Rod


costin@costin.dnt.ro wrote:
> [snip]
>
> If you read carefully all the configuration proposals based on XML, that's
> exactly what they want - XML is used as the storage for the configuration
> info, and the sofware is based around reading this file.
> 
> It's not XML used as data representation, it's XML used as storage and
> manipulation and query.
> 
> In your example, Oracle 8i doesn't store it's data in XML, and the
> software will ask oracle to do a query and return the result, it will not
> get the full database and parse it and compute the response.
> 
> For configuration - the software should be able to connect to a
> configuration service and query it, instead of exporting the data from
> LDAP to XML and then parsing it and query.
> 
> When you save it in XML you lose the ability of LDAP - you can't get
> notification ( triggers) when data changes, you can't have ACL to control
> access to individual portions, etc, etc.
> 
> Costin

Re: XML config

Posted by co...@costin.dnt.ro.

> > It's like taking the data out of a relational database and save it in XML,
> > and then parsing the XML to do a search or manipulation. 
> 
> Hmm. Oracle 8i allows you to do a query and get the results as an
> XML document, instead of as a JDBC result set.
> 
> Making XML the wrapper for relational (or LDAP, or whatever) data does
> not 
> mean losing the abilities of the underlying store; it just means that 
> queries return a single format parseable in a generic way.

> It's another question whether baking in XML knowledge is the right thing
> to do, but using XML as the data representation does not imply using it
> as the storage mechanism.


If you read carefully all the configuration proposals based on XML, that's
exactly what they want - XML is used as the storage for the configuration
info, and the sofware is based around reading this file.

It's not XML used as data representation, it's XML used as storage and
manipulation and query.

In your example, Oracle 8i doesn't store it's data in XML, and the
software will ask oracle to do a query and return the result, it will not
get the full database and parse it and compute the response.

For configuration - the software should be able to connect to a
configuration service and query it, instead of exporting the data from
LDAP to XML and then parsing it and query. 

When you save it in XML you lose the ability of LDAP - you can't get
notification ( triggers) when data changes, you can't have ACL to control
access to individual portions, etc, etc. 

Costin

Re: XML config

Posted by Rod McChesney <ro...@korobra.com>.

> It's like taking the data out of a relational database and save it in XML,
> and then parsing the XML to do a search or manipulation. 

Hmm. Oracle 8i allows you to do a query and get the results as an
XML document, instead of as a JDBC result set.

Making XML the wrapper for relational (or LDAP, or whatever) data does
not 
mean losing the abilities of the underlying store; it just means that 
queries return a single format parseable in a generic way.

It's another question whether baking in XML knowledge is the right thing
to do, but using XML as the data representation does not imply using it
as the storage mechanism.

Rod McChesney


costin@costin.dnt.ro wrote:
> 
> > But look, we have an LDAPPRocessor that generated XML out of an LDAP
> > server.
> 
> Instead of going directly to the LDAP server, you generate and parse an
> XML file? So you've lost most of the LDAP advantages, and reduced it to
> some data.  Both LDAP and XML allow you to have a hierechical
> representation and to validate it, but LDAP adds much more.
> 
> It's like taking the data out of a relational database and save it in XML,
> and then parsing the XML to do a search or manipulation. Imagine an
> enterprise doing that with the payroll. ( sure, I know that relational
> theory is ancient history and will soon be replaced by big XML files
> :-)
> 
> > This, alone, makes your point totally irrilevant. XML is a metasyntax,
> > not a container. Who you come up with that XML to parse, it's an
> > implementation detail.
> 
> That's my point: XML is a metasyntax, it is great for data exchange
> (or to import/export data ), but it was never designed for replication,
> security ( including ACLs), queries, notifications, etc. Yes, you can
> manipulate an XML file and extract informations using XSL or DOM, but
> that's far away from what an LDAP server ( or database ) can do.
> What about concurent changes, transactions, concurent queries,
> optimization and all those nice things we are just throwing away, and
> replace with just a syntax ?
> 
> I think a configuration system should be desinged so it can scale up, you
> have some requirements which are very different from documents.  At the
> low end it can use XML directly, and XML is the best way to exchange
> config information, but the system should be able to deal with
> configuration that change ( i.e. notification), should be able to modify
> it and should exect the config system to deal with access control.
> 
> Costin
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org

Re: XML config

Posted by Assaf Arkin <ar...@exoffice.com>.

JDBC and LDAP are very well formatted, in fact better than XML by
avoiding the flexibility syndrome.

Give me a JDBC result set generated from Oracle or Sybase, an LDAP
result from OpenLDAP or NDS and I can work with it without worrying
where it came from.

Printing the result set, sending it to another server, that's where I
resort to XML.

I totally agree with Costin, an LDAPProducer is a way to work with XML
that might happen to come from an LDAP server. It's the worst way to
work with LDAP as LDAP. All the features and benefits of LDAP are lost
in the document world.

XML over-use == XML abuse

arkin

costin@costin.dnt.ro wrote:
> 
> > Anyway, I see your point, but what I was proposing is something in
> > between: all data repository servers have a problem: you don't know the
> > format of their output. Even the biggest SQL spec doesn't say nothing
> > about that. But JDBC and JNDI provide you with something neutral, we
> > work over them.
> 
> Wait, wait. What do you mean by "the format of the output" ???
> You access the repository via either JDBC or JNDI ( or another API). Why
> would someone use an XML-based intermediate representation, when you can
> have already a clear and simple representation ( rows/columns or
> tree/attributes)?  The SQL spec and all directories I know have nothing to
> do with what the user will do with the data - save it in a file, print it,
> or use it in any way.
> 
> If you are talking about the protocol between the database and the client
> - most databases use some optimized binary protocols that allow multiple
> concurent transactions, efficient transfer, etc, etc. When you process 1M
> rows you don't want to send it back as XML.
> 
> > We are _NOT_ as many think, an XML file as a repository, this is the
> > biggest mistake ever.
> 
> Total agreement here.
> 
> > Also, if you think at XMLProducers rather than XMLParsers, you solve all
> > your LDAP problems: nobody ever said that XML must come from a disk
> > file.
> 
> Sorry, I don't understand. For example, how can my application register
> interest in a certain "node" and be notified when the node change? Or how
> can I modify some attributes? How can I specify that I want only a small
> fragment from a large configuration ?
> 
> And why should I think in terms of XMLProducers, instead of
> Context/Attributes? XML defines ( AFAIK ) a file syntax, and that's all.
> There are APIs that allows you to read/manipulate this file.
> It is not a data representation, neither a generic entity that can be used
> instead of everything.
> 
> Even in the "pull" model ( module asking for config info), the natural
> interface is not XMLProducer - I hope. I'm ok with a "Configuration API"
>  based on what modules need to self-configure, but I'm can't see why we
> need to mix XML into this ?
> 
> Costin
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org

Re: XML config

Posted by co...@costin.dnt.ro.

> Anyway, I see your point, but what I was proposing is something in
> between: all data repository servers have a problem: you don't know the
> format of their output. Even the biggest SQL spec doesn't say nothing
> about that. But JDBC and JNDI provide you with something neutral, we
> work over them.

Wait, wait. What do you mean by "the format of the output" ???
You access the repository via either JDBC or JNDI ( or another API). Why
would someone use an XML-based intermediate representation, when you can
have already a clear and simple representation ( rows/columns or
tree/attributes)?  The SQL spec and all directories I know have nothing to
do with what the user will do with the data - save it in a file, print it,
or use it in any way. 

If you are talking about the protocol between the database and the client
- most databases use some optimized binary protocols that allow multiple 
concurent transactions, efficient transfer, etc, etc. When you process 1M
rows you don't want to send it back as XML.


> We are _NOT_ as many think, an XML file as a repository, this is the
> biggest mistake ever.

Total agreement here.

> Also, if you think at XMLProducers rather than XMLParsers, you solve all
> your LDAP problems: nobody ever said that XML must come from a disk
> file.

Sorry, I don't understand. For example, how can my application register
interest in a certain "node" and be notified when the node change? Or how
can I modify some attributes? How can I specify that I want only a small
fragment from a large configuration ? 

And why should I think in terms of XMLProducers, instead of
Context/Attributes? XML defines ( AFAIK ) a file syntax, and that's all.
There are APIs that allows you to read/manipulate this file. 
It is not a data representation, neither a generic entity that can be used
instead of everything. 

Even in the "pull" model ( module asking for config info), the natural
interface is not XMLProducer - I hope. I'm ok with a "Configuration API"
 based on what modules need to self-configure, but I'm can't see why we
need to mix XML into this ? 

Costin

Re: XML config

Posted by Stefano Mazzocchi <st...@apache.org>.

costin@costin.dnt.ro wrote:
> 
> > But look, we have an LDAPPRocessor that generated XML out of an LDAP
> > server.
> 
> Instead of going directly to the LDAP server, you generate and parse an
> XML file? So you've lost most of the LDAP advantages, and reduced it to
> some data.  Both LDAP and XML allow you to have a hierechical
> representation and to validate it, but LDAP adds much more.

> It's like taking the data out of a relational database and save it in XML,
> and then parsing the XML to do a search or manipulation. Imagine an
> enterprise doing that with the payroll. ( sure, I know that relational
> theory is ancient history and will soon be replaced by big XML files
> :-)

No, it will never happen, at least as far as my words count something...

Anyway, I see your point, but what I was proposing is something in
between: all data repository servers have a problem: you don't know the
format of their output. Even the biggest SQL spec doesn't say nothing
about that. But JDBC and JNDI provide you with something neutral, we
work over them.

XML solves that problem. In this sense: Cocoon provides you with tools
to run queries against such data repository and translate the results
into XML.

We are _NOT_ as many think, an XML file as a repository, this is the
biggest mistake ever.

> > This, alone, makes your point totally irrilevant. XML is a metasyntax,
> > not a container. Who you come up with that XML to parse, it's an
> > implementation detail.
> 
> That's my point: XML is a metasyntax, it is great for data exchange
> (or to import/export data ), but it was never designed for replication,
> security ( including ACLs), queries, notifications, etc. Yes, you can
> manipulate an XML file and extract informations using XSL or DOM, but
> that's far away from what an LDAP server ( or database ) can do.

Sure. Never said XML would replace databases...

> What about concurent changes, transactions, concurent queries,
> optimization and all those nice things we are just throwing away, and
> replace with just a syntax ?

I'm what? this is not what we are doing.

> I think a configuration system should be desinged so it can scale up, you
> have some requirements which are very different from documents.  At the
> low end it can use XML directly, and XML is the best way to exchange
> config information, but the system should be able to deal with
> configuration that change ( i.e. notification), should be able to modify
> it and should exect the config system to deal with access control.

Look: this software business is about layering. Perallelization.

Your software cannot be based on top of XML configurations for a simple
reason: you don't want them to write or use a parser every time. So, in
Avalon, we push your Configurations inside your method and you access
that class as your local repository of configurations.

Next: how does the application come up with that repository?

that's an open question but it could

1) use java properties (some XML is equivalent to java properties and
even less verbose)

2) use a static XML file

3) use another syntax.

So, if you can either: make the configuration parser pluggable, or do
just XML configurations and transform into XML whatever you want
simplifying the parsing stage.

Also, if you think at XMLProducers rather than XMLParsers, you solve all
your LDAP problems: nobody ever said that XML must come from a disk
file.

Hope this helps to clarify my ideas.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Come to the first official Apache Software Foundation Conference!  
------------------------- http://ApacheCon.Com ---------------------

Re: XML config

Posted by co...@costin.dnt.ro.

> But look, we have an LDAPPRocessor that generated XML out of an LDAP
> server.

Instead of going directly to the LDAP server, you generate and parse an
XML file? So you've lost most of the LDAP advantages, and reduced it to
some data.  Both LDAP and XML allow you to have a hierechical
representation and to validate it, but LDAP adds much more.

It's like taking the data out of a relational database and save it in XML,
and then parsing the XML to do a search or manipulation. Imagine an
enterprise doing that with the payroll. ( sure, I know that relational
theory is ancient history and will soon be replaced by big XML files
:-)

> This, alone, makes your point totally irrilevant. XML is a metasyntax,
> not a container. Who you come up with that XML to parse, it's an
> implementation detail.

That's my point: XML is a metasyntax, it is great for data exchange
(or to import/export data ), but it was never designed for replication,
security ( including ACLs), queries, notifications, etc. Yes, you can
manipulate an XML file and extract informations using XSL or DOM, but
that's far away from what an LDAP server ( or database ) can do. 
What about concurent changes, transactions, concurent queries,
optimization and all those nice things we are just throwing away, and
replace with just a syntax ? 


I think a configuration system should be desinged so it can scale up, you
have some requirements which are very different from documents.  At the
low end it can use XML directly, and XML is the best way to exchange
config information, but the system should be able to deal with
configuration that change ( i.e. notification), should be able to modify
it and should exect the config system to deal with access control.


Costin

Re: XML config

Posted by Stefano Mazzocchi <st...@apache.org>.

costin@costin.dnt.ro wrote:
> 
> Since we are talking about XML config, and you asked
> for feedback, I can't resist...
> 
> I know on this list everyone believe XML solves all the
> world problems, but I want (again) to point that in
> the configuration case, XML is not the only solution
> ( and IMHO it's not the best solution for all cases! ),
> and a good design should let the components to be
> configured independent of the particular configuration
> subsystem - so later, maybe we can support a "directory"
> -based configuration.
> 
> As you know, directory servers are used for this
> for a while, long before XML, and they may provide
> some advantages ( build-in replication and update,
> change notification, existing tools, transactions).
> 
> ( AFAIK, LDAP schema is similar enough and allows
> many validations not possible with simple DTDs,
> and while XSchema is still changing, LDAP servers
> are up and running now )
> 
> Costin
> ( just my personal opinion, no need to convince
> me I'm wrong !)

I won't even try.

But look, we have an LDAPPRocessor that generated XML out of an LDAP
server.

This, alone, makes your point totally irrilevant. XML is a metasyntax,
not a container. Who you come up with that XML to parse, it's an
implementation detail.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Come to the first official Apache Software Foundation Conference!  
------------------------- http://ApacheCon.Com ---------------------

Re: XML config

Posted by co...@costin.dnt.ro.

Since we are talking about XML config, and you asked 
for feedback, I can't resist...

I know on this list everyone believe XML solves all the 
world problems, but I want (again) to point that in
the configuration case, XML is not the only solution
( and IMHO it's not the best solution for all cases! ),
and a good design should let the components to be
configured independent of the particular configuration
subsystem - so later, maybe we can support a "directory"
-based configuration.

As you know, directory servers are used for this
for a while, long before XML, and they may provide
some advantages ( build-in replication and update,
change notification, existing tools, transactions).

( AFAIK, LDAP schema is similar enough and allows
many validations not possible with simple DTDs,
and while XSchema is still changing, LDAP servers
are up and running now )

Costin 
( just my personal opinion, no need to convince
me I'm wrong !)

> That's not my point. My point is that if you _can't_ validate it, then
> it must not be well-defined. I'm not suggesting that anything other than
> the server should validate it!

Re: XML config

Posted by Ben Laurie <be...@algroup.co.uk>.

Stefano Mazzocchi wrote:
> 
> Ben Laurie wrote:
> >
> > Stefano Mazzocchi wrote:
> > >
> > > Ehmmm,
> > >
> > > pardon me, you guys. But having dealt with XML configurations for about
> > > a year now, I think I have something to say in this area.
> > >
> > > Ben had a problem: how to validate an XML configutation. The problem is
> > > resolved: you don't. Period. XSchema will not be powerful enough to
> > > allow an external tool to validate, say, apache configurations before
> > > this is applied to the web server. The web server will validate it and
> > > this is how it works for configurations.
> >
> > That wasn't my only problem, but I have to ask: if you can't validate
> > it, how can you define it? If you can't define it, how can you expect
> > anyone to use it?
> 
> ??? do you HTTPD people felt the lack of a "validation" syntax that
> allowed other programs to tell if your apache httpd.conf file was valid
> or not?

That's not my point. My point is that if you _can't_ validate it, then
it must not be well-defined. I'm not suggesting that anything other than
the server should validate it!

> > > - the worst case is where configurations overlap between modules.
> > >
> > > The question is: can we topologically redefine the DTD so that
> > > overlapping configurations don't overlap anymore? can something like
> > > inside Xlinks help in this area?
> > >
> > > I admit I don't know the answer to this last question, but I'm more than
> > > willing to help to find one since it would solve both Apache and Tomcat
> > > problems as well as establish some design patterns useful for other
> > > projects (cocoon, for example).
> > >
> > > What do you say?
> >
> > I say "let's go".
> 
> Great.
> 
> > Now, there are some other problems that need
> > addressing. In no particular order, here they are:
> >
> > a) Namespaces. I think this is a no-brainer if you are prepared to have
> > long names, but is that reasonable?
> 
> IMO, XML without namespaces is just SGML--
> 
> In my "topological" dissertations on general@xml.apache.org, I showed
> how the XML 1.0 model is mostly monodimentional, exactly like SGML.
> Namespaces allow XML to achieve multi-dimensionality.
> 
> In modular documents types (like we need for Apache), the use of
> namespace is a must if you want to achieve validation using XSchema.
> Otherwise, you have to write your schema that covers all possible
> combinations and if your mod_whatever needs a new tag, you need to
> rewrite your main DTD and to reflect this all over the world.
> 
> A DTD that changes every other software revision is totally useless.
> Namespaces + XSchema allow you to make a schema for your particular
> namespace. So, mod_rewrite may need a complex schema but once it's
> defined it doesn't change that much over time. While, mod_my_own_module
> would need to be validated without having to change the global DTD.
> 
> Am I making any sense?

Yes. But this is exactly the point I've been trying to make all along.

> > b) Conditions: a vital point, I think, is that some Apache modules deal
> > with what configuration is currently applicable, and the rest only worry
> > about the current applicable configuration. This is currently handled by
> > mucho magic inside Apache, and it'd be nice if it weren't.
> 
> I think I lost you here. Consider that many people around here know very
> little about Apache internals. I, from my part, know exactly nothing
> about that :) Pier rewrote mod_jserv, I wrote only on the other side of
> the socket.
> 
> > It occurs to me that what I was really trying to get at with my proposal
> > was the idea that as each element is processed, it should determine how
> > the contained elements are processed (or not) by the module that
> > processes the containing element, even if the contained elements are
> > unknown to it.
> 
> This can be done with namespaces: we "push" a filtered tree of elements
> to the module and this tree was pruned by all the elements that do not
> belong to this namespace. I call this "projecting the document on a
> namespace axis".

Right. There's more to it than that, but not much. In particular, we
need to _not_ push stuff when it doesn't match the current context.

> > So long as something has the ability to make this happen,
> > and it is possible to cache the results, then the <Module...> notation
> > is overkill (the relevant module can be determined by the element name).
> 
> or by the namespace.

Yep.

> 
> > I'm beginning to think this is actually trivial but I still can't quite
> > put my finger on how its done.
> 
> Let's make an example
> 
> <server xml:base="/usr/local/apache/"
>         xmlns="http://apache.org/httpd/global"
>         xmlns:jserv="http://apache.org/httpd/module/jserv">
> 
>  <type>standalone</type>
>  <pid>/logs/httpd.pid</pid>
>  ...
> 
>  <modules>
>   <module status="on" name="jserv_module"
> object="modules/mod_jserv.so"/>
>   <module status="off" name="info_module" object="modules/mod_info.so"/>
>   ...
>  </modules
> 
>  <hosts>
>   <host name="www.apache.org" port="80">
>    ...
>   </host>
> 
>   <host name="java.apache.org port="80">
>    <jserv:automatic/>
>    <jserv:properties file="conf/java.apache.org/jserv.properties"/>
>    <jserv:log file="logs/mod_jserv.log" level="notice"/>
>    <directory path="/home/web/java.apache.org/" options="All">
>     <allow>all</allow>
>     <directory path="servlet">
>      <jserv:map engine="ajpv12://127.0.0.1:8008/root"/>
>     </directory>
>    </directory>
>   </host>
>  </hosts>
> </server>
> 
> NOTE: this is just out of my head... it may not make that much sense in
> an apache-wide sense and note that I'm familiar only with mod_jserv
> configurations which are not that standard in an Apache sense.
> 
> Anyway, look at a couple of things:
> 
> 1) the default namespace handles core things
> 2) each important module has its own namespace where it is free to
> define its own stuff
> 3) there are elements like <directory> that drive the module
> functionality and thus require the ability to include elements from
> other namespaces.

These are the ones I want to generalise and avoid the magic I was
referring to above.

> 4) order is always enforced by the XML parser and it's important, like
> in current apache configurations
> 5) XML configurations are much more structured than plain flat
> configurations. These visually enforce the inclusion and layering,
> resulting in a better structured configuration file. IMO, Apache conf
> files "scream" to be ported in XML.
> 6) the .htaccess pattern of distributed configurations needs lots of
> reasoning since it cannot be ported as it is over the XML world. In the
> Cocoon project, we are trying to deal with the same things.

Yes, this is one thing that has been causing me brainstrain. I currently
suspect that the main configuration should say "include files called 'x'
here", but I have a feeling that gets messy.

> 7) servlets like Cocoon should be able to add its own configurations in
> their own namespace and behave like Apache modules in all senses. We
> should define a way to map Apache configurations to servlet.properties
> files but this is possible to do without breaking the servlet platform.
> 8) the use of XML requires people to have some XML knowledge but
> learning by example is much easier than with other syntaxes (how much it
> took you to write your first HTML file?)

XML is trivial to learn to write!

> > As for DTD-like stuff, well, the per-module DTD-like-thing neeeds to say
> > whether each top-level element can be contained within another module's
> > stuff, or not (i.e. is the configuration element server-wide, or not),
> > and, on the other side of the coin, whether it can contain other
> > module's stuff. I'm not sure there's any need for more subtlety than
> > that, is there? Or am I missing something?
> 
> This is almost _exactly_ what the XSchema Structure spec defines. I
> think you should take a look at it.

I will.

Cheers,

Ben.

--
SECURE HOSTING AT THE BUNKER! http://www.thebunker.net/hosting.htm

http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi

Re: XML config

Posted by Stefano Mazzocchi <st...@apache.org>.

Ben Laurie wrote:
> 
> Stefano Mazzocchi wrote:
> >
> > Ehmmm,
> >
> > pardon me, you guys. But having dealt with XML configurations for about
> > a year now, I think I have something to say in this area.
> >
> > Ben had a problem: how to validate an XML configutation. The problem is
> > resolved: you don't. Period. XSchema will not be powerful enough to
> > allow an external tool to validate, say, apache configurations before
> > this is applied to the web server. The web server will validate it and
> > this is how it works for configurations.
> 
> That wasn't my only problem, but I have to ask: if you can't validate
> it, how can you define it? If you can't define it, how can you expect
> anyone to use it?

??? do you HTTPD people felt the lack of a "validation" syntax that
allowed other programs to tell if your apache httpd.conf file was valid
or not?

I mean: apache is easy to install because it has a very nice template
set of *.conf files that you modify in a very small part. It follows the
OO inheritance pattern: you modify what you need and inherit what you
don't care or simply don't understand.

Dispite the warning: "don't just modify this file, read the docs first",
this is what 90% of the people that installed Apache did. Myself
included :) ...then, if you need more power you the manual.

Are we loosing some of XML power? you bet, but you can't validate a
modular DTD without namespace support in the validation language.
XSchema does that. We could come out with module Xschemas for fragmented
configurations and an overlall schema on how module schemas work
together. But XSchema is a moving target at this point. Fortunately, the
ASF has the only open source multi-language XSchema-validating XML
parser but I'd suggest to wait until the W3C finalizes the spec before
jumping on that train.

> > - the worst case is where configurations overlap between modules.
> >
> > The question is: can we topologically redefine the DTD so that
> > overlapping configurations don't overlap anymore? can something like
> > inside Xlinks help in this area?
> >
> > I admit I don't know the answer to this last question, but I'm more than
> > willing to help to find one since it would solve both Apache and Tomcat
> > problems as well as establish some design patterns useful for other
> > projects (cocoon, for example).
> >
> > What do you say?
> 
> I say "let's go". 

Great.

> Now, there are some other problems that need
> addressing. In no particular order, here they are:
> 
> a) Namespaces. I think this is a no-brainer if you are prepared to have
> long names, but is that reasonable?

IMO, XML without namespaces is just SGML--

In my "topological" dissertations on general@xml.apache.org, I showed
how the XML 1.0 model is mostly monodimentional, exactly like SGML.
Namespaces allow XML to achieve multi-dimensionality.

In modular documents types (like we need for Apache), the use of
namespace is a must if you want to achieve validation using XSchema.
Otherwise, you have to write your schema that covers all possible
combinations and if your mod_whatever needs a new tag, you need to
rewrite your main DTD and to reflect this all over the world.

A DTD that changes every other software revision is totally useless.
Namespaces + XSchema allow you to make a schema for your particular
namespace. So, mod_rewrite may need a complex schema but once it's
defined it doesn't change that much over time. While, mod_my_own_module
would need to be validated without having to change the global DTD.

Am I making any sense?

> b) Conditions: a vital point, I think, is that some Apache modules deal
> with what configuration is currently applicable, and the rest only worry
> about the current applicable configuration. This is currently handled by
> mucho magic inside Apache, and it'd be nice if it weren't.

I think I lost you here. Consider that many people around here know very
little about Apache internals. I, from my part, know exactly nothing
about that :) Pier rewrote mod_jserv, I wrote only on the other side of
the socket.

> It occurs to me that what I was really trying to get at with my proposal
> was the idea that as each element is processed, it should determine how
> the contained elements are processed (or not) by the module that
> processes the containing element, even if the contained elements are
> unknown to it. 

This can be done with namespaces: we "push" a filtered tree of elements
to the module and this tree was pruned by all the elements that do not
belong to this namespace. I call this "projecting the document on a
namespace axis".

> So long as something has the ability to make this happen,
> and it is possible to cache the results, then the <Module...> notation
> is overkill (the relevant module can be determined by the element name).

or by the namespace.

> I'm beginning to think this is actually trivial but I still can't quite
> put my finger on how its done.

Let's make an example

<server xml:base="/usr/local/apache/"
        xmlns="http://apache.org/httpd/global"
        xmlns:jserv="http://apache.org/httpd/module/jserv">

 <type>standalone</type>
 <pid>/logs/httpd.pid</pid>
 ...

 <modules>
  <module status="on" name="jserv_module"
object="modules/mod_jserv.so"/>
  <module status="off" name="info_module" object="modules/mod_info.so"/>
  ...
 </modules

 <hosts>
  <host name="www.apache.org" port="80">
   ...
  </host>

  <host name="java.apache.org port="80">
   <jserv:automatic/>
   <jserv:properties file="conf/java.apache.org/jserv.properties"/>
   <jserv:log file="logs/mod_jserv.log" level="notice"/>
   <directory path="/home/web/java.apache.org/" options="All">
    <allow>all</allow>
    <directory path="servlet">
     <jserv:map engine="ajpv12://127.0.0.1:8008/root"/>
    </directory>
   </directory>
  </host>
 </hosts>
</server>

NOTE: this is just out of my head... it may not make that much sense in
an apache-wide sense and note that I'm familiar only with mod_jserv
configurations which are not that standard in an Apache sense.

Anyway, look at a couple of things:

1) the default namespace handles core things
2) each important module has its own namespace where it is free to
define its own stuff
3) there are elements like <directory> that drive the module
functionality and thus require the ability to include elements from
other namespaces.
4) order is always enforced by the XML parser and it's important, like
in current apache configurations
5) XML configurations are much more structured than plain flat
configurations. These visually enforce the inclusion and layering,
resulting in a better structured configuration file. IMO, Apache conf
files "scream" to be ported in XML.
6) the .htaccess pattern of distributed configurations needs lots of
reasoning since it cannot be ported as it is over the XML world. In the
Cocoon project, we are trying to deal with the same things.
7) servlets like Cocoon should be able to add its own configurations in
their own namespace and behave like Apache modules in all senses. We
should define a way to map Apache configurations to servlet.properties
files but this is possible to do without breaking the servlet platform.
8) the use of XML requires people to have some XML knowledge but
learning by example is much easier than with other syntaxes (how much it
took you to write your first HTML file?)

> As for DTD-like stuff, well, the per-module DTD-like-thing neeeds to say
> whether each top-level element can be contained within another module's
> stuff, or not (i.e. is the configuration element server-wide, or not),
> and, on the other side of the coin, whether it can contain other
> module's stuff. I'm not sure there's any need for more subtlety than
> that, is there? Or am I missing something?

This is almost _exactly_ what the XSchema Structure spec defines. I
think you should take a look at it.

Stefano.

Re: XML config

Posted by co...@eng.sun.com.

> That wasn't my only problem, but I have to ask: if you can't validate
> it, how can you define it? If you can't define it, how can you expect
> anyone to use it?

The server will validate the config anyway, DTD isn't the only
solution ( the current config is validated well enough and defined
well enough - without any DTD )


> a) Namespaces. I think this is a no-brainer if you are prepared to have
> long names, but is that reasonable?

I think namespaces are good, even if DTD-namespace relation is 
bad, and some parsers might get confused.


> b) Conditions: a vital point, I think, is that some Apache modules deal
> with what configuration is currently applicable, and the rest only worry
> about the current applicable configuration. This is currently handled by
> mucho magic inside Apache, and it'd be nice if it weren't.
> 
> It occurs to me that what I was really trying to get at with my proposal
> was the idea that as each element is processed, it should determine how
> the contained elements are processed (or not) by the module that

In Java and tomcat - it's easy, the "module" is  a java class that
has enough information inside ( i.e. the configurable properties
names and types are known).

In C - an equivalent way is to set properties into a module
 using a "module method", and probably either the module setter 
will validate the content or you can have an additional "method"
to give you introspection info. 

> and it is possible to cache the results, then the <Module...> notation

I was thinking of <Module > as another way to define the module
information - attributes can be "name", "module_so_file", "java_class",
and it may have information about the module properties.

Instead of getting the "validation" data for the module using
introspection ( or callbacks), or DTD - you can use <module>
as a replacement for DTD.

Costin

Re: XML config

Posted by co...@eng.sun.com.

> That wasn't my only problem, but I have to ask: if you can't validate
> it, how can you define it? If you can't define it, how can you expect
> anyone to use it?

The server will validate the config anyway, DTD isn't the only
solution ( the current config is validated well enough and defined
well enough - without any DTD )


> a) Namespaces. I think this is a no-brainer if you are prepared to have
> long names, but is that reasonable?

I think namespaces are good, even if DTD-namespace relation is 
bad, and some parsers might get confused.


> b) Conditions: a vital point, I think, is that some Apache modules deal
> with what configuration is currently applicable, and the rest only worry
> about the current applicable configuration. This is currently handled by
> mucho magic inside Apache, and it'd be nice if it weren't.
> 
> It occurs to me that what I was really trying to get at with my proposal
> was the idea that as each element is processed, it should determine how
> the contained elements are processed (or not) by the module that

In Java and tomcat - it's easy, the "module" is  a java class that
has enough information inside ( i.e. the configurable properties
names and types are known).

In C - an equivalent way is to set properties into a module
 using a "module method", and probably either the module setter 
will validate the content or you can have an additional "method"
to give you introspection info. 

> and it is possible to cache the results, then the <Module...> notation

I was thinking of <Module > as another way to define the module
information - attributes can be "name", "module_so_file", "java_class",
and it may have information about the module properties.

Instead of getting the "validation" data for the module using
introspection ( or callbacks), or DTD - you can use <module>
as a replacement for DTD.

Costin