You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Donald Ball <ba...@webslingerZ.com> on 2000/05/31 05:50:44 UTC

Small note on sitemap configuration

First, some terminology to help me reduce clutter: a resolver consists of
one generator, any number of filters, and one serializer. The function
that maps requests to resolvers is called the map function.

In the sitemap file, a process element can contain one generator, any
number of filters, and one serializer (a resolver). The process element
may also contain match elements, which specify request-time conditions
that must be met. The match element may contain alternate resolver
components. Assuming the match elements provide us with true boolean
expression evaluation, we can build a sitemap that accomplishes the design
goals delineated above, at least on a per-URI basis.

The sitemap for a complex set of criteria may prove to be overly
cumbersome, though. For instance, if the map function depends on uri,
preferred language, and user agent, the sitemap might look like this:

<process uri="/foo/**">
 <match type="language" value="en">
  <generator type="parser" src="./foo/en/**"/>
 </match>
 <match type="language" value="fr">
  <generator type="parser" src="./foo/fr/**"/>
 </match>
 <match type="format" value="html">
  <filter type="xslt" src="./style/html.xslt"/>
  <serializer type="html"/>
 </match>
 <match type="format" value="wml">
  <filter type="xslt" src="./style/wml.xslt"/>
  <serializer type="xml"/>
 </match>

Problems become evident at this point. How do we specify a default
language? We could say that a resolver component specified by a match
overrides a top-level resolver component. But suppose we have more
languages than we want to enumerate in a sitemap repeatedly, and suppose
the configuration of a resolver component could depend on a request-time
variable? We could write it like this:

<process uri="/foo/**">
 <match type="language" value="*">
  <generator type="parser" src="./foo/{language}/**"/>
 </match>
 <match type="format" value="*">
  <filter type="xslt" src="./style/{format}.xslt"/>
 </match>
 ...
</process>

That's better, but still not perfect. We still don't have a default
language. For that, and for other reasons, I think we'd do better to add
conditional processing:

<process uri="/foo/**">
 <choose>
  <when test="language">
   <generator type="parser" src="./foo/{language}/**"/>
  </when>
  <otherwise>
   <generator type="parser" src="./foo/en/**"/>
  </otherwise>
 </choose>
 ...
</process>

Of course, this requires that we make request-time information available
via named variables in the sitemap (e.g. language). Virtually speaking, we
could construct an XML fragment for the request:

<request>
 <uri value="/foo/bar"/>
 <parameter name="foo" value="bar" method="get"/>
 <header type="http" name="Referer"/>
</request>

And allow the sitemap author to construct simple XPath expressions to
reference that information:

header[@name='Referer']

We could provide parameters for frequently accessed information or
information that might be composed from multiple pieces of information
(namely, the preferred language which might depend on a header or a
cookie). e.g. $referer == header[@name='Referer'].

Note we don't actually have to construct the request document - generally,
I don't think we'll want to since much of the request time information is
irrelevant. We can write our own simple XPath expression resolver, or if
we use Xalan's XPath module to do the resolution we could write a small
DOM implementation that acted as a decorator for the HttpServletRequest
methods.

So the upshot of my random thoughts here - I propose that we replace the
match element with conditionals ala XSLT and allow sitemap authors to
reference request time information using XPath expressions into the
request data represented as an XML fragment. Any takers?

- donald


Re: Small note on sitemap configuration

Posted by Stefano Mazzocchi <st...@apache.org>.
"Pier P. Fumagalli" wrote:
> 
> Stefano Mazzocchi wrote:
> >
> > [...] grrrr, damn broken CVS update mails [...]
> 
> ??? that's kinda weird... let me doublecheck...
> 
> awwww.... why nobody told me???? gotcha... cocoon-cvs became a new EZMLM
> mailing list (a couple of weeks ago), a "moderated" one (and I'm the
> moderator!)... the problem was that nobody was subscribed to the new
> list... i thought brian copied the accounts when he created this new
> list, but apparently he didn't.
> 
> i'm subscribing everyone from <co...@xml.apache.org> to
> <co...@xml.apache.org>, but then, if you don't want to see CVS
> updates messages just write to <co...@xml.apache.org>
> 
> have fun :)

Thanks Mr. Wolf, it feels so good to have you back :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------



Re: Small note on sitemap configuration

Posted by Giacomo Pati <Gi...@pwr.ch>.
"Pier P. Fumagalli" wrote:
> 
> Stefano Mazzocchi wrote:
> >
> > [...] grrrr, damn broken CVS update mails [...]
> 
> ??? that's kinda weird... let me doublecheck...
> 
> awwww.... why nobody told me???? gotcha... cocoon-cvs became a new EZMLM
> mailing list (a couple of weeks ago), a "moderated" one (and I'm the
> moderator!)... the problem was that nobody was subscribed to the new
> list... i thought brian copied the accounts when he created this new
> list, but apparently he didn't.
> 
> i'm subscribing everyone from <co...@xml.apache.org> to
> <co...@xml.apache.org>, but then, if you don't want to see CVS
> updates messages just write to <co...@xml.apache.org>
> 
> have fun :)
> 
>         pier
> 

Thanks a lot. This make live easier.

Giacomo

-- 
PWR GmbH, Organisation & Entwicklung      Tel:   +41 (0)1 856 2202
Giacomo Pati, CTO/CEO                     Fax:   +41 (0)1 856 2201
Hintereichenstrasse 7                     Mailto:Giacomo.Pati@pwr.ch
CH-8166 Niederweningen                    Web:   http://www.pwr.ch

Re: Small note on sitemap configuration

Posted by "Pier P. Fumagalli" <pi...@apache.org>.
Stefano Mazzocchi wrote:
> 
> [...] grrrr, damn broken CVS update mails [...]

??? that's kinda weird... let me doublecheck...

awwww.... why nobody told me???? gotcha... cocoon-cvs became a new EZMLM
mailing list (a couple of weeks ago), a "moderated" one (and I'm the
moderator!)... the problem was that nobody was subscribed to the new
list... i thought brian copied the accounts when he created this new
list, but apparently he didn't.

i'm subscribing everyone from <co...@xml.apache.org> to
<co...@xml.apache.org>, but then, if you don't want to see CVS
updates messages just write to <co...@xml.apache.org>

have fun :)

	pier

-- 
----------------------------------------------------------------------
pier: stable structure erected over water to allow docking of seacraft
<ma...@betaversion.org>      <http://www.betaversion.org/~pier/>
----------------------------------------------------------------------

Re: Small note on sitemap configuration

Posted by Donald Ball <ba...@webslingerZ.com>.
On Thu, 1 Jun 2000, Giacomo Pati wrote:

> > Go back to my language example:
> > 
> > <process uri="/foo/**">
> >  <choose>
> >   <when test="language">
> >    <generator type="parser" src="./foo/{language}/**"/>
> >   </when>
> >   <otherwise>
> >    <generator type="parser" src="./foo/en/**"/>
> >   </otherwise>
> >  </choose>
> >  ...
> > </process>
> > 
> > How would you propose to rewrite this example using your matcher elements?
> > 
> > > -1 on the use of XPath as testing logic (for the problems outlined
> > > above).
> > >
> > > You might want to look into the "Pipeline Conditional Model" thread and
> > > comment on what I proposed there.
> > 
> > I don't see anything in that proposal about accessing the value of
> > request-time variables in the configuration of the resolver components -
> > which I think is key. I don't see any problems with XPath expressions that
> > don't apply equally well to anything else that accomplishes the same goal.
> > Maybe I have missed something though?
> 
> Just a quick thought about this. What if we have a
> "SitemapPropertyManager" class (I don't want to call it a Component)
> that is responsable for resolving {variable}? A default version could
> implement request information. If someone has special needs he could
> subclass, extend or make his own to have such properties available.

Re: Small note on sitemap configuration

Posted by Giacomo Pati <Gi...@pwr.ch>.
Donald Ball wrote:
> 
> On Wed, 31 May 2000, Stefano Mazzocchi wrote:
> 
> >
> > Careful: this is a very dangerous path.
> >
> > True, we could virtually embed request information into a schema, but
> > what about _any_ state information? server name, machine load, time of
> > the day in Australia, density of population in antartica, local
> > temperature, number of subscribed people to the cocoon-dev mail list, an
> > so on...
> >
> > Do you _really_ want to write a schema (or RDFSchema, for that matter)
> > for all that? I don't think so :)
> 
> I strongly disagree here. We're going to have to present the sitemap
> author with some way to access request-time information in their
> conditionals, right? Whatever scheme we use, we're going to have to limit
> it to the request-information that is likely to be relevant (e.g. all
> information about the request itself and a little bit about local state -
> current time, etc.) So saying that allowing access to request-time
> information via XPath expressions is a bad idea because we could overload
> it with too much information is fatuous. The same criticism can be
> levelled at any method of allowing access to request-time information.
> 
> Go back to my language example:
> 
> <process uri="/foo/**">
>  <choose>
>   <when test="language">
>    <generator type="parser" src="./foo/{language}/**"/>
>   </when>
>   <otherwise>
>    <generator type="parser" src="./foo/en/**"/>
>   </otherwise>
>  </choose>
>  ...
> </process>
> 
> How would you propose to rewrite this example using your matcher elements?
> 
> > -1 on the use of XPath as testing logic (for the problems outlined
> > above).
> >
> > You might want to look into the "Pipeline Conditional Model" thread and
> > comment on what I proposed there.
> 
> I don't see anything in that proposal about accessing the value of
> request-time variables in the configuration of the resolver components -
> which I think is key. I don't see any problems with XPath expressions that
> don't apply equally well to anything else that accomplishes the same goal.
> Maybe I have missed something though?

Just a quick thought about this. What if we have a
"SitemapPropertyManager" class (I don't want to call it a Component)
that is responsable for resolving {variable}? A default version could
implement request information. If someone has special needs he could
subclass, extend or make his own to have such properties available.

Giacomo

> 
> - donald

-- 
PWR GmbH, Organisation & Entwicklung      Tel:   +41 (0)1 856 2202
Giacomo Pati, CTO/CEO                     Fax:   +41 (0)1 856 2201
Hintereichenstrasse 7                     Mailto:Giacomo.Pati@pwr.ch
CH-8166 Niederweningen                    Web:   http://www.pwr.ch

Re: Small note on sitemap configuration

Posted by Stefano Mazzocchi <st...@apache.org>.
Ok, look, you won't take me for lack of energy in the sitemap, ok? so
don't even try :)

> >  <process uri="**">
> >   <generator test="lang-browser-file" src="**"/>
> >  </process>
> >
> > is much easier to understand and write :)
> 
> Yeah, but what's going to process the test="lang-browser-file" attribute?

Shit, sorry, that's wrong:

  <process uri="**">
   <generator type="lang-browser-file" src="**"/>
  </process>

you simply reference a generator (a javabytecode class written in
whatever fashion you like). All logic is contained at the programming
language level.

> What name mangling scheme is it going to use to get to the proper file,
> and what if an author wants to use a different scheme? 

Modify the code of the existing component or write your own. Ask
yourself, how often are you changing this? Shouldn't there be _common_
components that do those things?

It would definately help enforcing proper use of URI spaces. (I'm not
big fan of Perl-like reasoning "there are always more ways to do
things")

> What if they also
> want the scheme to depend on the HTTP header for some stupid reason?

Piece of cake for a java component.
 
> > I have two main concerns:
> >
> > 1) once you start this programmatic road, pretty sure people will ask
> > for <for> loops, procedures and all that stuff. I've seen this happening
> > for Ant. I think there is an evident sign already, we _don't_ need such
> > complexity, not even the variables.
> 
> I disagree. There's absolutely no need for looping, subroutines,
> declarative variables, etc. Just the need for conditionals and expression
> evaluation.

That's not my point. I just see the layout of FS emerging thru this.
 
> > 2) this requires you to think! Sure, I can't force you to think, but
> > it's good to try.
> 
> Ha! I am thinking, and I think you're wrong. :)

Prrrr :-P
 
> > Example?
> >
> > Java avoiding multiple inheritance. That's the best ever.
> >
> > Have you ever wanted multiple inheritance? I did. Badly. It forced me to
> > go to my whiteboard and layout my OO strategy. Now I don't anymore
> > because I saw the problem and my design improved.
> 
> That's smokescreen, it isn't exactly relevant to this discussion. Trying
> to sidetrack me, you are... :)

No, dude, I don't need to do tricks to prove my points and I think you
got caught into FS.
 
> > Careful, a sitemap with no variables is, to me, the exact equivalent.
> > You should _not_ place logic into the sitemap... by creating variables
> > and run-time parameters.... it would soon make the sitemap too complex
> > to handle.
> 
> But you _already_ have logic in the sitemap. 

This is unfortunately correct. But the -minimal- logic necessary (but I
admit you've got a big point on this)

> What else is the matcher element but a boolean conditional? 

No, that's not logic. To me logic is

 <process uri="/**">
  <generator ... src="./home/**"/>

where "**" is a variable. This is what I'm seriously concerned about,
also because it's hard to visually program.

> I do _not_ think we need to _create_
> variables in the sitemap, just reference them.

Point well taken.

Question: how do you know what variable you have access to? 

> > On the other hand, I do see the need for such a thing.... it's another
> > of those things I need more feedback to evaluate....
> >
> > The problem is that once we add it and people start using it, we can't
> > remove it later on. So we must decide if this is FS or not _before_
> > placing it in.
> 
> Well, I dunno. Cocoon2's in alpha, so we should still be able to make
> these type experiments without affecting anyone seriously. I see your
> point, though, which is why I find this discussion quite valuable.

Same here.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------



Re: Small note on sitemap configuration

Posted by Donald Ball <ba...@webslingerZ.com>.
On Thu, 1 Jun 2000, Stefano Mazzocchi wrote:

> > i'm simply saying that a sitemap that allows conditionals based
> > on something like this:
> > 
> > <process uri="whatever">
> >  <matcher type="browser" value="explorer">
> >   <matcher type="language" value="en">
> >    <generator type="file" src="explorer.xml">
> >   </matcher>
> >   <matcher type="language" value="it">
> >    <generator type="file" src="explorer.it.xml">
> >   </matcher>
> >  </matcher>
> >  <matcher type="browser" value="lynx">
> >   <matcher type="language" value="en">
> >    <generator type="file" src="lynx.xml">
> >   </matcher>
> >   <matcher type="language" value="it">
> >    <generator type="file" src="lynx.it.xml">
> >   </matcher>
> >  </matcher>
> > </process>
> >
> > That's hella-cumbersome, and doesn't factor out the conditionals very
> > well. My strategy would let you rewrite it like this:
> > 
> > <process>
> >  <choose>
> >   <when test="$language='en'">
> >    <generator type="file" src="{$browser}.xml"/>
> >   </when>
> >   <when test="$language='it'">
> >    <generator type="file" src="{$browser}.{$language}.xml"/>
> >   </when>
> >  </choose>
> > </process>
> >
> > I think that's much easier to understand and write. 
> 
> I think
> 
>  <process uri="**">
>   <generator test="lang-browser-file" src="**"/>
>  </process>
> 
> is much easier to understand and write :)

Yeah, but what's going to process the test="lang-browser-file" attribute?
What name mangling scheme is it going to use to get to the proper file,
and what if an author wants to use a different scheme? What if they also
want the scheme to depend on the HTTP header for some stupid reason?

> I have two main concerns:
> 
> 1) once you start this programmatic road, pretty sure people will ask
> for <for> loops, procedures and all that stuff. I've seen this happening
> for Ant. I think there is an evident sign already, we _don't_ need such
> complexity, not even the variables.

I disagree. There's absolutely no need for looping, subroutines,
declarative variables, etc. Just the need for conditionals and expression
evaluation.

> 2) this requires you to think! Sure, I can't force you to think, but
> it's good to try.

Ha! I am thinking, and I think you're wrong. :)

> Example?
> 
> Java avoiding multiple inheritance. That's the best ever.
> 
> Have you ever wanted multiple inheritance? I did. Badly. It forced me to
> go to my whiteboard and layout my OO strategy. Now I don't anymore
> because I saw the problem and my design improved.

That's smokescreen, it isn't exactly relevant to this discussion. Trying
to sidetrack me, you are... :)

> Careful, a sitemap with no variables is, to me, the exact equivalent.
> You should _not_ place logic into the sitemap... by creating variables
> and run-time parameters.... it would soon make the sitemap too complex
> to handle.

But you _already_ have logic in the sitemap. What else is the matcher
element but a boolean conditional? I do _not_ think we need to _create_
variables in the sitemap, just reference them.

> On the other hand, I do see the need for such a thing.... it's another
> of those things I need more feedback to evaluate....
> 
> The problem is that once we add it and people start using it, we can't
> remove it later on. So we must decide if this is FS or not _before_
> placing it in.

Well, I dunno. Cocoon2's in alpha, so we should still be able to make
these type experiments without affecting anyone seriously. I see your
point, though, which is why I find this discussion quite valuable.

- donald


Re: Small note on sitemap configuration

Posted by Stefano Mazzocchi <st...@apache.org>.
Donald Ball wrote:
> 
> On Thu, 1 Jun 2000, Stefano Mazzocchi wrote:
> 
> >    <when test="machine-load gt 1.0">
> >     <filter type="parser"
> > src="./foo/styles/style-{round-to-integer(machine-load)}.xsl"/>
> >    </when>
> >    <otherwise>
> >     <filter type="parser" src="./foo/styles/normal-style.xsl"/>
> >    </otherwise>
> >   </choose>
> >   ...
> >  </process>
> >
> > I don't think you could make such thing pass the (now infamous)
> > "Stefano's girlfriend test".
> >
> > But I admit I'm not 100% sure about all this....
> 
> I'm certainly not either. Let's step back for a second and see what we
> agree on. The sitemap author should be able to write the sitemap in such a
> way as to...
> 
> 1. choose different resolver (pipeline? is that your name for my object?)
> components depending on request-time information

yes, we've always being calling it "pipeline".

Yes, pipelines shoud be choosen at request-time information given URI +
state, where state is a function of request parameter and everything
else in this world.
 
> 2. configure difference resolver components with request-time information

No, I don't think we need this.

Since the state is passed anyway to the components (all
generator/filter/serializer components have access to the CocoonRequest,
CocoonResponse and all Composers to Cocoon itself)
 
> do you agree with these assertions? if so, then we're pretty close in our
> visions. 

The problem is that I don't agree with point 2. I think it's -not- a
sitemap concern.

> i'm simply saying that a sitemap that allows conditionals based
> on something like this:
> 
> <process uri="whatever">
>  <matcher type="browser" value="explorer">
>   <matcher type="language" value="en">
>    <generator type="file" src="explorer.xml">
>   </matcher>
>   <matcher type="language" value="it">
>    <generator type="file" src="explorer.it.xml">
>   </matcher>
>  </matcher>
>  <matcher type="browser" value="lynx">
>   <matcher type="language" value="en">
>    <generator type="file" src="lynx.xml">
>   </matcher>
>   <matcher type="language" value="it">
>    <generator type="file" src="lynx.it.xml">
>   </matcher>
>  </matcher>
> </process>
>
> That's hella-cumbersome, and doesn't factor out the conditionals very
> well. My strategy would let you rewrite it like this:
> 
> <process>
>  <choose>
>   <when test="$language='en'">
>    <generator type="file" src="{$browser}.xml"/>
>   </when>
>   <when test="$language='it'">
>    <generator type="file" src="{$browser}.{$language}.xml"/>
>   </when>
>  </choose>
> </process>
>
> I think that's much easier to understand and write. 

I think

 <process uri="**">
  <generator test="lang-browser-file" src="**"/>
 </process>

is much easier to understand and write :)

> Sure, you can get
> yourself into trouble if you move too much logic inside the sitemap and
> out of your XSP pages or whatever. So what? I prefer to have the extra
> rope, even if I might accidentally hang myself.

I have two main concerns:

1) once you start this programmatic road, pretty sure people will ask
for <for> loops, procedures and all that stuff. I've seen this happening
for Ant. I think there is an evident sign already, we _don't_ need such
complexity, not even the variables.

2) this requires you to think! Sure, I can't force you to think, but
it's good to try.

Example?

Java avoiding multiple inheritance. That's the best ever.

Have you ever wanted multiple inheritance? I did. Badly. It forced me to
go to my whiteboard and layout my OO strategy. Now I don't anymore
because I saw the problem and my design improved.

Careful, a sitemap with no variables is, to me, the exact equivalent.
You should _not_ place logic into the sitemap... by creating variables
and run-time parameters.... it would soon make the sitemap too complex
to handle.

On the other hand, I do see the need for such a thing.... it's another
of those things I need more feedback to evaluate....

The problem is that once we add it and people start using it, we can't
remove it later on. So we must decide if this is FS or not _before_
placing it in.

Send your comments, folks.


-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------



Re: Small note on sitemap configuration

Posted by Donald Ball <ba...@webslingerZ.com>.
On Thu, 1 Jun 2000, Stefano Mazzocchi wrote:

>    <when test="machine-load gt 1.0">
>     <filter type="parser"
> src="./foo/styles/style-{round-to-integer(machine-load)}.xsl"/>
>    </when>
>    <otherwise>
>     <filter type="parser" src="./foo/styles/normal-style.xsl"/>
>    </otherwise>
>   </choose>
>   ...
>  </process>
> 
> I don't think you could make such thing pass the (now infamous)
> "Stefano's girlfriend test".
> 
> But I admit I'm not 100% sure about all this....

I'm certainly not either. Let's step back for a second and see what we
agree on. The sitemap author should be able to write the sitemap in such a
way as to...

1. choose different resolver (pipeline? is that your name for my object?)
components depending on request-time information

2. configure difference resolver components with request-time information

do you agree with these assertions? if so, then we're pretty close in our
visions. i'm simply saying that a sitemap that allows conditionals based
on something like this:

<process uri="whatever">
 <matcher type="browser" value="explorer">
  <matcher type="language" value="en">
   <generator type="file" src="explorer.xml">
  </matcher>
  <matcher type="language" value="it">
   <generator type="file" src="explorer.it.xml">
  </matcher>
 </matcher>
 <matcher type="browser" value="lynx">
  <matcher type="language" value="en">
   <generator type="file" src="lynx.xml">
  </matcher>
  <matcher type="language" value="it">
   <generator type="file" src="lynx.it.xml">
  </matcher>
 </matcher>
</process>
  
That's hella-cumbersome, and doesn't factor out the conditionals very
well. My strategy would let you rewrite it like this:

<process>
 <choose>
  <when test="$language='en'">
   <generator type="file" src="{$browser}.xml"/>
  </when>
  <when test="$language='it'">
   <generator type="file" src="{$browser}.{$language}.xml"/>
  </when>
 </choose>
</process>

I think that's much easier to understand and write. Sure, you can get
yourself into trouble if you move too much logic inside the sitemap and
out of your XSP pages or whatever. So what? I prefer to have the extra
rope, even if I might accidentally hang myself.

- donald


Re: Small note on sitemap configuration

Posted by Stefano Mazzocchi <st...@apache.org>.
Donald Ball wrote:
> 
> On Wed, 31 May 2000, Stefano Mazzocchi wrote:
> 
> > > Of course, this requires that we make request-time information available
> > > via named variables in the sitemap (e.g. language). Virtually speaking, we
> > > could construct an XML fragment for the request:
> > >
> > > <request>
> > >  <uri value="/foo/bar"/>
> > >  <parameter name="foo" value="bar" method="get"/>
> > >  <header type="http" name="Referer"/>
> > > </request>
> > >
> > > And allow the sitemap author to construct simple XPath expressions to
> > > reference that information:
> > >
> > > header[@name='Referer']
> > >
> > > We could provide parameters for frequently accessed information or
> > > information that might be composed from multiple pieces of information
> > > (namely, the preferred language which might depend on a header or a
> > > cookie). e.g. $referer == header[@name='Referer'].
> >
> > Careful: this is a very dangerous path.
> >
> > True, we could virtually embed request information into a schema, but
> > what about _any_ state information? server name, machine load, time of
> > the day in Australia, density of population in antartica, local
> > temperature, number of subscribed people to the cocoon-dev mail list, an
> > so on...
> >
> > Do you _really_ want to write a schema (or RDFSchema, for that matter)
> > for all that? I don't think so :)
> 
> I strongly disagree here. We're going to have to present the sitemap
> author with some way to access request-time information in their
> conditionals, right? 

Oh, well, no. Not in my proposals.... I assumed we could skip that
complexity alltogether.

> Whatever scheme we use, we're going to have to limit
> it to the request-information that is likely to be relevant (e.g. all
> information about the request itself and a little bit about local state -
> current time, etc.) So saying that allowing access to request-time
> information via XPath expressions is a bad idea because we could overload
> it with too much information is fatuous. The same criticism can be
> levelled at any method of allowing access to request-time information.

I'm sorry, I didn't not understand your points. I thought you were
asking to create a schema to translate the request into a virtual XML
document and then apply the XPath on that "as a condition".

Sort of like

 <process uri="/foo/**">
  <choose>
   <when test="/request/language[language='it']">
    ...
   </when>
   <otherwise>
    ...
   </otherwise>
  </choose>

I was indicating -1 on that, nothing else.
 
> Go back to my language example:

Ok.
 
> <process uri="/foo/**">
>  <choose>
>   <when test="language">
>    <generator type="parser" src="./foo/{language}/**"/>
>   </when>
>   <otherwise>
>    <generator type="parser" src="./foo/en/**"/>
>   </otherwise>
>  </choose>
>  ...
> </process>
> 
> How would you propose to rewrite this example using your matcher elements?

 <process uri="/foo/**">
  <generator type="lang-dependent-parser" src="./foo/**"/>
 </process>

the "lang-dependent-parser" will _know_ how to handle the request
parameters it needs.

> > -1 on the use of XPath as testing logic (for the problems outlined
> > above).
> >
> > You might want to look into the "Pipeline Conditional Model" thread and
> > comment on what I proposed there.
> 
> I don't see anything in that proposal about accessing the value of
> request-time variables in the configuration of the resolver components -
> which I think is key. 

Right

> I don't see any problems with XPath expressions that
> don't apply equally well to anything else that accomplishes the same goal.

> Maybe I have missed something though?

No, I did.

Anyway, I think we are starting to get into the FS path and the sitemap
complexity is starting to grow much higher than I want.

We should _NOT_ forget that programming is done using programming
languages. Sam is right advocating this all the time.

Going into the XPath or XSLT param/variable design you are, in fact,
making the sitemap dangerously programmable, but you are loosing power
instead of gaining it! And, in fact, you are breaking the administration
- logic contract by giving more turing-power to administrators (which
should _not_ have it for design).

Let's make an example to show you how dangerous things can get:

 <process uri="/foo/**">
  <generator type="parser" src="./foo/**"/>
  <choose>
   <when test="machine-load gt 1.0">
    <filter type="parser"
src="./foo/styles/style-{round-to-integer(machine-load)}.xsl"/>
   </when>
   <otherwise>
    <filter type="parser" src="./foo/styles/normal-style.xsl"/>
   </otherwise>
  </choose>
  ...
 </process>

I don't think you could make such thing pass the (now infamous)
"Stefano's girlfriend test".

But I admit I'm not 100% sure about all this....

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------



Re: Small note on sitemap configuration

Posted by Donald Ball <ba...@webslingerZ.com>.
On Wed, 31 May 2000, Stefano Mazzocchi wrote:

> > Of course, this requires that we make request-time information available
> > via named variables in the sitemap (e.g. language). Virtually speaking, we
> > could construct an XML fragment for the request:
> > 
> > <request>
> >  <uri value="/foo/bar"/>
> >  <parameter name="foo" value="bar" method="get"/>
> >  <header type="http" name="Referer"/>
> > </request>
> > 
> > And allow the sitemap author to construct simple XPath expressions to
> > reference that information:
> > 
> > header[@name='Referer']
> > 
> > We could provide parameters for frequently accessed information or
> > information that might be composed from multiple pieces of information
> > (namely, the preferred language which might depend on a header or a
> > cookie). e.g. $referer == header[@name='Referer'].
> 
> Careful: this is a very dangerous path.
> 
> True, we could virtually embed request information into a schema, but
> what about _any_ state information? server name, machine load, time of
> the day in Australia, density of population in antartica, local
> temperature, number of subscribed people to the cocoon-dev mail list, an
> so on...
> 
> Do you _really_ want to write a schema (or RDFSchema, for that matter)
> for all that? I don't think so :)

I strongly disagree here. We're going to have to present the sitemap
author with some way to access request-time information in their
conditionals, right? Whatever scheme we use, we're going to have to limit
it to the request-information that is likely to be relevant (e.g. all
information about the request itself and a little bit about local state -
current time, etc.) So saying that allowing access to request-time
information via XPath expressions is a bad idea because we could overload
it with too much information is fatuous. The same criticism can be
levelled at any method of allowing access to request-time information.

Go back to my language example:

<process uri="/foo/**">
 <choose>
  <when test="language">
   <generator type="parser" src="./foo/{language}/**"/>
  </when>
  <otherwise>
   <generator type="parser" src="./foo/en/**"/>
  </otherwise>
 </choose>
 ...
</process>

How would you propose to rewrite this example using your matcher elements?

> -1 on the use of XPath as testing logic (for the problems outlined
> above).
> 
> You might want to look into the "Pipeline Conditional Model" thread and
> comment on what I proposed there. 

I don't see anything in that proposal about accessing the value of
request-time variables in the configuration of the resolver components -
which I think is key. I don't see any problems with XPath expressions that
don't apply equally well to anything else that accomplishes the same goal.
Maybe I have missed something though?

- donald


Re: Small note on sitemap configuration

Posted by Stefano Mazzocchi <st...@apache.org>.
Donald Ball wrote:
> 
> First, some terminology to help me reduce clutter: a resolver consists of
> one generator, any number of filters, and one serializer. The function
> that maps requests to resolvers is called the map function.
> 
> In the sitemap file, a process element can contain one generator, any
> number of filters, and one serializer (a resolver). The process element
> may also contain match elements, which specify request-time conditions
> that must be met. The match element may contain alternate resolver
> components. Assuming the match elements provide us with true boolean
> expression evaluation, we can build a sitemap that accomplishes the design
> goals delineated above, at least on a per-URI basis.
> 
> The sitemap for a complex set of criteria may prove to be overly
> cumbersome, though. For instance, if the map function depends on uri,
> preferred language, and user agent, the sitemap might look like this:
> 
> <process uri="/foo/**">
>  <match type="language" value="en">
>   <generator type="parser" src="./foo/en/**"/>
>  </match>
>  <match type="language" value="fr">
>   <generator type="parser" src="./foo/fr/**"/>
>  </match>
>  <match type="format" value="html">
>   <filter type="xslt" src="./style/html.xslt"/>
>   <serializer type="html"/>
>  </match>
>  <match type="format" value="wml">
>   <filter type="xslt" src="./style/wml.xslt"/>
>   <serializer type="xml"/>
>  </match>
> 
> Problems become evident at this point. How do we specify a default
> language? We could say that a resolver component specified by a match
> overrides a top-level resolver component. But suppose we have more
> languages than we want to enumerate in a sitemap repeatedly, and suppose
> the configuration of a resolver component could depend on a request-time
> variable? We could write it like this:
> 
> <process uri="/foo/**">
>  <match type="language" value="*">
>   <generator type="parser" src="./foo/{language}/**"/>
>  </match>
>  <match type="format" value="*">
>   <filter type="xslt" src="./style/{format}.xslt"/>
>  </match>
>  ...
> </process>
> 
> That's better, but still not perfect. We still don't have a default
> language. For that, and for other reasons, I think we'd do better to add
> conditional processing:
> 
> <process uri="/foo/**">
>  <choose>
>   <when test="language">
>    <generator type="parser" src="./foo/{language}/**"/>
>   </when>
>   <otherwise>
>    <generator type="parser" src="./foo/en/**"/>
>   </otherwise>
>  </choose>
>  ...
> </process>

I agree on the idea of a pipeline conditional model. In fact, I already
changed the matching architecture with this (look into the Cocoon2
/xdocs/sitemap-working-draft.xml.... grrrr, damn broken CVS update
mails)
 
> Of course, this requires that we make request-time information available
> via named variables in the sitemap (e.g. language). Virtually speaking, we
> could construct an XML fragment for the request:
> 
> <request>
>  <uri value="/foo/bar"/>
>  <parameter name="foo" value="bar" method="get"/>
>  <header type="http" name="Referer"/>
> </request>
> 
> And allow the sitemap author to construct simple XPath expressions to
> reference that information:
> 
> header[@name='Referer']
> 
> We could provide parameters for frequently accessed information or
> information that might be composed from multiple pieces of information
> (namely, the preferred language which might depend on a header or a
> cookie). e.g. $referer == header[@name='Referer'].

Careful: this is a very dangerous path.

True, we could virtually embed request information into a schema, but
what about _any_ state information? server name, machine load, time of
the day in Australia, density of population in antartica, local
temperature, number of subscribed people to the cocoon-dev mail list, an
so on...

Do you _really_ want to write a schema (or RDFSchema, for that matter)
for all that? I don't think so :)
 
> Note we don't actually have to construct the request document - generally,
> I don't think we'll want to since much of the request time information is
> irrelevant. We can write our own simple XPath expression resolver, or if
> we use Xalan's XPath module to do the resolution we could write a small
> DOM implementation that acted as a decorator for the HttpServletRequest
> methods.
> 
> So the upshot of my random thoughts here - I propose that we replace the
> match element with conditionals ala XSLT and allow sitemap authors to
> reference request time information using XPath expressions into the
> request data represented as an XML fragment. Any takers?

+1 on the conditional model

-1 on the use of XPath as testing logic (for the problems outlined
above).

You might want to look into the "Pipeline Conditional Model" thread and
comment on what I proposed there. 

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------