You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Bernhard Huber <be...@a1.net> on 2003/01/18 07:33:06 UTC

Re: [RT] Linking revisited: A general linking system

hi,

maybe i'm a bit late...

but i checked the samples/linkrewriter-demo today,
i noticed that linkmap.xml is something like:
<site href="">
   <index href="index.html"/>
   <dreams href="dreams.html"/>
   <faq href="faq.html">
     <how_can_I_help href="#how_can_I_help"/>
     <building_own_website href="#own_website"/>

now from a xml validation point of view this is bad having
that much differen node-names,
do you have some special reasons for that node-name naming conventions?

may i suggest using instead
<site href="">
   <link name="index" href="index.html"/>
   <link name="dreams" href="dreams.html"/>
   <link name="faq" href="faq.html">
     <link name="how_can_I_help" href="#how_can_I_help"/>
     <link name="building_own_website" href="#own_website"/>
.....

maybe instead of name attribute, attribute named key, or id fits better,
anyway this way we could have a DTD and a formal validation would be easier,

an other option:
<site href="">
   <link href="index.html">index</link>
....


summarizing my main concern is about validating the linkmap,
using any of DTD, schema, or Schematron,
i think that's important in the long run,

regards bernhard


Re: [RT] Linking revisited: A general linking system

Posted by Jeff Turner <je...@apache.org>.
On Sat, Jan 18, 2003 at 08:09:58PM +0100, J.Pietschmann wrote:
> Jeff Turner wrote:
> >    <link name="building_own_website" href="#own_website"/>
> >That buys us the ability to validate:
> >
> ><!ELEMENT link (link*)>
> >
> >At the cost of 12 extra characters per line.  Seems hardly worth doing :)
> 
> This buys the possiblity to use spaces and other Unicode
> characters in logical target names which are invalid for
> QNames.

It certainly prohibits spaces, which I see as an advantage.  The
production for 'Letter' has lots of non-ASCII unicode ranges, so I don't
see that hurting i18n:

http://www.xml.com/axml/target.html#sec-common-syn

> It also buys using character references for writing
> down the link target name.

Yes, though, any non-English user is going to have an editor that can
enter non-ASCII characters directly.

> Both may be important features
> for non-technicans, in particular for non-english
> non-technicans.
> 
> It also avoids problems in case someone has the great
> idea to define a link target name containing a colon.

Preventing stupid things like this saves US problems :)  We don't have to
worry about disambiguating href="foo:bar:baz".


--Jeff

> J.Pietschmann
> 

Re: [RT] Linking revisited: A general linking system

Posted by Bernhard Huber <be...@a1.net>.

J.Pietschmann wrote:
> Jeff Turner wrote:
>  >    <link name="building_own_website" href="#own_website"/>
> 
>> That buys us the ability to validate:
>>
>> <!ELEMENT link (link*)>
>>
>> At the cost of 12 extra characters per line.  Seems hardly worth doing :)
> 
> 
> This buys the possiblity to use spaces and other Unicode
> characters in logical target names which are invalid for
> QNames. It also buys using character references for writing
> down the link target name. Both may be important features
> for non-technicans, in particular for non-english
> non-technicans.
> 
> It also avoids problems in case someone has the great
> idea to define a link target name containing a colon.
> 

ohhh, i didn't think about that,
you read the xml spec more careful than me.
thx for the clarification

bernhard


Re: [RT] Linking revisited: A general linking system

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Jeff Turner wrote:
 >    <link name="building_own_website" href="#own_website"/>
> That buys us the ability to validate:
> 
> <!ELEMENT link (link*)>
> 
> At the cost of 12 extra characters per line.  Seems hardly worth doing :)

This buys the possiblity to use spaces and other Unicode
characters in logical target names which are invalid for
QNames. It also buys using character references for writing
down the link target name. Both may be important features
for non-technicans, in particular for non-english
non-technicans.

It also avoids problems in case someone has the great
idea to define a link target name containing a colon.

J.Pietschmann


Re: [RT] Linking revisited: A general linking system

Posted by Jeff Turner <je...@apache.org>.
On Sat, Jan 18, 2003 at 07:33:06AM +0100, Bernhard Huber wrote:
> hi,
> 
> maybe i'm a bit late...
> 
> but i checked the samples/linkrewriter-demo today,
> i noticed that linkmap.xml is something like:
> <site href="">
>   <index href="index.html"/>
>   <dreams href="dreams.html"/>
>   <faq href="faq.html">
>     <how_can_I_help href="#how_can_I_help"/>
>     <building_own_website href="#own_website"/>
> 
> now from a xml validation point of view this is bad having that much
> differen node-names,

Having arbitrary element names certainly rules out DTDs, but not RELAX
NG.  I don't know about W3C Schema.

> do you have some special reasons for that node-name naming conventions?

One person's minimalist tendencies :)

> may i suggest using instead
> <site href="">
>   <link name="index" href="index.html"/>
>   <link name="dreams" href="dreams.html"/>
>   <link name="faq" href="faq.html">
>     <link name="how_can_I_help" href="#how_can_I_help"/>
>     <link name="building_own_website" href="#own_website"/>
> .....

That buys us the ability to validate:

<!ELEMENT link (link*)>

At the cost of 12 extra characters per line.  Seems hardly worth doing :)
And anyway, in Forrest's site.xml, not all elements are links.  One could
have an <about> category, or <external-refs> to store 'external' URLs.

> maybe instead of name attribute, attribute named key, or id fits
> better, anyway this way we could have a DTD and a formal validation
> would be easier,
> 
> an other option:
> <site href="">
>   <link href="index.html">index</link>
> ....

Yep, there's lots of things we could do. In the long term, it would be
good to migrate to either RDF or Topic Maps:

http://marc.theaimsgroup.com/?l=forrest-dev&m=104199302214292&w=2

> summarizing my main concern is about validating the linkmap,
> using any of DTD, schema, or Schematron,
> i think that's important in the long run,

You're not alone :)  Pretty much everyone gasps in horror when they first
see element names so gratuitously abused.  But site.xml has to be the
simplest linkbase format in existence, and I think we should preserve
that simplicity for now, and aim for a standardized format (eg RDF/TMs)
in the future.


--Jeff


> regards bernhard
>