You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lenya.apache.org by Joern Nettingsmeier <ne...@folkwang-hochschule.de> on 2006/10/31 00:19:46 UTC

[ANNOUNCE] proposal for lenya document reference syntax

here's a proposal for a new lenya document reference syntax, to be used 
for internal links, assets and image inclusions:

lenya-document:<uuid>?lang=<language>&area=<area>&rev=<revision>&pub=<pub-id>

where any of the components (uuid>, <language> <area> <revision> 
<pub-id>) are optional, i.e. everything that is not specified is derived 
from the current page envelope information.

please comment - your insight is greatly appreciated.

for those who have not followed the previous posts on this topic, a 
short summary:

[the problem]

at the moment, lenya uses a nice, uuid-based storage backend that allows 
site restructuring without having to shuffle files around, because 
content files and structure (i.e. sitetree.xml) are decoupled.

but we currently have no way of defining links to lenya documents in 
this fashion - for linking, we still use the sitetree-dependent document 
path, which will break when users restructure their sites (i.e. move 
pages around).

[discussion]

the primary identifier of a document is the UUID, so this has to become 
the main part of the link syntax. but to further pinpoint the desired 
resource, we also need a way to optionally specify the language version, 
the area and the revision.

if, in the future, cross-publication linking were to be supported, it 
might also be helpful to spell out the publication id for faster 
retrieval, even though in theory a UUID lookup mechanism could work 
without explicit knowledge of the publication id due to the global 
uniqueness of the uuids.

using the traditional approach of putting a number of slash-separated 
"parameters" into a standard URL has a number of disadvantages:
* it is very hard to read and easy to get wrong
* we might want different types of "relative" links, for example:
    * specify only the language -> i want another language version of the
      current document
    * specify only the area (= same doc, same lang)
    * specify only the uuid (= same lang, same area)
    * specify only a revision (= same doc, same lang, same area, other
      version)
    generally, you could specify as few or many attributes as you like,
    and the others would be gleaned from the page envelope. with the
    current positional parameter scheme, we have already seen a
    rather hackish quick fix for relative linking that leads to
    constructs such as "lenyadoc://{1}/{2}/{4}/{3}".

by borrowing the GET parameter syntax from http requests, we gain a 
truly multi-dimensional parameter model where arbitrary parts can be 
left unspecified, and we get a self-documenting syntax with named 
parameters and no magic position-based semantics as in 
lenyadoc://something/somethingelse/foo/bar

it had been proposed to re-use the lenyadoc semantics to specify 
internal links, but we have decided not to, for two reasons: lenyadoc 
uses positional parameters with all disadvantages mentioned before, and 
the link references are never actually fed to a source factory, but 
instead handled by a link rewriter mechanism. so it was considered 
misleading to use the semantics of an existing source factory for a 
mechanism that is totally independent of said source factory.

[future improvements]

however, it might be interesting to consider implementing a real 
lenya-document: source factory in the near future and use it instead of 
the current lenyadoc:// scheme, so as to benefit from the multiple ways 
of relative linking and the self-documenting named parameters. the old 
lenyadoc: factory could then be deprecated and gradually replaced, to 
avoid disruptive changes while 1.4 is stabilizing.

[musings]

before implementing, we should probably get a clear idea of whether the 
proposed scheme is a Uniform Resource Name, a Uniform Resource Locator, 
or a Uniform Resource Identifier, and check the current proposal against 
relevant rfcs pertaining to those uniform resource gizmos, in order to 
build on existing, proven concepts and to avoid introducing something 
that might become overly idiosyncratic in the near future.
your comments on this are especially appreciated.


regards,



jörn


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: [ANNOUNCE] proposal for lenya document reference syntax

Posted by Andreas Hartmann <an...@apache.org>.
Joern Nettingsmeier schrieb:

[...]

> [musings]
> 
> before implementing, we should probably get a clear idea of whether the
> proposed scheme is a Uniform Resource Name, a Uniform Resource Locator,
> or a Uniform Resource Identifier, and check the current proposal against
> relevant rfcs pertaining to those uniform resource gizmos, in order to
> build on existing, proven concepts and to avoid introducing something
> that might become overly idiosyncratic in the near future.
> your comments on this are especially appreciated.

Your proposal complies to the URI syntax [1]:


lenya-document:<uuid>?lang=<language>&area=<area>&rev=<revision>&pub=<pub-id>
\____________/ \____/ \______________________________________________...
       |         |            |
    scheme      path        query


[1]
http://www.gbiv.com/protocols/uri/rfc/rfc3986.html#components

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: [ANNOUNCE] proposal for lenya document reference syntax

Posted by Andreas Hartmann <an...@apache.org>.
Jann Forrer schrieb:

[...]

>> 2 solutions come to mind:
>> reserve a prefix namespace like "lenya-..." and use
>> "?lenya-area=..." and so on, or do nothing and let people fix their
>> custom query params.
>> for obvious reasons, i prefer the former solution, even if it makes the
>> queries more verbose.
>>
> I would also prefer the lenya prefix for the lenya-document parameters.
> That seems the most obvious solution to me.
> BTW, would it even be possible for the people to fix all query params
> without touching core files? Anyway the second solution is not a very
> user friendly one :-(


I found an interesting paragraph in this issue in [1]:

"Aside from dot-segments in hierarchical paths, a path segment is
considered opaque by the generic syntax. URI producing applications
often use the reserved characters allowed in a segment to delimit
scheme-specific or dereference-handler-specific subcomponents. For
example, the semicolon (";") and equals ("=") reserved characters are
often used to delimit parameters and parameter values applicable to that
segment. The comma (",") reserved character is often used for similar
purposes. For example, one URI producer might use a segment such as
"name;v=1.1" to indicate a reference to version 1.1 of "name", whereas
another might use a segment such as "name,1.1" to indicate the same.
Parameter types may be defined by scheme-specific semantics, but in most
cases the syntax of a parameter is specific to the implementation of the
URI's dereferencing algorithm."


In short, a path segment can be followed by a set of parameters.
In our case, we could use for instance

  lenya-document:12erg313-3fww46,lang=de,rev=12?foo=bar#section2

  scheme        |path segment   |params        |query  |fragment

This would conform to the URI syntax and follow a common practise.
I like it :)

If you agree to this syntax - do you prefer a comma or a semicolon
as delimiter? IMO the comma makes it easier to distinguish the
parts because it is smaller, but both are fine with me.

[1] http://www.gbiv.com/protocols/uri/rfc/rfc3986.html#path

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: [ANNOUNCE] proposal for lenya document reference syntax

Posted by Jann Forrer <ja...@id.unizh.ch>.
Joern Nettingsmeier wrote:
> 
> 
> Andreas Hartmann wrote:
>> Hi Jörn,
>>
>> Joern Nettingsmeier schrieb:
>>> here's a proposal for a new lenya document reference syntax, to be used
>>> for internal links, assets and image inclusions:
>>>
>>> lenya-document:<uuid>?lang=<language>&area=<area>&rev=<revision>&pub=<pub-id>
>>>
>>
>> unfortunately, there is a problem with this syntax:
>> It is not possible to distinguish lenya-document parameters from
>> request parameters.
>>
>> For instance, consider an internal link to a usecase-document
>> page. This would lead to a conflict if the usecase also requires
>> an "area" parameter (e.g., a "show area version" usecase could be
>> invoked in the authoring area on a live document).
>>
>> Possible solutions:
>>
>>
>> - use a different delimiter for lenya-document parameters
>>
>>   lenya-document:uuid=...:lang=...:area=authoring?area=live
>>
>>
>> - prefix the lenya-document parameters
>>   (would be quite verbose)
>>
>>
>> - add another delimiter after the lenya-document parameters
>>
>>   lenya-document:<uuid>?area=authoring?area=live
>>
>>   I guess this doesn't conform to the URI syntax, though.
>>
>>
>> - use a different syntax for lenya-document parameters, e.g.
>>
>>   lenya-document:<uuid>[@area='authoring']?area=live
>>
>>   This probably doesn't conform to the URI syntax either.
> 
> 
> very important point!
> 
> i don't like the non-conformant syntaxes, if only because it means we
> must write a custom parser (which is definitely non-trivial for unicode
> - recall Microsoft's cherished "Code Red" fuck-up), and we should really
> abstain from ad-hoc string munging in this implementation.
> 
> 2 solutions come to mind:
> reserve a prefix namespace like "lenya-..." and use
> "?lenya-area=..." and so on, or do nothing and let people fix their
> custom query params.
> for obvious reasons, i prefer the former solution, even if it makes the
> queries more verbose.
>
I would also prefer the lenya prefix for the lenya-document parameters.
That seems the most obvious solution to me.
BTW, would it even be possible for the people to fix all query params
without touching core files? Anyway the second solution is not a very
user friendly one :-(

[ ... ]

Jann

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: [ANNOUNCE] proposal for lenya document reference syntax

Posted by Joern Nettingsmeier <ne...@folkwang-hochschule.de>.

Andreas Hartmann wrote:
> Hi Jörn,
> 
> Joern Nettingsmeier schrieb:
>> here's a proposal for a new lenya document reference syntax, to be used
>> for internal links, assets and image inclusions:
>>
>> lenya-document:<uuid>?lang=<language>&area=<area>&rev=<revision>&pub=<pub-id>
> 
> unfortunately, there is a problem with this syntax:
> It is not possible to distinguish lenya-document parameters from
> request parameters.
> 
> For instance, consider an internal link to a usecase-document
> page. This would lead to a conflict if the usecase also requires
> an "area" parameter (e.g., a "show area version" usecase could be
> invoked in the authoring area on a live document).
> 
> Possible solutions:
> 
> 
> - use a different delimiter for lenya-document parameters
> 
>   lenya-document:uuid=...:lang=...:area=authoring?area=live
> 
> 
> - prefix the lenya-document parameters
>   (would be quite verbose)
> 
> 
> - add another delimiter after the lenya-document parameters
> 
>   lenya-document:<uuid>?area=authoring?area=live
> 
>   I guess this doesn't conform to the URI syntax, though.
> 
> 
> - use a different syntax for lenya-document parameters, e.g.
> 
>   lenya-document:<uuid>[@area='authoring']?area=live
> 
>   This probably doesn't conform to the URI syntax either.


very important point!

i don't like the non-conformant syntaxes, if only because it means we 
must write a custom parser (which is definitely non-trivial for unicode 
- recall Microsoft's cherished "Code Red" fuck-up), and we should really 
abstain from ad-hoc string munging in this implementation.

2 solutions come to mind:
reserve a prefix namespace like "lenya-..." and use
"?lenya-area=..." and so on, or do nothing and let people fix their 
custom query params.
for obvious reasons, i prefer the former solution, even if it makes the 
queries more verbose.
andreas, since you seem familiar with the URI rfc - how set-in-stone is 
the 255 chars length limitation? because we are likely to run into it 
the moment we have lots of multibyte unicode char refs...



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: [ANNOUNCE] proposal for lenya document reference syntax

Posted by Andreas Hartmann <an...@apache.org>.
Hi Jörn,

Joern Nettingsmeier schrieb:
> here's a proposal for a new lenya document reference syntax, to be used
> for internal links, assets and image inclusions:
> 
> lenya-document:<uuid>?lang=<language>&area=<area>&rev=<revision>&pub=<pub-id>

unfortunately, there is a problem with this syntax:
It is not possible to distinguish lenya-document parameters from
request parameters.

For instance, consider an internal link to a usecase-document
page. This would lead to a conflict if the usecase also requires
an "area" parameter (e.g., a "show area version" usecase could be
invoked in the authoring area on a live document).

Possible solutions:


- use a different delimiter for lenya-document parameters

  lenya-document:uuid=...:lang=...:area=authoring?area=live


- prefix the lenya-document parameters
  (would be quite verbose)


- add another delimiter after the lenya-document parameters

  lenya-document:<uuid>?area=authoring?area=live

  I guess this doesn't conform to the URI syntax, though.


- use a different syntax for lenya-document parameters, e.g.

  lenya-document:<uuid>[@area='authoring']?area=live

  This probably doesn't conform to the URI syntax either.


Maybe there are some better suggestions?

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org