You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lenya.apache.org by Andreas Hartmann <an...@apache.org> on 2005/06/03 15:58:34 UTC

[1.4] Component-specific meta data

Hi Lenya devs,

I came accross the following issue:

Currently, Lenya supports 3 types of meta data:

- LenyaMetaData
- DublinCore
- CustomMetaData

LenyaMetaData should be used for "internal" meta data.
The problem here is:

Which class is responsible to know the attribute names?

Actually IMO only the component which actually uses the meta
data should know the attribute name, and it should encapsulate
them. Otherwise we present an option for misuse:

    String state = metaData.getFirstValue(LenyaMetaData.STATE);

This is bad code, because the syntax of the state storage
is known only by the workflow components. It should rather read

    Workflowable workflowable = new DocumentWorkflowable(document...);
    String state = workflowable.getLatestVersion().getState();

IMO the best way to deal with that is to allow component-specific
meta data, which requires a customizable set of meta data.
Or should we use the CustomMetaData for that purpose? Here I see
a danger of clashes due to the lack of further namespacing:

    customMetaData.getFirstValue("state");

instead of

    customMetaData.getFirstValue(Workflow.NAMESPACE, "state");

WDYT?

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Component-specific meta data

Posted by Andreas Hartmann <an...@apache.org>.
J. Wolfgang Kaltz wrote:
> Andreas Hartmann schrieb:
> 
>> Hi Lenya devs,
>>
>> I came accross the following issue:
>>
>> Currently, Lenya supports 3 types of meta data:
>>
>> - LenyaMetaData
>> - DublinCore
>> - CustomMetaData
>>
>> LenyaMetaData should be used for "internal" meta data.
>> The problem here is:
>>
>> Which class is responsible to know the attribute names?
>>
>> Actually IMO only the component which actually uses the meta
>> data should know the attribute name, and it should encapsulate
>> them. Otherwise we present an option for misuse:
>>
>>    String state = metaData.getFirstValue(LenyaMetaData.STATE);
>>
>> This is bad code, because the syntax of the state storage
>> is known only by the workflow components. 
> 
> 
> IIUC by syntax you mean the way workflow state knowledge has presently 
> been represented in XML ?

No, actually I mean the syntax of the string value. Just some
possibilities for the wf state:

     authoring
     workflow://pubs/default/config/workflow/workflow.xml/states/authoring

or for variable values:

     is_published:true
     is_published = true
     is_published=1

I mean the workflow component has arbitrary options to encode the
meta data values. This knowledge has to be restricted to a single place,
which is the component itself. IMO this implies that the attribute names
should not be exposed by the API. People would examine the storage syntax,
parse the values and run into trouble when updating to another version
of the component which uses another storage syntax. IMO exposing the
attribute names as public fields suggests that they're meant to be
accessed by client code.

[...]

> First, I think the workflow history should be separated from the 
> metadata, IMO these are 2 separate concerns, which we should deal with 
> separately.

+1, I'm working on that.

> If we want to consider the current workflow state as metadata of a 
> document (which I agree makes sense), this is a separate concern than a 
> state storage syntax. In this particular case, IMO we should "flatten" 
> the current state knowledge in the Lenya metadata, so something like
>    <lenya:internal>
>     ...
>       <lenya:workflowState>review</lenya:workflowState>
>    </lenya:internal>
> 
> So, the code which accesses the current state does call 
> metaData.getFirstValue(LenyaMetaData.STATE), but needs no further 
> knowledge about state storage.

I don't really understand that. Maybe it's clearer to look at

   LenyaMetaData.WORKFLOW_VARIABLE_VALUE

That is not a plain string and would need a special syntax (see above)

> Whereas the component which presents 
> workflow history uses different code.
> 
> 
> 
>  > It should rather read
> 
>>
>>    Workflowable workflowable = new DocumentWorkflowable(document...);
>>    String state = workflowable.getLatestVersion().getState();
> 
> 
> Could that not be "workflowable.getState()", which would in turn call 
> the document's metaData.getFirstValue(LenyaMetaData.STATE) ?

Yes, workflowable.getState() is better. But I disagree to
LenyaMetaData.STATE for the reasons stated above and below.


> I hope I am not completely misunderstanding something; I noticed there 
> were changes this week in trunk regarding workflow & metadata handling, 
> but I have not yet been able to keep up with them.
> 
> As a general design strategy, my opinion is:
> 
> LenyaMetaData is Lenya's specific metadata about a document - meaning, 
> all metadata not clearly standardized in Dublin Core. IMO it makes sense 
> to put all metadata pertaining to the document in this container. This 
> does not say which component "uses" it; in fact we cannot predict future 
> uses. Any component wanting to use this metadata must know its name and 
> access it via the LenyaMetaData. This generic "metadata = simple 
> attributes" approach does, I agree, imply that arbitrary, nested XML is 
> not possible within the metadata. So the decisive question is, do we 
> really need such XML within the metadata ? (IMO no)

I agree to plain strings, but I disagree to the centralized attribute
name management.

   - no encapsulation of storage details (attribute names and value syntax
     are maintained by different components)

   - extending or changing the meta data set needed by a component
     requires to change the core

   - it's not possible to add further components which use meta data
     storage without changing the core API

Actually I see components as additions to the core. IMO the core
is everything in o.a.lenya.cms.publication. Access control, workflow,
site management etc. are components which use the core functionality.
It should be possible to extend Lenya with additional components
which use the same mechanisms like the ac, workflow etc. components.

My point is: We should not provide special fields for workflow, ac
etc. if we don't provide them for additional components. A component
should be able to use the meta data in a convenient, namespaced way
without changes to the core API.

We'll run into trouble when we think like this. We're building a
framework, which needs a powerful infrastructure for implementing
components using and extending the provided functionality. Centralized
meta data attribute names lead to a lock-in.


> To summarize, IMO it is in fact good code to encapsalute (non-DC) 
> metadata within LenyaMetaData, including the name of the attributes, and 
> any component needing access does something like
>    metaData.getFirstValue(LenyaMetaData.WHATEVER_ATTRIBUTE)

IMO that's not the way encapsulation should be applied. You don't
encapsulate data sets used by a variety of components, you encapsulate
components with specific responsibilities. Meta data handling makes
a good component, but meta data attribute names do not, unless they
serve a very specific, generally applicable purpose and are used for
data exchange (like DublinCore for publishing issues).

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Component-specific meta data

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
J. Wolfgang Kaltz wrote:

> Do you think we should distinguish metadata attributes from different 
> components according to namespace, or should we support a hierarchy 
> instead (each component getting its own subelement in the metadata) ?

the ns way has the advantage of being inambigous

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Component-specific meta data

Posted by "J. Wolfgang Kaltz" <ka...@interactivesystems.info>.
Andreas Hartmann schrieb:
> (...)
> I see the MetaDataManager as a service which offers the functionality
> to store meta data which are needed by other components. If the 
> MetaDataManager
> exposes a fixed attribute set, we'll have to change the core API whenever
> we change a component's meta data set.
> 
> Assume you're implementing a component needing several meta data attributes
> in a publication. If you're using the CustomMetaData, you're running into
> several problems:
> 
>   - risk of name clashes, especially if you're using the pub as a template
>     and add further components
> 
>   - you have to implement your own namespacing syntax to avoid name clashes
> 
>   - making the component available to the public can also lead to name 
> clashes
> 
> IMO we should not separate between LenyaMetaData and CustomMetaData, but
> we should offer each component to use its own meta data set.

The type "CustomMetaData" was just a first proposal, as someone on the 
list mentioned a custom need for metadata, which he suggested to store 
in the site tree, but I thought a metadata mechanism was better for 
this. Since the CustomMetaData is not yet actually used, we can remove 
it and generalize LenyaMetaData instead. Or, we could rename 
CustomMetaData to ComponentMetaData, add support for parameterizing the 
namespace, and say that this is to be the container for all non-core 
metadata.

Do you think we should distinguish metadata attributes from different 
components according to namespace, or should we support a hierarchy 
instead (each component getting its own subelement in the metadata) ?


--
Wolfgang

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Component-specific meta data

Posted by Andreas Hartmann <an...@apache.org>.
Gregor J. Rothfuss wrote:
> Andreas Hartmann wrote:
> 
>> IMO we should not separate between LenyaMetaData and CustomMetaData, but
>> we should offer each component to use its own meta data set.
> 
> 
> your cogent analysis swayed me :)

Let's hope the resulting code is equally convincing :)

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Component-specific meta data

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Andreas Hartmann wrote:

> IMO we should not separate between LenyaMetaData and CustomMetaData, but
> we should offer each component to use its own meta data set.

your cogent analysis swayed me :)

+1

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Component-specific meta data

Posted by Andreas Hartmann <an...@apache.org>.
Gregor J. Rothfuss wrote:
> J. Wolfgang Kaltz wrote:
> 
>> access it via the LenyaMetaData. This generic "metadata = simple 
>> attributes" approach does, I agree, imply that arbitrary, nested XML 
>> is not possible within the metadata. So the decisive question is, do 
>> we really need such XML within the metadata ? (IMO no)
> 
> 
> i also think that the answer tends to be no, which in turn means it is a 
> piece of cake to make these attributes into jcr properties

+1

>> To summarize, IMO it is in fact good code to encapsalute (non-DC) 
>> metadata within LenyaMetaData, including the name of the attributes, 
>> and any component needing access does something like
>>    metaData.getFirstValue(LenyaMetaData.WHATEVER_ATTRIBUTE)
> 
> 
> seems to me to be a difference whether the knowledge about attributes 
> should reside centrally in LenyaMetaData, or decentrally with the 
> various components. each has their strengths, but until we have very 
> loose coupling between lenya components, i don't see much harm in having 
> this knowledge centralized.

I think we're on a good way to achieve loose coupling, and IMO that's
one of the most important issues re. maintenance and extensibility. And
de-centralizing meta data attributes is another step in this direction.

I see the MetaDataManager as a service which offers the functionality
to store meta data which are needed by other components. If the MetaDataManager
exposes a fixed attribute set, we'll have to change the core API whenever
we change a component's meta data set.

Assume you're implementing a component needing several meta data attributes
in a publication. If you're using the CustomMetaData, you're running into
several problems:

   - risk of name clashes, especially if you're using the pub as a template
     and add further components

   - you have to implement your own namespacing syntax to avoid name clashes

   - making the component available to the public can also lead to name clashes

IMO we should not separate between LenyaMetaData and CustomMetaData, but
we should offer each component to use its own meta data set.

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Component-specific meta data

Posted by "J. Wolfgang Kaltz" <ka...@interactivesystems.info>.
Gregor J. Rothfuss schrieb:
> J. Wolfgang Kaltz wrote:
> 
>> access it via the LenyaMetaData. This generic "metadata = simple 
>> attributes" approach does, I agree, imply that arbitrary, nested XML 
>> is not possible within the metadata. So the decisive question is, do 
>> we really need such XML within the metadata ? (IMO no)
> 
> 
> i also think that the answer tends to be no, which in turn means it is a 
> piece of cake to make these attributes into jcr properties

If we distinguish metadata attributes to be handled by different 
components according to a namespace, do you know if that could be a 
problem for JCR mapping ?

IIUC a JCR Node has a list of properties via the Property type; these 
don't have an explicit concept of namespace, but according to the example
   N.getProperties("jcr:* | myapp:name | my doc")
it looks like one can use the namespace in the parameter "namePattern" ?

If that works, it would definitely be a +1 for the solution "distinguish 
metadata from different components by using different namespaces"


> (...)
>> Any issues other than getting current meta attributes about a document 
>> should not be handled via the MetaData interfaces.
> 
> 
> explain?

I meant issues such as workflow history, but we agree on that, so no 
problem there.


--
Wolfgang

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Component-specific meta data

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
J. Wolfgang Kaltz wrote:

> access it via the LenyaMetaData. This generic "metadata = simple 
> attributes" approach does, I agree, imply that arbitrary, nested XML is 
> not possible within the metadata. So the decisive question is, do we 
> really need such XML within the metadata ? (IMO no)

i also think that the answer tends to be no, which in turn means it is a 
piece of cake to make these attributes into jcr properties

> To summarize, IMO it is in fact good code to encapsalute (non-DC) 
> metadata within LenyaMetaData, including the name of the attributes, and 
> any component needing access does something like
>    metaData.getFirstValue(LenyaMetaData.WHATEVER_ATTRIBUTE)

seems to me to be a difference whether the knowledge about attributes 
should reside centrally in LenyaMetaData, or decentrally with the 
various components. each has their strengths, but until we have very 
loose coupling between lenya components, i don't see much harm in having 
this knowledge centralized.

> Any issues other than getting current meta attributes about a document 
> should not be handled via the MetaData interfaces.

explain?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re[2]: [1.4] Component-specific meta data

Posted by qMax <qm...@mediasoft.ru>.
Just my 3 cents.

Friday, June 3, 2005, 11:48:51 PM, J. Wolfgang Kaltz wrote:

JWK> As a general design strategy, my opinion is:
JWK> LenyaMetaData is Lenya's specific metadata about a document - meaning,
JWK> all metadata not clearly standardized in Dublin Core. IMO it makes sense
JWK> to put all metadata pertaining to the document in this container. This
JWK> does not say which component "uses" it; in fact we cannot predict future
JWK> uses. Any component wanting to use this metadata must know its name and
JWK> access it via the LenyaMetaData. This generic "metadata = simple
JWK> attributes" approach does, I agree, imply that arbitrary, nested XML is
JWK> not possible within the metadata. So the decisive question is, do we
JWK> really need such XML within the metadata ? (IMO no)

It was me who proposed custom XML metadata.
which I had stored it in sitetree when used pure cocoon.
I called it "metadata" just because it is associated with documents.
Some of it replaced with DC in Lenya.

The purpose of other was defining url parameters affecting rendering,
they form additional navigation elements - a menues of these
parameters.
Each parameter was presented with:
list of( prarmetername + list of ( value + title ))

Another example of custom "metadata" im looking to is set of
subscribers to notify on page changes: list of (email + attributes).

Both examples do not fit attribute=value scheme.

IMO storing custom XML data associated with documents MAY be usefull.
But this definitely does not fit common concept of "metadata" (attributes+values),
and is probably something else, which requires more complex handling
(accessing with xpath, editing in cforms).

I still hope there will be a way to implement such data w/out breaking core.

EndOfCents.
-- 
 qMax


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Component-specific meta data

Posted by "J. Wolfgang Kaltz" <ka...@interactivesystems.info>.
Andreas Hartmann schrieb:
> Hi Lenya devs,
> 
> I came accross the following issue:
> 
> Currently, Lenya supports 3 types of meta data:
> 
> - LenyaMetaData
> - DublinCore
> - CustomMetaData
> 
> LenyaMetaData should be used for "internal" meta data.
> The problem here is:
> 
> Which class is responsible to know the attribute names?
> 
> Actually IMO only the component which actually uses the meta
> data should know the attribute name, and it should encapsulate
> them. Otherwise we present an option for misuse:
> 
>    String state = metaData.getFirstValue(LenyaMetaData.STATE);
> 
> This is bad code, because the syntax of the state storage
> is known only by the workflow components. 

IIUC by syntax you mean the way workflow state knowledge has presently 
been represented in XML ? For example
   <wf:history xmlns:wf="http://apache.org/cocoon/lenya/workflow/1.0">
     <wf:version date="2005-06-03 18:25:58" event="submit" state="review">
     <wf:variable name="is_live" value="false"/>
     </wf:version>
   </wf:history>

First, I think the workflow history should be separated from the 
metadata, IMO these are 2 separate concerns, which we should deal with 
separately.

If we want to consider the current workflow state as metadata of a 
document (which I agree makes sense), this is a separate concern than a 
state storage syntax. In this particular case, IMO we should "flatten" 
the current state knowledge in the Lenya metadata, so something like
    <lenya:internal>
     ...
       <lenya:workflowState>review</lenya:workflowState>
    </lenya:internal>

So, the code which accesses the current state does call 
metaData.getFirstValue(LenyaMetaData.STATE), but needs no further 
knowledge about state storage. Whereas the component which presents 
workflow history uses different code.



 > It should rather read
> 
>    Workflowable workflowable = new DocumentWorkflowable(document...);
>    String state = workflowable.getLatestVersion().getState();

Could that not be "workflowable.getState()", which would in turn call 
the document's metaData.getFirstValue(LenyaMetaData.STATE) ?

I hope I am not completely misunderstanding something; I noticed there 
were changes this week in trunk regarding workflow & metadata handling, 
but I have not yet been able to keep up with them.

As a general design strategy, my opinion is:

LenyaMetaData is Lenya's specific metadata about a document - meaning, 
all metadata not clearly standardized in Dublin Core. IMO it makes sense 
to put all metadata pertaining to the document in this container. This 
does not say which component "uses" it; in fact we cannot predict future 
uses. Any component wanting to use this metadata must know its name and 
access it via the LenyaMetaData. This generic "metadata = simple 
attributes" approach does, I agree, imply that arbitrary, nested XML is 
not possible within the metadata. So the decisive question is, do we 
really need such XML within the metadata ? (IMO no)

To summarize, IMO it is in fact good code to encapsalute (non-DC) 
metadata within LenyaMetaData, including the name of the attributes, and 
any component needing access does something like
    metaData.getFirstValue(LenyaMetaData.WHATEVER_ATTRIBUTE)

Any issues other than getting current meta attributes about a document 
should not be handled via the MetaData interfaces.


--
Wolfgang

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org