You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sis.apache.org by Martin Desruisseaux <ma...@geomatys.fr> on 2012/10/16 14:36:55 UTC

XLink and other identifiers

Hello all

A key interface has been recently committed: IdentifiedObject in the XML 
package (note: there is also closely-related IdentifiedObject interface 
in GeoAPI, but the later is not yet the subject of this email):

https://builds.apache.org/job/sis-trunk/site/apidocs/org/apache/sis/xml/package-summary.html


The problem
-----------
We will need to identify our object instances, sometime using the 
primary key in a database, sometime using a URL to a on-line resources, 
sometime by other ways. Identifiers are used a lot in OGC/ISO 
specifications, and there are defined in many ways. Actually (this is a 
funny coincidence!) in the OGC meeting last week, one of the speeches 
was someone who did an overview of many identifiers available in the 
OGC/ISO specifications and how to choose one. We have at least the 
following kinds of identifiers:

  * XML elements can have a "gml:id" attribute, which is valid only
    inside the XML document that define it.
  * XML elements can have a "gco:uuid" attribute, which is valid outside
    the document. E.g. the primary key in a database managed by an agency.
  * XML elements can have a "xlink:href" attribute, which can refer to
    the definition provided in another XML document.
  * Many Java objects derived from OGC/ISO have a getIdentifiers()
    method, which return MD_Identifier objects. Those identifier are
    basically (authority, code) pairs, where "authority" (a CI_Citation)
    is contact information to an agency, and "code" (a String) is any
    code allocated by that agency.
  * CI_Citation additionally provides ISBN and ISSN codes, which could
    be seen as special cases of MD_Identifier with fixed "authorities".


And I'm sure there is other kind of identifiers that I missed. When 
looking at the OGC/ISO specifications, I'm not aware of any central 
place where all kinds of identifiers are listed; the above list is an 
Apache SIS effort. It is easy to get lost.

The GeoAPI interfaces offer no programmatic way to access the "gml:id", 
"gco:uuid" and "xlink:href" identifiers, because they are specific to 
XML documents while GeoAPI is about Java development. Indeed, the 
above-cited identifiers do not exist in the UML of OGC/ISO abstract 
specifications. GeoAPI is derived from UML, not from XML. Nevertheless, 
sometime we need programmatical access to those XML identifiers.


Proposed approach
-----------------
We provide a specialization of CI_Citation : IdentifierSpace. All 
CI_Citation which are used in the "authority" part of a MD_Identifier 
will implement this interface. In addition, we provide in the 
IdentifierSpace interface some constants identifying the XML identifiers 
(non-XML identifiers are listed in the Citations class, not yet committed).

We provide an IdentifiedObject inteface, which will be implemented by 
all "object with identity" classes. There will be hundreds of such 
classes. All IdentifiedObjects provide two methods, which are basically 
the same thing from two different perspective:

Collection<Identifier> getIdentifiers()
--------------------------------
This method actually appears in various GeoAPI interfaces, so the method 
declared in IdentifiedObject has to be compatible. But instead of having 
this method in only a couple of types, we have it for every "object with 
identity" type. The Collection<Identifier> shall contains "gml:id", 
"gco:uuid" and "xlink:href" attributes, if any. Users can add and remove 
elements in this collection.

Map<Citation,Identifier> getIdentifierMap()
-------------------------------------
Basically the same information than 'getIdentifiers()', but with 
components of the (authority, code) pairs separated. It is usually 
easier for fetching or modifying a particular identifier (i.e. only the 
"gml:id").

Implementations are free to add more identifiers in the mix. For example 
DefaultCitation (not yet committed), will include ISBN and ISSN codes in 
the map of identifiers. Consequently getIdentifierMap() can be used as a 
central place where to see and edit all the various identifiers 
associated to an object.

     Martin


Re: XLink and other identifiers

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
+1 to this proposal from me.

Thanks Martin!

Cheers,
Chris

On Oct 16, 2012, at 5:36 AM, Martin Desruisseaux wrote:

> Hello all
> 
> A key interface has been recently committed: IdentifiedObject in the XML package (note: there is also closely-related IdentifiedObject interface in GeoAPI, but the later is not yet the subject of this email):
> 
> https://builds.apache.org/job/sis-trunk/site/apidocs/org/apache/sis/xml/package-summary.html
> 
> 
> The problem
> -----------
> We will need to identify our object instances, sometime using the primary key in a database, sometime using a URL to a on-line resources, sometime by other ways. Identifiers are used a lot in OGC/ISO specifications, and there are defined in many ways. Actually (this is a funny coincidence!) in the OGC meeting last week, one of the speeches was someone who did an overview of many identifiers available in the OGC/ISO specifications and how to choose one. We have at least the following kinds of identifiers:
> 
> * XML elements can have a "gml:id" attribute, which is valid only
>   inside the XML document that define it.
> * XML elements can have a "gco:uuid" attribute, which is valid outside
>   the document. E.g. the primary key in a database managed by an agency.
> * XML elements can have a "xlink:href" attribute, which can refer to
>   the definition provided in another XML document.
> * Many Java objects derived from OGC/ISO have a getIdentifiers()
>   method, which return MD_Identifier objects. Those identifier are
>   basically (authority, code) pairs, where "authority" (a CI_Citation)
>   is contact information to an agency, and "code" (a String) is any
>   code allocated by that agency.
> * CI_Citation additionally provides ISBN and ISSN codes, which could
>   be seen as special cases of MD_Identifier with fixed "authorities".
> 
> 
> And I'm sure there is other kind of identifiers that I missed. When looking at the OGC/ISO specifications, I'm not aware of any central place where all kinds of identifiers are listed; the above list is an Apache SIS effort. It is easy to get lost.
> 
> The GeoAPI interfaces offer no programmatic way to access the "gml:id", "gco:uuid" and "xlink:href" identifiers, because they are specific to XML documents while GeoAPI is about Java development. Indeed, the above-cited identifiers do not exist in the UML of OGC/ISO abstract specifications. GeoAPI is derived from UML, not from XML. Nevertheless, sometime we need programmatical access to those XML identifiers.
> 
> 
> Proposed approach
> -----------------
> We provide a specialization of CI_Citation : IdentifierSpace. All CI_Citation which are used in the "authority" part of a MD_Identifier will implement this interface. In addition, we provide in the IdentifierSpace interface some constants identifying the XML identifiers (non-XML identifiers are listed in the Citations class, not yet committed).
> 
> We provide an IdentifiedObject inteface, which will be implemented by all "object with identity" classes. There will be hundreds of such classes. All IdentifiedObjects provide two methods, which are basically the same thing from two different perspective:
> 
> Collection<Identifier> getIdentifiers()
> --------------------------------
> This method actually appears in various GeoAPI interfaces, so the method declared in IdentifiedObject has to be compatible. But instead of having this method in only a couple of types, we have it for every "object with identity" type. The Collection<Identifier> shall contains "gml:id", "gco:uuid" and "xlink:href" attributes, if any. Users can add and remove elements in this collection.
> 
> Map<Citation,Identifier> getIdentifierMap()
> -------------------------------------
> Basically the same information than 'getIdentifiers()', but with components of the (authority, code) pairs separated. It is usually easier for fetching or modifying a particular identifier (i.e. only the "gml:id").
> 
> Implementations are free to add more identifiers in the mix. For example DefaultCitation (not yet committed), will include ISBN and ISSN codes in the map of identifiers. Consequently getIdentifierMap() can be used as a central place where to see and edit all the various identifiers associated to an object.
> 
>    Martin
> 


Re: XLink and other identifiers

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Le 16/10/12 21:36, Martin Desruisseaux a écrit :
> Map<Citation,Identifier> getIdentifierMap()
Typo: this is a Map<Citation,String> (actually an IdentifierMap 
specialization, but we can ignore this details).