You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@sis.apache.org by Martin Desruisseaux <ma...@geomatys.fr> on 2012/07/31 01:14:17 UTC

Grouping modules in categories

Hello Chris

Le 30/07/12 19:50, Mattmann, Chris A (388J) a écrit :
> OK to clarify, OSGeo will discuss this issue at their board meeting on 8/9?
Yes, I will keep this list informed.


> Sure feel free to start threads here on sis-dev@ to discuss your thoughts.
I saw the 5 modules in the root directory of the SIS project. The first 
thing that come to my though was that if the project growth to 1 million 
lines of code (which could happen relatively fast), we are going to have 
a lot of modules. What about a directory tree a little bit deeper, which 
regroup modules by category? We could have the following categories 
among others:

   * utilities
   * metadata
   * referencing
   * geometry
   * feature
   * coverage
   * processing
   * index
   * display
   * client

Each category is a directory, which contains one or many modules. For 
example the "coverage" category could contains the following modules:

   * sis-coverage
   * sis-coverageio
   * sis-coverageio-netcdf

I noticed that the current SIS code contains a "sis-core" module. What 
about trying to put the content of "sis-core" in some more specific 
modules? Otherwise I think that core may become very big if 
"referencing", "geometry", "feature" etc. are considered as core...

     Martin

Re: Grouping modules in categories

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.

Hey Martin,

On Jul 31, 2012, at 10:21 AM, Martin Desruisseaux wrote:

> Le 31/07/12 18:36, Mattmann, Chris A (388J) a écrit :
>> My personal 2c is that if folks are doing any of the following enough:
>> 
>> * adding patches
>> * creating JIRA issues
>> * having thoughtful discussion on list
>> * writing documentation
>> * answering user questions
>> 
>> or being generally "interested" in the project, then they are a candidate, at least for me,
>> for committership and PPMC membership on the project. PPMC membership and
>> committership is not just for code. Contributions come in many forms.
> 
> Understood. I would bring the point of view that Distributed Versionning System, through "push" and "pull" actions, can been seen as a more powerful way to submit patches than email.

Yeah to me honestly it's just a discussion of tooling. I've seen SVN + Review Board fulfill a "Github like" capability pretty well, and then the other basic tenets, e.g., Mailing lists, Wiki, CI systems (Jenkins, Buildbot, etc.) work combined with either SVN or Git. Regardless of your preferred tool and/or workflow, the ASF has the ability to support that.

> 
> One thing I could do is to create a Git repository synchronized on the SIS subversion repository (I'm already doing that for GeoAPI). Using the Git repository for creating patches would be easier for me, even if the patch is sent by email for application on SVN, and may be an opportunity for those who wish to get familiar with Distributed Versionning System. My past experience with a very large and complex project on SVN convinced me on the importance of DVS flexibility...

Sure, the ASF maintains read-only Git mirrors, here:

http://git.apache.org/

Currently I don't believe we have a Git mirror set up for SIS, but it's as simple as filing an issue here:

https://issues.apache.org/jira/browse/INFRA

And requesting one from infra.

ASF Git mirrors are mirrored out to Github and can be used to work on changes, and then post back diffs
through git-svn etc etc.

The danger though is that we have to be mindful not to split the development communities of ASF repositories.
If dev occurs elsewhere, and then is "flushed" back to the list in the form of huge patches representing changes
made somewhere else (your laptop, Github, etc.) besides the ASF, with a tool like Git, it's harder to integrate 
the patches unless they are incremental, and we also lose out on not just the revision history, but also the 
principle tenets of the Apache way, which is that all conversations must occur on list, including our tooling and
infrastructure that sends mails to our mail archives, that can be referenced later.

Anyways I'm not really against Git, so if we want to start out with a Git mirror, and then you could use that
to develop and submit patches back to SVN and JIRA I'm willing to work with you on those issues and 
to start incrementally improving SIS and making you a contributor with your vast spatial knowledge and
existing GeoTk experience!

> 
> While I agree on the importance of welcoming new members, isn't important to list what we think developers should know before getting commit rights for a particular module? For example I would like a page explaining that before committing in the "referencing" module, we pledge the developer to become familiar with java.awt.geom.AffineTransform. Even if the committer rarely touch directly any affine transform, the "spirit" of affine transform is widely spread across the referencing module and developers often take a while to become used to it. When patches are sent by email or by a DVS system, the maintainer having commit right can ensure that the "spirit" of affine transform is followed.

Well that's the thing. There is no single maintainer here of modules at the ASF. We have Project Management
Committees who themselves are responsible for the collective "whole" of code, including potentially multiple
modules and projects. Now, that being said, the way it turns out is most of the time like you mention above, 
that a particular PPMC member/committer has itches that they like to scratch and thus modules and code that
they really know, whilst other code that they don't. In these cases, rather than having explicit pedagogy to draw 
a wall between who is good at what, and what others have the "right" to maintain and work on, in my experience,
if you trust someone enough to give them the commit bit and add them to the PPMC you trust them implicitly enough
to *not* mess with code and/or things that they aren't familiar with. This isn't a 100% always effective solution, and
sure there are corner cases, but more often than not, the social and intellectual freedom given, as well as the "good
will" earned by trusting people far outweighs people that are malicious and others "messing up" code.

Plus, we use a revision control system; things are easy to revert; CI is available to us in all forms, and we have
some of the most diverse use cases and people in the world working on code here. In other words, *we care*.
Our metrics at Apache that we use to measure this include:

* are you adding new committers to a project on a regular basis?
* are you releasing the code on a regular basis?
* is the code compatible with the Apache license?

Those are some loose metrics that, if we are following them, and also following in the social spirit of the ASF, I've
found to be very effective in ensuring the quality of projects here. We have some of the most high quality projects
in the world, that operate in this manner. 

That being said, if someone on the PPMC is constantly breaking things, etc., we also have the ability to 
discuss that on list and to talk through why/where/how to get better/etc. The Apache Board, and the chairs
of the PMCs (and PMCs themselves) have broad ability to *fix* problematic situations as well. That's part
of what the foundation ensures through its active triage and monitoring of projects. We don't just dump code here,
or commit things that don't work (at least for long).

Also, we can employ things like branches, and trunk and other techniques at the CM level here to help 
ensure quality of code and so forth.

> But once a developer get direct commit right, I would feel reassured if I could be confident that AffineTransform has been well understood by him... Of course this is our responsibility to write a page explaining what we think the committer need to know. It would be different for each module (e.g. AffineTransform is irrelevant to metadata).

Here are the links that I regularly send new committers when they are made at the ASF:

http://community.apache.org/
http://www.apache.org/dev/new-committers-guide.html
http://www.apache.org/dev/pmc.html

I think they are worth a ready by all of us in general.

HTH explain some things Martin and thanks for the dialogue!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Re: Grouping modules in categories

Posted by Martin Desruisseaux <ma...@geomatys.fr>.

Le 31/07/12 18:36, Mattmann, Chris A (388J) a écrit :
> My personal 2c is that if folks are doing any of the following enough:
>
> * adding patches
> * creating JIRA issues
> * having thoughtful discussion on list
> * writing documentation
> * answering user questions
>
> or being generally "interested" in the project, then they are a candidate, at least for me,
> for committership and PPMC membership on the project. PPMC membership and
> committership is not just for code. Contributions come in many forms.

Understood. I would bring the point of view that Distributed Versionning 
System, through "push" and "pull" actions, can been seen as a more 
powerful way to submit patches than email.

One thing I could do is to create a Git repository synchronized on the 
SIS subversion repository (I'm already doing that for GeoAPI). Using the 
Git repository for creating patches would be easier for me, even if the 
patch is sent by email for application on SVN, and may be an opportunity 
for those who wish to get familiar with Distributed Versionning System. 
My past experience with a very large and complex project on SVN 
convinced me on the importance of DVS flexibility...

While I agree on the importance of welcoming new members, isn't 
important to list what we think developers should know before getting 
commit rights for a particular module? For example I would like a page 
explaining that before committing in the "referencing" module, we pledge 
the developer to become familiar with java.awt.geom.AffineTransform. 
Even if the committer rarely touch directly any affine transform, the 
"spirit" of affine transform is widely spread across the referencing 
module and developers often take a while to become used to it. When 
patches are sent by email or by a DVS system, the maintainer having 
commit right can ensure that the "spirit" of affine transform is 
followed. But once a developer get direct commit right, I would feel 
reassured if I could be confident that AffineTransform has been well 
understood by him... Of course this is our responsibility to write a 
page explaining what we think the committer need to know. It would be 
different for each module (e.g. AffineTransform is irrelevant to metadata).

     Martin

Re: Grouping modules in categories

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.

Hi Martin,

On Jul 31, 2012, at 8:19 AM, Martin Desruisseaux wrote:

> Just a note: a few mails ago, they were a mention about granting commit rights. Actually I would suggest a slightly different approach. We don't need commit rights, at least not directly. If the SIS project goes ahead with a Git repository, then we only need to clone that Git repository on our own public server. We can commit whatever we want on that server and propose that to the SIS project. If it looks good, someone with commit right on Apache server can "pull" from our server and "push" to the Apache one. If the proposal doesn't look good, then we can delete our clone and re-clone with an alternative proposal. Commit right would be granted on the Apache server only if the SIS maintainer feel tired to "pull" and "push" :-)

Well Git or SVN aside, the way Apache works is based on meritocracy. The Apache SIS PPMC members decide
when someone's time has come to grant them commit access, and then discusses this (privately) before nominating
a person to obtain committership and PPMC membership (2 different things) at Apache.

Commit rights allow someone to modify the code, and PPMC membership gives the person a binding VOTE on
adding new PPMC members and committers, and on releases of the software.

Apache's VOTE'ing process is documented here:

http://www.apache.org/foundation/voting.html

FWIW, these are only "loose" guidelines and projects are free to form their own
interpretation of these guidelines, but more or less should map to them in a
discernable fashion.

The Apache SIS project decided long ago that PPMC member == committer, like many
ASF projects have done. There is no need to introduce artificial barriers into contributors
who *only* have commit access versus a binding VOTE on adding new personnel and
releasing our software.

So, that being said, if you contribute "enough" around here where "enough" is subjective
to the PPMC member who would like to nominate you or anyone else for committership,
that's basically the process. That's why *trust* is really important here and having a general
understanding of the Apache way (e.g., not being a troll; being a person who is friendly on
list; who values others contributions and recognizes the contributions, etc.) is a key
quality.

My personal 2c is that if folks are doing any of the following enough:

* adding patches
* creating JIRA issues
* having thoughtful discussion on list
* writing documentation
* answering user questions

or being generally "interested" in the project, then they are a candidate, at least for me,
for committership and PPMC membership on the project. PPMC membership and
committership is not just for code. Contributions come in many forms.

>
> (actually, in the context of Git repository this is not "commit" rights - everyone have commit rights on his own clone - but "push" rights)

Yeah the ASF is supporting Git, though like I said, SIS has traditionally used SVN. I don't think
folks are uber opposed to it, but the transition could happen gradually, which is always a nice
thing at the ASF, since it allows incremental, easily reversible, change.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Re: Grouping modules in categories

Posted by Martin Desruisseaux <ma...@geomatys.fr>.

Just a note: a few mails ago, they were a mention about granting commit 
rights. Actually I would suggest a slightly different approach. We don't 
need commit rights, at least not directly. If the SIS project goes ahead 
with a Git repository, then we only need to clone that Git repository on 
our own public server. We can commit whatever we want on that server and 
propose that to the SIS project. If it looks good, someone with commit 
right on Apache server can "pull" from our server and "push" to the 
Apache one. If the proposal doesn't look good, then we can delete our 
clone and re-clone with an alternative proposal. Commit right would be 
granted on the Apache server only if the SIS maintainer feel tired to 
"pull" and "push" :-)

(actually, in the context of Git repository this is not "commit" rights 
- everyone have commit rights on his own clone - but "push" rights)

I'm proposing this approach because I think it would be nice if the SIS 
maintainer takes his time regarding commit right grants. This apply to 
ourselves like anyone else.

     Martin

Re: Grouping modules in categories

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.

Hi Martin,

On Jul 31, 2012, at 4:09 AM, Martin Desruisseaux wrote:

> Hello Chris
> 
> Le 31/07/12 03:55, Mattmann, Chris A (388J) a écrit :
>> Before we create a utils package, I like to find classes that don't fit in any of the other
>> categories. Do you have some ideas as to what would fit into here now?
> Example of things in the utilities modules are:
> 
> * Specialized collections implementations. Apache has a special
>   project for that, but some time it is useful to control the details
>   of the implementation like exception handling inside the iterator
>   (especially when the iterator is backed by I/O operations), thread
>   management when there is some "cleaner" background thread at work, etc. 

+1, so a spatial oriented collections package? Sure.

> * Localized resources for internationalisation. Years ago my first
>   approach was to let each modules managing its own resources. But I
>   realized that the amount of duplicated resources (e.g. "Argument foo
>   can not be null") is so high that it was impracticable. We don't
>   need to put all resources in a utility module, but at least the most
>   frequently used ones.

+1.

> * A few mathematical functions, utility methods dealing with units of
>   measurement, an AngleFormat for parsing/formatting angles in a
>   DD°MM'SS format, utility methods for formatting tables or trees from
>   TreeModel, a TreeTableNode interface in complement to the standard
>   TreeNode interface.
> * A small framework which allow to redirect loggings message to Log4J
>   or other frameworks. I know there is common-loggings and SLF4J for
>   that, but those frameworks define their own API. I don't know why
>   they are doing that, since it is possible to do the same on top of
>   the standard JDK API, which is what we are doing.
> * For those who are fine with the standard JDK logging framework, a
>   log formatter for producing logging message on one line (like Log4J)
>   instead of two lines. Additionally this formatter uses colors on
>   terminal that support it (warnings messages appear in red, info in
>   green, etc.).

+1.

> * Logging level for operations that take a long time ("slow",
>   "slower", "slowest") in order to help us detect potential bottleneck
>   in production (without profiler). The logging level is determined by
>   the time that the operation took to execute. If it is above some
>   threshold, a message is logged using one of the above-cited levels.

This would seem useful for commons-logging or some commons project,
but +1 to have it here then try to push it upstream as time permits.

> * Some basic JAXB adapter for XML marshalling/unmashalling, not yet
>   specific to an other module (rather the basic elements that we find
>   in all ISO standards).

+1.

> 
> 
> 
>> I would imagine metadata here would refer to arbitrary metadata support and/or
>> specific Geo metadata models?
> This is mostly about supporting the ISO 19115 standard (http://www.iso.org/iso/catalogue_detail?csnumber=26020) and its ISO 19115-2 extension. Those standards are mandatory in European governmental agencies as per the INSPIRE legislation. I think that NOAA participated to this standard, or at least the -2 extension. This module would provide "plain old Java objects" (POJO) for the various elements defined by the standard (about 100). For example there is a class GeographicBoundingBox with getEastBoundLongitude(), getNorthBoundLatitude(), etc. methods. This module contains also the JAXB adapters for marshalling/unmarshalling to ISO 19139 conformant XML (ISO 19115 defines the metadata model, ISO 19139 defines how to express it in XML).

Got it. +1 to support ISO 19115.

> 
> A general framework for supporting arbitrary metadata would be an other topic. Actually we already use Lucene for that, but this is at a higher level than providing POJO specifically for the ISO 19115 standard.

Well Lucene is a search framework and yes it can support arbitrary metadata, but it brings with it
the rest of the things needed for a search framework. I'd like to push as much metadata support
as possible into Tika since it's a lower level library.

> 
> We also provide bridges from the NetCDF CF conventions to ISO 19115 metadata and conversely.

+1.

> 
> 
>>>  * referencing
>> Is this like geo-location and geo-referencing for coordinate system support?
> Yes. This is a very big topic on its own. Include also parsing/formatting Well Known Text (WKT) format, creating Coordinate Reference System objects from the definition provided in the EPSG database, coordinates transformations, etc. This topic is one of the strongest point of Geotk.

+1.

> 
> 
>> If so, I filed an issue already for this:
>> 
>> https://issues.apache.org/jira/browse/SIS-9
> Thanks for the tip. If this issue means "create a referencing" module, I would see it more as a whole component on the JIRA tracker instead than a single issue :).

Yep agreed, just using this email as a guide to reference relevant reported issues that we can
track progress on (or add more issues to).

> 
> 
>>>  * geometry
>> Seems similar to:
>> 
>> https://issues.apache.org/jira/browse/SIS-51
>> 
>> (distance being just one of the many possible geometric functions)
> Actually, on our side we put distance calculations in the referencing module instead. I see geometry more as an implemention of the ISO 19107 standard. Two Ph.D. students worked on this subject up to date and we are not yet there, so I think it still a very long-term task...

+1 yep.

> 
> 
>>>  * coverage
>> WCS?
> I would consider web services (WCS, WFS, WMS...) only as protocols for transferring data between whatever model the library use, and external servers or clients. Here by "coverage" I mean implementation of the ISO 19123 standard, which define a Coverage class with attributes like "domain", "range" and various sub-classes like ContinuousQuadrilateralGridCoverage, HexagonalGridCoverage, SegmentedCurveCoverage, ThiessenPolygonCoverage, TinCoverage, etc. Once we implemented at least the GridCoverage class, WCS is simply a way to export that class on the web.

Agreed. I would like to focus on using JAX-RS and Apache CXF (which provides an implementation of it).

> 
> The coverage module includes also I/O services for reading/formatting from various formats like NetCDF, some operations like raster reprojections, etc.

+1.

> 
> 
>>>  * processing
>> What type of processing would this be? Like raster processing? Tiling?
> The idea is to define an abstract framework for launching arbitrary process. The goal is to support the "Web Processing Service". But actually I'm not sure if "processing" deserve its own module. We could define a basic framework in the "utility" module, and let each modules like "coverage" define their processes. I think we can defer this group to a later stage.

Yeah I would love to implement WPS using Apache OODT and its workflow manager. We can maintain the 
spatial specific parts of it here in SIS, but leverage OODT for the workflow part.

> 
> 
>>>  * index
>> I think I called this "storage" -- this would be the persistence layer for the
>> spatial data structures, e.g.,
>> 
>> https://issues.apache.org/jira/browse/SIS-8
>> https://issues.apache.org/jira/browse/SIS-33
> Yes. In this proposal "index" was a separated group for RTree, QuadTree and the like, but like "processing" I'm not sure it deserve it own module. Furthermore some index are closely related to the storage format. So I guess we can also omit this group for a while.

Got it. So we could move our existing QuadTree implementation there, and then add an RTree, etc.

> 
> 
>>>  * display
>> For us, this would be:
>> https://issues.apache.org/jira/browse/SIS-43
> Yes, however the "renderer engine" probably need a separated topic.

Yep agree.

> 
> 
>>>  * client
>> I think for us this would be:
>> 
>> https://issues.apache.org/jira/browse/SIS-11
>> https://issues.apache.org/jira/browse/SIS-46
> In this case the proposal is more for clients of web services. For example a WCS client would be a source of data connected to a web server. The WCS clients is responsible for constructing Coverage object from the data received from a server. Same apply for WFS, sensor data, processing...
> 
> If we said that SIS support WCS, it can be both ways: as a server or as a client. This group of modules would take care of the "client" side of things. Maybe "web-client" would be a better name.

Agreed.

> 
> 
>> Think about the above in the context of the JIRA issues I sent you and I would be happy
>> to discuss smallish, intermediate next steps to get there.
> If we forget "utilities" for now, the first step would be ISO-19115 metadata. Everything else depend on it. But if we agree to start with metadata, before discussing the details of this module, maybe it would be worth to make some plan about the overall architecture? For example if we apply the proposal to define some group of modules, it would imply moving around the existing SIS classes. Would it be done on the Subversion repository, or would it be easier to start an initially empty Git repository and move classes there (possibly in different directories) as modules are created?

I think we should start moving around the existing SVN repository. At at the ASF, small incremental change
is always preferred compared to huge, hard to revert things. I'll also help build merit and bring more existing
GeoTK folks on board and get them on the path to committership and PPMC membership here at the ASF.

Are there others in GeoTK that can join up here on the Apache SIS lists and start to contribute? That would
be great!

Regardless, if you want to take the lead in proposing some issues relevant to ISO-19115 and then start
coding or creating patches, that would be awesome and I'll be happy to work with you to shepherd the
work in.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Re: Grouping modules in categories

Posted by Martin Desruisseaux <ma...@geomatys.fr>.

Hello Chris

Le 31/07/12 03:55, Mattmann, Chris A (388J) a écrit :
> Before we create a utils package, I like to find classes that don't fit in any of the other
> categories. Do you have some ideas as to what would fit into here now?
Example of things in the utilities modules are:

  * Specialized collections implementations. Apache has a special
    project for that, but some time it is useful to control the details
    of the implementation like exception handling inside the iterator
    (especially when the iterator is backed by I/O operations), thread
    management when there is some "cleaner" background thread at work, etc.
  * Localized resources for internationalisation. Years ago my first
    approach was to let each modules managing its own resources. But I
    realized that the amount of duplicated resources (e.g. "Argument foo
    can not be null") is so high that it was impracticable. We don't
    need to put all resources in a utility module, but at least the most
    frequently used ones.
  * A few mathematical functions, utility methods dealing with units of
    measurement, an AngleFormat for parsing/formatting angles in a
    DD°MM'SS format, utility methods for formatting tables or trees from
    TreeModel, a TreeTableNode interface in complement to the standard
    TreeNode interface.
  * A small framework which allow to redirect loggings message to Log4J
    or other frameworks. I know there is common-loggings and SLF4J for
    that, but those frameworks define their own API. I don't know why
    they are doing that, since it is possible to do the same on top of
    the standard JDK API, which is what we are doing.
  * For those who are fine with the standard JDK logging framework, a
    log formatter for producing logging message on one line (like Log4J)
    instead of two lines. Additionally this formatter uses colors on
    terminal that support it (warnings messages appear in red, info in
    green, etc.).
  * Logging level for operations that take a long time ("slow",
    "slower", "slowest") in order to help us detect potential bottleneck
    in production (without profiler). The logging level is determined by
    the time that the operation took to execute. If it is above some
    threshold, a message is logged using one of the above-cited levels.
  * Some basic JAXB adapter for XML marshalling/unmashalling, not yet
    specific to an other module (rather the basic elements that we find
    in all ISO standards).



> I would imagine metadata here would refer to arbitrary metadata support and/or
> specific Geo metadata models?
This is mostly about supporting the ISO 19115 standard 
(http://www.iso.org/iso/catalogue_detail?csnumber=26020) and its ISO 
19115-2 extension. Those standards are mandatory in European 
governmental agencies as per the INSPIRE legislation. I think that NOAA 
participated to this standard, or at least the -2 extension. This module 
would provide "plain old Java objects" (POJO) for the various elements 
defined by the standard (about 100). For example there is a class 
GeographicBoundingBox with getEastBoundLongitude(), 
getNorthBoundLatitude(), etc. methods. This module contains also the 
JAXB adapters for marshalling/unmarshalling to ISO 19139 conformant XML 
(ISO 19115 defines the metadata model, ISO 19139 defines how to express 
it in XML).

A general framework for supporting arbitrary metadata would be an other 
topic. Actually we already use Lucene for that, but this is at a higher 
level than providing POJO specifically for the ISO 19115 standard.

We also provide bridges from the NetCDF CF conventions to ISO 19115 
metadata and conversely.


>>   * referencing
> Is this like geo-location and geo-referencing for coordinate system support?
Yes. This is a very big topic on its own. Include also 
parsing/formatting Well Known Text (WKT) format, creating Coordinate 
Reference System objects from the definition provided in the EPSG 
database, coordinates transformations, etc. This topic is one of the 
strongest point of Geotk.


> If so, I filed an issue already for this:
>
> https://issues.apache.org/jira/browse/SIS-9
Thanks for the tip. If this issue means "create a referencing" module, I 
would see it more as a whole component on the JIRA tracker instead than 
a single issue :).


>>   * geometry
> Seems similar to:
>
> https://issues.apache.org/jira/browse/SIS-51
>
> (distance being just one of the many possible geometric functions)
Actually, on our side we put distance calculations in the referencing 
module instead. I see geometry more as an implemention of the ISO 19107 
standard. Two Ph.D. students worked on this subject up to date and we 
are not yet there, so I think it still a very long-term task...


>>   * coverage
> WCS?
I would consider web services (WCS, WFS, WMS...) only as protocols for 
transferring data between whatever model the library use, and external 
servers or clients. Here by "coverage" I mean implementation of the ISO 
19123 standard, which define a Coverage class with attributes like 
"domain", "range" and various sub-classes like 
ContinuousQuadrilateralGridCoverage, HexagonalGridCoverage, 
SegmentedCurveCoverage, ThiessenPolygonCoverage, TinCoverage, etc. Once 
we implemented at least the GridCoverage class, WCS is simply a way to 
export that class on the web.

The coverage module includes also I/O services for reading/formatting 
from various formats like NetCDF, some operations like raster 
reprojections, etc.


>>   * processing
> What type of processing would this be? Like raster processing? Tiling?
The idea is to define an abstract framework for launching arbitrary 
process. The goal is to support the "Web Processing Service". But 
actually I'm not sure if "processing" deserve its own module. We could 
define a basic framework in the "utility" module, and let each modules 
like "coverage" define their processes. I think we can defer this group 
to a later stage.


>>   * index
> I think I called this "storage" -- this would be the persistence layer for the
> spatial data structures, e.g.,
>
> https://issues.apache.org/jira/browse/SIS-8
> https://issues.apache.org/jira/browse/SIS-33
Yes. In this proposal "index" was a separated group for RTree, QuadTree 
and the like, but like "processing" I'm not sure it deserve it own 
module. Furthermore some index are closely related to the storage 
format. So I guess we can also omit this group for a while.


>>   * display
> For us, this would be:
> https://issues.apache.org/jira/browse/SIS-43
Yes, however the "renderer engine" probably need a separated topic.


>>   * client
> I think for us this would be:
>
> https://issues.apache.org/jira/browse/SIS-11
> https://issues.apache.org/jira/browse/SIS-46
In this case the proposal is more for clients of web services. For 
example a WCS client would be a source of data connected to a web 
server. The WCS clients is responsible for constructing Coverage object 
from the data received from a server. Same apply for WFS, sensor data, 
processing...

If we said that SIS support WCS, it can be both ways: as a server or as 
a client. This group of modules would take care of the "client" side of 
things. Maybe "web-client" would be a better name.


> Think about the above in the context of the JIRA issues I sent you and I would be happy
> to discuss smallish, intermediate next steps to get there.
If we forget "utilities" for now, the first step would be ISO-19115 
metadata. Everything else depend on it. But if we agree to start with 
metadata, before discussing the details of this module, maybe it would 
be worth to make some plan about the overall architecture? For example 
if we apply the proposal to define some group of modules, it would imply 
moving around the existing SIS classes. Would it be done on the 
Subversion repository, or would it be easier to start an initially empty 
Git repository and move classes there (possibly in different 
directories) as modules are created?

     Martin

Re: Grouping modules in categories

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.

Hi Martin,

On Jul 30, 2012, at 4:14 PM, Martin Desruisseaux wrote:

> Hello Chris
> 
> Le 30/07/12 19:50, Mattmann, Chris A (388J) a écrit :
>> OK to clarify, OSGeo will discuss this issue at their board meeting on 8/9?
> Yes, I will keep this list informed.
> 
> 
>> Sure feel free to start threads here on sis-dev@ to discuss your thoughts.
> I saw the 5 modules in the root directory of the SIS project. The first thing that come to my though was that if the project growth to 1 million lines of code (which could happen relatively fast), we are going to have a lot of modules. What about a directory tree a little bit deeper, which regroup modules by category? We could have the following categories among others:

The below look pretty good in terms of organization. I like the idea of having top-level modules too.
A few comments:

> 
>  * utilities

Before we create a utils package, I like to find classes that don't fit in any of the other
categories. Do you have some ideas as to what would fit into here now?

>  * metadata

I would imagine metadata here would refer to arbitrary metadata support and/or
specific Geo metadata models? If so, then I think we should leverage Apache Tika
(http://tika.apache.org/) as much as possible for doing that. There are already a few
issues filed to help tie together Tika and SIS here:

https://issues.apache.org/jira/browse/SIS-32
https://issues.apache.org/jira/browse/TIKA-605

>  * referencing

Is this like geo-location and geo-referencing for coordinate system support?
If so, I filed an issue already for this:

https://issues.apache.org/jira/browse/SIS-9

>  * geometry

Seems similar to:

https://issues.apache.org/jira/browse/SIS-51

(distance being just one of the many possible
geometric functions)

>  * feature

We don't have support for this yet, if this is like WFS, etc.

>  * coverage

WCS?

We did discuss a WMS-like service here:

https://issues.apache.org/jira/browse/SIS-28

>  * processing

What type of processing would this be? Like raster processing? Tiling?

>  * index

I think I called this "storage" -- this would be the persistence layer for the
spatial data structures, e.g., 

https://issues.apache.org/jira/browse/SIS-8
https://issues.apache.org/jira/browse/SIS-33

>  * display

For us, this would be:

https://issues.apache.org/jira/browse/SIS-43

>  * client

I think for us this would be:

https://issues.apache.org/jira/browse/SIS-11
https://issues.apache.org/jira/browse/SIS-46

> 
> Each category is a directory, which contains one or many modules. For example the "coverage" category could contains the following modules:
> 
>  * sis-coverage
>  * sis-coverageio
>  * sis-coverageio-netcdf

Sure that makes sense.

Think about the above in the context of the JIRA issues I sent you and I would be happy
to discuss smallish, intermediate next steps to get there.

> 
> I noticed that the current SIS code contains a "sis-core" module. What about trying to put the content of "sis-core" in some more specific modules? Otherwise I think that core may become very big if "referencing", "geometry", "feature" etc. are considered as core...

Sure, that makes a lot of sense. Right now, core is a catch-all for functionality that sis-webapp and sis-app depend on.
But, I'd be happy to discuss a better refactoring based on the above.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Re: Grouping modules in categories

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.

Thanks for these links, Martin.

Cheers,
Chris

On Jul 30, 2012, at 4:43 PM, Martin Desruisseaux wrote:

> Hello Adam
> 
> Le 31/07/12 01:20, Adam Estrada a écrit :
>> Where in your code is the geometry support? I need a heads up on where to find it.
> 
> In the "pending" source code repository of Geotk, directory modules/geometry. However this particular topic is still work in progress, both on our side and on OGC side, and deserve a bit of context:
> 
> Geometries are defined by the ISO 19107 international standard, which is reputed complex. Indeed, I'm not aware of any open source library implementing fully this standard. Most projects use the simpler Java Topology Suite (JTS) library. However JTS is designed for two-dimensional geometries in a Cartesian space (JTS can store a 'z' ordinate value, but doesn't use it). By contrast, ISO 19107 is designed for 1D, 2D and 3D geometries in arbitrary coordinate systems.
> 
> In 2007, a Ph.D. student published his thesis on an implementation of ISO 19107 geometries (http://w1.cirrelt.ca/~jena/files/DiplThesisJena07.pdf). In our company, we also supported a Ph.D. student who finished his thesis last year. We are considering to support yet an other Ph.D. student on this topic - this is to say that this particular topic is difficult. The ISO 19107 editor is aware of those kind of difficulties and is in the process of revising the ISO 19107 standard at OGC.
> 
> In the main time, we are using a mixed approach: we use the ISO 19107 Java interfaces in some code, but the JTS library as the underlying implementation. So the "geometry" group of modules contains a module that provide a partial ISO 19107 implementation as wrappers around JTS implementation. This has the JTS limits (2D, Cartesian), but may be the most functional implementation for now...
> 
>    Martin
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Re: Grouping modules in categories

Posted by Martin Desruisseaux <ma...@geomatys.fr>.

Hello Adam

Le 31/07/12 01:20, Adam Estrada a écrit :
> Where in your code is the geometry support? I need a heads up on where to find it.

In the "pending" source code repository of Geotk, directory 
modules/geometry. However this particular topic is still work in 
progress, both on our side and on OGC side, and deserve a bit of context:

Geometries are defined by the ISO 19107 international standard, which is 
reputed complex. Indeed, I'm not aware of any open source library 
implementing fully this standard. Most projects use the simpler Java 
Topology Suite (JTS) library. However JTS is designed for 
two-dimensional geometries in a Cartesian space (JTS can store a 'z' 
ordinate value, but doesn't use it). By contrast, ISO 19107 is designed 
for 1D, 2D and 3D geometries in arbitrary coordinate systems.

In 2007, a Ph.D. student published his thesis on an implementation of 
ISO 19107 geometries 
(http://w1.cirrelt.ca/~jena/files/DiplThesisJena07.pdf). In our company, 
we also supported a Ph.D. student who finished his thesis last year. We 
are considering to support yet an other Ph.D. student on this topic - 
this is to say that this particular topic is difficult. The ISO 19107 
editor is aware of those kind of difficulties and is in the process of 
revising the ISO 19107 standard at OGC.

In the main time, we are using a mixed approach: we use the ISO 19107 
Java interfaces in some code, but the JTS library as the underlying 
implementation. So the "geometry" group of modules contains a module 
that provide a partial ISO 19107 implementation as wrappers around JTS 
implementation. This has the JTS limits (2D, Cartesian), but may be the 
most functional implementation for now...

     Martin

Re: Grouping modules in categories

Posted by Adam Estrada <es...@gmail.com>.

Hey Martin,

Where in your code is the geometry support? I need a heads up on where to find it.

Thanks!
Adam

On Jul 30, 2012, at 7:14 PM, Martin Desruisseaux wrote:

> Hello Chris
> 
> Le 30/07/12 19:50, Mattmann, Chris A (388J) a écrit :
>> OK to clarify, OSGeo will discuss this issue at their board meeting on 8/9?
> Yes, I will keep this list informed.
> 
> 
>> Sure feel free to start threads here on sis-dev@ to discuss your thoughts.
> I saw the 5 modules in the root directory of the SIS project. The first thing that come to my though was that if the project growth to 1 million lines of code (which could happen relatively fast), we are going to have a lot of modules. What about a directory tree a little bit deeper, which regroup modules by category? We could have the following categories among others:
> 
>  * utilities
>  * metadata
>  * referencing
>  * geometry
>  * feature
>  * coverage
>  * processing
>  * index
>  * display
>  * client
> 
> Each category is a directory, which contains one or many modules. For example the "coverage" category could contains the following modules:
> 
>  * sis-coverage
>  * sis-coverageio
>  * sis-coverageio-netcdf
> 
> I noticed that the current SIS code contains a "sis-core" module. What about trying to put the content of "sis-core" in some more specific modules? Otherwise I think that core may become very big if "referencing", "geometry", "feature" etc. are considered as core...
> 
>    Martin
>