You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@clerezza.apache.org by Reto Gmür <re...@apache.org> on 2014/12/19 15:31:13 UTC

Re: Commons RDF

Hi Sergio

On Fri, Dec 19, 2014 at 12:49 PM, Sergio Fernández <wi...@apache.org>
wrote:
>
> Hi,
>
> On 18/12/14 13:43, Benedikt Ritter wrote:
>
>> We had a similar proposal a while ago [1]. Is the Clerezza RDF library
>> related to this proposal? In the end the people around
>> https://github.com/commons-rdf/commons-rdf decided not to bring their
>> code
>> to Apache Commons, because they wanted to use github for development and
>> discussions. They already requested the commons-rdf git repository from
>> infra, which is now unused [2]. So if you want to bring your RDF library
>> to
>> commons, we can use that repo, I guess... I can help you with bootstraping
>> the component and bring up a website.
>>
>
> as part of that initiative, I'd like to assert my opinion here. That we
> did not accommodate as project in Apache Commons is just accessory and not
> so important.
>
> Andy already reply to COMMONSSITE-80,but I think this is a better place
> for such discussion.
>
> Because probably many of the folks here are not so familiar with RDF, I'd
> like to introduce (my version of) the story: RDF is a directed labeled
> graph proposed by W3C as base data model for the Semantic Web. In the Java
> world there are historically two major toolkits (Apache Jena and OpenRDF
> Sesame) that with different approaches provide a RDF stack. Therefore there
> are many wrappers trying to 'integrate' both implementations, Apache
> Clerezza is one, but there are others.


That's not true. Clerezza is not meant as a wrapper to integrate these
toolkits but since the beginning the purpose was to provide a generic API
solely based on the RDF standards, integrating well into the Java platform.
The APIs of Jena and Sesame are specifically designed for triple stores.
While triple stores are the standard databases for RDF databases other data
can also be exposed as RDF. In the generic case a mapping to the Jena or
Sesame model is not possible without memory expensive work-arounds. Also
because of this ties to tripe stores some of the RDF features are not
possible (at least not easily) with these APIs.

Another aspect the Clerezza model empathizes is the identity criteria for
the types. While mutable graph objects are only equals when they are the
same instance or backed by the same graph in the triplestore (so hat they
contain the same triples at any point in time) immutable graphs are equals
if they are isomorphic. This is important for example when adding graphs to
a set. I developed the first version of such an API in 2006 when working
with the Jena team at HP on a Graph Versioning System. The current Clerezza
API is however substantially different from this with the contributions of
many developers.


>
> The idea of Commons RDF as we conceived was to design together with those
> toolkits a generic API that new versions could implement and focus on
> implementing portable algorithms with a choice of storage. And that's
> basically the current scope of Commons RDF as I know it.
>

That's pretty much the goal of the Clerezza RDF library too. And I can show
you usecases where the current version of your API doesn't fulfill this
purpose.


>
> Therefore adopting Clerezza as a yet-another library does really bring any
> value to Apache Commons.


What I wanted to push yesterday to git and found out that I can't push is a
subset of Clerezza rdf-core adapted to much the rdf-commons on Github as
close as possible (that it, while staying with Java best practices and not
introducing artifacts that make it less general purpose).


> Besides there are many aspects (blank nodes, etc) why I would never
> consider Clerezza as a generic RDF API, but that's another story.
>

Bring forward your concerns, we had many discussions around the Clerezza
RDF API and its likely that what makes you skeptic has been discussed and
answered before. As for the project on github I've never seen any
discussion about this on the various semantic web related mailing lists at
apache.


>
> FMPOV the decision of Apache Commons to grants write access to all ASF
> committers should not be used as way out of dying projects.
>

No, I don't think that's fair. Clerezza is an Apache project which has
never been sponsored by a large company or research institution so clerezza
is driven mainly by volunteers who use their spare time to write beautiful
code. I don't think you are in a position to diagnose clerezza as "dying
project" (for instance we are having an active mailing list and the
proposal to publish the RDF API as commons found 4 spontaneous +1 within
one day).


> But that's my personal opinion. I hope it would be taken into
> consideration.
>

Rather than about accusations I would like to talk about APIs, not is some
private circles but at Apache. Following the mantra "If it wasn't on the
mailing list it didn't happen". As I first step I wanted to push a draft
merging the core of clerezza with your proposals om github, not as
something final, but as a basis for discussion.

Reto


>
> Cheers,
>
> --
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 660 2747 925
> e: sergio.fernandez@redlink.co
> w: http://redlink.co
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: Commons RDF

Posted by Andy Seaborne <an...@apache.org>.
On 19/12/14 14:31, Reto Gmür wrote:
> Clerezza is not meant as a wrapper to integrate these
> toolkits but since the beginning the purpose was to provide a generic API
> solely based on the RDF standards, integrating well into the Java platform.

> The APIs of Jena and Sesame are specifically designed for triple stores.

Good - let's recognize that different APIs can serve different purposes. 
  That includes Clerezza.

> While triple stores are the standard databases for RDF databases other data
> can also be exposed as RDF. In the generic case a mapping to the Jena or
> Sesame model is not possible without memory expensive work-arounds.

I don't understand that point - could you expand on it?  Let's get concrete.

> Also
> because of this ties to tripe stores some of the RDF features are not
> possible (at least not easily) with these APIs.

Interesting claim - could you explain it?  Links would be good.  Let's 
ground discussion in the spec text we are all claiming to adhere to. [*]

>
> I developed the first version of such an API in 2006 when working
> with the Jena team at HP on a Graph Versioning System. The current Clerezza
> API is however substantially different from this with the contributions of
> many developers.

You will be pleased to know that HP granted IP rights of GVS, and 
derived works, to ASF as part of the Jena software grant.  It had not 
been done before.

>> The idea of Commons RDF as we conceived was to design together with those
>> toolkits a generic API that new versions could implement and focus on
>> implementing portable algorithms with a choice of storage. And that's
>> basically the current scope of Commons RDF as I know it.
>>
>
> That's pretty much the goal of the Clerezza RDF library too. And I can show
> you usecases where the current version of your API doesn't fulfill this
> purpose.

Please show the use cases.

> As for the project on github I've never seen any
> discussion about this on the various semantic web related mailing lists at
> apache.

http://markmail.org/message/dtvy7mpm7gd7kvdw

http://mail-archives.apache.org/mod_mbox/clerezza-dev/201406.mbox/%3C5398B07C.5000507%40apache.org%3E

This is not an Apache-only endeavour.

	Andy

[*] Disclose the the wider audience: I was active on the RDF 1.1 working 
group, contributing to various of the W3C specs.