You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commonsrdf.apache.org by Alexandre Bertails <al...@bertails.org> on 2015/05/05 22:43:57 UTC

General questions

Hi all,

I haven't followed the development in a long time, especially after
the move to Apache. I just looked at it and I had a few remarks and
questions.

Just some background for those who don't already know me: I am part of
the banana-rdf project [1]. The main approach there relies on defining
RDF and its operations as a typeclass [2]. In that world, Jena and
Sesame are just two instances of that typeclass (see for example [4]
and [5]). So there is no wrapper involved. Still, commons-rdf is
really a good step in the right direction as we could obsolete a lot
of stuff.

Thanks to Scala.js, we also have instances of the typeclass for
jsonld.js and n3.js. And finally, we have Plantain, a pure Scala
implementation that targets both the JVM and JS.

So my hope was to be able to get Plaintain to implement the
commons-rdf interfaces too. We would be able to still compile to JS by
providing a Scala version of the interfaces (it's pretty easy).

Here are the things we care about:

# immutable graphs

Right now, there is no support in commons-rdf for immutable
operations. `Graph`s are mutable by default. Is there any plan to make
an API for immutable graphs? Graphs in banana-rdf are immutable by
default, and they are persistent in Plantain. We could always wrap an
immutable graph in a structure with a `var`, but, well, there are
better ways to do that.

# stateful factory

`RDFTermFactory` is stateful just to accommodate
`createBlankNode(String)`. It's stateless otherwise. This is really an
issue for banana-rdf as everything is defined as pure function (the
output only depends on the input).

Is `createBlankNode(String)` really needed? The internal map for
bnodes could be maintained _outside_ of the factory. Or at least, we
could pass it as an argument instead: `createBlankNode(Map<String,
BlankNode>, String)`.

# wrapped values

There are a lot of unnecessary objects because of the class hierarchy.
In banana-rdf, we can say that RDFTerm is a plain `String` while being
100% type-safe. That's already what's happening for the N3.js
implementation. And in Plantain, Literals are just `java.lang.Object`
[6] so that we can directly have String, Int, etc.

That means that there is no way currently to provide a
`RDFTermFactory` for Plantain. The only alternatives I see right now
are:

* have the methods return Object
* make the types abstract (Generics) so that one can substitute them.
The tests could be defined just for the interfaces

We might hit the limit of class-based design here...

# getTriples vs iterate

Not a big deal but I believe the naming could be improved. When I read
getTriples, I expect to have all the triples in my hand, but this is
not quite what Streams are about. On the other hand, when I read
iterate, I kinda expect the opposite. Of course the types clarify
everything but I believe it'd be easier to use getTriplesAsStream and
getTriplesAsIterable.

Cheers,

Alexandre

[1] https://github.com/w3c/banana-rdf
[2] https://github.com/w3c/banana-rdf/blob/series/0.8.x/rdf/common/src/main/scala/org/w3/banana/RDF.scala
[3] https://github.com/w3c/banana-rdf/blob/series/0.8.x/rdf/common/src/main/scala/org/w3/banana/RDFOps.scala
[4] https://github.com/w3c/banana-rdf/blob/series/0.8.x/jena/src/main/scala/org/w3/banana/jena/Jena.scala
[5] https://github.com/w3c/banana-rdf/blob/series/0.8.x/jena/src/main/scala/org/w3/banana/jena/JenaOps.scala
[6] https://github.com/w3c/banana-rdf/blob/series/0.8.x/plantain/jvm/src/main/scala/org/w3/banana/plantain/Plantain.scala#L16