You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Alessandro Adamou <ad...@cs.unibo.it> on 2012/02/01 19:19:26 UTC
Discrepancies in RDF/XML vs. JSONLD signatures in EntityHub
Hi,
I tried to add a MusicBrainz Referenced Site (the one maintained by
DBTune) using the following configuration:
id = musicbrainz
name = musicbrainz
entity prefixes =
http://dbtune.org/musicbrainz/resource/
http://dbtune.org/musicbrainz/resource/artist/
http://dbtune.org/musicbrainz/resource/record/
http://dbtune.org/musicbrainz/resource/track/
http://dbtune.org/musicbrainz/resource/master/
access URI = http://dbtune.org/musicbrainz/data/
dereferencing = coolURI
query service = http://dbtune.org/musicbrainz/sparql
query strategy = SPARQL
caching strategy = used
cache name = mbcache
I then went on to query the site for the artist Metallica, first in json
curl -H "Accept: application/json"
"http://localhost:8080/entityhub/site/musicbrainz/entity?id=http://dbtune.org/musicbrainz/resource/artist/65f4f0c5-ef9e-490c-aee3-909e7ae6b2ab"
then in rdf+xml
curl -H "Accept: application/rdf+xml"
"http://localhost:8080/entityhub/site/musicbrainz/entity?id=http://dbtune.org/musicbrainz/resource/artist/65f4f0c5-ef9e-490c-aee3-909e7ae6b2ab"
I noticed that the rdf/xml graph as many, many more triples. More
precisely, it includes inverse predicates, namely triples like
?x foaf:maker
http://dbtune.org/musicbrainz/resource/artist/65f4f0c5-ef9e-490c-aee3-909e7ae6b2ab
I haven't checked if there is anything else missing yet.
Is this some limitation of the json-ld renderer or something?
Also, with this cache strategy am I not supposed to get the same data
when querying the entity hub tout-court? If I issue:
curl -H "Accept: application/json"
"http://localhost:8080/entityhub/entity?id=http://dbtune.org/musicbrainz/resource/artist/65f4f0c5-ef9e-490c-aee3-909e7ae6b2ab"
I keep getting a 404
Thanks for your help
Alessandro
--
M.Sc. Alessandro Adamou
Alma Mater Studiorum - Università di Bologna
Department of Computer Science
Mura Anteo Zamboni 7, 40127 Bologna - Italy
Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, 00161 Rome - Italy
"As for the charges against me, I am unconcerned. I am beyond their timid, lying morality, and so I am beyond caring."
(Col. Walter E. Kurtz)
Not sent from my iSnobTechDevice
Re: Discrepancies in RDF/XML vs. JSONLD signatures in EntityHub
Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi
On Thu, Feb 2, 2012 at 11:53 AM, Alessandro Adamou <ad...@cs.unibo.it> wrote:
>> The entityhub does not support incoming triples. Therefore it is expected
>> that those are missing in the Entityhub specific "application/json"
>> serialization.
>> If you choose a RDF backed serialization such triples may be present if
>> the remote service returns them. This may be the case for the "coolURI"
>> dereferencer.
>
>
> Ok but I still don't understand why it is the acceptable MIME-type that
> makes the difference: does the entityhub json writer filter incoming triples
> or something?
>
If the Dereferencer (like the CoolURI dereferencer you configured)
returns the data as RDF, than the Entityhub directly streams those
results to serializers that also use an RDF graph as source. The
"application/json" serializer directly operates on the entityhub
Representation interface.
Because of that you do not see incoming triples with
"application/json" but all data returned by the CoolURI dereferencer
with RDF based serialization.
>
>> Note that the "application/json" returned by the Entityhub is NOT json-ld
>> but an own JSON serialization.
>
>
> I see, so it's not possible to serialize to json-ld at all?
>
The current "json-ld" serializer has still issues (completeness and
performance) that would makes its usage questionable.
As soon as this issues are solved the Entityhub will use json-ld for
"application/json" requests. The current json format will than use an
mime type such as "application/entityhub+json".
> It might not be a problem for me as I am probably going to use the EntityHub
> Java API and handle Representation objects. I just use the REST API to try
> out the EntityHub itself.
>
On the JavaAPI level you can make a check like
if(representation instance of RdfRepresentation){
((RdfRepresentation)representation).getGraph();
}
this would give you the graph as serialized as RDF.
If you really need to check incoming Triples, than you can use LDPath for that
e.g. go to http://dev.iks-project.eu:8081/entityhub/site/dbpedia/ldpath
and use:
Context: http://dbpedia.org/resource/Category:Host_cities_of_the_Summer_Olympic_Games
LD-Path:
schema:name = rdfs:label[@en];
members = ^dc:subject :: xsd:anyURI;
the '^{property}' allows to traverse inverse relations.
I hope this makes things more clear
best
Rupert
--
| Rupert Westenthaler rupert.westenthaler@gmail.com
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen
Re: Discrepancies in RDF/XML vs. JSONLD signatures in EntityHub
Posted by Alessandro Adamou <ad...@cs.unibo.it>.
Thanks Rupert! Further questions below:
> Parsing "http://dbtune.org/musicbrainz/resource/" would be enough as
> it will be matched as "http://dbtune.org/musicbrainz/resource/*".
Ok will do. I just wasn't sure whether it was using wildcards like that.
> The entityhub does not support incoming triples. Therefore it is expected that those are missing in the Entityhub specific "application/json" serialization.
> If you choose a RDF backed serialization such triples may be present if the remote service returns them. This may be the case for the "coolURI" dereferencer.
Ok but I still don't understand why it is the acceptable MIME-type that
makes the difference: does the entityhub json writer filter incoming
triples or something?
> Note that the "application/json" returned by the Entityhub is NOT json-ld but an own JSON serialization.
I see, so it's not possible to serialize to json-ld at all?
It might not be a problem for me as I am probably going to use the
EntityHub Java API and handle Representation objects. I just use the
REST API to try out the EntityHub itself.
> From the cache you will get no incoming triples regardless of the chosen Accept mime type. This is simple because incoming triples will not get stored in the cache.
Ok then I will be expecting that.
> Have you configured a Yard and a Cache with the name "mbcache" as noted in the above configuration?
Ouch, no I will try that now. So I should:
* crate a Solr Yard "mbcache". Do I have to create a musicbrainz Solr
index manually beforehand? can I use the existing Stanbol managed Solr
server for that?
* create a Cache Configuration using "mbcache" as a Yard
best,
Alessandro
--
M.Sc. Alessandro Adamou
Alma Mater Studiorum - Università di Bologna
Department of Computer Science
Mura Anteo Zamboni 7, 40127 Bologna - Italy
Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, 00161 Rome - Italy
"As for the charges against me, I am unconcerned. I am beyond their timid, lying morality, and so I am beyond caring."
(Col. Walter E. Kurtz)
Not sent from my iSnobTechDevice
Re: Discrepancies in RDF/XML vs. JSONLD signatures in EntityHub
Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Alessandro
On 01.02.2012, at 19:19, Alessandro Adamou wrote:
>
> I tried to add a MusicBrainz Referenced Site (the one maintained by DBTune) using the following configuration:
>
> id = musicbrainz
> name = musicbrainz
> entity prefixes =
> http://dbtune.org/musicbrainz/resource/
> http://dbtune.org/musicbrainz/resource/artist/
> http://dbtune.org/musicbrainz/resource/record/
> http://dbtune.org/musicbrainz/resource/track/
> http://dbtune.org/musicbrainz/resource/master/
Parsing "http://dbtune.org/musicbrainz/resource/" would be enough as it will be matched as "http://dbtune.org/musicbrainz/resource/*".
> access URI = http://dbtune.org/musicbrainz/data/
> dereferencing = coolURI
> query service = http://dbtune.org/musicbrainz/sparql
> query strategy = SPARQL
> caching strategy = used
> cache name = mbcache
>
> I then went on to query the site for the artist Metallica, first in json
>
> curl -H "Accept: application/json" "http://localhost:8080/entityhub/site/musicbrainz/entity?id=http://dbtune.org/musicbrainz/resource/artist/65f4f0c5-ef9e-490c-aee3-909e7ae6b2ab"
>
> then in rdf+xml
>
> curl -H "Accept: application/rdf+xml" "http://localhost:8080/entityhub/site/musicbrainz/entity?id=http://dbtune.org/musicbrainz/resource/artist/65f4f0c5-ef9e-490c-aee3-909e7ae6b2ab"
>
> I noticed that the rdf/xml graph as many, many more triples. More precisely, it includes inverse predicates, namely triples like
>
> ?x foaf:maker http://dbtune.org/musicbrainz/resource/artist/65f4f0c5-ef9e-490c-aee3-909e7ae6b2ab
>
The entityhub does not support incoming triples. Therefore it is expected that those are missing in the Entityhub specific "application/json" serialization.
If you choose a RDF backed serialization such triples may be present if the remote service returns them. This may be the case for the "coolURI" dereferencer.
> I haven't checked if there is anything else missing yet.
I would expect all incoming triples are "missing". If you just look at outgoing the two serialization should represent the same information.
>
> Is this some limitation of the json-ld renderer or something?
>
Note that the "application/json" returned by the Entityhub is NOT json-ld but an own JSON serialization.
> Also, with this cache strategy am I not supposed to get the same data when querying the entity hub tout-court? If I issue:
>
> curl -H "Accept: application/json" "http://localhost:8080/entityhub/entity?id=http://dbtune.org/musicbrainz/resource/artist/65f4f0c5-ef9e-490c-aee3-909e7ae6b2ab"
>
From the cache you will get no incoming triples regardless of the chosen Accept mime type. This is simple because incoming triples will not get stored in the cache.
> I keep getting a 404
Have you configured a Yard and a Cache with the name "mbcache" as noted in the above configuration?
best
Rupert