You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Olivier Rossel <ol...@gmail.com> on 2012/12/17 12:41:42 UTC
Matching literals of unknown langs
Hello.
The SPARQL spec says:
"Florence" is not the same RDF literal as "Florence"@fr
To illustrate that, I have tried these queries on the french dbPedia:
##### This one returns no result! ######
SELECT ?Locality WHERE {
BIND ("Florence" AS ?Label)
SERVICE <http://fr.dbpedia.org/sparql>{
?Locality <http://www.w3.org/2000/01/rdf-schema#label> ?Label .
}
}
##### This one returns 2 results! ######
SELECT ?Locality WHERE {
BIND ("Florence"@fr AS ?Label)
SERVICE <http://fr.dbpedia.org/sparql>{
?Locality <http://www.w3.org/2000/01/rdf-schema#label> ?Label .
}
}
Obviously, we can conclude that the french dbPedia encodes its
literals with @fr lang.
Now I have to federate-query an italian dataset and the french dbPedia.
As seen above, my french dbPedia contains this literal: "Florence"@fr
My italian dataset contains this literal: "Florence" (with no lang tag).
Here is the federated query:
SELECT DISTINCT ?LocalityITA ?LocalityFR WHERE {
SERVICE <http://91.121.14.47:6665/sparql/> {
?Address <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://www.w3.org/2006/vcard/ns#Address> .
?Address <http://www.w3.org/2006/vcard/ns#locality> ?LocalityITA .
?LocalityITA <http://www.w3.org/2000/01/rdf-schema#label> ?LabelITA .
}
BIND (strbefore(?LabelITA, "(") AS ?Label)
SERVICE <http://fr.dbpedia.org/sparql>{
?LocalityFR <http://www.w3.org/2000/01/rdf-schema#label> ?Label
}
}
How can I tune the query so the literal matching works across lang tags?
Thanks for your help.
Re: Matching literals of unknown langs
Posted by Olivier Rossel <ol...@gmail.com>.
Le 18 déc. 2012 à 13:02, Andy Seaborne <an...@apache.org> a écrit :
> On 17/12/12 22:04, Olivier Rossel wrote:
>> Thqt sounds like q
>>
>> On Mon, Dec 17, 2012 at 10:03 PM, Andy Seaborne <an...@apache.org> wrote:
>>> On 17/12/12 11:41, Olivier Rossel wrote:
>>>>
>>>> Hello.
>>>>
>>>> The SPARQL spec says:
>>>> "Florence" is not the same RDF literal as "Florence"@fr
>>>
>>>
>>> and it's just repeating RDF.
>>>
>>> ...
>>>
>>>
>>>> Now I have to federate-query an italian dataset and the french dbPedia.
>>>>
>>>> As seen above, my french dbPedia contains this literal: "Florence"@fr
>>>> My italian dataset contains this literal: "Florence" (with no lang tag).
>>>>
>>>> Here is the federated query:
>>>> SELECT DISTINCT ?LocalityITA ?LocalityFR WHERE {
>>>> SERVICE <http://91.121.14.47:6665/sparql/> {
>>>> ?Address <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>>>> <http://www.w3.org/2006/vcard/ns#Address> .
>>>> ?Address <http://www.w3.org/2006/vcard/ns#locality> ?LocalityITA .
>>>> ?LocalityITA <http://www.w3.org/2000/01/rdf-schema#label> ?LabelITA .
>>>> }
>>>> BIND (strbefore(?LabelITA, "(") AS ?Label)
>>>> SERVICE <http://fr.dbpedia.org/sparql>{
>>>> ?LocalityFR <http://www.w3.org/2000/01/rdf-schema#label> ?Label
>>>> }
>>>> }
>>>
>>>
>>> Reformatted:
>>>
>>>
>>> SELECT DISTINCT ?LocalityITA ?LocalityFR
>>> WHERE
>>> { SERVICE <http://91.121.14.47:6665/sparql/>
>>> { ?Address rdf:type vc:Address .
>>> ?Address vc:locality ?LocalityITA .
>>> ?LocalityITA rdfs:label ?LabelITA
>>> }
>>> BIND(strbefore(?LabelITA, "(") AS ?Label)
>>> SERVICE <http://fr.dbpedia.org/sparql>
>>> { ?LocalityFR rdfs:label ?Label }
>>> }
>>>
>>>
>>> You can use str() to get just the lexical form:
>>>
>>>
>>>>
>>>> How can I tune the query so the literal matching works across lang tags?
>>>> Thanks for your help.
>>>
>>>
>>> You could canonicalise to the simple literal in each SERVICE
>>>
>>> SERVICE <....> {
>>> ....
>>> ?Address vc:locality ?l .
>>> BIND(str(?l) AS ?locality)
>>> }
>>>
>>>>
>>>
>>> Except fr.dbpedia.org/Virtuoso does not support BIND.
>>>
>>> You can get the same effect with a subquery:
>>>
>>> SERVICE <....> {
>>> SELECT (str(?l as ?locality)
>>> { ...
>>> ?Address vc:locality ?l .
>>> }
>>> }
>>>
>>> Now you have ?locality without a language tag and can use it as the
>>> canonical term (if yoru app thinks that's safe enough).
>>>
>>> Andy
>>>
>>
>> Ok for the first SERVICE<..> block : str(...) binds a "raw string"
>> into ?locality.
>> Now the seconde SERVICE<...> block:
>> Is a strlang(..., "fr") required to match ?locality against the @fr strings ???
>> Like this:
>>
>> SERVICE <myItalianData> {
>> SELECT (str(?l) as ?locality)
>> {
>> ?Address vc:locality ?l .
>> }
>> }
>> SERVICE <fr.dbpedia.org> {
>> SELECT (?LocalityFR)
>> {
>> ?LocalityFR rdfs:label strlang(?locality,"fr") .
>> }
>> }
>>
>> Or is the ?locality variable bound in a way that says "i am a raw
>> string, compare me without taking lang-tag into account"?
>>
>> As usual, thanks for your help, Andy.
>>
>
> ?locality is not passed from the first SERVICE to the second - evaluation is bottom up and (logically) each SERVICE is evaluted and then the results combined in the client.
>
> So I suggest making the SERVICE calls extract the lexical form and then
> equate them (via a join) in the client by using ?locality as the output of each SERVICE call.
>
> Andy
I use the basic federation of Jena.
And my query runs reasonnably fast between a 4store and dbpedia.org (something like 1 minute for something like 900 results).
Given the fact that the second SERVICE<...> block is insane to resolve by itself (give me all the labels of dbpedia then i will join them with the ones from block 1), i suppose some magic optimization is at work during its evaluation.
FYI, my final query has a BIND inbetween the two SERVICE<...> blocks:
SERVICE <http://91.121.14.47:6665/sparql/> {
SELECT DISTINCT ?LocalityITA ?LabelITA WHERE {
?Address <vc:locality> ?LocalityITA .
?LocalityITA <rdfs:label> ?LabelITA .
}}
BIND (strlang(str(?LabelITA),"en") as ?LabelEN)
SERVICE <http://dbpedia.org/sparql>{
?Locality <rdfs:label> ?LabelEN
}
Could you explain how basic federation works in that case?
I was pretty sure basic federation was resolving the SERVICE<...> blocks first-to-last ?
Re: Matching literals of unknown langs
Posted by Andy Seaborne <an...@apache.org>.
On 17/12/12 22:04, Olivier Rossel wrote:
> Thqt sounds like q
>
> On Mon, Dec 17, 2012 at 10:03 PM, Andy Seaborne <an...@apache.org> wrote:
>> On 17/12/12 11:41, Olivier Rossel wrote:
>>>
>>> Hello.
>>>
>>> The SPARQL spec says:
>>> "Florence" is not the same RDF literal as "Florence"@fr
>>
>>
>> and it's just repeating RDF.
>>
>> ...
>>
>>
>>> Now I have to federate-query an italian dataset and the french dbPedia.
>>>
>>> As seen above, my french dbPedia contains this literal: "Florence"@fr
>>> My italian dataset contains this literal: "Florence" (with no lang tag).
>>>
>>> Here is the federated query:
>>> SELECT DISTINCT ?LocalityITA ?LocalityFR WHERE {
>>> SERVICE <http://91.121.14.47:6665/sparql/> {
>>> ?Address <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>>> <http://www.w3.org/2006/vcard/ns#Address> .
>>> ?Address <http://www.w3.org/2006/vcard/ns#locality> ?LocalityITA .
>>> ?LocalityITA <http://www.w3.org/2000/01/rdf-schema#label> ?LabelITA .
>>> }
>>> BIND (strbefore(?LabelITA, "(") AS ?Label)
>>> SERVICE <http://fr.dbpedia.org/sparql>{
>>> ?LocalityFR <http://www.w3.org/2000/01/rdf-schema#label> ?Label
>>> }
>>> }
>>
>>
>> Reformatted:
>>
>>
>> SELECT DISTINCT ?LocalityITA ?LocalityFR
>> WHERE
>> { SERVICE <http://91.121.14.47:6665/sparql/>
>> { ?Address rdf:type vc:Address .
>> ?Address vc:locality ?LocalityITA .
>> ?LocalityITA rdfs:label ?LabelITA
>> }
>> BIND(strbefore(?LabelITA, "(") AS ?Label)
>> SERVICE <http://fr.dbpedia.org/sparql>
>> { ?LocalityFR rdfs:label ?Label }
>> }
>>
>>
>> You can use str() to get just the lexical form:
>>
>>
>>>
>>> How can I tune the query so the literal matching works across lang tags?
>>> Thanks for your help.
>>
>>
>> You could canonicalise to the simple literal in each SERVICE
>>
>> SERVICE <....> {
>> ....
>> ?Address vc:locality ?l .
>> BIND(str(?l) AS ?locality)
>> }
>>
>>>
>>
>> Except fr.dbpedia.org/Virtuoso does not support BIND.
>>
>> You can get the same effect with a subquery:
>>
>> SERVICE <....> {
>> SELECT (str(?l as ?locality)
>> { ...
>> ?Address vc:locality ?l .
>> }
>> }
>>
>> Now you have ?locality without a language tag and can use it as the
>> canonical term (if yoru app thinks that's safe enough).
>>
>> Andy
>>
>
> Ok for the first SERVICE<..> block : str(...) binds a "raw string"
> into ?locality.
> Now the seconde SERVICE<...> block:
> Is a strlang(..., "fr") required to match ?locality against the @fr strings ???
> Like this:
>
> SERVICE <myItalianData> {
> SELECT (str(?l) as ?locality)
> {
> ?Address vc:locality ?l .
> }
> }
> SERVICE <fr.dbpedia.org> {
> SELECT (?LocalityFR)
> {
> ?LocalityFR rdfs:label strlang(?locality,"fr") .
> }
> }
>
> Or is the ?locality variable bound in a way that says "i am a raw
> string, compare me without taking lang-tag into account"?
>
> As usual, thanks for your help, Andy.
>
?locality is not passed from the first SERVICE to the second -
evaluation is bottom up and (logically) each SERVICE is evaluted and
then the results combined in the client.
So I suggest making the SERVICE calls extract the lexical form and then
equate them (via a join) in the client by using ?locality as the output
of each SERVICE call.
Andy
Re: Matching literals of unknown langs
Posted by Olivier Rossel <ol...@gmail.com>.
Thqt sounds like q
On Mon, Dec 17, 2012 at 10:03 PM, Andy Seaborne <an...@apache.org> wrote:
> On 17/12/12 11:41, Olivier Rossel wrote:
>>
>> Hello.
>>
>> The SPARQL spec says:
>> "Florence" is not the same RDF literal as "Florence"@fr
>
>
> and it's just repeating RDF.
>
> ...
>
>
>> Now I have to federate-query an italian dataset and the french dbPedia.
>>
>> As seen above, my french dbPedia contains this literal: "Florence"@fr
>> My italian dataset contains this literal: "Florence" (with no lang tag).
>>
>> Here is the federated query:
>> SELECT DISTINCT ?LocalityITA ?LocalityFR WHERE {
>> SERVICE <http://91.121.14.47:6665/sparql/> {
>> ?Address <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>> <http://www.w3.org/2006/vcard/ns#Address> .
>> ?Address <http://www.w3.org/2006/vcard/ns#locality> ?LocalityITA .
>> ?LocalityITA <http://www.w3.org/2000/01/rdf-schema#label> ?LabelITA .
>> }
>> BIND (strbefore(?LabelITA, "(") AS ?Label)
>> SERVICE <http://fr.dbpedia.org/sparql>{
>> ?LocalityFR <http://www.w3.org/2000/01/rdf-schema#label> ?Label
>> }
>> }
>
>
> Reformatted:
>
>
> SELECT DISTINCT ?LocalityITA ?LocalityFR
> WHERE
> { SERVICE <http://91.121.14.47:6665/sparql/>
> { ?Address rdf:type vc:Address .
> ?Address vc:locality ?LocalityITA .
> ?LocalityITA rdfs:label ?LabelITA
> }
> BIND(strbefore(?LabelITA, "(") AS ?Label)
> SERVICE <http://fr.dbpedia.org/sparql>
> { ?LocalityFR rdfs:label ?Label }
> }
>
>
> You can use str() to get just the lexical form:
>
>
>>
>> How can I tune the query so the literal matching works across lang tags?
>> Thanks for your help.
>
>
> You could canonicalise to the simple literal in each SERVICE
>
> SERVICE <....> {
> ....
> ?Address vc:locality ?l .
> BIND(str(?l) AS ?locality)
> }
>
>>
>
> Except fr.dbpedia.org/Virtuoso does not support BIND.
>
> You can get the same effect with a subquery:
>
> SERVICE <....> {
> SELECT (str(?l as ?locality)
> { ...
> ?Address vc:locality ?l .
> }
> }
>
> Now you have ?locality without a language tag and can use it as the
> canonical term (if yoru app thinks that's safe enough).
>
> Andy
>
Ok for the first SERVICE<..> block : str(...) binds a "raw string"
into ?locality.
Now the seconde SERVICE<...> block:
Is a strlang(..., "fr") required to match ?locality against the @fr strings ???
Like this:
SERVICE <myItalianData> {
SELECT (str(?l) as ?locality)
{
?Address vc:locality ?l .
}
}
SERVICE <fr.dbpedia.org> {
SELECT (?LocalityFR)
{
?LocalityFR rdfs:label strlang(?locality,"fr") .
}
}
Or is the ?locality variable bound in a way that says "i am a raw
string, compare me without taking lang-tag into account"?
As usual, thanks for your help, Andy.
Re: Matching literals of unknown langs
Posted by Andy Seaborne <an...@apache.org>.
On 17/12/12 11:41, Olivier Rossel wrote:
> Hello.
>
> The SPARQL spec says:
> "Florence" is not the same RDF literal as "Florence"@fr
and it's just repeating RDF.
...
> Now I have to federate-query an italian dataset and the french dbPedia.
>
> As seen above, my french dbPedia contains this literal: "Florence"@fr
> My italian dataset contains this literal: "Florence" (with no lang tag).
>
> Here is the federated query:
> SELECT DISTINCT ?LocalityITA ?LocalityFR WHERE {
> SERVICE <http://91.121.14.47:6665/sparql/> {
> ?Address <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> <http://www.w3.org/2006/vcard/ns#Address> .
> ?Address <http://www.w3.org/2006/vcard/ns#locality> ?LocalityITA .
> ?LocalityITA <http://www.w3.org/2000/01/rdf-schema#label> ?LabelITA .
> }
> BIND (strbefore(?LabelITA, "(") AS ?Label)
> SERVICE <http://fr.dbpedia.org/sparql>{
> ?LocalityFR <http://www.w3.org/2000/01/rdf-schema#label> ?Label
> }
> }
Reformatted:
SELECT DISTINCT ?LocalityITA ?LocalityFR
WHERE
{ SERVICE <http://91.121.14.47:6665/sparql/>
{ ?Address rdf:type vc:Address .
?Address vc:locality ?LocalityITA .
?LocalityITA rdfs:label ?LabelITA
}
BIND(strbefore(?LabelITA, "(") AS ?Label)
SERVICE <http://fr.dbpedia.org/sparql>
{ ?LocalityFR rdfs:label ?Label }
}
You can use str() to get just the lexical form:
>
> How can I tune the query so the literal matching works across lang tags?
> Thanks for your help.
You could canonicalise to the simple literal in each SERVICE
SERVICE <....> {
....
?Address vc:locality ?l .
BIND(str(?l) AS ?locality)
}
>
Except fr.dbpedia.org/Virtuoso does not support BIND.
You can get the same effect with a subquery:
SERVICE <....> {
SELECT (str(?l as ?locality)
{ ...
?Address vc:locality ?l .
}
}
Now you have ?locality without a language tag and can use it as the
canonical term (if yoru app thinks that's safe enough).
Andy