You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Paolo Castagna <ca...@googlemail.com> on 2011/12/15 16:37:53 UTC
On unit of measurement and custom datatypes...
Hi,
I tried to create a custom datatype with Jena and use if from a SPARQL query.
For example, I used temperatures in °C or °F. The complete example is here [1].
I found the documentation here quite useful:
- Typed literals how-to
http://incubator.apache.org/jena/documentation/notes/typed-literals.html
and I looked at the RomanNumeralDatatype [2] implementation in ARQ.
I created a TemperatureCelsius and TemperatureFahrenheit which extend BaseDatatype.
I'd like to be able to automatically compare or sort temperatures from my SPARQL
queries. For example, with a query like this:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT * WHERE {
?s rdf:value ?temperature .
}
ORDER BY ?temperature
Currently I get:
( ?temperature = "15"^^<http://jena.apache.org/datatypes/temperature/celsius> ) ( ?s = <x2> ) -> [Root]
( ?temperature = "25"^^<http://jena.apache.org/datatypes/temperature/celsius> ) ( ?s = <x1> ) -> [Root]
( ?temperature = "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit> ) ( ?s = <x3> ) -> [Root]
But, 25 °F = -3.89 °C, therefore 25 °F should be the first on the list.
Is there a way I can automatically apply the necessary conversions and ensure that
the ORDER BY sorts temperatures correctly?
An alternative is to not use custom datatypes and simply have something like this:
:x :temperature [
rdf:value "25.0" ;
:unit :Celsius ;
]
A similar question is answered on http://answers.semanticweb.com/ [3], I wanted
to try the custom datatype approach anyway... and probably it's not a good idea.
What do you recommend in order to model unit of measurements and conversions
between those, making sure people can use SPARQL and/or inference to work with
their data?
I found these vocabularies/ontologies:
- http://www.w3.org/2007/ont/unit
- http://qudt.org/vocab/unit#
- ...
Do you have other vocabularies/ontologies to suggest?
Thanks,
Paolo
[1] https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/ExampleDT_01.java
[2] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/util/RomanNumeralDatatype.java
[3] http://answers.semanticweb.com/questions/3572/xsd-or-vocabulary
Re: On unit of measurement and custom datatypes...
Posted by Paolo Castagna <ca...@googlemail.com>.
Hi Andy
Andy Seaborne wrote:
> On 15/12/11 17:25, Paolo Castagna wrote:
>> Hi Andy,
>> thanks for the quick reply.
>>
>> Andy Seaborne wrote:
>>> ARQ has it's own evaluation engine. Currently, it needs code changes to
>>> extend it. It covers most of XSD already.
>>
>> I had a look into NodeValue.java [1], would it be possible to do
>> something
>> along these lines?
>
> See NodeValue.compareAlways (which has the SPARQL rules for ORDER BY
> which are more than just "<")
Indeed! :-)
>
>>
>> Index: src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>> ===================================================================
>> --- src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>> (revision 1214849)
>> +++ src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>> (working copy)
>> @@ -1036,6 +1036,14 @@
>> boolean b = ((Boolean)lit.getValue()).booleanValue() ;
>> return new NodeValueBoolean(b, node) ;
>> }
>> +
>> +
>> + Object clazz = lit.getDatatype().getJavaClass();
>> + if ( clazz.getClass().isInstance(Double.class) )
>> + {
>> + double d = ((Number)lit.getValue()).doubleValue() ;
>> + return new NodeValueDouble(d, node) ;
>> + }
>
> That makes it a double, a number, no units.
> (aside from the fact doubles are NOT numbers in XSD - use xsd:decimal -
> their value space is "m × 2^e" i.e. (m,e) pairs, for fixed length m and
> e extended with NaN, Inf, -Inf and a lot of machinery for comparison and
> addition etc for rounding and mapping ).
>
> Saying a temperature has a value which is a number is wrong. Different
> value spaces.
>
> Can you compare the height of a mountain with a temperature? Ones in
> meters, the other in Kelvin.
>
>>
>> // If wired into the TypeMapper via
>> RomanNumeralDatatype.enableAsFirstClassDatatype
>> // if ( RomanNumeralDatatype.get().isValidLiteral(lit) )
>
> Note that RomanNumeralDatatype are another way to write integers. Same
> value space.
>
> Creating an ordering function that returns a number (no units) means
> comparison is defined.
I got an example of what I was trying to achieve (although I am not sure
the implementation is the best way to do it... in particular I think the
parse() method is still 'wrong' because it returns a Double).
Code here:
https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/ExampleDT_01.java
https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/temperature.java
Output:
---- Data ----
<x4> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "25"^^<http://jena.apache.org/datatypes/temperature/kelvin> .
<x2> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "15"^^<http://jena.apache.org/datatypes/temperature/celsius> .
<x5> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "25"^^<http://jena.apache.org/datatypes/temperature/rankine> .
<x3> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit> .
<x1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "25"^^<http://jena.apache.org/datatypes/temperature/celsius> .
---- Query ----
PREFIX java: <java:org.apache.jena.examples.>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT * WHERE {
?s rdf:value ?temperature .
}
ORDER BY java:temperature( ?temperature )
---- Results ----
--------------------------------------------------------------------------
| s | temperature |
==========================================================================
| <x5> | "25"^^<http://jena.apache.org/datatypes/temperature/rankine> |
| <x4> | "25"^^<http://jena.apache.org/datatypes/temperature/kelvin> |
| <x3> | "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit> |
| <x2> | "15"^^<http://jena.apache.org/datatypes/temperature/celsius> |
| <x1> | "25"^^<http://jena.apache.org/datatypes/temperature/celsius> |
--------------------------------------------------------------------------
Thanks.
Paolo
>
>> [1]
>> https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>>
Re: On unit of measurement and custom datatypes...
Posted by nat lu <na...@gmail.com>.
On 15/12/11 18:10, Andy Seaborne wrote:
> On 15/12/11 17:25, Paolo Castagna wrote:
>> Hi Andy,
>> thanks for the quick reply.
>>
>> Andy Seaborne wrote:
>>> ARQ has it's own evaluation engine. Currently, it needs code
>>> changes to
>>> extend it. It covers most of XSD already.
>>
>> I had a look into NodeValue.java [1], would it be possible to do
>> something
>> along these lines?
>
> See NodeValue.compareAlways (which has the SPARQL rules for ORDER BY
> which are more than just "<")
>
>>
>> Index: src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>> ===================================================================
>> --- src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>> (revision 1214849)
>> +++ src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>> (working copy)
>> @@ -1036,6 +1036,14 @@
>> boolean b = ((Boolean)lit.getValue()).booleanValue() ;
>> return new NodeValueBoolean(b, node) ;
>> }
>> +
>> +
>> + Object clazz = lit.getDatatype().getJavaClass();
>> + if ( clazz.getClass().isInstance(Double.class) )
>> + {
>> + double d = ((Number)lit.getValue()).doubleValue() ;
>> + return new NodeValueDouble(d, node) ;
>> + }
>
> That makes it a double, a number, no units.
> (aside from the fact doubles are NOT numbers in XSD - use xsd:decimal
> - their value space is "m × 2^e" i.e. (m,e) pairs, for fixed length m
> and e extended with NaN, Inf, -Inf and a lot of machinery for
> comparison and addition etc for rounding and mapping ).
>
> Saying a temperature has a value which is a number is wrong.
> Different value spaces.
>
> Can you compare the height of a mountain with a temperature? Ones in
> meters, the other in Kelvin.
>
>>
>> // If wired into the TypeMapper via
>> RomanNumeralDatatype.enableAsFirstClassDatatype
>> // if ( RomanNumeralDatatype.get().isValidLiteral(lit) )
>
> Note that RomanNumeralDatatype are another way to write integers.
> Same value space.
>
> Creating an ordering function that returns a number (no units) means
> comparison is defined.
>
>> [1]
>> https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
As an aside : units can change definition over time, or mean different
things in different localities - British Miles, US Gallons, Metric Ton
Imperial Tonne, etc, etc, etc.
Re: On unit of measurement and custom datatypes...
Posted by Andy Seaborne <an...@apache.org>.
On 15/12/11 17:25, Paolo Castagna wrote:
> Hi Andy,
> thanks for the quick reply.
>
> Andy Seaborne wrote:
>> ARQ has it's own evaluation engine. Currently, it needs code changes to
>> extend it. It covers most of XSD already.
>
> I had a look into NodeValue.java [1], would it be possible to do something
> along these lines?
See NodeValue.compareAlways (which has the SPARQL rules for ORDER BY
which are more than just "<")
>
> Index: src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
> ===================================================================
> --- src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java (revision 1214849)
> +++ src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java (working copy)
> @@ -1036,6 +1036,14 @@
> boolean b = ((Boolean)lit.getValue()).booleanValue() ;
> return new NodeValueBoolean(b, node) ;
> }
> +
> +
> + Object clazz = lit.getDatatype().getJavaClass();
> + if ( clazz.getClass().isInstance(Double.class) )
> + {
> + double d = ((Number)lit.getValue()).doubleValue() ;
> + return new NodeValueDouble(d, node) ;
> + }
That makes it a double, a number, no units.
(aside from the fact doubles are NOT numbers in XSD - use xsd:decimal -
their value space is "m × 2^e" i.e. (m,e) pairs, for fixed length m and
e extended with NaN, Inf, -Inf and a lot of machinery for comparison and
addition etc for rounding and mapping ).
Saying a temperature has a value which is a number is wrong. Different
value spaces.
Can you compare the height of a mountain with a temperature? Ones in
meters, the other in Kelvin.
>
> // If wired into the TypeMapper via RomanNumeralDatatype.enableAsFirstClassDatatype
> // if ( RomanNumeralDatatype.get().isValidLiteral(lit) )
Note that RomanNumeralDatatype are another way to write integers. Same
value space.
Creating an ordering function that returns a number (no units) means
comparison is defined.
> [1] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
Re: On unit of measurement and custom datatypes...
Posted by Paolo Castagna <ca...@googlemail.com>.
Hi Andy,
thanks for the quick reply.
Andy Seaborne wrote:
> ARQ has it's own evaluation engine. Currently, it needs code changes to
> extend it. It covers most of XSD already.
I had a look into NodeValue.java [1], would it be possible to do something
along these lines?
Index: src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
===================================================================
--- src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java (revision 1214849)
+++ src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java (working copy)
@@ -1036,6 +1036,14 @@
boolean b = ((Boolean)lit.getValue()).booleanValue() ;
return new NodeValueBoolean(b, node) ;
}
+
+
+ Object clazz = lit.getDatatype().getJavaClass();
+ if ( clazz.getClass().isInstance(Double.class) )
+ {
+ double d = ((Number)lit.getValue()).doubleValue() ;
+ return new NodeValueDouble(d, node) ;
+ }
// If wired into the TypeMapper via RomanNumeralDatatype.enableAsFirstClassDatatype
// if ( RomanNumeralDatatype.get().isValidLiteral(lit) )
[1] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>
> notes/typed-literals.html does not describe comparison, only
> value-equality.
>
> Use SPARQL custom functions.
>
> ORDER BY my:orderingFunction(?temperature)
>
> where my:orderingFunction returns the value as a number in on the, say,
> Kelvin scale.
Ack, I'll follow this advice.
Thanks,
Paolo
>
> Andy
>
> On 15/12/11 15:37, Paolo Castagna wrote:
>> Hi,
>> I tried to create a custom datatype with Jena and use if from a SPARQL
>> query.
>> For example, I used temperatures in °C or °F. The complete example is
>> here [1].
>>
>> I found the documentation here quite useful:
>>
>> - Typed literals how-to
>>
>> http://incubator.apache.org/jena/documentation/notes/typed-literals.html
>>
>> and I looked at the RomanNumeralDatatype [2] implementation in ARQ.
>>
>> I created a TemperatureCelsius and TemperatureFahrenheit which extend
>> BaseDatatype.
>> I'd like to be able to automatically compare or sort temperatures from
>> my SPARQL
>> queries. For example, with a query like this:
>>
>> PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>> SELECT * WHERE {
>> ?s rdf:value ?temperature .
>> }
>> ORDER BY ?temperature
>>
>> Currently I get:
>>
>> ( ?temperature =
>> "15"^^<http://jena.apache.org/datatypes/temperature/celsius> ) ( ?s
>> =<x2> ) -> [Root]
>> ( ?temperature =
>> "25"^^<http://jena.apache.org/datatypes/temperature/celsius> ) ( ?s
>> =<x1> ) -> [Root]
>> ( ?temperature =
>> "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit> ) (
>> ?s =<x3> ) -> [Root]
>>
>> But, 25 °F = -3.89 °C, therefore 25 °F should be the first on the list.
>>
>> Is there a way I can automatically apply the necessary conversions and
>> ensure that
>> the ORDER BY sorts temperatures correctly?
>>
>> An alternative is to not use custom datatypes and simply have
>> something like this:
>>
>> :x :temperature [
>> rdf:value "25.0" ;
>> :unit :Celsius ;
>> ]
>>
>>
>> A similar question is answered on http://answers.semanticweb.com/ [3],
>> I wanted
>> to try the custom datatype approach anyway... and probably it's not a
>> good idea.
>>
>> What do you recommend in order to model unit of measurements and
>> conversions
>> between those, making sure people can use SPARQL and/or inference to
>> work with
>> their data?
>>
>> I found these vocabularies/ontologies:
>>
>> - http://www.w3.org/2007/ont/unit
>> - http://qudt.org/vocab/unit#
>> - ...
>>
>> Do you have other vocabularies/ontologies to suggest?
>>
>> Thanks,
>> Paolo
>>
>>
>> [1]
>> https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/ExampleDT_01.java
>>
>> [2]
>> https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/util/RomanNumeralDatatype.java
>>
>> [3] http://answers.semanticweb.com/questions/3572/xsd-or-vocabulary
>
Re: On unit of measurement and custom datatypes...
Posted by Andy Seaborne <an...@apache.org>.
ARQ has it's own evaluation engine. Currently, it needs code changes to
extend it. It covers most of XSD already.
notes/typed-literals.html does not describe comparison, only value-equality.
Use SPARQL custom functions.
ORDER BY my:orderingFunction(?temperature)
where my:orderingFunction returns the value as a number in on the, say,
Kelvin scale.
Andy
On 15/12/11 15:37, Paolo Castagna wrote:
> Hi,
> I tried to create a custom datatype with Jena and use if from a SPARQL query.
> For example, I used temperatures in °C or °F. The complete example is here [1].
>
> I found the documentation here quite useful:
>
> - Typed literals how-to
> http://incubator.apache.org/jena/documentation/notes/typed-literals.html
>
> and I looked at the RomanNumeralDatatype [2] implementation in ARQ.
>
> I created a TemperatureCelsius and TemperatureFahrenheit which extend BaseDatatype.
> I'd like to be able to automatically compare or sort temperatures from my SPARQL
> queries. For example, with a query like this:
>
> PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT * WHERE {
> ?s rdf:value ?temperature .
> }
> ORDER BY ?temperature
>
> Currently I get:
>
> ( ?temperature = "15"^^<http://jena.apache.org/datatypes/temperature/celsius> ) ( ?s =<x2> ) -> [Root]
> ( ?temperature = "25"^^<http://jena.apache.org/datatypes/temperature/celsius> ) ( ?s =<x1> ) -> [Root]
> ( ?temperature = "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit> ) ( ?s =<x3> ) -> [Root]
>
> But, 25 °F = -3.89 °C, therefore 25 °F should be the first on the list.
>
> Is there a way I can automatically apply the necessary conversions and ensure that
> the ORDER BY sorts temperatures correctly?
>
> An alternative is to not use custom datatypes and simply have something like this:
>
> :x :temperature [
> rdf:value "25.0" ;
> :unit :Celsius ;
> ]
>
>
> A similar question is answered on http://answers.semanticweb.com/ [3], I wanted
> to try the custom datatype approach anyway... and probably it's not a good idea.
>
> What do you recommend in order to model unit of measurements and conversions
> between those, making sure people can use SPARQL and/or inference to work with
> their data?
>
> I found these vocabularies/ontologies:
>
> - http://www.w3.org/2007/ont/unit
> - http://qudt.org/vocab/unit#
> - ...
>
> Do you have other vocabularies/ontologies to suggest?
>
> Thanks,
> Paolo
>
>
> [1] https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/ExampleDT_01.java
> [2] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/util/RomanNumeralDatatype.java
> [3] http://answers.semanticweb.com/questions/3572/xsd-or-vocabulary