You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Paolo Castagna <ca...@googlemail.com> on 2011/12/15 16:37:53 UTC

On unit of measurement and custom datatypes...

Hi,
I tried to create a custom datatype with Jena and use if from a SPARQL query.
For example, I used temperatures in °C or °F. The complete example is here [1].

I found the documentation here quite useful:

 - Typed literals how-to
   http://incubator.apache.org/jena/documentation/notes/typed-literals.html

and I looked at the RomanNumeralDatatype [2] implementation in ARQ.

I created a TemperatureCelsius and TemperatureFahrenheit which extend BaseDatatype.
I'd like to be able to automatically compare or sort temperatures from my SPARQL
queries. For example, with a query like this:

  PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
  SELECT * WHERE {
    ?s rdf:value ?temperature .
  }
  ORDER BY ?temperature

Currently I get:

( ?temperature = "15"^^<http://jena.apache.org/datatypes/temperature/celsius> ) ( ?s = <x2> ) -> [Root]
( ?temperature = "25"^^<http://jena.apache.org/datatypes/temperature/celsius> ) ( ?s = <x1> ) -> [Root]
( ?temperature = "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit> ) ( ?s = <x3> ) -> [Root]

But, 25 °F = -3.89 °C, therefore 25 °F should be the first on the list.

Is there a way I can automatically apply the necessary conversions and ensure that
the ORDER BY sorts temperatures correctly?

An alternative is to not use custom datatypes and simply have something like this:

  :x :temperature [
    rdf:value "25.0" ;
    :unit :Celsius ;
  ]


A similar question is answered on http://answers.semanticweb.com/ [3], I wanted
to try the custom datatype approach anyway... and probably it's not a good idea.

What do you recommend in order to model unit of measurements and conversions
between those, making sure people can use SPARQL and/or inference to work with
their data?

I found these vocabularies/ontologies:

 - http://www.w3.org/2007/ont/unit
 - http://qudt.org/vocab/unit#
 - ...

Do you have other vocabularies/ontologies to suggest?

Thanks,
Paolo


 [1] https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/ExampleDT_01.java
 [2] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/util/RomanNumeralDatatype.java
 [3] http://answers.semanticweb.com/questions/3572/xsd-or-vocabulary

Re: On unit of measurement and custom datatypes...

Posted by Paolo Castagna <ca...@googlemail.com>.
Hi Andy

Andy Seaborne wrote:
> On 15/12/11 17:25, Paolo Castagna wrote:
>> Hi Andy,
>> thanks for the quick reply.
>>
>> Andy Seaborne wrote:
>>> ARQ has it's own evaluation engine.  Currently, it needs code changes to
>>> extend it.  It covers most of XSD already.
>>
>> I had a look into NodeValue.java [1], would it be possible to do
>> something
>> along these lines?
> 
> See NodeValue.compareAlways (which has the SPARQL rules for ORDER BY
> which are more than just "<")

Indeed! :-)

> 
>>
>> Index: src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>> ===================================================================
>> --- src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java   
>> (revision 1214849)
>> +++ src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java   
>> (working copy)
>> @@ -1036,6 +1036,14 @@
>>                   boolean b = ((Boolean)lit.getValue()).booleanValue() ;
>>                   return new NodeValueBoolean(b, node) ;
>>               }
>> +
>> +
>> +            Object clazz = lit.getDatatype().getJavaClass();
>> +            if ( clazz.getClass().isInstance(Double.class) )
>> +            {
>> +                double d = ((Number)lit.getValue()).doubleValue() ;
>> +                return new NodeValueDouble(d, node) ;
>> +            }
> 
> That makes it a double, a number, no units.
> (aside from the fact doubles are NOT numbers in XSD - use xsd:decimal -
> their value space is "m × 2^e" i.e. (m,e) pairs, for fixed length m and
> e extended with NaN, Inf, -Inf and a lot of machinery for comparison and
> addition etc for rounding and mapping ).
> 
> Saying a temperature has a value which is a number is wrong.  Different
> value spaces.
> 
> Can you compare the height of a mountain with a temperature?  Ones in
> meters, the other in Kelvin.
> 
>>
>>               // If wired into the TypeMapper via
>> RomanNumeralDatatype.enableAsFirstClassDatatype
>>   //            if ( RomanNumeralDatatype.get().isValidLiteral(lit) )
> 
> Note that RomanNumeralDatatype are another way to write integers.  Same
> value space.
> 
> Creating an ordering function that returns a number (no units) means
> comparison is defined.

I got an example of what I was trying to achieve (although I am not sure
the implementation is the best way to do it... in particular I think the
parse() method is still 'wrong' because it returns a Double).

Code here:
https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/ExampleDT_01.java
https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/temperature.java

Output:

---- Data ----
<x4> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "25"^^<http://jena.apache.org/datatypes/temperature/kelvin> .
<x2> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "15"^^<http://jena.apache.org/datatypes/temperature/celsius> .
<x5> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "25"^^<http://jena.apache.org/datatypes/temperature/rankine> .
<x3> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit> .
<x1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "25"^^<http://jena.apache.org/datatypes/temperature/celsius> .

---- Query ----
PREFIX java: <java:org.apache.jena.examples.>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT * WHERE {
    ?s rdf:value ?temperature .
}
ORDER BY java:temperature( ?temperature )

---- Results ----
--------------------------------------------------------------------------
| s    | temperature                                                     |
==========================================================================
| <x5> | "25"^^<http://jena.apache.org/datatypes/temperature/rankine>    |
| <x4> | "25"^^<http://jena.apache.org/datatypes/temperature/kelvin>     |
| <x3> | "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit> |
| <x2> | "15"^^<http://jena.apache.org/datatypes/temperature/celsius>    |
| <x1> | "25"^^<http://jena.apache.org/datatypes/temperature/celsius>    |
--------------------------------------------------------------------------


Thanks.

Paolo


> 
>>   [1]
>> https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>>

Re: On unit of measurement and custom datatypes...

Posted by nat lu <na...@gmail.com>.
On 15/12/11 18:10, Andy Seaborne wrote:
> On 15/12/11 17:25, Paolo Castagna wrote:
>> Hi Andy,
>> thanks for the quick reply.
>>
>> Andy Seaborne wrote:
>>> ARQ has it's own evaluation engine.  Currently, it needs code 
>>> changes to
>>> extend it.  It covers most of XSD already.
>>
>> I had a look into NodeValue.java [1], would it be possible to do 
>> something
>> along these lines?
>
> See NodeValue.compareAlways (which has the SPARQL rules for ORDER BY 
> which are more than just "<")
>
>>
>> Index: src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
>> ===================================================================
>> --- src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java    
>> (revision 1214849)
>> +++ src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java    
>> (working copy)
>> @@ -1036,6 +1036,14 @@
>>                   boolean b = ((Boolean)lit.getValue()).booleanValue() ;
>>                   return new NodeValueBoolean(b, node) ;
>>               }
>> +
>> +
>> +            Object clazz = lit.getDatatype().getJavaClass();
>> +            if ( clazz.getClass().isInstance(Double.class) )
>> +            {
>> +                double d = ((Number)lit.getValue()).doubleValue() ;
>> +                return new NodeValueDouble(d, node) ;
>> +            }
>
> That makes it a double, a number, no units.
> (aside from the fact doubles are NOT numbers in XSD - use xsd:decimal 
> - their value space is "m × 2^e" i.e. (m,e) pairs, for fixed length m 
> and e extended with NaN, Inf, -Inf and a lot of machinery for 
> comparison and addition etc for rounding and mapping ).
>
> Saying a temperature has a value which is a number is wrong.  
> Different value spaces.
>
> Can you compare the height of a mountain with a temperature?  Ones in 
> meters, the other in Kelvin.
>
>>
>>               // If wired into the TypeMapper via 
>> RomanNumeralDatatype.enableAsFirstClassDatatype
>>   //            if ( RomanNumeralDatatype.get().isValidLiteral(lit) )
>
> Note that RomanNumeralDatatype are another way to write integers.  
> Same value space.
>
> Creating an ordering function that returns a number (no units) means 
> comparison is defined.
>
>>   [1] 
>> https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java

As an aside : units can change definition over time, or mean different 
things in different localities - British Miles, US Gallons,  Metric Ton 
Imperial Tonne, etc, etc, etc.

Re: On unit of measurement and custom datatypes...

Posted by Andy Seaborne <an...@apache.org>.
On 15/12/11 17:25, Paolo Castagna wrote:
> Hi Andy,
> thanks for the quick reply.
>
> Andy Seaborne wrote:
>> ARQ has it's own evaluation engine.  Currently, it needs code changes to
>> extend it.  It covers most of XSD already.
>
> I had a look into NodeValue.java [1], would it be possible to do something
> along these lines?

See NodeValue.compareAlways (which has the SPARQL rules for ORDER BY 
which are more than just "<")

>
> Index: src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
> ===================================================================
> --- src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java	(revision 1214849)
> +++ src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java	(working copy)
> @@ -1036,6 +1036,14 @@
>                   boolean b = ((Boolean)lit.getValue()).booleanValue() ;
>                   return new NodeValueBoolean(b, node) ;
>               }
> +
> +
> +            Object clazz = lit.getDatatype().getJavaClass();
> +            if ( clazz.getClass().isInstance(Double.class) )
> +            {
> +                double d = ((Number)lit.getValue()).doubleValue() ;
> +                return new NodeValueDouble(d, node) ;
> +            }

That makes it a double, a number, no units.
(aside from the fact doubles are NOT numbers in XSD - use xsd:decimal - 
their value space is "m × 2^e" i.e. (m,e) pairs, for fixed length m and 
e extended with NaN, Inf, -Inf and a lot of machinery for comparison and 
addition etc for rounding and mapping ).

Saying a temperature has a value which is a number is wrong.  Different 
value spaces.

Can you compare the height of a mountain with a temperature?  Ones in 
meters, the other in Kelvin.

>
>               // If wired into the TypeMapper via RomanNumeralDatatype.enableAsFirstClassDatatype
>   //            if ( RomanNumeralDatatype.get().isValidLiteral(lit) )

Note that RomanNumeralDatatype are another way to write integers.  Same 
value space.

Creating an ordering function that returns a number (no units) means 
comparison is defined.

>   [1] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java

Re: On unit of measurement and custom datatypes...

Posted by Paolo Castagna <ca...@googlemail.com>.
Hi Andy,
thanks for the quick reply.

Andy Seaborne wrote:
> ARQ has it's own evaluation engine.  Currently, it needs code changes to
> extend it.  It covers most of XSD already.

I had a look into NodeValue.java [1], would it be possible to do something
along these lines?

Index: src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java
===================================================================
--- src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java	(revision 1214849)
+++ src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java	(working copy)
@@ -1036,6 +1036,14 @@
                 boolean b = ((Boolean)lit.getValue()).booleanValue() ;
                 return new NodeValueBoolean(b, node) ;
             }
+
+
+            Object clazz = lit.getDatatype().getJavaClass();
+            if ( clazz.getClass().isInstance(Double.class) )
+            {
+                double d = ((Number)lit.getValue()).doubleValue() ;
+                return new NodeValueDouble(d, node) ;
+            }

             // If wired into the TypeMapper via RomanNumeralDatatype.enableAsFirstClassDatatype
 //            if ( RomanNumeralDatatype.get().isValidLiteral(lit) )


 [1] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/expr/NodeValue.java

>
> notes/typed-literals.html does not describe comparison, only
> value-equality.
> 
> Use SPARQL custom functions.
> 
> ORDER BY my:orderingFunction(?temperature)
> 
> where my:orderingFunction returns the value as a number in on the, say,
> Kelvin scale.

Ack, I'll follow this advice.

Thanks,
Paolo

> 
>     Andy
> 
> On 15/12/11 15:37, Paolo Castagna wrote:
>> Hi,
>> I tried to create a custom datatype with Jena and use if from a SPARQL
>> query.
>> For example, I used temperatures in °C or °F. The complete example is
>> here [1].
>>
>> I found the documentation here quite useful:
>>
>>   - Typed literals how-to
>>    
>> http://incubator.apache.org/jena/documentation/notes/typed-literals.html
>>
>> and I looked at the RomanNumeralDatatype [2] implementation in ARQ.
>>
>> I created a TemperatureCelsius and TemperatureFahrenheit which extend
>> BaseDatatype.
>> I'd like to be able to automatically compare or sort temperatures from
>> my SPARQL
>> queries. For example, with a query like this:
>>
>>    PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>    SELECT * WHERE {
>>      ?s rdf:value ?temperature .
>>    }
>>    ORDER BY ?temperature
>>
>> Currently I get:
>>
>> ( ?temperature =
>> "15"^^<http://jena.apache.org/datatypes/temperature/celsius>  ) ( ?s
>> =<x2>  ) ->  [Root]
>> ( ?temperature =
>> "25"^^<http://jena.apache.org/datatypes/temperature/celsius>  ) ( ?s
>> =<x1>  ) ->  [Root]
>> ( ?temperature =
>> "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit>  ) (
>> ?s =<x3>  ) ->  [Root]
>>
>> But, 25 °F = -3.89 °C, therefore 25 °F should be the first on the list.
>>
>> Is there a way I can automatically apply the necessary conversions and
>> ensure that
>> the ORDER BY sorts temperatures correctly?
>>
>> An alternative is to not use custom datatypes and simply have
>> something like this:
>>
>>    :x :temperature [
>>      rdf:value "25.0" ;
>>      :unit :Celsius ;
>>    ]
>>
>>
>> A similar question is answered on http://answers.semanticweb.com/ [3],
>> I wanted
>> to try the custom datatype approach anyway... and probably it's not a
>> good idea.
>>
>> What do you recommend in order to model unit of measurements and
>> conversions
>> between those, making sure people can use SPARQL and/or inference to
>> work with
>> their data?
>>
>> I found these vocabularies/ontologies:
>>
>>   - http://www.w3.org/2007/ont/unit
>>   - http://qudt.org/vocab/unit#
>>   - ...
>>
>> Do you have other vocabularies/ontologies to suggest?
>>
>> Thanks,
>> Paolo
>>
>>
>>   [1]
>> https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/ExampleDT_01.java
>>
>>   [2]
>> https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/util/RomanNumeralDatatype.java
>>
>>   [3] http://answers.semanticweb.com/questions/3572/xsd-or-vocabulary
> 

Re: On unit of measurement and custom datatypes...

Posted by Andy Seaborne <an...@apache.org>.
ARQ has it's own evaluation engine.  Currently, it needs code changes to 
extend it.  It covers most of XSD already.

notes/typed-literals.html does not describe comparison, only value-equality.

Use SPARQL custom functions.

ORDER BY my:orderingFunction(?temperature)

where my:orderingFunction returns the value as a number in on the, say, 
Kelvin scale.

	Andy

On 15/12/11 15:37, Paolo Castagna wrote:
> Hi,
> I tried to create a custom datatype with Jena and use if from a SPARQL query.
> For example, I used temperatures in °C or °F. The complete example is here [1].
>
> I found the documentation here quite useful:
>
>   - Typed literals how-to
>     http://incubator.apache.org/jena/documentation/notes/typed-literals.html
>
> and I looked at the RomanNumeralDatatype [2] implementation in ARQ.
>
> I created a TemperatureCelsius and TemperatureFahrenheit which extend BaseDatatype.
> I'd like to be able to automatically compare or sort temperatures from my SPARQL
> queries. For example, with a query like this:
>
>    PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>    SELECT * WHERE {
>      ?s rdf:value ?temperature .
>    }
>    ORDER BY ?temperature
>
> Currently I get:
>
> ( ?temperature = "15"^^<http://jena.apache.org/datatypes/temperature/celsius>  ) ( ?s =<x2>  ) ->  [Root]
> ( ?temperature = "25"^^<http://jena.apache.org/datatypes/temperature/celsius>  ) ( ?s =<x1>  ) ->  [Root]
> ( ?temperature = "25"^^<http://jena.apache.org/datatypes/temperature/fahrenheit>  ) ( ?s =<x3>  ) ->  [Root]
>
> But, 25 °F = -3.89 °C, therefore 25 °F should be the first on the list.
>
> Is there a way I can automatically apply the necessary conversions and ensure that
> the ORDER BY sorts temperatures correctly?
>
> An alternative is to not use custom datatypes and simply have something like this:
>
>    :x :temperature [
>      rdf:value "25.0" ;
>      :unit :Celsius ;
>    ]
>
>
> A similar question is answered on http://answers.semanticweb.com/ [3], I wanted
> to try the custom datatype approach anyway... and probably it's not a good idea.
>
> What do you recommend in order to model unit of measurements and conversions
> between those, making sure people can use SPARQL and/or inference to work with
> their data?
>
> I found these vocabularies/ontologies:
>
>   - http://www.w3.org/2007/ont/unit
>   - http://qudt.org/vocab/unit#
>   - ...
>
> Do you have other vocabularies/ontologies to suggest?
>
> Thanks,
> Paolo
>
>
>   [1] https://github.com/castagna/jena-examples/blob/master/src/main/java/org/apache/jena/examples/ExampleDT_01.java
>   [2] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/com/hp/hpl/jena/sparql/util/RomanNumeralDatatype.java
>   [3] http://answers.semanticweb.com/questions/3572/xsd-or-vocabulary