You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Tim Harsch <ha...@gmail.com> on 2014/05/12 19:26:52 UTC
typed literals to java classes, problem with short and byte.
According to the docs:
http://jena.apache.org/documentation/notes/typed-literals.html
These are all available as static member variables from
com.hp.hpl.jena.datatypes.xsd.XSDDatatype<http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/datatypes/xsd/XSDDatatype.html>
.
Of these types, the following are registered as the default type to use to
represent certain Java classes:
Java class xsd type Float float Double double Integer int Long long
Short short Byte byte BigInteger integer BigDecimal decimal Boolean
Boolean String string
This is what I am seeing for xsd:short and xsd:byte. I'm puzzled by the
type from getValue.
CODE:
System.out.println( "RDFDatatype: " + literal.getDatatype().toString() );
System.out.println( "Datatype URI: " + literal.getDatatypeURI() );
System.out.println( "getValue java class: " +
((Literal)literal).getValue().getClass()
);
OUTPUT:
RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#byte -> class
java.lang.Byte]
Datatype URI: http://www.w3.org/2001/XMLSchema#byte
getValue java class: class java.lang.Integer
RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#short -> class
java.lang.Short]
Datatype URI: http://www.w3.org/2001/XMLSchema#short
getValue java class: class java.lang.Integer
So, is the expected behavior?
Thanks,
Tim
Re: typed literals to java classes, problem with short and byte.
Posted by Dave Reynolds <da...@gmail.com>.
Hi Tim,
On 13/05/14 18:53, Tim Harsch wrote:
> Thanks Dave. Makes sense. Why though does RDFDatatype says the class
> would be Byte and would be Short ? I guess there is no code that consults
> RDFDatatype to ask what they type should be before creating it. Is this
> just an inconsistency in the API? Or bug in the code?
Arguably an insufficiently clear javadoc.
The issue is that the TypeMapper, which tells you what datatype to use
when *encoding* a java type, is currently initialized from the
getJavaClass() for those datatypes. We wanted people to be able to use
shorts and bytes in java and still get them encoded appropriately.
Which is why the javadoc for RDFDatatype#getJavaClass says:
"""
If this datatype is used as the cannonical representation for a
particular java datatype then return that java type, otherwise returns null.
"""
I.e. it records the java to xsd mapping, which is not the same as the
xsd to java mapping if we don't enforce strict round tripping.
In fact the type mapper allows you to register types directly, which
allows us to have a many-to-one map from java class to RDF datatype. So
the use of getJavaClass is not really necessary and arguably confusing
in a world without round tripping guarantees.
Dave
> On Tue, May 13, 2014 at 12:51 AM, Dave Reynolds
> <da...@gmail.com>wrote:
>
>> On 12/05/14 18:26, Tim Harsch wrote:
>>
>>> According to the docs:
>>> http://jena.apache.org/documentation/notes/typed-literals.html
>>>
>>> These are all available as static member variables from
>>> com.hp.hpl.jena.datatypes.xsd.XSDDatatype<http://jena.
>>> apache.org/documentation/javadoc/jena/com/hp/hpl/jena/
>>> datatypes/xsd/XSDDatatype.html>
>>>
>>> .
>>>
>>> Of these types, the following are registered as the default type to use to
>>> represent certain Java classes:
>>> Java class xsd type Float float Double double Integer int Long
>>> long
>>> Short short Byte byte BigInteger integer BigDecimal decimal Boolean
>>> Boolean String string
>>>
>>> This is what I am seeing for xsd:short and xsd:byte. I'm puzzled by the
>>> type from getValue.
>>>
>>> CODE:
>>>
>>> System.out.println( "RDFDatatype: " + literal.getDatatype().toString() );
>>> System.out.println( "Datatype URI: " + literal.getDatatypeURI() );
>>> System.out.println( "getValue java class: " +
>>> ((Literal)literal).getValue().getClass()
>>> );
>>>
>>> OUTPUT:
>>>
>>> RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#byte -> class
>>> java.lang.Byte]
>>> Datatype URI: http://www.w3.org/2001/XMLSchema#byte
>>> getValue java class: class java.lang.Integer
>>> RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#short -> class
>>> java.lang.Short]
>>> Datatype URI: http://www.w3.org/2001/XMLSchema#short
>>> getValue java class: class java.lang.Integer
>>>
>>> So, is the expected behavior?
>>>
>>
>> Yes, or at least that's the implemented behaviour and has been for some
>> time.
>>
>> The getValue() code picks a Java datatype big enough for the actual value
>> out of Integer, Long and BigInteger.
>>
>> Arguably it would be better if it round tripped so that a java short would
>> become an xsd:short and would return a Short from getValue.
>>
>> The issue is largely historical. Partly its that the code was developed
>> while the RDF datatype handling was still in flux. Partly it's convenience
>> - a lot of people use xsd:integer (i.e. arbitrary size) in their RDF
>> (because that's what you get in Turtle if you use number syntax) but expect
>> them to be Integers in java "unless they are too big". Round-tripping from
>> java was never a requirement. Having once implemented it that way we
>> created a backward compatibility issue if we wanted to change it.
>>
>> I suspect that changing so that short and byte round tripped would be OK.
>> But equally I suspect that dropping the truncation of smaller BigIntegers
>> to Integers would cause problems.
>>
>> This might be something to revisit in any future Jena 3 though doesn't
>> seem like much of a priority - xsd:byte or xsd:short don't seem to be very
>> much used in RDF in the wild.
>>
>> Dave
>>
>>
>
Re: typed literals to java classes, problem with short and byte.
Posted by Tim Harsch <ha...@gmail.com>.
Thanks Dave. Makes sense. Why though does RDFDatatype says the class
would be Byte and would be Short ? I guess there is no code that consults
RDFDatatype to ask what they type should be before creating it. Is this
just an inconsistency in the API? Or bug in the code?
Thanks,
Tim
On Tue, May 13, 2014 at 12:51 AM, Dave Reynolds
<da...@gmail.com>wrote:
> On 12/05/14 18:26, Tim Harsch wrote:
>
>> According to the docs:
>> http://jena.apache.org/documentation/notes/typed-literals.html
>>
>> These are all available as static member variables from
>> com.hp.hpl.jena.datatypes.xsd.XSDDatatype<http://jena.
>> apache.org/documentation/javadoc/jena/com/hp/hpl/jena/
>> datatypes/xsd/XSDDatatype.html>
>>
>> .
>>
>> Of these types, the following are registered as the default type to use to
>> represent certain Java classes:
>> Java class xsd type Float float Double double Integer int Long
>> long
>> Short short Byte byte BigInteger integer BigDecimal decimal Boolean
>> Boolean String string
>>
>> This is what I am seeing for xsd:short and xsd:byte. I'm puzzled by the
>> type from getValue.
>>
>> CODE:
>>
>> System.out.println( "RDFDatatype: " + literal.getDatatype().toString() );
>> System.out.println( "Datatype URI: " + literal.getDatatypeURI() );
>> System.out.println( "getValue java class: " +
>> ((Literal)literal).getValue().getClass()
>> );
>>
>> OUTPUT:
>>
>> RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#byte -> class
>> java.lang.Byte]
>> Datatype URI: http://www.w3.org/2001/XMLSchema#byte
>> getValue java class: class java.lang.Integer
>> RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#short -> class
>> java.lang.Short]
>> Datatype URI: http://www.w3.org/2001/XMLSchema#short
>> getValue java class: class java.lang.Integer
>>
>> So, is the expected behavior?
>>
>
> Yes, or at least that's the implemented behaviour and has been for some
> time.
>
> The getValue() code picks a Java datatype big enough for the actual value
> out of Integer, Long and BigInteger.
>
> Arguably it would be better if it round tripped so that a java short would
> become an xsd:short and would return a Short from getValue.
>
> The issue is largely historical. Partly its that the code was developed
> while the RDF datatype handling was still in flux. Partly it's convenience
> - a lot of people use xsd:integer (i.e. arbitrary size) in their RDF
> (because that's what you get in Turtle if you use number syntax) but expect
> them to be Integers in java "unless they are too big". Round-tripping from
> java was never a requirement. Having once implemented it that way we
> created a backward compatibility issue if we wanted to change it.
>
> I suspect that changing so that short and byte round tripped would be OK.
> But equally I suspect that dropping the truncation of smaller BigIntegers
> to Integers would cause problems.
>
> This might be something to revisit in any future Jena 3 though doesn't
> seem like much of a priority - xsd:byte or xsd:short don't seem to be very
> much used in RDF in the wild.
>
> Dave
>
>
Re: typed literals to java classes, problem with short and byte.
Posted by Dave Reynolds <da...@gmail.com>.
On 12/05/14 18:26, Tim Harsch wrote:
> According to the docs:
> http://jena.apache.org/documentation/notes/typed-literals.html
>
> These are all available as static member variables from
> com.hp.hpl.jena.datatypes.xsd.XSDDatatype<http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/datatypes/xsd/XSDDatatype.html>
> .
>
> Of these types, the following are registered as the default type to use to
> represent certain Java classes:
> Java class xsd type Float float Double double Integer int Long long
> Short short Byte byte BigInteger integer BigDecimal decimal Boolean
> Boolean String string
>
> This is what I am seeing for xsd:short and xsd:byte. I'm puzzled by the
> type from getValue.
>
> CODE:
>
> System.out.println( "RDFDatatype: " + literal.getDatatype().toString() );
> System.out.println( "Datatype URI: " + literal.getDatatypeURI() );
> System.out.println( "getValue java class: " +
> ((Literal)literal).getValue().getClass()
> );
>
> OUTPUT:
>
> RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#byte -> class
> java.lang.Byte]
> Datatype URI: http://www.w3.org/2001/XMLSchema#byte
> getValue java class: class java.lang.Integer
> RDFDatatype: Datatype[http://www.w3.org/2001/XMLSchema#short -> class
> java.lang.Short]
> Datatype URI: http://www.w3.org/2001/XMLSchema#short
> getValue java class: class java.lang.Integer
>
> So, is the expected behavior?
Yes, or at least that's the implemented behaviour and has been for some
time.
The getValue() code picks a Java datatype big enough for the actual
value out of Integer, Long and BigInteger.
Arguably it would be better if it round tripped so that a java short
would become an xsd:short and would return a Short from getValue.
The issue is largely historical. Partly its that the code was developed
while the RDF datatype handling was still in flux. Partly it's
convenience - a lot of people use xsd:integer (i.e. arbitrary size) in
their RDF (because that's what you get in Turtle if you use number
syntax) but expect them to be Integers in java "unless they are too
big". Round-tripping from java was never a requirement. Having once
implemented it that way we created a backward compatibility issue if we
wanted to change it.
I suspect that changing so that short and byte round tripped would be
OK. But equally I suspect that dropping the truncation of smaller
BigIntegers to Integers would cause problems.
This might be something to revisit in any future Jena 3 though doesn't
seem like much of a priority - xsd:byte or xsd:short don't seem to be
very much used in RDF in the wild.
Dave