You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Ozga, Rafal" <r....@nature.com> on 2012/02/01 16:53:20 UTC
RIOT, quads and quotes in literals
Hi,
I¹m getting the following exception:
com.hp.hpl.jena.datatypes.DatatypeFormatException: Lexical form '<div><oid
id="aff1"/>Department of Medical Scienconoma de Madrid</div>' is not a legal
instance of Datatype[http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral]
Bad rdf:XMLLiteral
while trying to load the following quad:
<http://test.com/subj> <http://test.com/prop> "<div><oid
id=\"aff1\"/>Department of Medical Scienconoma de
Madrid</div>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<http://ns.nature.com/graphs/somegraph> .
using the RIOT library. The following piece of the code reproduces that
problem:
public static void main(String[] args) throws IOException {
final InputStream io =
Thread.currentThread().getContextClassLoader().getResourceAsStream("test.nq"
);
SysRIOT.wireIntoJena();
RiotLoader.readQuads(io, Lang.NQUADS, "", new Sink<Quad>() {
@Override
public void send(Quad item) {
System.out.println(item.toString())
System.out.println(item.getObject().getLiteralValue());
}
@Override
public void close() {
}
@Override
public void flush() {
}
});
}
iIt seems that the problem lies in those quotes around aff1: for some reason
RIOT omits the escaping backslashes while reading the input stream (at least
item.toString() shows the item value without quotes).
Rafal
********************************************************************************
DISCLAIMER: This e-mail is confidential and should not be used by anyone who is
not the original intended recipient. If you have received this e-mail in error
please inform the sender and delete it from your mailbox or any other storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
liability for any statements made which are clearly the sender's own and not
expressly made on behalf of Macmillan Publishers Limited or one of its agents.
Please note that neither Macmillan Publishers Limited nor any of its agents
accept any responsibility for viruses that may be contained in this e-mail or
its attachments and it is your responsibility to scan the e-mail and
attachments (if any). No contracts may be concluded on behalf of Macmillan
Publishers Limited or its agents by means of e-mail communication. Macmillan
Publishers Limited Registered in England and Wales with registered number 785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
********************************************************************************
Re: RIOT, quads and quotes in literals
Posted by Andy Seaborne <an...@apache.org>.
On 01/02/12 15:53, Ozga, Rafal wrote:
> Hi,
>
> I¹m getting the following exception:
>
> com.hp.hpl.jena.datatypes.DatatypeFormatException: Lexical form '<div><oid
> id="aff1"/>Department of Medical Scienconoma de Madrid</div>' is not a legal
> instance of Datatype[http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral]
> Bad rdf:XMLLiteral
>
> while trying to load the following quad:
>
> <http://test.com/subj> <http://test.com/prop> "<div><oid
> id=\"aff1\"/>Department of Medical Scienconoma de
> Madrid</div>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
> <http://ns.nature.com/graphs/somegraph> .
The rules for XMLiterals are bizarre, complicated and basically out to
get you. It's not the quotes, but you're right to worry about them.
An rdf:XMLLiteral must be canonical by the rules of [1] which includes
<oid id=\"aff1\"/>
as:
<oid id=\"aff1\"></oid>
with that, it works for me.
The error you are getting is not from RIOT; it's the
Node.getLiteralValue(). If you call that method with any illegal
literal e.g. an integer with lexical form "foo", you get an exception.
If you parse with the command line tools, you just get a warning.
Try getLiteralLexicalForm() instead of getLiteralValue()
Avoiding XMLLiterals and using your own datatype might be worth
considering if you want to take XML strings from an existing datasource
Andy
[1] http://www.w3.org/TR/xml-c14n
> using the RIOT library. The following piece of the code reproduces that
> problem:
>
> public static void main(String[] args) throws IOException {
> final InputStream io =
> Thread.currentThread().getContextClassLoader().getResourceAsStream("test.nq"
> );
>
> SysRIOT.wireIntoJena();
> RiotLoader.readQuads(io, Lang.NQUADS, "", new Sink<Quad>() {
>
> @Override
> public void send(Quad item) {
> System.out.println(item.toString())
> System.out.println(item.getObject().getLiteralValue());
> }
>
> @Override
> public void close() {
> }
>
> @Override
> public void flush() {
> }
> });
> }
>
>
> iIt seems that the problem lies in those quotes around aff1: for some reason
> RIOT omits the escaping backslashes while reading the input stream (at least
> item.toString() shows the item value without quotes).
>
> Rafal
>
>
>
> ********************************************************************************
> DISCLAIMER: This e-mail is confidential and should not be used by anyone who is
> not the original intended recipient. If you have received this e-mail in error
> please inform the sender and delete it from your mailbox or any other storage
> mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
> liability for any statements made which are clearly the sender's own and not
> expressly made on behalf of Macmillan Publishers Limited or one of its agents.
> Please note that neither Macmillan Publishers Limited nor any of its agents
> accept any responsibility for viruses that may be contained in this e-mail or
> its attachments and it is your responsibility to scan the e-mail and
> attachments (if any). No contracts may be concluded on behalf of Macmillan
> Publishers Limited or its agents by means of e-mail communication. Macmillan
> Publishers Limited Registered in England and Wales with registered number 785998
> Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
> ********************************************************************************
>