You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Ozga, Rafal" <r....@nature.com> on 2012/02/01 16:53:20 UTC

RIOT, quads and quotes in literals

Hi,

I¹m getting the following exception:

com.hp.hpl.jena.datatypes.DatatypeFormatException: Lexical form '<div><oid
id="aff1"/>Department of Medical Scienconoma de Madrid</div>' is not a legal
instance of Datatype[http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral]
Bad rdf:XMLLiteral

while trying to load the following quad:

<http://test.com/subj> <http://test.com/prop> "<div><oid
id=\"aff1\"/>Department of Medical Scienconoma de
Madrid</div>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<http://ns.nature.com/graphs/somegraph> .

using the RIOT library. The following piece of the code reproduces that
problem:

public static void main(String[] args) throws IOException {
        final InputStream io =
Thread.currentThread().getContextClassLoader().getResourceAsStream("test.nq"
);
        
        SysRIOT.wireIntoJena();
        RiotLoader.readQuads(io, Lang.NQUADS, "", new Sink<Quad>() {
            
            @Override
            public void send(Quad item) {
               System.out.println(item.toString())
               System.out.println(item.getObject().getLiteralValue());
            }
            
            @Override
            public void close() {
            }

            @Override
            public void flush() {
            }
        });
    }


iIt seems that the problem lies in those quotes around aff1: for some reason
RIOT omits the escaping backslashes while reading the input stream (at least
item.toString() shows the item value without quotes).

Rafal 

 

********************************************************************************   
DISCLAIMER: This e-mail is confidential and should not be used by anyone who is
not the original intended recipient. If you have received this e-mail in error
please inform the sender and delete it from your mailbox or any other storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
liability for any statements made which are clearly the sender's own and not
expressly made on behalf of Macmillan Publishers Limited or one of its agents.
Please note that neither Macmillan Publishers Limited nor any of its agents
accept any responsibility for viruses that may be contained in this e-mail or
its attachments and it is your responsibility to scan the e-mail and 
attachments (if any). No contracts may be concluded on behalf of Macmillan 
Publishers Limited or its agents by means of e-mail communication. Macmillan 
Publishers Limited Registered in England and Wales with registered number 785998 
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS   
********************************************************************************

Re: RIOT, quads and quotes in literals

Posted by Andy Seaborne <an...@apache.org>.
On 01/02/12 15:53, Ozga, Rafal wrote:
> Hi,
>
> I¹m getting the following exception:
>
> com.hp.hpl.jena.datatypes.DatatypeFormatException: Lexical form '<div><oid
> id="aff1"/>Department of Medical Scienconoma de Madrid</div>' is not a legal
> instance of Datatype[http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral]
> Bad rdf:XMLLiteral
>
> while trying to load the following quad:
>
> <http://test.com/subj>  <http://test.com/prop>  "<div><oid
> id=\"aff1\"/>Department of Medical Scienconoma de
> Madrid</div>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
> <http://ns.nature.com/graphs/somegraph>  .

The rules for XMLiterals are bizarre, complicated and basically out to 
get you.  It's not the quotes, but you're right to worry about them.

An rdf:XMLLiteral must be canonical by the rules of [1] which includes

<oid id=\"aff1\"/>

as:

<oid id=\"aff1\"></oid>

with that, it works for me.

The error you are getting is not from RIOT; it's the 
Node.getLiteralValue().  If you call that method with any illegal 
literal e.g. an integer with lexical form "foo", you get an exception.

If you parse with the command line tools, you just get a warning.

Try  getLiteralLexicalForm()  instead of  getLiteralValue()

Avoiding XMLLiterals and using your own datatype might be worth 
considering if you want to take XML strings from an existing datasource

	Andy

[1] http://www.w3.org/TR/xml-c14n

> using the RIOT library. The following piece of the code reproduces that
> problem:
>
> public static void main(String[] args) throws IOException {
>          final InputStream io =
> Thread.currentThread().getContextClassLoader().getResourceAsStream("test.nq"
> );
>
>          SysRIOT.wireIntoJena();
>          RiotLoader.readQuads(io, Lang.NQUADS, "", new Sink<Quad>() {
>
>              @Override
>              public void send(Quad item) {
>                 System.out.println(item.toString())
>                 System.out.println(item.getObject().getLiteralValue());
>              }
>
>              @Override
>              public void close() {
>              }
>
>              @Override
>              public void flush() {
>              }
>          });
>      }
>
>
> iIt seems that the problem lies in those quotes around aff1: for some reason
> RIOT omits the escaping backslashes while reading the input stream (at least
> item.toString() shows the item value without quotes).
>
> Rafal
>
>
>
> ********************************************************************************
> DISCLAIMER: This e-mail is confidential and should not be used by anyone who is
> not the original intended recipient. If you have received this e-mail in error
> please inform the sender and delete it from your mailbox or any other storage
> mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
> liability for any statements made which are clearly the sender's own and not
> expressly made on behalf of Macmillan Publishers Limited or one of its agents.
> Please note that neither Macmillan Publishers Limited nor any of its agents
> accept any responsibility for viruses that may be contained in this e-mail or
> its attachments and it is your responsibility to scan the e-mail and
> attachments (if any). No contracts may be concluded on behalf of Macmillan
> Publishers Limited or its agents by means of e-mail communication. Macmillan
> Publishers Limited Registered in England and Wales with registered number 785998
> Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
> ********************************************************************************
>