You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by Evan Patton <pa...@rpi.edu> on 2011/04/10 04:45:05 UTC

N3 XML Literals in ARQ SPARQL/Update

Hello,

I'm having some issues with SPARQL/Update support in ARQ using Jena 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from the endpoint using a DESCRIBE query and the results include some XMLLiterals specified using the ^^ syntax. However, if I attempt to load these files using a LOAD query, all of the < and > characters are encoded as &lt; and &gt; and when queried using the DESCRIBE no longer make sense to other tools. First, is this the expected behavior for loading XML Literals? If so, is there any easy way to change the behavior without having to extensively change the source (i.e. some configuration option for Jena I can set)?

Your advice is greatly appreciated,

Evan Patton

Re: N3 XML Literals in ARQ SPARQL/Update

Posted by Andy Seaborne <an...@epimorphics.com>.

If you parse your data with

riotcmd.riot --validate file.ttl

then it will do XML literal validation (which is quite expensive).

The rules for legal XMLLiteals are quite complicated and quicky picky.

The SPARQL update does not check for legal literals - it takes what's in 
your query exactly as-is.

	Andy

On 12/04/11 22:44, Dave Reynolds wrote:
> On Tue, 2011-04-12 at 16:58 -0400, Evan Patton wrote:
>> After further investigation I found that the problem seems to be due to the use of<br/>
>
> Ah yes, that's not well-formed xml so the canonicalization process
> bites.
>
> I believe<br />  would be OK (as when writing xhtml you leave a space
> before />) which may or may not be preferable!
>
> Dave
>
>>
>> If I insert the following triple:
>>
>> ex:Item dc:description "First Line<br/>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
>>
>> the RDF returned by the endpoint is
>>
>> <rdf:Description rdf:about="http://example.com/Item">
>>      <dc:description rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">First Line&lt;br/&gt;Second Line</dc:description>
>> </rdf:Description>
>>
>> However, if I insert this triple:
>>
>> ex:Item dc:description "First Line<br></br>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
>>
>> then the RDF returned uses rdf:parseType="Literal", like so:
>>
>> <rdf Description rdf:about="http://example.com/Item">
>>      <dc:description rdf:parseType="Literal">First Line<br></br>Second Line</dc:description>
>> </rdf:Description>
>>
>> I will just need to write a regex to identify when users input tags like<br/>  and expand them into<br></br>
>>
>> Thanks,
>> Evan
>>
>> On Apr 11, 2011, at 6:35 PM, Andy Seaborne wrote:
>>
>>>
>>>
>>> On 10/04/11 09:09, Dave Reynolds wrote:
>>>> On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote:
>>>>> Hello,
>>>>>
>>>>> I'm having some issues with SPARQL/Update support in ARQ using Jena
>>>>> 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from
>>>>> the endpoint using a DESCRIBE query and the results include some
>>>>> XMLLiterals specified using the ^^ syntax. However, if I attempt to
>>>>> load these files using a LOAD query, all of the<   and>   characters
>>>>> are encoded as&lt; and&gt; and when queried using the DESCRIBE no
>>>>> longer make sense to other tools. First, is this the expected
>>>>> behavior for loading XML Literals?
>>>>
>>>> No, that's not normal, at least at the level of basic N3 (well
>>>> Turtle) parsing it's not. For example:
>>>>
>>>> modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
>>>> .write(System.out, "RDF/XML-ABBREV");
>>>>
>>>> generates
>>>>
>>>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>> xmlns:owl="http://www.w3.org/2002/07/owl#"
>>>> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
>>>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>>>> xmlns="http://jena.hpl.hp.com/eg#">  <rdf:Description
>>>> rdf:about="http://jena.hpl.hp.com/eg#r">  <p
>>>> rdf:parseType="Literal"><a>foo</a></p>  </rdf:Description>  </rdf:RDF>
>>>>
>>>> No quoting of<   >.
>>>>
>>>> Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to
>>>> check that the source data can be correctly parsed.
>>>>
>>>> Dave
>>>
>>> It should work - do you have a complete, minimal example or what your trying?
>>>
>>> 	Andy
>>>
>>>
>>> --
>>> BEGIN-ANTISPAM-VOTING-LINKS
>>> ------------------------------------------------------
>>>
>>> Teach CanIt if this mail (ID 40717954) is spam:
>>> Spam:        http://respite.rpi.edu/b.php?c=s&i=40717954&m=0c706edea997
>>> Not spam:    http://respite.rpi.edu/b.php?c=n&i=40717954&m=0c706edea997
>>> Forget vote: http://respite.rpi.edu/b.php?c=f&i=40717954&m=0c706edea997
>>> ------------------------------------------------------
>>> END-ANTISPAM-VOTING-LINKS
>>>
>>>
>>
>
>

Re: N3 XML Literals in ARQ SPARQL/Update

Posted by Dave Reynolds <da...@gmail.com>.

On Tue, 2011-04-12 at 16:58 -0400, Evan Patton wrote: 
> After further investigation I found that the problem seems to be due to the use of <br/>

Ah yes, that's not well-formed xml so the canonicalization process
bites.

I believe <br /> would be OK (as when writing xhtml you leave a space
before />) which may or may not be preferable!

Dave

> 
> If I insert the following triple:
> 
> ex:Item dc:description "First Line<br/>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
> 
> the RDF returned by the endpoint is
> 
> <rdf:Description rdf:about="http://example.com/Item">
>     <dc:description rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">First Line&lt;br/&gt;Second Line</dc:description>
> </rdf:Description>
> 
> However, if I insert this triple:
> 
> ex:Item dc:description "First Line<br></br>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
> 
> then the RDF returned uses rdf:parseType="Literal", like so:
> 
> <rdf Description rdf:about="http://example.com/Item">
>     <dc:description rdf:parseType="Literal">First Line<br></br>Second Line</dc:description>
> </rdf:Description>
> 
> I will just need to write a regex to identify when users input tags like <br/> and expand them into <br></br>
> 
> Thanks,
> Evan
> 
> On Apr 11, 2011, at 6:35 PM, Andy Seaborne wrote:
> 
> > 
> > 
> > On 10/04/11 09:09, Dave Reynolds wrote:
> >> On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote:
> >>> Hello,
> >>> 
> >>> I'm having some issues with SPARQL/Update support in ARQ using Jena
> >>> 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from
> >>> the endpoint using a DESCRIBE query and the results include some
> >>> XMLLiterals specified using the ^^ syntax. However, if I attempt to
> >>> load these files using a LOAD query, all of the<  and>  characters
> >>> are encoded as&lt; and&gt; and when queried using the DESCRIBE no
> >>> longer make sense to other tools. First, is this the expected
> >>> behavior for loading XML Literals?
> >> 
> >> No, that's not normal, at least at the level of basic N3 (well
> >> Turtle) parsing it's not. For example:
> >> 
> >> modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
> >> .write(System.out, "RDF/XML-ABBREV");
> >> 
> >> generates
> >> 
> >> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> >> xmlns:owl="http://www.w3.org/2002/07/owl#"
> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
> >> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
> >> xmlns="http://jena.hpl.hp.com/eg#"> <rdf:Description
> >> rdf:about="http://jena.hpl.hp.com/eg#r"> <p
> >> rdf:parseType="Literal"><a>foo</a></p> </rdf:Description> </rdf:RDF>
> >> 
> >> No quoting of<  >.
> >> 
> >> Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to
> >> check that the source data can be correctly parsed.
> >> 
> >> Dave
> > 
> > It should work - do you have a complete, minimal example or what your trying?
> > 
> > 	Andy
> > 
> > 
> > -- 
> > BEGIN-ANTISPAM-VOTING-LINKS
> > ------------------------------------------------------
> > 
> > Teach CanIt if this mail (ID 40717954) is spam:
> > Spam:        http://respite.rpi.edu/b.php?c=s&i=40717954&m=0c706edea997
> > Not spam:    http://respite.rpi.edu/b.php?c=n&i=40717954&m=0c706edea997
> > Forget vote: http://respite.rpi.edu/b.php?c=f&i=40717954&m=0c706edea997
> > ------------------------------------------------------
> > END-ANTISPAM-VOTING-LINKS
> > 
> > 
>

Re: N3 XML Literals in ARQ SPARQL/Update

Posted by Evan Patton <pa...@rpi.edu>.

After further investigation I found that the problem seems to be due to the use of <br/>

If I insert the following triple:

ex:Item dc:description "First Line<br/>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>

the RDF returned by the endpoint is

<rdf:Description rdf:about="http://example.com/Item">
    <dc:description rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">First Line&lt;br/&gt;Second Line</dc:description>
</rdf:Description>

However, if I insert this triple:

ex:Item dc:description "First Line<br></br>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>

then the RDF returned uses rdf:parseType="Literal", like so:

<rdf Description rdf:about="http://example.com/Item">
    <dc:description rdf:parseType="Literal">First Line<br></br>Second Line</dc:description>
</rdf:Description>

I will just need to write a regex to identify when users input tags like <br/> and expand them into <br></br>

Thanks,
Evan

On Apr 11, 2011, at 6:35 PM, Andy Seaborne wrote:

> 
> 
> On 10/04/11 09:09, Dave Reynolds wrote:
>> On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote:
>>> Hello,
>>> 
>>> I'm having some issues with SPARQL/Update support in ARQ using Jena
>>> 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from
>>> the endpoint using a DESCRIBE query and the results include some
>>> XMLLiterals specified using the ^^ syntax. However, if I attempt to
>>> load these files using a LOAD query, all of the<  and>  characters
>>> are encoded as&lt; and&gt; and when queried using the DESCRIBE no
>>> longer make sense to other tools. First, is this the expected
>>> behavior for loading XML Literals?
>> 
>> No, that's not normal, at least at the level of basic N3 (well
>> Turtle) parsing it's not. For example:
>> 
>> modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
>> .write(System.out, "RDF/XML-ABBREV");
>> 
>> generates
>> 
>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>> xmlns:owl="http://www.w3.org/2002/07/owl#"
>> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>> xmlns="http://jena.hpl.hp.com/eg#"> <rdf:Description
>> rdf:about="http://jena.hpl.hp.com/eg#r"> <p
>> rdf:parseType="Literal"><a>foo</a></p> </rdf:Description> </rdf:RDF>
>> 
>> No quoting of<  >.
>> 
>> Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to
>> check that the source data can be correctly parsed.
>> 
>> Dave
> 
> It should work - do you have a complete, minimal example or what your trying?
> 
> 	Andy
> 
> 
> -- 
> BEGIN-ANTISPAM-VOTING-LINKS
> ------------------------------------------------------
> 
> Teach CanIt if this mail (ID 40717954) is spam:
> Spam:        http://respite.rpi.edu/b.php?c=s&i=40717954&m=0c706edea997
> Not spam:    http://respite.rpi.edu/b.php?c=n&i=40717954&m=0c706edea997
> Forget vote: http://respite.rpi.edu/b.php?c=f&i=40717954&m=0c706edea997
> ------------------------------------------------------
> END-ANTISPAM-VOTING-LINKS
> 
>

Re: N3 XML Literals in ARQ SPARQL/Update

Posted by Andy Seaborne <an...@epimorphics.com>.


On 10/04/11 09:09, Dave Reynolds wrote:
> On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote:
>> Hello,
>>
>> I'm having some issues with SPARQL/Update support in ARQ using Jena
>> 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from
>> the endpoint using a DESCRIBE query and the results include some
>> XMLLiterals specified using the ^^ syntax. However, if I attempt to
>> load these files using a LOAD query, all of the<  and>  characters
>> are encoded as&lt; and&gt; and when queried using the DESCRIBE no
>> longer make sense to other tools. First, is this the expected
>> behavior for loading XML Literals?
>
> No, that's not normal, at least at the level of basic N3 (well
> Turtle) parsing it's not. For example:
>
> modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
> .write(System.out, "RDF/XML-ABBREV");
>
> generates
>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:owl="http://www.w3.org/2002/07/owl#"
> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
> xmlns="http://jena.hpl.hp.com/eg#"> <rdf:Description
> rdf:about="http://jena.hpl.hp.com/eg#r"> <p
> rdf:parseType="Literal"><a>foo</a></p> </rdf:Description> </rdf:RDF>
>
> No quoting of<  >.
>
> Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to
> check that the source data can be correctly parsed.
>
> Dave

It should work - do you have a complete, minimal example or what your 
trying?

	Andy

Re: N3 XML Literals in ARQ SPARQL/Update

Posted by Dave Reynolds <da...@gmail.com>.

On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote: 
> Hello,
> 
> I'm having some issues with SPARQL/Update support in ARQ using Jena 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from the endpoint using a DESCRIBE query and the results include some XMLLiterals specified using the ^^ syntax. However, if I attempt to load these files using a LOAD query, all of the < and > characters are encoded as &lt; and &gt; and when queried using the DESCRIBE no longer make sense to other tools. First, is this the expected behavior for loading XML Literals? 

No, that's not normal, at least at the level of basic N3 (well Turtle)
parsing it's not. For example:

modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
       .write(System.out, "RDF/XML-ABBREV");

generates

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns="http://jena.hpl.hp.com/eg#">
  <rdf:Description rdf:about="http://jena.hpl.hp.com/eg#r">
    <p rdf:parseType="Literal"><a>foo</a></p>
  </rdf:Description>
</rdf:RDF>

No quoting of < >.

Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to check
that the source data can be correctly parsed.

Dave