You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Evan Patton <pa...@rpi.edu> on 2011/04/10 04:45:05 UTC
N3 XML Literals in ARQ SPARQL/Update
Hello,
I'm having some issues with SPARQL/Update support in ARQ using Jena 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from the endpoint using a DESCRIBE query and the results include some XMLLiterals specified using the ^^ syntax. However, if I attempt to load these files using a LOAD query, all of the < and > characters are encoded as < and > and when queried using the DESCRIBE no longer make sense to other tools. First, is this the expected behavior for loading XML Literals? If so, is there any easy way to change the behavior without having to extensively change the source (i.e. some configuration option for Jena I can set)?
Your advice is greatly appreciated,
Evan Patton
Re: N3 XML Literals in ARQ SPARQL/Update
Posted by Andy Seaborne <an...@epimorphics.com>.
If you parse your data with
riotcmd.riot --validate file.ttl
then it will do XML literal validation (which is quite expensive).
The rules for legal XMLLiteals are quite complicated and quicky picky.
The SPARQL update does not check for legal literals - it takes what's in
your query exactly as-is.
Andy
On 12/04/11 22:44, Dave Reynolds wrote:
> On Tue, 2011-04-12 at 16:58 -0400, Evan Patton wrote:
>> After further investigation I found that the problem seems to be due to the use of<br/>
>
> Ah yes, that's not well-formed xml so the canonicalization process
> bites.
>
> I believe<br /> would be OK (as when writing xhtml you leave a space
> before />) which may or may not be preferable!
>
> Dave
>
>>
>> If I insert the following triple:
>>
>> ex:Item dc:description "First Line<br/>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
>>
>> the RDF returned by the endpoint is
>>
>> <rdf:Description rdf:about="http://example.com/Item">
>> <dc:description rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">First Line<br/>Second Line</dc:description>
>> </rdf:Description>
>>
>> However, if I insert this triple:
>>
>> ex:Item dc:description "First Line<br></br>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
>>
>> then the RDF returned uses rdf:parseType="Literal", like so:
>>
>> <rdf Description rdf:about="http://example.com/Item">
>> <dc:description rdf:parseType="Literal">First Line<br></br>Second Line</dc:description>
>> </rdf:Description>
>>
>> I will just need to write a regex to identify when users input tags like<br/> and expand them into<br></br>
>>
>> Thanks,
>> Evan
>>
>> On Apr 11, 2011, at 6:35 PM, Andy Seaborne wrote:
>>
>>>
>>>
>>> On 10/04/11 09:09, Dave Reynolds wrote:
>>>> On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote:
>>>>> Hello,
>>>>>
>>>>> I'm having some issues with SPARQL/Update support in ARQ using Jena
>>>>> 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from
>>>>> the endpoint using a DESCRIBE query and the results include some
>>>>> XMLLiterals specified using the ^^ syntax. However, if I attempt to
>>>>> load these files using a LOAD query, all of the< and> characters
>>>>> are encoded as< and> and when queried using the DESCRIBE no
>>>>> longer make sense to other tools. First, is this the expected
>>>>> behavior for loading XML Literals?
>>>>
>>>> No, that's not normal, at least at the level of basic N3 (well
>>>> Turtle) parsing it's not. For example:
>>>>
>>>> modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
>>>> .write(System.out, "RDF/XML-ABBREV");
>>>>
>>>> generates
>>>>
>>>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>> xmlns:owl="http://www.w3.org/2002/07/owl#"
>>>> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
>>>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>>>> xmlns="http://jena.hpl.hp.com/eg#"> <rdf:Description
>>>> rdf:about="http://jena.hpl.hp.com/eg#r"> <p
>>>> rdf:parseType="Literal"><a>foo</a></p> </rdf:Description> </rdf:RDF>
>>>>
>>>> No quoting of< >.
>>>>
>>>> Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to
>>>> check that the source data can be correctly parsed.
>>>>
>>>> Dave
>>>
>>> It should work - do you have a complete, minimal example or what your trying?
>>>
>>> Andy
>>>
>>>
>>> --
>>> BEGIN-ANTISPAM-VOTING-LINKS
>>> ------------------------------------------------------
>>>
>>> Teach CanIt if this mail (ID 40717954) is spam:
>>> Spam: http://respite.rpi.edu/b.php?c=s&i=40717954&m=0c706edea997
>>> Not spam: http://respite.rpi.edu/b.php?c=n&i=40717954&m=0c706edea997
>>> Forget vote: http://respite.rpi.edu/b.php?c=f&i=40717954&m=0c706edea997
>>> ------------------------------------------------------
>>> END-ANTISPAM-VOTING-LINKS
>>>
>>>
>>
>
>
Re: N3 XML Literals in ARQ SPARQL/Update
Posted by Dave Reynolds <da...@gmail.com>.
On Tue, 2011-04-12 at 16:58 -0400, Evan Patton wrote:
> After further investigation I found that the problem seems to be due to the use of <br/>
Ah yes, that's not well-formed xml so the canonicalization process
bites.
I believe <br /> would be OK (as when writing xhtml you leave a space
before />) which may or may not be preferable!
Dave
>
> If I insert the following triple:
>
> ex:Item dc:description "First Line<br/>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
>
> the RDF returned by the endpoint is
>
> <rdf:Description rdf:about="http://example.com/Item">
> <dc:description rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">First Line<br/>Second Line</dc:description>
> </rdf:Description>
>
> However, if I insert this triple:
>
> ex:Item dc:description "First Line<br></br>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
>
> then the RDF returned uses rdf:parseType="Literal", like so:
>
> <rdf Description rdf:about="http://example.com/Item">
> <dc:description rdf:parseType="Literal">First Line<br></br>Second Line</dc:description>
> </rdf:Description>
>
> I will just need to write a regex to identify when users input tags like <br/> and expand them into <br></br>
>
> Thanks,
> Evan
>
> On Apr 11, 2011, at 6:35 PM, Andy Seaborne wrote:
>
> >
> >
> > On 10/04/11 09:09, Dave Reynolds wrote:
> >> On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote:
> >>> Hello,
> >>>
> >>> I'm having some issues with SPARQL/Update support in ARQ using Jena
> >>> 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from
> >>> the endpoint using a DESCRIBE query and the results include some
> >>> XMLLiterals specified using the ^^ syntax. However, if I attempt to
> >>> load these files using a LOAD query, all of the< and> characters
> >>> are encoded as< and> and when queried using the DESCRIBE no
> >>> longer make sense to other tools. First, is this the expected
> >>> behavior for loading XML Literals?
> >>
> >> No, that's not normal, at least at the level of basic N3 (well
> >> Turtle) parsing it's not. For example:
> >>
> >> modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
> >> .write(System.out, "RDF/XML-ABBREV");
> >>
> >> generates
> >>
> >> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> >> xmlns:owl="http://www.w3.org/2002/07/owl#"
> >> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
> >> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
> >> xmlns="http://jena.hpl.hp.com/eg#"> <rdf:Description
> >> rdf:about="http://jena.hpl.hp.com/eg#r"> <p
> >> rdf:parseType="Literal"><a>foo</a></p> </rdf:Description> </rdf:RDF>
> >>
> >> No quoting of< >.
> >>
> >> Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to
> >> check that the source data can be correctly parsed.
> >>
> >> Dave
> >
> > It should work - do you have a complete, minimal example or what your trying?
> >
> > Andy
> >
> >
> > --
> > BEGIN-ANTISPAM-VOTING-LINKS
> > ------------------------------------------------------
> >
> > Teach CanIt if this mail (ID 40717954) is spam:
> > Spam: http://respite.rpi.edu/b.php?c=s&i=40717954&m=0c706edea997
> > Not spam: http://respite.rpi.edu/b.php?c=n&i=40717954&m=0c706edea997
> > Forget vote: http://respite.rpi.edu/b.php?c=f&i=40717954&m=0c706edea997
> > ------------------------------------------------------
> > END-ANTISPAM-VOTING-LINKS
> >
> >
>
Re: N3 XML Literals in ARQ SPARQL/Update
Posted by Evan Patton <pa...@rpi.edu>.
After further investigation I found that the problem seems to be due to the use of <br/>
If I insert the following triple:
ex:Item dc:description "First Line<br/>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
the RDF returned by the endpoint is
<rdf:Description rdf:about="http://example.com/Item">
<dc:description rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">First Line<br/>Second Line</dc:description>
</rdf:Description>
However, if I insert this triple:
ex:Item dc:description "First Line<br></br>Second Line"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
then the RDF returned uses rdf:parseType="Literal", like so:
<rdf Description rdf:about="http://example.com/Item">
<dc:description rdf:parseType="Literal">First Line<br></br>Second Line</dc:description>
</rdf:Description>
I will just need to write a regex to identify when users input tags like <br/> and expand them into <br></br>
Thanks,
Evan
On Apr 11, 2011, at 6:35 PM, Andy Seaborne wrote:
>
>
> On 10/04/11 09:09, Dave Reynolds wrote:
>> On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote:
>>> Hello,
>>>
>>> I'm having some issues with SPARQL/Update support in ARQ using Jena
>>> 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from
>>> the endpoint using a DESCRIBE query and the results include some
>>> XMLLiterals specified using the ^^ syntax. However, if I attempt to
>>> load these files using a LOAD query, all of the< and> characters
>>> are encoded as< and> and when queried using the DESCRIBE no
>>> longer make sense to other tools. First, is this the expected
>>> behavior for loading XML Literals?
>>
>> No, that's not normal, at least at the level of basic N3 (well
>> Turtle) parsing it's not. For example:
>>
>> modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
>> .write(System.out, "RDF/XML-ABBREV");
>>
>> generates
>>
>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>> xmlns:owl="http://www.w3.org/2002/07/owl#"
>> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>> xmlns="http://jena.hpl.hp.com/eg#"> <rdf:Description
>> rdf:about="http://jena.hpl.hp.com/eg#r"> <p
>> rdf:parseType="Literal"><a>foo</a></p> </rdf:Description> </rdf:RDF>
>>
>> No quoting of< >.
>>
>> Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to
>> check that the source data can be correctly parsed.
>>
>> Dave
>
> It should work - do you have a complete, minimal example or what your trying?
>
> Andy
>
>
> --
> BEGIN-ANTISPAM-VOTING-LINKS
> ------------------------------------------------------
>
> Teach CanIt if this mail (ID 40717954) is spam:
> Spam: http://respite.rpi.edu/b.php?c=s&i=40717954&m=0c706edea997
> Not spam: http://respite.rpi.edu/b.php?c=n&i=40717954&m=0c706edea997
> Forget vote: http://respite.rpi.edu/b.php?c=f&i=40717954&m=0c706edea997
> ------------------------------------------------------
> END-ANTISPAM-VOTING-LINKS
>
>
Re: N3 XML Literals in ARQ SPARQL/Update
Posted by Andy Seaborne <an...@epimorphics.com>.
On 10/04/11 09:09, Dave Reynolds wrote:
> On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote:
>> Hello,
>>
>> I'm having some issues with SPARQL/Update support in ARQ using Jena
>> 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from
>> the endpoint using a DESCRIBE query and the results include some
>> XMLLiterals specified using the ^^ syntax. However, if I attempt to
>> load these files using a LOAD query, all of the< and> characters
>> are encoded as< and> and when queried using the DESCRIBE no
>> longer make sense to other tools. First, is this the expected
>> behavior for loading XML Literals?
>
> No, that's not normal, at least at the level of basic N3 (well
> Turtle) parsing it's not. For example:
>
> modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
> .write(System.out, "RDF/XML-ABBREV");
>
> generates
>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:owl="http://www.w3.org/2002/07/owl#"
> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
> xmlns="http://jena.hpl.hp.com/eg#"> <rdf:Description
> rdf:about="http://jena.hpl.hp.com/eg#r"> <p
> rdf:parseType="Literal"><a>foo</a></p> </rdf:Description> </rdf:RDF>
>
> No quoting of< >.
>
> Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to
> check that the source data can be correctly parsed.
>
> Dave
It should work - do you have a complete, minimal example or what your
trying?
Andy
Re: N3 XML Literals in ARQ SPARQL/Update
Posted by Dave Reynolds <da...@gmail.com>.
On Sat, 2011-04-09 at 22:45 -0400, Evan Patton wrote:
> Hello,
>
> I'm having some issues with SPARQL/Update support in ARQ using Jena 2.6.4 and Joseki 3.4.3. I obtain query results in N3 format from the endpoint using a DESCRIBE query and the results include some XMLLiterals specified using the ^^ syntax. However, if I attempt to load these files using a LOAD query, all of the < and > characters are encoded as < and > and when queried using the DESCRIBE no longer make sense to other tools. First, is this the expected behavior for loading XML Literals?
No, that's not normal, at least at the level of basic N3 (well Turtle)
parsing it's not. For example:
modelFromN3(":r :p '<a>foo</a>'^^rdf:XMLLiteral .")
.write(System.out, "RDF/XML-ABBREV");
generates
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns="http://jena.hpl.hp.com/eg#">
<rdf:Description rdf:about="http://jena.hpl.hp.com/eg#r">
<p rdf:parseType="Literal"><a>foo</a></p>
</rdf:Description>
</rdf:RDF>
No quoting of < >.
Is the XMLLiteral syntax correct? Can you try using jena.rdfcat to check
that the source data can be correctly parsed.
Dave