You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Erich Bremer <er...@ebremer.com> on 2016/06/06 17:46:09 UTC

Illegal character in IRI

Hi,

I used Jena (3.0) to read a RDF/XML file and then write that RDF back out to 
a ttl file.  When I try to read that ttl file back into a Jena Model using 
RDFDataMgr, the following error is thrown:

Exception in thread "main" org.apache.jena.riot.RiotException: [line: 9873, 
col: 68] Illegal character in IRI (codepoint 0x5E, '^'): 
<http://rdf.wwpdb.org/pdb/11BA/pdbx_struct_assembly_prop/1,ABSA_(A[^]...>
    at 
org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
    at 
org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:165)
    at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:108)
    at org.apache.jena.riot.lang.LangEngine.expect(LangEngine.java:145)
    at org.apache.jena.riot.lang.LangEngine.expectOrEOF(LangEngine.java:130)
    at 
org.apache.jena.riot.lang.LangTurtleBase.expectEndOfTriplesTurtle(LangTurtleBase.java:264)
    at 
org.apache.jena.riot.lang.LangTurtle.expectEndOfTriples(LangTurtle.java:51)
    at 
org.apache.jena.riot.lang.LangTurtleBase.triples(LangTurtleBase.java:250)
    at 
org.apache.jena.riot.lang.LangTurtleBase.triplesSameSubject(LangTurtleBase.java:190)
    at 
org.apache.jena.riot.lang.LangTurtle.oneTopLevelElement(LangTurtle.java:46)
    at 
org.apache.jena.riot.lang.LangTurtleBase.runParser(LangTurtleBase.java:89)
    at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
    at 
org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:176)
    at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:861)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:259)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:233)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:223)
    at io.haylyn.atoz.TripleCount.main(TripleCount.java:31)

Lines 9873 of the ttl file plus a few more lines are listed here:

<http://rdf.wwpdb.org/pdb/11BA/pdbx_struct_assembly_prop/1,ABSA_(A^2)>
   � � � a� � � � � � � � � PDBo:pdbx_struct_assembly_prop ;
   � � � PDBo:of_datablock� <http://rdf.wwpdb.org/pdb/11BA> ;
   � � � PDBo:pdbx_struct_assembly_prop.biol_id
   � � � � � � � "1" ;
   � � � PDBo:pdbx_struct_assembly_prop.type
   � � � � � � � "ABSA (A^2)" ;
   � � � PDBo:pdbx_struct_assembly_prop.value
   � � � � � � � "6120" .

Shouldn't I be able to read an RDF file that Jena itself wrote?  - Erich

Re: Illegal character in IRI

Posted by Erich Bremer <er...@ebremer.com>.
Thanks, that explains that.  I'll try to contact the author of the data to 
update his IRI's.�� - Erich

On Tue, 7 Jun 2016 17:54:44 +0100
  Andy Seaborne <an...@apache.org> wrote:
> On 06/06/16 18:46, Erich Bremer wrote:
>> Hi,
>>
>> I used Jena (3.0) to read a RDF/XML file and then write 
>>that RDF back
>> out to a ttl file.  When I try to read that ttl file 
>>back into a Jena
>> Model using RDFDataMgr, the following error is thrown:
>>
>> Exception in thread "main" 
>>org.apache.jena.riot.RiotException: [line:
>> 9873, col: 68] Illegal character in IRI (codepoint 0x5E, 
>>'^'):
>> <http://rdf.wwpdb.org/pdb/11BA/pdbx_struct_assembly_prop/1,ABSA_(A[^]...>
> 
> Turtle is defined to work with IRIs and the grammar 
>says:
> 
> [18] 	IRIREF 	::= 	'<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* 
>'>'
> 
> so it is illegal as an IRI.
> 
> RDF/XML, an older standard written before IRI's were 
>finalized, works with "RDF URI References".  They are 
>were designed in anticipation of where the IRI specs were 
>going but the IRI drafts did change before becoming 
>final.
> 
> Jena tends to favour compatibility - what could be read, 
>remains readable; the alternative would be at some 
>version stopped accepting certain RDF/XML that is 
>nowadays not considered "good".
> 
> Writing is an attempt to get assumed correct data out.
> 
> There are various ways to get junk data in, includes the 
>API which for efficiency does not check IRIs.
> 
> 	Andy
> 
>>     at
>> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
>>
>>     at
>> org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:165)
>>     at 
>>org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:108)
>>     at 
>>org.apache.jena.riot.lang.LangEngine.expect(LangEngine.java:145)
>>     at
>> org.apache.jena.riot.lang.LangEngine.expectOrEOF(LangEngine.java:130)
>>     at
>> org.apache.jena.riot.lang.LangTurtleBase.expectEndOfTriplesTurtle(LangTurtleBase.java:264)
>>
>>     at
>> org.apache.jena.riot.lang.LangTurtle.expectEndOfTriples(LangTurtle.java:51)
>>     at
>> org.apache.jena.riot.lang.LangTurtleBase.triples(LangTurtleBase.java:250)
>>     at
>> org.apache.jena.riot.lang.LangTurtleBase.triplesSameSubject(LangTurtleBase.java:190)
>>
>>     at
>> org.apache.jena.riot.lang.LangTurtle.oneTopLevelElement(LangTurtle.java:46)
>>     at
>> org.apache.jena.riot.lang.LangTurtleBase.runParser(LangTurtleBase.java:89)
>>     at 
>>org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
>>     at
>> org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:176)
>>
>>     at 
>>org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:861)
>>     at 
>>org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:259)
>>     at 
>>org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:233)
>>     at 
>>org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:223)
>>     at 
>>io.haylyn.atoz.TripleCount.main(TripleCount.java:31)
>>
>> Lines 9873 of the ttl file plus a few more lines are 
>>listed here:
>>
>> <http://rdf.wwpdb.org/pdb/11BA/pdbx_struct_assembly_prop/1,ABSA_(A^2)>
>>          a 
>>                 PDBo:pdbx_struct_assembly_prop ;
>>          PDBo:of_datablock 
>> <http://rdf.wwpdb.org/pdb/11BA> ;
>>          PDBo:pdbx_struct_assembly_prop.biol_id
>>                  "1" ;
>>          PDBo:pdbx_struct_assembly_prop.type
>>                  "ABSA (A^2)" ;
>>          PDBo:pdbx_struct_assembly_prop.value
>>                  "6120" .
>>
>> Shouldn't I be able to read an RDF file that Jena itself 
>>wrote?  - Erich
> 



Re: Illegal character in IRI

Posted by Andy Seaborne <an...@apache.org>.
On 06/06/16 18:46, Erich Bremer wrote:
> Hi,
>
> I used Jena (3.0) to read a RDF/XML file and then write that RDF back
> out to a ttl file.  When I try to read that ttl file back into a Jena
> Model using RDFDataMgr, the following error is thrown:
>
> Exception in thread "main" org.apache.jena.riot.RiotException: [line:
> 9873, col: 68] Illegal character in IRI (codepoint 0x5E, '^'):
> <http://rdf.wwpdb.org/pdb/11BA/pdbx_struct_assembly_prop/1,ABSA_(A[^]...>

Turtle is defined to work with IRIs and the grammar says:

[18] 	IRIREF 	::= 	'<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'

so it is illegal as an IRI.

RDF/XML, an older standard written before IRI's were finalized, works 
with "RDF URI References".  They are were designed in anticipation of 
where the IRI specs were going but the IRI drafts did change before 
becoming final.

Jena tends to favour compatibility - what could be read, remains 
readable; the alternative would be at some version stopped accepting 
certain RDF/XML that is nowadays not considered "good".

Writing is an attempt to get assumed correct data out.

There are various ways to get junk data in, includes the API which for 
efficiency does not check IRIs.

	Andy

>     at
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
>
>     at
> org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:165)
>     at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:108)
>     at org.apache.jena.riot.lang.LangEngine.expect(LangEngine.java:145)
>     at
> org.apache.jena.riot.lang.LangEngine.expectOrEOF(LangEngine.java:130)
>     at
> org.apache.jena.riot.lang.LangTurtleBase.expectEndOfTriplesTurtle(LangTurtleBase.java:264)
>
>     at
> org.apache.jena.riot.lang.LangTurtle.expectEndOfTriples(LangTurtle.java:51)
>     at
> org.apache.jena.riot.lang.LangTurtleBase.triples(LangTurtleBase.java:250)
>     at
> org.apache.jena.riot.lang.LangTurtleBase.triplesSameSubject(LangTurtleBase.java:190)
>
>     at
> org.apache.jena.riot.lang.LangTurtle.oneTopLevelElement(LangTurtle.java:46)
>     at
> org.apache.jena.riot.lang.LangTurtleBase.runParser(LangTurtleBase.java:89)
>     at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
>     at
> org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:176)
>
>     at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:861)
>     at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:259)
>     at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:233)
>     at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:223)
>     at io.haylyn.atoz.TripleCount.main(TripleCount.java:31)
>
> Lines 9873 of the ttl file plus a few more lines are listed here:
>
> <http://rdf.wwpdb.org/pdb/11BA/pdbx_struct_assembly_prop/1,ABSA_(A^2)>
>          a                  PDBo:pdbx_struct_assembly_prop ;
>          PDBo:of_datablock  <http://rdf.wwpdb.org/pdb/11BA> ;
>          PDBo:pdbx_struct_assembly_prop.biol_id
>                  "1" ;
>          PDBo:pdbx_struct_assembly_prop.type
>                  "ABSA (A^2)" ;
>          PDBo:pdbx_struct_assembly_prop.value
>                  "6120" .
>
> Shouldn't I be able to read an RDF file that Jena itself wrote?  - Erich