You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Andy Seaborne (Commented) (JIRA)" <ji...@apache.org> on 2012/03/02 13:12:57 UTC

[jira] [Commented] (JENA-216) Official Turtle Test-18 does not parse

    [ https://issues.apache.org/jira/browse/JENA-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220867#comment-13220867 ] 

Andy Seaborne commented on JENA-216:
------------------------------------

By the way:

Turtle test suite is here:
https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/testing/RIOT/TurtleStd/

which includes fixes

* illegal chars like \n or \u0000 in URIs not expected to parse. 
    The test assume no checking is done in the parser but RIOT does and so you get line numbers.

* test-28.out in the Turtle test suite is (as you've already found out) just plain wrong.

IIRC \U00015678 isn't a legal code point (i.e. not allocated) in all versions of Unicode.  As of java6, I think it's now OK.
                
> Official Turtle Test-18 does not parse
> --------------------------------------
>
>                 Key: JENA-216
>                 URL: https://issues.apache.org/jira/browse/JENA-216
>             Project: Apache Jena
>          Issue Type: Bug
>    Affects Versions: ARQ 2.9.0
>         Environment: Java 6, OSX
>            Reporter: Henry Story
>            Assignee: Andy Seaborne
>
> I am having trouble Trying to parse http://www.w3.org/TR/turtle/tests/test-18.ttl which contains the following two lines
> <http://example.org/foo#a> <http://example.org/foo#b> "\nthis \ris a \U00015678long\t\nliteral\uABCD\n" .
> <http://example.org/foo#d> <http://example.org/foo#e> "\tThis \uABCDis\r \U00015678another\n\none\n" .
> scala> import java.io._
> import java.io._
> scala> import com.hp.hpl.jena.rdf.model._
> import com.hp.hpl.jena.rdf.model._
> scala> val f = "/Volumes/Dev/Programming/w3.org/git/pimp-my-rdf/n3-test-suite/target/scala-2.9.1/classes/www.w3.org/TR/turtle/tests/test-18.out"
> f: java.lang.String = /Volumes/Dev/Programming/w3.org/git/pimp-my-rdf/n3-test-suite/target/scala-2.9.1/classes/www.w3.org/TR/turtle/tests/test-18.out
> scala> val in = new InputStreamReader(new BufferedInputStream(new FileInputStream(f)),"UTF-8")
> in: java.io.InputStreamReader = java.io.InputStreamReader@1e392427
> scala> val model = ModelFactory.createDefaultModel()
> model: com.hp.hpl.jena.rdf.model.Model = <ModelCom   {} | >
> scala> model.read(in,"file:/"+f,"TTL")
> com.hp.hpl.jena.n3.turtle.TurtleParseException: Lexical error at line 1, column 71.  Encountered: "U" (85), after : "\"\\nthis \\ris a \\"
> 	at com.hp.hpl.jena.n3.turtle.ParserTurtle.parse(ParserTurtle.java:56)
> 	at com.hp.hpl.jena.n3.turtle.TurtleReader.readWorker(TurtleReader.java:33)
> 	at com.hp.hpl.jena.n3.JenaReaderBase.readImpl(JenaReaderBase.java:119)
> 	at com.hp.hpl.jena.n3.JenaReaderBase.read(JenaReaderBase.java:49)
> 	at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:261)
> or more directly
>  scala> model.read("http://www.w3.org/TR/turtle/tests/test-18.ttl","TTL")
> com.hp.hpl.jena.n3.turtle.TurtleParseException: Lexical error at line 3, column 25.  Encountered: "U" (85), after : "\"\\nthis \\ris a \\"
> 	at com.hp.hpl.jena.n3.turtle.ParserTurtle.parse(ParserTurtle.java:56)
> 	at com.hp.hpl.jena.n3.turtle.TurtleReader.readWorker(TurtleReader.java:33)
> 	at com.hp.hpl.jena.n3.JenaReaderBase.readImpl(JenaReaderBase.java:119)
> 	at com.hp.hpl.jena.n3.JenaReaderBase.read(JenaReaderBase.java:49)
> 	at com.hp.hpl.jena.n3.JenaReaderBase.read(JenaReaderBase.java:60)
> 	at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:241)
> This is with the 2.9 release of Jena for December which I imported into my project with 
>     "org.apache.jena" % "jena-arq" % "2.9.0-incubating"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira