You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Andy Seaborne (JIRA)" <ji...@apache.org> on 2017/05/28 08:51:04 UTC

[jira] [Commented] (JENA-1349) Regression: reading from HTTP with content-type text/plain with RDFDataMgr.loadGraph()

    [ https://issues.apache.org/jira/browse/JENA-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027747#comment-16027747 ] 

Andy Seaborne commented on JENA-1349:
-------------------------------------

> In 3.20, the real content-type used to be recognized from URL ending, but not anymore.

What is happening is that the {{text/plain}}, the Content-type is being used as declared syntax.  This is technically correct but also a bit of a nuisance because it is the commonly used content-type for unconfigured servers. Previously, for the special case of {{text/plain}} only, {{RDFDataMgr}} ignored the {{Content-type}} because it was wrong so often wrong.

We should probably put this behaviour back.

For reference, here is an example set of {{web.xml}} settings for RDF and related syntaxes. Note that {{text/turtle}} has an explicit charset (the text/* default is not UTF-8). Includes [the RDF binary|http://afs.github.io/rdf-thrift/rdf-binary-thrift.html] content types as well.

{noformat}
  <!-- For serving static files -->

  <mime-mapping>
    <extension>rdf</extension>
    <mime-type>application/rdf+xml</mime-type>
  </mime-mapping>
  <mime-mapping>
    <extension>ttl</extension>
    <mime-type>text/turtle;charset=utf-8</mime-type>
  </mime-mapping>
  <mime-mapping>
    <extension>nt</extension>
    <mime-type>application/n-triples</mime-type>
  </mime-mapping>
  <mime-mapping>
    <extension>nq</extension>
    <mime-type>application/n-quads</mime-type>
  </mime-mapping>
  <mime-mapping>
    <extension>trig</extension>
    <mime-type>application/trig</mime-type>
  </mime-mapping>
  <mime-mapping>
    <extension>jsonld</extension>
    <mime-type>application/ld+json</mime-type>
  </mime-mapping>

  <mime-mapping>
    <extension>tr</extension>
    <mime-type>application/rdf+thrift</mime-type>
  </mime-mapping>
  <mime-mapping>
    <extension>trdf</extension>
    <mime-type>application/rdf+thrift</mime-type>
  </mime-mapping>
  
  <mime-mapping>
    <extension>rq</extension>
    <mime-type>application/sparql-query</mime-type>
  </mime-mapping>
  <mime-mapping>
    <extension>ru</extension>
    <mime-type>application/sparql-update</mime-type>
  </mime-mapping>

  <mime-mapping>
    <extension>srx</extension>
    <mime-type>application/sparql-results+xml</mime-type>
  </mime-mapping>
  <mime-mapping>
    <extension>srj</extension>
    <mime-type>application/sparql-results+json</mime-type>
  </mime-mapping>
  <mime-mapping>
    <extension>srt</extension>
    <mime-type>application/sparql-results+thrift</mime-type>
  </mime-mapping>
{noformat}

> Regression: reading from HTTP with content-type text/plain with RDFDataMgr.loadGraph()
> --------------------------------------------------------------------------------------
>
>                 Key: JENA-1349
>                 URL: https://issues.apache.org/jira/browse/JENA-1349
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: RIOT
>    Affects Versions: Jena 3.3.0
>         Environment: uname -a
> Linux jmv-SMBIOSation 4.10.0-21-generic #23-Ubuntu SMP Fri Apr 28 16:14:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> java -version
> java version "1.8.0_121"
> Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
>            Reporter: Jean-Marc Vanel
>
> raw.githubusercontent.com sends file as text/plain ,
>     and Jena 3.3.0 has a regression reading such URL:
> In 3.20, the real content-type used to be recognized from URL ending, but not anymore.
>  
> How to reproduce:  org.apache.jena.riot.RDFDataMgr.loadGraph("https://raw.githubusercontent.com/jmvanel/rdf-i18n/master/foaf/foaf.fr.ttl")
> Log and stack:
> contentType for <http://topbraid.org/schema/schema.ttl> "text/plain; charset=UTF-8" 
> 20:19:57.687 [main] ERROR org.apache.jena.riot - [line: 3, col: 1 ] Expected BNode or IRI: Got: [DIRECTIVE:prefix]
> readStoreURINoTransaction: after rdfLoader.load(http://topbraid.org/schema/schema.ttl): Failure(org.apache.jena.riot.RiotException: [line: 3, col: 1 ] Expected BNode or IRI: Got: [DIRECTIVE:prefix])
> org.apache.jena.riot.RiotException: [line: 3, col: 1 ] Expected BNode or IRI: Got: [DIRECTIVE:prefix]
> 	at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
> 	at org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:148)
> 	at org.apache.jena.riot.lang.LangEngine.exceptionDirect(LangEngine.java:143)
> 	at org.apache.jena.riot.lang.LangEngine.exception(LangEngine.java:137)
> 	at org.apache.jena.riot.lang.LangNTuple.checkIRIOrBNode(LangNTuple.java:89)
> 	at org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:74)
> 	at org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:53)
> 	at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:41)
> 	at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:194)
> 	at org.apache.jena.riot.RDFParser.read(RDFParser.java:293)
> 	at org.apache.jena.riot.RDFParser.parseNotUri(RDFParser.java:283)
> 	at org.apache.jena.riot.RDFParser.parse(RDFParser.java:233)
> 	at org.apache.jena.riot.RDFParserBuilder.parse(RDFParserBuilder.java:405)
> 	at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:862)
> 	at org.apache.jena.riot.RDFDataMgr.parse(RDFDataMgr.java:676)
> 	at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:222)
> 	at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:103)
> 	at org.apache.jena.riot.RDFDataMgr.loadGraph(RDFDataMgr.java:354)
> 	at org.w3.banana.jena.io.JenaRDFReader$$anon$2$$anonfun$load$1.apply(JenaRDFReader.scala:76)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)