You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@jena.apache.org by "Andy Seaborne (Jira)" <ji...@apache.org> on 2023/04/16 14:59:00 UTC

[jira] [Comment Edited] (JENA-2351) Newline (U+000A) in IRIs not escaped during NT/TTL/NQ/TRIG serialization

    [ https://issues.apache.org/jira/browse/JENA-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712793#comment-17712793 ] 

Andy Seaborne edited comment on JENA-2351 at 4/16/23 2:58 PM:
--------------------------------------------------------------

Hopefully  the DBpedia extraction framework will fix the bug.

The principle here is "Be liberal in what you accept, and conservative in what you send." Postel's Law / https://en.wikipedia.org/wiki/Robustness_principle

The consumer of written output can't be expected to handle U+000A.

You can argue that Jena ought to error on output but not write illegal output.

bq. they at least meet the Turtle grammar rule.

But not the Turtle spec overall. The specs give a simple rule to avoid repeating the whole of RFC3986 updated by RFC3987.

In fact, for RDF 1.2, we are providing (in RDF Concepts), informatively, a ABNF grammar for IRIs, with the later RFC's applied.

> without further notice,

Could you show exactly what you ran because I do get a warning making that describe query.



was (Author: andy.seaborne):
The principle here is "Be liberal in what you accept, and conservative in what you send." Postel's Law / https://en.wikipedia.org/wiki/Robustness_principle

The consumer of written output can't be expected to handle U+000A.
You can argue that Jena ought to error on output but not write illegal output.

bq. they at least meet the Turtle grammar rule.

But not the Turtle spec overall. The specs give a simple rule to avoid repeating the whole of RFC3986 updated by RFC3987.

In fact, for RDF 1.2, we are providing (in RDF Concepts), informatively, a ABNF grammar for IRIs, with the later RFC's applied.

> without further notice,

Could you show exactly what you ran because I do get a warning making that describe query.


> Newline (U+000A) in IRIs not escaped during NT/TTL/NQ/TRIG serialization 
> -------------------------------------------------------------------------
>
>                 Key: JENA-2351
>                 URL: https://issues.apache.org/jira/browse/JENA-2351
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: RIOT
>    Affects Versions: Jena 4.7.0
>            Reporter: Jan Martin Keil
>            Priority: Major
>
> [Newline characters (U+000A) in IRIs|https://github.com/dbpedia/extraction-framework/issues/748] are not escaped during the serialization of a model or datasets into a format of the turtle family. This results in invalid files, which Jena is not able to read anymore. Please not the following tests:
> {code:java}
> import org.apache.jena.query.Dataset;
> import org.apache.jena.query.DatasetFactory;
> import org.apache.jena.rdf.model.*;
> import org.apache.jena.riot.Lang;
> import org.apache.jena.riot.RDFDataMgr;
> import org.junit.jupiter.api.Test;
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileOutputStream;
> import java.io.IOException;
> public class Example {
>     @Test
>     public void rdfXml() throws IOException {
>         Property someProperty = ResourceFactory.createProperty("http://example.org/property");
>         Model model = ModelFactory.createDefaultModel();
>         model.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a string");
>         System.out.println("\nRDF/XML:\n");
>         model.write(System.out,"RDF/XML");
>         // test write and read
>         File file = File.createTempFile("example",".rdf");
>         model.write(new FileOutputStream(file),"RDF/XML");
>         ModelFactory.createDefaultModel().read(new FileInputStream(file),"","RDF/XML");
>     }
>     @Test
>     public void ttl() throws IOException {
>         Property someProperty = ResourceFactory.createProperty("http://example.org/property");
>         Model model = ModelFactory.createDefaultModel();
>         model.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a string");
>         System.out.println("\nTTL:\n");
>         model.write(System.out,"TTL");
>         // test write and read
>         File file = File.createTempFile("example",".ttl");
>         model.write(new FileOutputStream(file),"TTL");
>         ModelFactory.createDefaultModel().read(new FileInputStream(file),"","TTL");
>     }
>     @Test
>     public void nTriples() throws IOException {
>         Property someProperty = ResourceFactory.createProperty("http://example.org/property");
>         Model model = ModelFactory.createDefaultModel();
>         model.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a string");
>         System.out.println("\nN-TRIPLE:\n");
>         model.write(System.out,"N-TRIPLE");
>         // test write and read
>         File file = File.createTempFile("example",".nt");
>         model.write(new FileOutputStream(file),"N-TRIPLE");
>         ModelFactory.createDefaultModel().read(new FileInputStream(file),"","N-TRIPLE");
>     }
>     @Test
>     public void nq() throws IOException {
>         Property someProperty = ResourceFactory.createProperty("http://example.org/property");
>         Model model1 = ModelFactory.createDefaultModel();
>         model1.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a string");
>         Model model2 = ModelFactory.createDefaultModel();
>         model2.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a string");
>         Dataset dataset = DatasetFactory.createGeneral();
>         dataset.setDefaultModel(model1);
>         dataset.addNamedModel("http://example.org/namedGraph",model2);
>         System.out.println("\nNQ:\n");
>         RDFDataMgr.write(System.out, dataset, Lang.NQ) ;
>         // test write and read
>         File file = File.createTempFile("example", ".nq");
>         RDFDataMgr.write(new FileOutputStream(file), dataset, Lang.NQ) ;
>         RDFDataMgr.read(DatasetFactory.createGeneral(), new FileInputStream(file), Lang.NQ) ;
>     }
>     @Test
>     public void trig() throws IOException {
>         Property someProperty = ResourceFactory.createProperty("http://example.org/property");
>         Model model1 = ModelFactory.createDefaultModel();
>         model1.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a string");
>         Model model2 = ModelFactory.createDefaultModel();
>         model2.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a string");
>         Dataset dataset = DatasetFactory.createGeneral();
>         dataset.setDefaultModel(model1);
>         dataset.addNamedModel("http://example.org/namedGraph",model2);
>         System.out.println("\nTRIG:\n");
>         RDFDataMgr.write(System.out, dataset, Lang.TRIG) ;
>         // test write and read
>         File file = File.createTempFile("example", ".trig");
>         RDFDataMgr.write(new FileOutputStream(file), dataset, Lang.TRIG) ;
>         RDFDataMgr.read(DatasetFactory.createGeneral(), new FileInputStream(file), Lang.TRIG) ;
>     }
> }
> {code}
> Outputs (stack traces truncated):
> {code:java}
> N-TRIPLE:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> Apr. 15, 2023 10:01:45 PM org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline): http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline): http://example.org/aaa/
> 	at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> 	...
> {code}
> {code:java}
> RDF/XML:
> <rdf:RDF
>     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>     xmlns:j.0="http://example.org/">
>   <rdf:Description rdf:about="http://example.org/aaa/&#xA;bbb">
>     <j.0:property>a string</j.0:property>
>   </rdf:Description>
> </rdf:RDF>
> {code}
> {code:java}
> NQ:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" <http://example.org/namedGraph> .
> Apr. 15, 2023 10:01:45 PM org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline): http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline): http://example.org/aaa/
> 	at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> 	...
> {code}
> {code:java}
> TTL:
> <http://example.org/aaa/
> bbb>    <http://example.org/property>  "a string" .
> Apr. 15, 2023 10:01:45 PM org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline): http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline): http://example.org/aaa/
> 	at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> 	...
> {code}
> {code:java}
> TRIG:
> <http://example.org/aaa/
> bbb>    <http://example.org/property>  "a string" .
> <http://example.org/namedGraph> {
>     <http://example.org/aaa/
>     bbb>    <http://example.org/property>  "a string" .
> }
> Apr. 15, 2023 10:01:45 PM org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline): http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline): http://example.org/aaa/
> 	at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> 	...
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: jira-unsubscribe@jena.apache.org
For additional commands, e-mail: jira-help@jena.apache.org