You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jena.apache.org by an...@apache.org on 2013/03/09 15:07:00 UTC
svn commit: r1454711 -
/jena/site/trunk/content/documentation/io/output.mdtext
Author: andy
Date: Sat Mar 9 14:06:59 2013
New Revision: 1454711
URL: http://svn.apache.org/r1454711
Log:
Add draft RIOT writer documentation
Added:
jena/site/trunk/content/documentation/io/output.mdtext
Added: jena/site/trunk/content/documentation/io/output.mdtext
URL: http://svn.apache.org/viewvc/jena/site/trunk/content/documentation/io/output.mdtext?rev=1454711&view=auto
==============================================================================
--- jena/site/trunk/content/documentation/io/output.mdtext (added)
+++ jena/site/trunk/content/documentation/io/output.mdtext Sat Mar 9 14:06:59 2013
@@ -0,0 +1,280 @@
+Title: Reading and Writing RDF in Apache Jena
+
+This page describes the RIOT (RDF I/O technology) output capabilities.
+
+See [Reading RDF](index.html) for details of the RIOT Reader system.
+
+See [Advanced RDF/XML Output](rdfxml_howto.html#advanced-rdfxml-output)
+for details of the Jena RDF/XML writer.
+
+- [API](API)
+- [RDFFormat](rdfformat)
+- [`RDFFormat`s and Jena syntax names](rdfformats-and-jena-syntax-names)
+- [Formats](formats)
+ - [Normal Printing](normal-printing)
+ - [Pretty Printed Languages](pretty-printed-formats)
+ - [Streamed Block Formats](streamed-block-formats)
+ - [Line printed formats](line-printed-formats)
+ - [N-triples and N-Quads](n-triples-and-n-quads)
+ - [RDF/XML](rdfxml)
+- [Examples](examples)
+- [Notes](Notes)
+
+## API
+
+There are two ways to write RDF data using Apache Jena RIOT,
+either via the `RDFDataMgr`
+
+ RDFDataMgr.write(OutputStream, Model, RDFFormat) ;
+ RDFDataMgr.write(OutputStream, Dataset, RDFFormat) ;
+
+or using the `model` API:
+
+ model.write(output, "<i>format</i>") ;
+
+The `<i>format</i>` names are described below; they are a superset of the
+names Jena has supported before RIOT.
+
+Many variations of these methods exist. See the full javadoc for details.
+
+## `RDFFormat`
+
+Output using RIOT depends on the format, which involves both the language (syntax)
+being written and the variant of that syntax.
+
+The RIOT writer architecture is extensible. The following languages
+are available as part of the standard setup.
+
+* Turtle
+* N-Triples
+* RDF/XML
+* RDF/JSON
+* TriG
+* NQuads
+
+In addition, there are variants of Trutle, TriG for pretty printing,
+streamed output and flat output. RDF/XML has variants for pretty printing
+and plain output. Jena RIOT uses `org.apache.jena.riot.RDFFormat` as a way
+to identfy the language and variant to be written. The class contains constants
+for the standard supported formats.
+
+Note:
+
+* RDF/JSON is not JSON-LD. See the [description of RDF/JSON](rdf-json.html)].
+* N3 is treated as Turtle for output.
+
+## `RDFFormat`s and Jena syntax names
+
+The string name traditionally used in `model.write` is mapped to RIOT `RDFFormat`
+as follows:
+
+| Jena writer name | RIOT RDFFormat |
+|----------------------|------------------|
+| `"TURTLE"` | `TURTLE` |
+| `"TTL"` | `TURTLE` |
+| `"Turtle"` | `TURTLE` |
+| `"N-TRIPLES"` | `NTRIPLES` |
+| `"N-TRIPLE"` | `NTRIPLES` |
+| `"NT"` | `NTRIPLES` |
+| `"RDF/XML-ABBREV"` | `RDFXML` |
+| `"RDF/XML"` | `RDFXML_PLAIN` |
+| `"N3"` | `N3` |
+| `"RDF/JSON"` | `RDFJSON` |
+
+## Formats
+
+### Normal Printing
+
+A `Lang` can be used for the writer format, in which case it is mapped to
+an `RDFFormat` internally. The normal writers are:
+
+| RDFFormat or Lang | |
+|-------------------|-------------------------|
+| TURTLE | Turtle, pretty printed |
+| TTL | Same |
+| NTRIPLES | N-triples |
+| TRIG | TriG, pretty printed |
+| NQUADS | |
+| RDFXML | RDF/XML, pretty printed |
+
+Pretty printed RDF/XML is also known as RDF/XML-ABBREV
+
+### Pretty Printed Languages
+
+All Turtle and TriG formats use
+prefix names, and short forms for literals.
+
+The pretty printed versions of Turtle and TriG prints
+data with the same subject in the same graph together.
+All the properties for a given subject are sorted
+into a predefined order. RDF lists are printed as
+`(...)` and `[...]` is used for blank nodes where possible.
+
+The analysis for determing what can be pretty printed requires
+temporary datastructures and also a scan of the whole graph before
+writing begins. Therefore, pretty printed formats are not suitable
+for writing persistent graphs and datasets.
+
+When writing at scale use either a "blocked" version of Turtle or TriG,
+or write N-triples/N-Quads.
+
+Example:
+
+ @prefix : <http://example/> .
+ @prefix dc: <http://purl.org/dc/elements/1.1/> .
+ @prefix foaf: <http://xmlns.com/foaf/0.1/> .
+ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
+
+ :book dc:author ( :a :b ) .
+
+ :a a foaf:Person ;
+ foaf:knows [ foaf:name "Bob" ] ;
+ foaf:name "Alice" .
+
+ :b foaf:knows :a .
+
+Pretty printed formats:
+
+| RDFFormat | Same as |
+|----------------|-----------------------|
+| TURTLE_PRETTY | TURTLE, TTL |
+| TRIG_PRETTY | TRIG |
+| RDFXML_PRETTY | RDFXML_ABBREV, RDFXML |
+
+### Streamed Block Formats
+
+The streamed formats write triples or quads as given.
+They group together data by adjacent subject or graph/subject
+in the output stream.
+
+The written data is like the pretty printed forms but without
+RDF lists being written in the '(...)' form, and it does not
+use the blank node form `[...]`.
+
+This gives some degree of readability while not requiring
+excessive temporary datastructure. Data larger than the size of RAM
+can be written but blank node labels need to be tracked in order
+to use the short label form.
+
+Example:
+
+ @prefix : <http://example/> .
+ @prefix dc: <http://purl.org/dc/elements/1.1/> .
+ @prefix foaf: <http://xmlns.com/foaf/0.1/> .
+ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
+
+ :book dc:author _:b0 .
+
+ _:b0 rdf:rest _:b1 ;
+ rdf:first :a .
+
+ :a foaf:knows _:b2 ;
+ foaf:name "Alice" ;
+ rdf:type foaf:Person .
+
+ _:b2 foaf:name "Bob" .
+
+ :b foaf:knows :a .
+
+ _:b1 rdf:rest rdf:nil ;
+ rdf:first :b .
+
+Formats:
+
+| RDFFormat |
+|----------------|
+| TURTLE_BLOCKS |
+| TRIG_BLOCKS |
+
+### Line printed
+
+There are writers for Turtle and Trig that use the abbreviated formats for
+prefix names and short forms for literals. They write each triple or quad
+on a single line.
+
+The regularity of the output can be useful for test processing data.
+These formats do not offer more scalabilty than the stream forms.
+
+Example:
+
+The FLAT writers abbreviates IRIs, literals and blank node labels
+but always writes one complete triple on one line (no use of `;`).
+
+ @prefix : <http://example/> .
+ @prefix dc: <http://purl.org/dc/elements/1.1/> .
+ @prefix foaf: <http://xmlns.com/foaf/0.1/> .
+ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
+ _:b0 foaf:name "Bob" .
+ :book dc:author _:b1 .
+ _:b2 rdf:rest rdf:nil .
+ _:b2 rdf:first :b .
+ :a foaf:knows _:b0 .
+ :a foaf:name "Alice" .
+ :a rdf:type foaf:Person .
+ _:b1 rdf:rest _:b2 .
+ _:b1 rdf:first :a .
+ :b foaf:knows :a .
+
+
+| RDFFormat |
+|-------------|
+| TURTLE_FLAT |
+| TRIG_FLAT |
+
+### N-triples and N-Quads
+
+These provide the formats that are fastest to write,
+and data of any size can be output. They do not use any
+internal state. They maximise the
+interoperability with other systems and are useful
+for database dumps. They are not human readable,
+even at moderate scale.
+
+The files can be large but they compress well with gzip.
+Compression ratios of x8-x10 can often be obtained.
+
+Example:
+
+The N-Triples writer makes no attempt to make it's output readable.
+It uses internal blank nodes to ensure correct labeling without
+needing any writer state.
+
+ _:BX2Dc2b3371X3A13cf8faaf53X3AX2D7fff <http://xmlns.com/foaf/0.1/name> "Bob" .
+ <http://example/book> <http://purl.org/dc/elements/1.1/author> _:BX2Dc2b3371X3A13cf8faaf53X3AX2D7ffe .
+ _:BX2Dc2b3371X3A13cf8faaf53X3AX2D7ffd <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
+ _:BX2Dc2b3371X3A13cf8faaf53X3AX2D7ffd <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> <http://example/b> .
+ <http://example/a> <http://xmlns.com/foaf/0.1/knows> _:BX2Dc2b3371X3A13cf8faaf53X3AX2D7fff .
+ <http://example/a> <http://xmlns.com/foaf/0.1/name> "Alice" .
+ <http://example/a> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
+ _:BX2Dc2b3371X3A13cf8faaf53X3AX2D7ffe <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:BX2Dc2b3371X3A13cf8faaf53X3AX2D7ffd .
+ _:BX2Dc2b3371X3A13cf8faaf53X3AX2D7ffe <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> <http://example/a> .
+ <http://example/b> <http://xmlns.com/foaf/0.1/knows> <http://example/a> .
+
+
+| RDFFormat | Other names |
+|-----------|-----------------|
+| NTRIPLE | NTRIPLE, NT, NQ |
+| NQUADS | NQUADS |
+
+### RDF/XML
+
+RIOT supports output in RDF/XML. RIOT RDFFormats defaults to pretty printed RDF/XML,
+while the jena writer writer name defaults to a streaming plain output.
+
+| RDFFormat | Other names | Jena writer name |
+|-----------|--------------------------|-----------------------------|
+| RDFXML | RDFXML_PRETTY, RDF_XML_ABBREV | "RDF/XML-ABBREV" |
+| RDFXML_PLAIN | | "RDF/XML" |
+
+## Examples
+
+@@TODO
+
+## Notes
+
+Using `OutputStream`s is strongly encouraged. This allows the writers
+to manage the character encoding using UTF-8. Using `java.io.Writer`
+does not allow this; on platforms such as MS Windows, the default
+configuration of a `Writer` is not suitable for Turtle because
+the characte set is the platform default, and not UTF-8.
+The only use of wirters that is useful is using `java.io.StringWriter`.