You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jena.apache.org by bu...@apache.org on 2014/08/15 08:52:10 UTC

svn commit: r919406 - in /websites/staging/jena/trunk/content: ./ documentation/csv/design.html documentation/csv/get_started.html

Author: buildbot
Date: Fri Aug 15 06:52:09 2014
New Revision: 919406

Log:
Staging update by buildbot for jena

Modified:
    websites/staging/jena/trunk/content/   (props changed)
    websites/staging/jena/trunk/content/documentation/csv/design.html
    websites/staging/jena/trunk/content/documentation/csv/get_started.html

Propchange: websites/staging/jena/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Aug 15 06:52:09 2014
@@ -1 +1 @@
-1617706
+1618109

Modified: websites/staging/jena/trunk/content/documentation/csv/design.html
==============================================================================
--- websites/staging/jena/trunk/content/documentation/csv/design.html (original)
+++ websites/staging/jena/trunk/content/documentation/csv/design.html Fri Aug 15 06:52:09 2014
@@ -184,7 +184,7 @@ They are supposed to be compatible with 
 <p><a href="https://svn.apache.org/repos/asf/jena/Experimental/jena-csv/src/main/java/org/apache/jena/propertytable/impl/GraphCSV.java">GraphCSV</a> is a sub class of GraphPropertyTable aiming at CSV data.
 Its constructor takes a CSV file path as the parameter, parse the file using a CSV Parser, and makes a <code>PropertyTable</code> through <code>PropertyTableBuilder</code>.</p>
 <p>For CSV to RDF mapping, we establish some basic principles:</p>
-<h3 id="single-value-and-regular-shaped-csv-only">Single-Value and Regular-Shaped CSV only</h3>
+<h3 id="single-value-and-regular-shaped-csv-only">Single-Value and Regular-Shaped CSV Only</h3>
 <p>In the <a href="https://www.w3.org/2013/csvw/wiki/Main_Page">CSV-WG</a>, it looks like duplicate column names are not going to be supported. Therefore, we just consider parsing single-valued CSV tables. 
 There is the current editor working <a href="http://w3c.github.io/csvw/syntax/">draft</a> from the CSV on the Web Working Group, which is defining a more regular data out of CSV.
 This is the target for the CSV work of GraphCSV: tabular regular-shaped CSV; not arbitrary, irregularly shaped CSV.</p>
@@ -196,10 +196,15 @@ It's not necessary to have a defined pri
 <li>The triples for row N have a subject URI which is <code>&lt;FILE#_N&gt;</code>.</li>
 </ol>
 <h3 id="data-type-for-typed-literal">Data Type for Typed Literal</h3>
-<p>All the values in CSV are parsed as strings line by line. As a better option for the user to turn on, a dynamic choice which is a posh way of saying attempt to parse it as an integer (or decimal, double, date) and if it passes, it's an integer (or decimal, double, date).</p>
+<p>All the values in CSV are parsed as strings line by line. As a better option for the user to turn on, a dynamic choice which is a posh way of saying attempt to parse it as an integer (or decimal, double, date) and if it passes, it's an integer (or decimal, double, date).
+Note that for the current release, all of the numbers are parsed as <code>double</code>, and <code>date</code> is not supported yet.</p>
 <h3 id="file-path-as-namespace">File Path as Namespace</h3>
 <p>RDF requires that the subjects and the predicates are URIs. We need to pass in the namespaces (or just the default namespaces) to make URIs by combining the namespaces with the values in CSV.
 We don’t have metadata of the namespaces for the columns, But subjects can be blank nodes which is useful because each row is then a new blank node. For predicates, suppose the URL of the CSV file is <code>file:///c:/town.csv</code>, then the columns can be <code>&lt;file:///c:/town.csv#Town&gt;</code> and <code>&lt;file:///c:/town.csv#Population&gt;</code>, as is showed in the illustration.</p>
+<h3 id="first-line-of-table-header-needed-as-predicates">First Line of Table Header Needed as Predicates</h3>
+<p>The first line of the CSV file must be the table header. The columns of the first line are parsed as the predicates of the RDF triples. The RDF triple data are parsed starting from the second line.</p>
+<h3 id="utf-8-encoded-only">UTF-8 Encoded Only</h3>
+<p>The CSV files must be UTF-8 encoded. If your CSV files are using Western European encodings, please change the encoding before using CSV PropertyTable.</p>
   </div>
 </div>
 

Modified: websites/staging/jena/trunk/content/documentation/csv/get_started.html
==============================================================================
--- websites/staging/jena/trunk/content/documentation/csv/get_started.html (original)
+++ websites/staging/jena/trunk/content/documentation/csv/get_started.html Fri Aug 15 06:52:09 2014
@@ -145,9 +145,28 @@
 	<div class="col-md-12">
 	<div id="breadcrumbs"></div>
 	<h1 class="title">CSV PropertyTable - Get Started
</h1>
-  <h2 id="using-csv-propertytable-from-java-through-the-api">Using CSV PropertyTable from Java through the API</h2>
-<p>For the users, <a href="https://svn.apache.org/repos/asf/jena/Experimental/jena-csv/src/main/java/org/apache/jena/propertytable/impl/GraphCSV.java">GraphCSV</a> is the only class required to run CSV PropertyTable.
-GraphCSV wrappers a CSV file as a Graph, which makes a Model for SPARQL query:</p>
+  <h2 id="using-csv-propertytable-with-apache-maven">Using CSV PropertyTable with Apache Maven</h2>
+<p>See <a href="http://jena.apache.org/download/maven.html">"Using Jena with Apache Maven"</a> for full details.</p>
+<div class="codehilite"><pre><span class="nt">&lt;dependency&gt;</span>
+   <span class="nt">&lt;groupId&gt;</span>org.apache.jena<span class="nt">&lt;/groupId&gt;</span>
+   <span class="nt">&lt;artifactId&gt;</span>jena-csv<span class="nt">&lt;/artifactId&gt;</span>
+   <span class="nt">&lt;version&gt;</span>X.Y.Z<span class="nt">&lt;/version&gt;</span>
+<span class="nt">&lt;/dependency&gt;</span>
+</pre></div>
+
+
+<h2 id="using-csv-propertytable-from-java-through-the-api">Using CSV PropertyTable from Java through the API</h2>
+<p>In order to switch on CSV PropertyTable, it's required to register <code>LangCSV</code> into <a href="http://jena.apache.org/documentation/io/">Jena RIOT</a>, through a simple method call:</p>
+<div class="codehilite"><pre><span class="n">import</span> <span class="n">org</span><span class="p">.</span><span class="n">apache</span><span class="p">.</span><span class="n">jena</span><span class="p">.</span><span class="n">propertytable</span><span class="p">.</span><span class="n">lang</span><span class="p">.</span><span class="n">LangCSV</span><span class="p">;</span>
+<span class="p">...</span> 
+<span class="n">LangCSV</span><span class="p">.</span><span class="n">register</span><span class="p">();</span>
+</pre></div>
+
+
+<p>It's a static method call of registration, which needs to be run just one time for an application before using CSV PropertyTable (e.g. during the initialization phase).</p>
+<p>Once registered, CSV PropertyTable provides 2 ways for the users to play with (i.e. GraphCSV and RIOT):</p>
+<h3 id="graphcsv">GraphCSV</h3>
+<p><a href="https://svn.apache.org/repos/asf/jena/Experimental/jena-csv/src/main/java/org/apache/jena/propertytable/graph/GraphCSV.java">GraphCSV</a> wrappers a CSV file as a Graph, which makes a Model for SPARQL query:</p>
 <div class="codehilite"><pre><span class="n">Model</span> <span class="n">model</span> <span class="p">=</span> <span class="n">ModelFactory</span><span class="p">.</span><span class="n">createModelForGraph</span><span class="p">(</span><span class="n">new</span> <span class="n">GraphCSV</span><span class="p">(</span>&quot;<span class="n">data</span><span class="p">.</span><span class="n">csv</span>&quot;<span class="p">))</span> <span class="p">;</span>
 <span class="n">QueryExecution</span> <span class="n">qExec</span> <span class="p">=</span> <span class="n">QueryExecutionFactory</span><span class="p">.</span><span class="n">create</span><span class="p">(</span><span class="n">query</span><span class="p">,</span> <span class="n">model</span><span class="p">)</span> <span class="p">;</span>
 </pre></div>
@@ -165,8 +184,22 @@ GraphCSV wrappers a CSV file as a Graph,
 </pre></div>
 
 
-<p>You can also find the full examples from <a href="https://svn.apache.org/repos/asf/jena/Experimental/jena-csv/src/test/java/org/apache/jena/propertytable/impl/GraphCSVTest.java">GraphCSVTest</a>.</p>
-<p>In short, for Jena ARP, a CSV table is actually a Graph (i.e. GraphCSV), without any differences from other types of Graphs when using it from the Jena ARQ API.</p>
+<p>You can also find the full examples from <a href="https://svn.apache.org/repos/asf/jena/Experimental/jena-csv/src/test/java/org/apache/jena/propertytable/graph/GraphCSVTest.java">GraphCSVTest</a>.</p>
+<p>In short, for Jena ARQ, a CSV table is actually a Graph (i.e. GraphCSV), without any differences from other types of Graphs when using it from the Jena ARQ API.</p>
+<h3 id="riot">RIOT</h3>
+<p>When LangCSV is registered into RIOT, CSV PropertyTable adds a new RDF syntax of '.csv' with the content type of "text/csv".
+You can read ".csv" files into Model following the standard RIOT usages:</p>
+<div class="codehilite"><pre><span class="c1">// Usage 1: Direct reading through Model</span>
+<span class="n">Model</span> <span class="n">model_1</span> <span class="o">=</span> <span class="n">ModelFactory</span><span class="p">.</span><span class="n">createDefaultModel</span><span class="p">()</span>
+<span class="n">model</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="s">&quot;test.csv&quot;</span><span class="p">)</span> <span class="p">;</span>
+
+<span class="c1">// Usage 2: Reading using RDFDataMgr</span>
+<span class="n">Model</span> <span class="n">model_2</span> <span class="o">=</span> <span class="n">RDFDataMgr</span><span class="p">.</span><span class="n">loadModel</span><span class="p">(</span><span class="s">&quot;test.csv&quot;</span><span class="p">)</span> <span class="p">;</span>
+</pre></div>
+
+
+<p>For more information, see <a href="http://jena.apache.org/documentation/io/rdf-input.html">Reading RDF in Apache Jena</a>.</p>
+<p>Note that, the requirements for the CSV files are listed in the documentation of <a href="design.html">Design</a>. CSV PropertyTable only supports single-Value, regular-Shaped, table-headed and UTF-8-encoded CSV files.</p>
 <h2 id="command-line-tool">Command Line Tool</h2>
 <p><a href="https://svn.apache.org/repos/asf/jena/Experimental/jena-csv/src/main/java/riotcmd/csv2rdf.java">csv2rdf</a> is a tool for direct transforming from CSV to the formatted RDF syntax of N-Triples.
 The script calls the <code>csv2rdf</code> java program in the <code>riotcmd</code> package in this way:</p>