You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jena.apache.org by jp...@apache.org on 2014/07/31 13:45:21 UTC
svn commit: r1614863 - in /jena/site/trunk/content/documentation/csv: ./
index.mdtext
Author: jpz6311whu
Date: Thu Jul 31 11:45:20 2014
New Revision: 1614863
URL: http://svn.apache.org/r1614863
Log:
documentation for JENA-625
Added:
jena/site/trunk/content/documentation/csv/
jena/site/trunk/content/documentation/csv/index.mdtext
Added: jena/site/trunk/content/documentation/csv/index.mdtext
URL: http://svn.apache.org/viewvc/jena/site/trunk/content/documentation/csv/index.mdtext?rev=1614863&view=auto
==============================================================================
--- jena/site/trunk/content/documentation/csv/index.mdtext (added)
+++ jena/site/trunk/content/documentation/csv/index.mdtext Thu Jul 31 11:45:20 2014
@@ -0,0 +1,55 @@
+Title: CSV PropertyTable
+
+This module is about getting CSVs into a form that is amenable to Jena SPARQL processing, and doing so in a way that is not specific to CSV files.
+It includes getting the right architecture in place for regular table shaped data, using the core abstraction of PropertyTable.
+
+*Illustration*
+
+This module involves the basic mapping of CSV to RDF using a fixed algorithm, including interpreting data as numbers or strings.
+
+Suppose we have a CSV file located in ¡°file:///c:/town.csv¡±, which has one header row, two data rows:
+
+ Town,Population
+ Southton,123000
+ Northville,654000
+
+As RDF this might be viewable as:
+
+ @prefix : <file:///c:/town.csv#> .
+ @prefix csv: <http://w3c/future-csv-vocab/> .
+ [ csv:row 1 ; :Town "Southton" ; :Population ¡°123000¡±^^http://www.w3.org/2001/XMLSchema#int ] .
+ [ csv:row 2 ; :Town "Northville" ; :Population ¡°654000¡±^^http://www.w3.org/2001/XMLSchema#int ] .
+
+or without the bnode abbreviation:
+
+ @prefix : <file:///c:/town.csv#> .
+ @prefix csv: <http://w3c/future-csv-vocab/> .
+ _:b0 csv:row 1 ;
+ :Town "Southton" ;
+ :Population ¡°123000¡±^^http://www.w3.org/2001/XMLSchema#int .
+ _:b1 csv:row 2 ;
+ :Town "Northville" ;
+ :Population ¡°654000¡±^^http://www.w3.org/2001/XMLSchema#int.
+
+Each row is modeling one "entity" (here, a population observation).
+There is a subject (a blank node) and one predicate-value for each cell of the row.
+Row numbers are added because it can be important.
+Now the CSV file is viewed as a graph - normal, unmodified SPARQL can be used.
+Multiple CSVs files can be multiple graphs in one dataset to give query across different data sources.
+
+We can use the following SPARQL query for ¡°Towns over 500,000 people¡± mentioned in the CSV file:
+
+ SELECT ?townName ?pop {
+ GRAPH <file:///c:/town.csv> {
+ ?x :Town ?townName ;
+ :Popuation ?pop .
+ FILTER(?pop > 500000)
+ }
+ }
+
+## Get Started
+
+## Design
+
+## Implementation
+