You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@age.apache.org by jo...@apache.org on 2022/03/29 20:01:07 UTC

[incubator-age-website] branch master updated: Age laod documentation (#31)

This is an automated email from the ASF dual-hosted git repository.

joshinnis pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-age-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 5bed73c  Age laod documentation (#31)
5bed73c is described below

commit 5bed73c5cdab7af74005e05af708ddce385e82a4
Author: Shoaib <mu...@gmail.com>
AuthorDate: Tue Mar 29 22:01:04 2022 +0200

    Age laod documentation (#31)
    
    * added ageload documentation
---
 docs/index.rst       |   1 +
 docs/intro/agload.md | 158 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 159 insertions(+)

diff --git a/docs/index.rst b/docs/index.rst
index 109064c..854b1fc 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -14,6 +14,7 @@ Apache AGE's documentation
    intro/comparability
    intro/operators
    intro/aggregation
+   intro/agload
 
 .. toctree::
    :caption: Clauses
diff --git a/docs/intro/agload.md b/docs/intro/agload.md
new file mode 100644
index 0000000..e37595f
--- /dev/null
+++ b/docs/intro/agload.md
@@ -0,0 +1,158 @@
+# Importing graph from files 
+You can use the following instructions to create a graph from the files. This document explains 
+- information about the current branch that includes the functions to load graphs from files
+- explanation of the functions that enable the creation of graphs from files 
+- the structure of CSV files that load functions as input, do and do not. 
+- A simple source code example to load countries and cities from the files. 
+
+## Getting from the current branch 
+The current implementation is available on the fork I have created and currently have not been merged with the main master branch. You can download the fork from the following. Please make sure that you download `create-graph-from-files-b1` branch and not some other branch as this has been specifically prepared for you. 
+
+https://github.com/muhammadshoaib/incubator-age/tree/create-graph-from-files-b1
+
+## Load Graph functions 
+Following are the details about the functions to create vertices and edges from the file. 
+
+function `load_labels_from_file` is used to load vertices from the CSV files. 
+
+```sql
+load_labels_from_file('<graph name>', 
+                      '<label name>',
+                      '<file path>')
+```
+
+By adding the fourth parameter user can exclude the id field. *** Use this when there is no id field in the file***
+
+```sql
+load_labels_from_file('<graph name>', 
+                      '<label name>',
+                      '<file path>', 
+                      false)
+```
+
+Function `load_edges_from_file` can be used to load properties from the CSV file. Please see the file structure in the following. 
+
+Note: make sure that ids in the edge file are identical to ones that are in vertices files. 
+
+```sql
+oad_edges_from_file('<graph name>',
+                    '<label name>',
+                    '<file path>');
+```
+
+## Explanation about the CSV format
+Following is the explanation about the structure for CSV files for vertices and edges.
+
+- A CSV file for nodes shall be formatted aw following; 
+
+| field name | Field description                                            |
+| ---------- | ------------------------------------------------------------ |
+| id         | it shall be the first column of the file and all values shall be a positive integer. This is an optional field when `id_field_exists` is ***false***. However, it should be present when `id_field_exists` is ***not*** set to false.  |
+| Properties | all other columns contains the properties for the nodes. Header row shall contain the name of property |
+
+- Similarly, a CSV file for edges shall be formatted as follows 
+
+| field name        | Field description                                            |
+| ----------------- | ------------------------------------------------------------ |
+| start_id          | node id of the node from where the edge is stated. This id shall be present in nodes.csv file. |
+| start_vertex_type | class of the node                                            |
+| end_id            | end id of the node at which the edge shall be terminated    |
+| end_vertex_type   | Class of the node                                            |
+| properties        | properties of the edge. the header shall contain the property name |
+
+example files can be viewed at `regress/age_load/data`
+
+## Example SQL script 
+
+- Load and create graph 
+```sql
+LOAD 'age';
+
+SET search_path TO ag_catalog;
+SELECT create_graph('agload_test_graph');
+```
+
+- Create label `country` and load vertices from csv file. *** Note this CSV file has id field ***
+
+```sql
+SELECT create_vlabel('agload_test_graph','Country');
+SELECT load_labels_from_file('agload_test_graph',
+                             'Country',
+                             'age_load/countries.csv');
+```
+
+- Create label `City` and load vertices from csv file. *** Note this CSV file has id field ***
+
+```sql
+SELECT create_vlabel('agload_test_graph','City');
+SELECT load_labels_from_file('agload_test_graph',
+                             'City', 
+                             'age_load/cities.csv');
+```
+
+- Create label `has_city` and load edges from csv file.
+
+```sql
+SELECT create_elabel('agload_test_graph','has_city');
+SELECT load_edges_from_file('agload_test_graph', 'has_city',
+     'age_load/edges.csv');
+```
+
+- check if the graph has been loaded properly
+
+```sql
+SELECT table_catalog, table_schema, table_name, table_type
+FROM information_schema.tables
+WHERE table_schema = 'agload_test_graph';
+
+SELECT COUNT(*) FROM agload_test_graph."Country";
+SELECT COUNT(*) FROM agload_test_graph."City";
+SELECT COUNT(*) FROM agload_test_graph."has_city";
+
+SELECT COUNT(*) FROM cypher('agload_test_graph', $$MATCH(n) RETURN n$$) as (n agtype);
+SELECT COUNT(*) FROM cypher('agload_test_graph', $$MATCH (a)-[e]->(b) RETURN e$$) as (n agtype);
+```
+
+### Creating vertices without id field in the file. 
+
+- Create label `country` and load vertices from csv file. *** Note this CSV file has no id field ***
+
+```sql
+SELECT create_vlabel('agload_test_graph','Country2');
+SELECT load_labels_from_file('agload_test_graph',
+                             'Country2',
+                             'age_load/countries.csv', 
+                             false);
+```
+
+- Create label `City` and load vertices from csv file. *** Note this CSV file has id field ***
+```sql
+SELECT create_vlabel('agload_test_graph','City2');
+SELECT load_labels_from_file('agload_test_graph',
+                             'City2',
+                             'age_load/cities.csv', 
+                             false);
+```
+- check if the graph has been loaded properly and perform difference analysis between ids created automatically and picked from the files.
+
+- labels `country` and `city` were created with id field in the file
+- labels `country2` and `city2` were created with no id field in the file. 
+```sql
+SELECT COUNT(*) FROM agload_test_graph."Country2";
+SELECT COUNT(*) FROM agload_test_graph."City2";
+
+SELECT id FROM agload_test_graph."Country" LIMIT 10;
+SELECT id FROM agload_test_graph."Country2" LIMIT 10;
+
+SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country {iso2 : 'BE'})
+    RETURN id(n), n.name, n.iso2 $$) as ("id(n)" agtype, "n.name" agtype, "n.iso2" agtype);
+SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country2 {iso2 : 'BE'})
+    RETURN id(n), n.name, n.iso2 $$) as ("id(n)" agtype, "n.name" agtype, "n.iso2" agtype);
+
+SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country {iso2 : 'AT'})
+    RETURN id(n), n.name, n.iso2 $$) as ("id(n)" agtype, "n.name" agtype, "n.iso2" agtype);
+SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country2 {iso2 : 'AT'})
+    RETURN id(n), n.name, n.iso2 $$) as ("id(n)" agtype, "n.name" agtype, "n.iso2" agtype);
+
+SELECT drop_graph('agload_test_graph', true);
+```
\ No newline at end of file