You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@age.apache.org by GitBox <gi...@apache.org> on 2021/08/20 13:58:11 UTC

[GitHub] [incubator-age] pdpotter commented on issue #45: unique properties / indexing

pdpotter commented on issue #45:
URL: https://github.com/apache/incubator-age/issues/45#issuecomment-902711546


   I found a (1000 times faster) workaround for the creation of edges between vertices that were added earlier. The workaround consists of inserting the data directly in the underlying tables that are used by Apache AGE.
   
   First, the underlying ids of the vertices are retrieved:
   ```
   SELECT * FROM cyper('graph_name',$$
   MATCH (n:LabelA)
   return id(n), n.id
   $$) as (id agtype, prop agtype);
   ```
   A single edge is then created using the cypher function to make sure the underlying tables are created correctly (the id values are made up)
   ```
   SELECT * FROM cypher('graph_name', $$
   MATCH (d:LabelA), (r:LabelB)
   WHERE id(d) = 11111 and id(r) = 11112
   CREATE (d)-[:RelationA {prop: 'value'}]->(r)
   $$) as (a agtype);
   ```
   Indices on start_id and end_id are then created in the underlying table for the edges of a certain type, to speed up the inserts in the `_ag_label_edge` table later on:
   ```
   CREATE INDEX RelationA__start_id ON graph_name.RelationA(start_id);
   CREATE INDEX RelationA__end_id ON graph_name.RelationA(end_id);
   ```
   All other edges are created by direct insertion, first in the specific edge table (using [executemany](https://magicstack.github.io/asyncpg/current/api/index.html#asyncpg.connection.Connection.executemany)):
   ```
   INSERT INTO graph_name.RelationA (start_id, end_id, properties)
   VALUES ($1, $2, $3)
   ```
   Where `$1` are the domain_ids, `$2` are the range_ids and `$3` are the properties (as json dump).
   
   Secondly, the edges are also directly inserted into the `_ag_label_edge` table (also using executemany). In my use case, each edge has an `id` property that can be used to select the correct edge when there are multiple edges between two vertices:
   ```
   INSERT INTO graph_name._ag_label_edge (id, start_id, end_id, properties)
   VALUES (
       (
           SELECT id from graph_name.RelationA
           WHERE start_id = $1
           AND end_id = $2
           AND properties::text::json->>'id' = $3
       ),
       $1,
       $2,
       $4
   )
   ```
   Where `$1` are the domain_ids, `$2` are the range_ids, `$3` are the relation ids and `$4` are the properties (as json dump).
   
   Any thoughts on this workaround?
   Are there any plans to add property indexes to Apache AGE later on?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org