You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@age.apache.org by GitBox <gi...@apache.org> on 2022/10/12 19:08:08 UTC

[GitHub] [age] TropicalPenguin opened a new issue, #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

TropicalPenguin opened a new issue, #330:
URL: https://github.com/apache/age/issues/330

   **Describe the bug**
   
   After setting up the following data:
   `CREATE (a:A)-[:incs]->(:C), (a)-[:incs]->(:C) RETURN a`
   
   The node labelled A has two outgoing edges, each going to a node labelled C.
   
   With AGE, I can successfully match these two nodes with the following query, substituting for 0 the ID of a:
   
   `MATCH (a:A) WHERE ID(a)=0 WITH a OPTIONAL MATCH (a)-[:incs]->(c) RETURN c`.
   
   _However_, with the slightly more complex query (again substituting the generated ID as needed), to capture only the C's which have one or zero 'incs' edges, AGE is giving an invalid result:
   `MATCH (a:A) WHERE ID(a)=0 WITH a OPTIONAL MATCH (a)-[:incs]->(c)-[d:incs]-() WITH a,c,COUNT(d) AS deps WHERE deps<=1 RETURN c,deps`
   
   **How are you accessing AGE (Command line, driver, etc.)?**
   - Golang Driver, Command Line
   
   **What data setup do we need to do?**
   
   Try running the following Go program:
   ```go
   package main
   
   import (
   	"database/sql"
   	"fmt"
   
   	_ "github.com/lib/pq"
   	"github.com/rhizome-ai/apache-age-go/age"
   )
   
   func main() {
   	dsn := "postgres://postgres:postgres@127.0.0.1:5434/postgres?sslmode=disable"
   	db, err := sql.Open("postgres", dsn)
   	if err != nil {
   		panic(err)
   	}
   
   	graphName := "test"
   
   	_, err = age.GetReady(db, graphName)
   	if err != nil {
   		panic(err)
   	}
   
   	tx, err := db.Begin()
   	if err != nil {
   		panic(err)
   	}
   
   	cursor, err := age.ExecCypher(tx, graphName, 0, "MATCH (n) DETACH DELETE n")
   	if err != nil {
   		panic(err)
   	}
   
   	cursor, err = age.ExecCypher(tx, graphName, 1, "CREATE (a:A)-[:incs]->(:C), (a)-[:incs]->(:C) RETURN a")
   	if err != nil {
   		panic(err)
   	}
   
   	var row []age.Entity
   	if cursor.Next() {
   		row, err = cursor.GetRow()
   		aid := row[0].(*age.Vertex).Id()
   
   		tx.Commit()
   
   		tx, err = db.Begin()
   		if err != nil {
   			panic(err)
   		}
   
   		// This query can work with AGE, returning each of the two nodes with label 'C':
   		//q := fmt.Sprintf("MATCH (a:A) WHERE ID(a)=%d WITH a OPTIONAL MATCH (a)-[:incs]->(c) RETURN c", aid)
   
   		// Whereas this returns a single row, with a null and a 0
   		// Expected: Should return the each node labelled C, each with a value 1 for 'deps'
   		q := fmt.Sprintf("MATCH (a:A) WHERE ID(a)=%d WITH a OPTIONAL MATCH (a)-[:incs]->(c)-[d:incs]-() WITH a,c,COUNT(d) AS deps WHERE deps<=1 RETURN c,deps", aid)
   		fmt.Println("Test query:", q)
   
   		cursor, err = age.ExecCypher(tx, graphName, 2, q)
   		if err == nil {
   			for cursor.Next() {
   				row, err = cursor.GetRow()
   
   				for rowIdx, v := range row {
   					fmt.Println(rowIdx, len(row), v)
   				}
   			}
   		} else {
   			fmt.Println("ERROR:", err)
   		}
   	} else {
   		tx.Commit()
   
   		tx, err = db.Begin()
   		if err != nil {
   			panic(err)
   		}
   	}
   
   	_, err = tx.Exec(fmt.Sprintf("SELECT drop_graph('%s', true);", graphName))
   	if err != nil {
   		panic(err)
   	}
   	tx.Commit()
   }
   ```
   
   **What is the necessary configuration info needed?**
   - N/A
   
   **What is the command that caused the error?**
   ```go
              q := fmt.Sprintf("MATCH (a:A) WHERE ID(a)=%d WITH a OPTIONAL MATCH (a)-[:incs]->(c)-[d:incs]-() WITH a,c,COUNT(d) AS deps WHERE deps<=1 RETURN c,deps", aid)
   
   		cursor, err = age.ExecCypher(tx, graphName, 2, q)
   ```
   When the cursor is traversed in the following lines, this gives the output
   ```
   0 2 <nil>
   1 2 0
   ```
   
   **Expected behavior**
   Whereas it is expected to print something more like:
   ```
   0 2 V{id:1407374883553283, label:C, props:map[]}
   1 2 1
   0 2 V{id:1407374883553284, label:C, props:map[]}
   1 2 1
   ```
   
   **Environment (please complete the following information):**
   - Version: 1.0.0
   
   **Additional context**
   N/A
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] TropicalPenguin commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
TropicalPenguin commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1283474368

   > Btw, as it is always possible that we got something incorrect or overlooked something, I ran your query on Neo4j to make sure that our implementation conforms with theirs and it also returned zero rows.
   
   Plot twist! Huh. This is interesting. I tested these in RedisGraph before creating this issue. And in RedisGraph, these commands give me the following result:
   
   ```
   127.0.0.1:6379> GRAPH.QUERY test "CREATE (a:A)-[:incs]->(:C), (a)-[:incs]->(:C) RETURN a"
   ```
   ```
   1) 1) "a"
   2) 1) 1) 1) 1) "id"
               2) (integer) 0
            2) 1) "labels"
               2) 1) "A"
            3) 1) "properties"
               2) (empty array)
   3) 1) "Labels added: 2"
      2) "Nodes created: 3"
      3) "Relationships created: 2"
      4) "Cached execution: 0"
      5) "Query internal execution time: 6.650151 milliseconds"
   ```
   ```
   127.0.0.1:6379> GRAPH.QUERY test "MATCH (a:A) WITH a OPTIONAL MATCH (a)-[:incs]->(c) RETURN c"
   ```
   ```
   1) 1) "c"
   2) 1) 1) 1) 1) "id"
               2) (integer) 1
            2) 1) "labels"
               2) 1) "C"
            3) 1) "properties"
               2) (empty array)
      2) 1) 1) 1) "id"
               2) (integer) 2
            2) 1) "labels"
               2) 1) "C"
            3) 1) "properties"
               2) (empty array)
   3) 1) "Cached execution: 0"
      2) "Query internal execution time: 6.739759 milliseconds
   ```
   ```
   127.0.0.1:6379> GRAPH.QUERY test "MATCH (a:A) WITH a OPTIONAL MATCH (a)-[:incs]->(c)-[d:incs]-() WITH a,c,COUNT(d) AS deps WHERE deps=1 RETURN c,deps"
   ```
   ```
   1) 1) "c"
      2) "deps"
   2) 1) 1) 1) 1) "id"
               2) (integer) 2
            2) 1) "labels"
               2) 1) "C"
            3) 1) "properties"
               2) (empty array)
         2) (integer) 1
      2) 1) 1) 1) "id"
               2) (integer) 1
            2) 1) "labels"
               2) 1) "C"
            3) 1) "properties"
               2) (empty array)
         2) (integer) 1
   3) 1) "Cached execution: 0"
      2) "Query internal execution time: 2.113974 milliseconds"
   ```
   
   It seems you're right about the Neo4J behaviour though. I was able to get the behaviour there (and on AGE) that I was expecting by separating that clause in two: 
   `MATCH (a:A) WITH a OPTIONAL MATCH (a)-[:incs]->(c) WITH c OPTIONAL MATCH (c)-[d:incs]-() WITH c,COUNT(d) AS deps WHERE deps=1 RETURN c,deps`
   
   Given the possibility of such a workaround (thanks for making clearer the source - whether limitation or intent - of the failure, because it let me know how to approach one), I'm now less concerned about the difference in behaviour from RedisGraph.
   
   Still, the fact that is supported in some Cypher implementations raises some interesting questions about your other observations.
   
   > It is not a limitation, it depends on the graph given and the "technical" definition of a path.
   I wonder from this if the implementation based on sparse matrices makes the evaluation of such structures more natively handled than whatever approach Neo4J is using...
   
   > But, just for a moment, think of the implications of allowing an edge to be reused. Since vertices can be reused, that would cause endless loops.
   Yeah, I can definitely see the possibility of an implementation running into cycles, however, intuition (and the fact that it can be evaluated by at least one implementation) would suggest that there are sufficient constraints specified in the query to infer when to stop traversing.
   
   But especially given that a major vendor doesn't handle this, I'm now totally content if this isn't treated as a priority.
   
   Would you like me to close this issue?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] jrgemignani commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
jrgemignani commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1284292692

   If you feel the issue is resolved, then yes, please close it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] jrgemignani commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
jrgemignani commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1282947056

   I modified my answer above slightly to make it a little more clear.
   
   The bidirectional edge just means, any **available** edge connected to this node. It doesn't say to reuse an edge that has already been traveled.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] TropicalPenguin commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
TropicalPenguin commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1282950694

   While it is true that it doesn't _mandate_ reusing an edge,  what I've come to expect from other implementations of Cypher is that it also isn't restricted to only following untraversed edges. If it is in AGE, this is a definite limitation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] TropicalPenguin commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
TropicalPenguin commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1282942228

   In other words, it is expected to reuse the edge so as to be able to count the number of such 'dependencies'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] TropicalPenguin commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
TropicalPenguin commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1282802614

   Sorry @jrgemignani ; In trying to simplify the example (by using 0 as a placeholder for the actual ID), I hadn't realised I just made the actual issue harder to reproduce 😅.
   
   I've updated the description to more easily reveal the problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] jrgemignani commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
jrgemignani commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1284290832

   I can't speak for RedisGraph, as I have never used it. So, I can't really say why it chose that route for MATCH. Especially, since most traversal algorithms would preclude those results, like as in AGE and Neo4j. What I can say is that our implementation follows Neo4j **and** graph definitions as closely as possible, where applicable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] jrgemignani commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
jrgemignani commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1282992051

   It is not a limitation, it **depends** on the graph given and the "technical" **definition** of a path.
   
   But, just for a moment, think of the implications of allowing an edge to be reused. Since vertices can be reused, that would cause endless loops.
   
   Btw, as it is always possible that we got something incorrect or overlooked something, I ran your query on Neo4j to make sure that our implementation conforms with theirs and it also returned zero rows.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] jrgemignani commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
jrgemignani commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1281405834

   I would be warry of specifying a match on a specific ID number, unless you verify that it is that ID.
   
   Note the following example -
   
   ```
   psql-11.5-5432-pgsql=# SELECT * from cypher('test', $$ MATCH (a:A) WITH a OPTIONAL MATCH (a)-[:incs]->(c) RETURN c, id(c) $$) as (c agtype, id agtype);
                                   c                                 |        id
   ------------------------------------------------------------------+------------------
    {"id": 1407374883553281, "label": "C", "properties": {}}::vertex | 1407374883553281
    {"id": 1407374883553282, "label": "C", "properties": {}}::vertex | 1407374883553282
   (2 rows)
   
   psql-11.5-5432-pgsql=# SELECT * from cypher('test', $$ MATCH (a:A) WHERE ID(a)=0 WITH a OPTIONAL MATCH (a)-[:incs]->(c) RETURN c, id(c) $$) as (c agtype, id agtype);
    c | id
   ---+----
   (0 rows)
   
   psql-11.5-5432-pgsql=#
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] TropicalPenguin commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
TropicalPenguin commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1282939163

   > There isn't a path from a to c _and then back to a_, without reusing an edge. That's why it returns nothing.
   
   I've added emphasis to where this is wrong: the 'd' edge is a bidirectional match.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] jrgemignani commented on issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
jrgemignani commented on issue #330:
URL: https://github.com/apache/age/issues/330#issuecomment-1282936299

   The following -
   
   `CREATE (a:A {foo:1})-[:incs]->(:C), (a)-[:incs]->(:C) RETURN a` 
   
   creates 2 **distinct** paths joined at node **a**. So, 3 **distinct** nodes and 2 **distinct** edges.
   
   ```
   psql-11.5-5432-pgsql=# SELECT * FROM cypher('test', $$ MATCH (u)-[e]->(v) RETURN e $$) AS (e agtype);
                                                                e
   ----------------------------------------------------------------------------------------------------------------------------
    {"id": 1125899906842658, "label": "incs", "end_id": 1407374883553314, "start_id": 844424930132002, "properties": {}}::edge
    {"id": 1125899906842659, "label": "incs", "end_id": 1407374883553315, "start_id": 844424930132002, "properties": {}}::edge
   (2 rows)
   ```
   
   `()<-[]-(a)-[]->()`
   
   The following -
   
   `MATCH (a:A) WHERE exists(a.foo) WITH a OPTIONAL MATCH (a)-[:incs]->(c) RETURN c`
   
   matches those 2 **distinct** paths joined at **a** and returns **c** - each paths' endpoint.
   
   The following -
   
   `MATCH (a:A) WHERE exists(a.foo) WITH a OPTIONAL MATCH (a)-[:incs]->(c)-[d:incs]-() WITH a,c,COUNT(d) AS deps WHERE deps=1 RETURN c,deps`
   
   doesn't do what you may think it does. The result that you are expecting is based on there being a **c** and a **d** value matched. However, there isn't.
   
   ```
   psql-11.5-5432-pgsql=# SELECT * FROM cypher('test', $$ MATCH (a)-[:incs]->(c)-[d:incs]-() RETURN c,d $$) AS (c agtype, d agtype);
    c | d
   ---+---
   (0 rows)
   
   psql-11.5-5432-pgsql=#
   ```
   
   There isn't a path from **a** to **c** and then back to **a**, without reusing an edge. That's why it returns nothing. 
   
   Hopefully this is helpful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [age] TropicalPenguin closed issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...

Posted by GitBox <gi...@apache.org>.
TropicalPenguin closed issue #330: Invalid Result Returned for OPTIONAL MATCH ... WITH ...
URL: https://github.com/apache/age/issues/330


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@age.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org