You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@age.apache.org by Lewis John McGibbney <le...@apache.org> on 2022/01/02 19:11:40 UTC

Re: AGE Tutorial?

Thank you all VERY much for the resources and information. I’ll get back around to replying more thoroughly this coming week.
lewismc

On 2021/12/26 17:59:47 Nicholas Sorrell wrote:
> Lewis,
> 
> Joe has some great tips there. I would also point you in the direction of this repo I created: https://github.com/sorrell/age-compose
> 
> 
>   *   It loads a database with Northwind sample data, and then loads the graph from existing tables (so to answer your question, AGE is needed to ingest/populate the graph)
>   *   There is a python function that loads data from existing tables here https://github.com/sorrell/age-compose/blob/a8933bea36276d6e04fb814acb74aff282e7493d/docker-entrypoint/initdb.d/20-initgraph.sql#L10
>   *   That file, 20-initgraph.sql, also has example queries of how to create relationships
> 
> Alain is right that new tables are created for graph edges and nodes, as evidenced by this screenshot (where regular tables are in the 'public' schema and graph tables are in the 'northwind_graph' schema).
> [cid:68cba9fa-652b-44c8-a42a-f613f94ce8c6]
> 
> I'd like to hear more from Alain about the concern regarding this. I would defer to the core team on their vision here, but perhaps Alain's concern is that the data isn't as easy to index? Because the data is stored as 'agtype' rather than JSONB, the PG accessors/indexes aren't readily available on these tables/fields (as far as I know) and require explicit casts.
> 
> 
> --
> Nick Sorrell
> 513.288.1084
> https://cint.io
> 
> 
> ________________________________
> From: Joe Fagan <jo...@bitnine.net>
> Sent: Friday, December 24, 2021 10:32 AM
> To: users@age.apache.org <us...@age.apache.org>
> Subject: Re: AGE Tutorial?
> 
> Lewis
> To be clear, graph data must be ingested by PostgreSQL using the AGE extension  - not Postgres alone.
> Of course you can create data with CREATE OpenCypher query but this is not realistic for large data sets. See example below.
> 
>   *   For very large files there is a function that can read csv files, one for nodes, another for relationships, but this is not in main branch yet. If you need it we can provide it or if you can wait until mid Jan we expect it will be publicly available and documented.
>   *   Alternatively if your data is not private, you can send it to me, I will ingest it for you and send you back a pgdump that will contain your data that you can pull straight back into your postgres with AGE extension.
> 
> I'm based in the UK so 8 hours ahead of you in CA. Feel free to call me if you wish
> Joe +447788148772
> 
> example friends graph
> SELECT CREATE_GRAPH('agc_graph');
> 
> SELECT * from cypher('agc_graph', $$
>         CREATE
>         (a :Male {Name: 'Adam', Age: 25}),
>         (b :Male {Name: 'Bert', Age: 26, Eyes: 'Brown'}),
>         (c :Male {Name: 'Carl', Age: 27}),
>         (d :Female {Name: 'Dee', Age: 25}),
>         (e :Female {Name: 'Eve', Age: 25}),
>         (f :Female {Name: 'Fia', Age: 28}),
>         (a)-[:follows { Duration: 5}]->(b),
>         (b)-[:follows]->(a),
>         (b)-[:follows]->(c),
>         (b)-[:follows]->(d),
>         (d)-[:follows]->(a),
>         (c)-[:follows]->(d),
>         (c)-[:follows]->(e),
>         (c)-[:follows]->(f),
>         (f)-[:follows]->(d),
>         (e)-[:follows]->(b)
> 
> $$) as (foo agtype);
> 
> On Fri, 24 Dec 2021 at 11:34, Joe Fagan <jo...@bitnine.net>> wrote:
> Lewis
> 
> How are you using AGE. For example if you are using docker, there is a good readme at at https://hub.docker.com/repository/docker/joefagan/incubator-age
> 
> You can also find documentation at https://age.apache.org/docs/master/index.html
> 
> Are you already familiar with PostgreSQL and/or OpenCypher?
> 
> How much data do you wish to ingest? What format is the data currently in? How many nodes?
> 
> A contributor Muhammad Shoaib https://github.com/muhammadshoaib has developed a ver fast .csv ingest utility that is just about to be made public. We can make that available to you.
> 
> Thanks
> Joe
> 
> 
> 
> On Fri, 24 Dec 2021 at 01:12, lewis john mcgibbney <le...@apache.org>> wrote:
> Hi users@,
> Does anyone know if a getting started AGE tutorial exists?
> What’s not clear to me is whether data is ingested directly into PostgreSQL or via AGE…?
> I’ve read the documentation regarding graph creation but I’ve not found documentation related to populating the graph or data ingestion generally.
> Can anyone shine some light on this for me?
> Thanks
> lewismc
> --
> http://home.apache.org/~lewismc/
> http://people.apache.org/keys/committer/lewismc
>

Re: AGE Tutorial?

Posted by "Alain (Gmail)" <al...@gmail.com>.

Hi Joe (and others),

(To answer your question of the last paragraph ;-) - Thanks for asking ...)

Disclaimer: I wouldn't consider me as a PostgreSQL expert (I know more 
about other databases) but I have been in the IT sector for more than 30 
years, in various roles (developer in a number of languages [including 
functional ones], software designer, trainer/courseware writer, 
architect of various kinds, including enterprise architect - what I now 
do most of the time, middleware team leader, project manager, 
application manager ...). This is not to expose my "service records" but 
just to give an idea of my experience ;-).

My main concern is that this approach won't scale very well - not 
necessarily in terms of raw performance [I just cannot say] but at least 
in terms of "overall manageability" of a database instance. If one 
considers a knowledge graph representing a complex domain with several 
hundred thousands nodes and possibly several thousands labels, I'm not 
sure that the database will be easy to tune or that it will be able to 
make the best possible use of the available computing resources. Now, 
it's more a "guts feeling"; call it "code smell" - or rather "design 
smell" if you prefer ... I also seem to remember - but just from a 
glance at the physical database  - and that was several months ago - 
that it led to (managed, for sure, but significant) duplication of 
information in the database instance (what I don't like much, either).

Yet, I like the idea of having this layer on top of a strong player like 
PostgreSQL. I have played quite a bit with Neo4j - that I like too - but 
I'm not sure Java is the right language in the longer run for demanding 
applications. I would rather bet on Go or Rust (or C++) ... (I have also 
had a quick look at OrientDB and didn't get convinced; ArangoDb I 
haven't examined yet).

Of course, I'm interested in hearing and discussing with people of 
differing opinions. Any opinion or argument is welcome ...

Best regards,
Alain


On 2/01/22 20:11, Lewis John McGibbney wrote:
> Thank you all VERY much for the resources and information. I’ll get back around to replying more thoroughly this coming week.
> lewismc
>
> On 2021/12/26 17:59:47 Nicholas Sorrell wrote:
>> Lewis,
>>
>> Joe has some great tips there. I would also point you in the direction of this repo I created: https://github.com/sorrell/age-compose
>>
>>
>>    *   It loads a database with Northwind sample data, and then loads the graph from existing tables (so to answer your question, AGE is needed to ingest/populate the graph)
>>    *   There is a python function that loads data from existing tables here https://github.com/sorrell/age-compose/blob/a8933bea36276d6e04fb814acb74aff282e7493d/docker-entrypoint/initdb.d/20-initgraph.sql#L10
>>    *   That file, 20-initgraph.sql, also has example queries of how to create relationships
>>
>> Alain is right that new tables are created for graph edges and nodes, as evidenced by this screenshot (where regular tables are in the 'public' schema and graph tables are in the 'northwind_graph' schema).
>> [cid:68cba9fa-652b-44c8-a42a-f613f94ce8c6]
>>
>> I'd like to hear more from Alain about the concern regarding this. I would defer to the core team on their vision here, but perhaps Alain's concern is that the data isn't as easy to index? Because the data is stored as 'agtype' rather than JSONB, the PG accessors/indexes aren't readily available on these tables/fields (as far as I know) and require explicit casts.
>>
>>
>> --
>> Nick Sorrell
>> 513.288.1084
>> https://cint.io
>>
>>
>> ________________________________
>> From: Joe Fagan <jo...@bitnine.net>
>> Sent: Friday, December 24, 2021 10:32 AM
>> To: users@age.apache.org <us...@age.apache.org>
>> Subject: Re: AGE Tutorial?
>>
>> Lewis
>> To be clear, graph data must be ingested by PostgreSQL using the AGE extension  - not Postgres alone.
>> Of course you can create data with CREATE OpenCypher query but this is not realistic for large data sets. See example below.
>>
>>    *   For very large files there is a function that can read csv files, one for nodes, another for relationships, but this is not in main branch yet. If you need it we can provide it or if you can wait until mid Jan we expect it will be publicly available and documented.
>>    *   Alternatively if your data is not private, you can send it to me, I will ingest it for you and send you back a pgdump that will contain your data that you can pull straight back into your postgres with AGE extension.
>>
>> I'm based in the UK so 8 hours ahead of you in CA. Feel free to call me if you wish
>> Joe +447788148772
>>
>> example friends graph
>> SELECT CREATE_GRAPH('agc_graph');
>>
>> SELECT * from cypher('agc_graph', $$
>>          CREATE
>>          (a :Male {Name: 'Adam', Age: 25}),
>>          (b :Male {Name: 'Bert', Age: 26, Eyes: 'Brown'}),
>>          (c :Male {Name: 'Carl', Age: 27}),
>>          (d :Female {Name: 'Dee', Age: 25}),
>>          (e :Female {Name: 'Eve', Age: 25}),
>>          (f :Female {Name: 'Fia', Age: 28}),
>>          (a)-[:follows { Duration: 5}]->(b),
>>          (b)-[:follows]->(a),
>>          (b)-[:follows]->(c),
>>          (b)-[:follows]->(d),
>>          (d)-[:follows]->(a),
>>          (c)-[:follows]->(d),
>>          (c)-[:follows]->(e),
>>          (c)-[:follows]->(f),
>>          (f)-[:follows]->(d),
>>          (e)-[:follows]->(b)
>>
>> $$) as (foo agtype);
>>
>> On Fri, 24 Dec 2021 at 11:34, Joe Fagan <jo...@bitnine.net>> wrote:
>> Lewis
>>
>> How are you using AGE. For example if you are using docker, there is a good readme at at https://hub.docker.com/repository/docker/joefagan/incubator-age
>>
>> You can also find documentation at https://age.apache.org/docs/master/index.html
>>
>> Are you already familiar with PostgreSQL and/or OpenCypher?
>>
>> How much data do you wish to ingest? What format is the data currently in? How many nodes?
>>
>> A contributor Muhammad Shoaib https://github.com/muhammadshoaib has developed a ver fast .csv ingest utility that is just about to be made public. We can make that available to you.
>>
>> Thanks
>> Joe
>>
>>
>>
>> On Fri, 24 Dec 2021 at 01:12, lewis john mcgibbney <le...@apache.org>> wrote:
>> Hi users@,
>> Does anyone know if a getting started AGE tutorial exists?
>> What’s not clear to me is whether data is ingested directly into PostgreSQL or via AGE…?
>> I’ve read the documentation regarding graph creation but I’ve not found documentation related to populating the graph or data ingestion generally.
>> Can anyone shine some light on this for me?
>> Thanks
>> lewismc
>> --
>> http://home.apache.org/~lewismc/
>> http://people.apache.org/keys/committer/lewismc
>>