You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Markus Schuch (JIRA)" <ji...@apache.org> on 2017/02/25 21:42:44 UTC

[jira] [Comment Edited] (CONNECTORS-1379) TinkerPop Output Connector

    [ https://issues.apache.org/jira/browse/CONNECTORS-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884388#comment-15884388 ] 

Markus Schuch edited comment on CONNECTORS-1379 at 2/25/17 9:42 PM:
--------------------------------------------------------------------

I've learned that the tinkerpop Java API is not build for batch writing large graphs. It's focus is processing on top of existing graphs.

Writing graphs per se is possible, but only as embedded (in memory) graph that can be persisted to a file (graphml, graphson, ...) when finished (no session/transactions, hard to handle in a multi threaded environment, does not scale well)

There is no real graph database agnostic Java API for writing to remote graph databases (neo4j, titan, ...).
The gremlin-server provides only agnostic access and processing.

One would have to select a specific graph database and implement an output connector specifically for that.

We will explore this topic with a new direction: output to a large scale document store (e.g. MongoDB) as staging repository. Graph processing can be done on top of that by importing documents from there.

So i close this ticket.


was (Author: schuchm):
I've learned that the tinkerpop Java API is not build for batch writing large graphs. It's focus is processing on top of existing graphs.

Writing graphs per se is possible, but only as embedded (in memory) graph that can be persisted to a file (graphml, graphson, ...) when finished (no session/transactions, hard to handle in a multi threaded environment, does not scale well)

There is no real graph database agnostic Java API for writing to remote graph databases (neo4j, titan, ...).
The gremlin-server provides only agnostic access and processing.

One would have to select a specific graph database and implement an output connector specifically for that.

So i close this ticket.

> TinkerPop Output Connector
> --------------------------
>
>                 Key: CONNECTORS-1379
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1379
>             Project: ManifoldCF
>          Issue Type: New Feature
>            Reporter: Markus Schuch
>            Assignee: Markus Schuch
>
> An output connector for a https://tinkerpop.apache.org/ graph database.
> Emits {{RepositoryDocuments}} as vertices.
> *Background*
> We experiment pushing docs to TinkerPop instead of pushing to solr directly. This is very experimental.
> Development will be ignited here: https://github.com/dbsystel/manifoldcf/tree/CONNECTORS-1397 and then committed to manifoldcf, if something good comes out of it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)