You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by Mike Personick <mi...@systap.com> on 2015/12/19 18:51:35 UTC

Blazegraph / TP3

Still have a few kinds to work through but our basic Blazegraph / TP3
integration is more or less working at this point.

Everything went pretty smoothly - my only major frustration with the API is
the liberal use of Iterators, which are not AutoCloseable, and the built-in
assumption widespread throughout the Tinkerpop codebase that iterations
provided by the Graph implementation do not have any resource cleanup
requirements.

For example - Graph.edges() and Graph.vertices().  I am returning iterators
backed by a database connection, which very much needs to be released when
the iteration is done.  I've done the best I can to protect the caller, by
strengthening the return type to CloseableIterator (extends Iterator and
AutoCloseable), and I've even gone so far as to auto-close the iterator
inside next() if there is no hasNext().  But this does not fully protect
the caller against resource leaks if they are writing to the basic Graph
API, (e.g. they do not fully drain the iterator).

Humbly asking you to please take this into account in future version of the
API.

Thanks,
Mike

Re: Blazegraph / TP3

Posted by Stephen Mallette <sp...@gmail.com>.
We don't have much written on developing TraversalStrategy implementations
at the moment.  You might want to look at some of the existing ones in the
TinkerPop repo or checkout the work in Titan

https://github.com/thinkaurelius/titan/tree/titan11/titan-core/src/main/java/com/thinkaurelius/titan/graphdb/tinkerpop/optimize

to get a feel for how they are doing stuff.  I also think Pieter Martin did
some work in this area in sqlg - perhaps he can provide some pointers.

I think that the general feeling right now is that writing
TraversalStrategy implementations is harder than it should be at them
moment.  Feel free to ask questions and let us know your pain points as you
get into it.  We have some tickets out there to make improvements for
3.2.0, but are still trying to wrap our heads around how to make things
"easier".  A big step towards easier was introduced by marko with the
ability to "explain" a traversal:

http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/reference/#explain-step

That's huge for debugging these things - I believe that Marko even found
bugs in existing strategies while he was building explain().

We'll look for the formal announcement of BlazeGraph's TP3 support before
we put it on the home page - looking forward to seeing that!



On Mon, Dec 21, 2015 at 10:31 AM, Mike Personick <mi...@systap.com> wrote:

> I commented on Ticket 790 and created a new one:
>
> https://issues.apache.org/jira/browse/TINKERPOP-1058
>
> -MP
>
> On Mon, Dec 21, 2015 at 8:13 AM, Mike Personick <mi...@systap.com> wrote:
>
> > Thanks for the response Stephen.  Yes I feel very strongly that any
> > open-ended result sets provided by the graph must be assumed to be backed
> > by a database resource that needs to be closed.  It's true for Blazegraph
> > and it will certainly be true for others.  When you ask for a tuple
> > iterator on a database you must close it, and that is what asking for an
> > iterator over edges, vertices, properties is.  Materializing the result
> set
> > fully into memory first to manage this problem is not a viable solution
> for
> > a large graph.  As I said, I hid a close() inside the last call to
> next(),
> > but this does not protect callers that do not fully drain the iterator.
> >
> > The EmptyVertexProperty and EmptyProperty is more of a nit - I need to
> > extend those to make certain tests in the StructureStandardSuite pass
> > without overriding them and changing their logic a bit.   Those tests do
> > things like this:
> >
> > assertEquals(VertexProperty.empty(), p);
> >
> > Where in my case p is a BlazeVertexProperty (so that it can strengthen
> the
> > return type on any iterators to CloseableIterator), but the equality
> method
> > on EmptyVertexProperty checks the class of other.  So I'd need to
> subclass
> > EmptyVertexProperty to make it work.  Of course if we changed the
> GraphAPI
> > to use CloseableIterators then this change would become unnecessary as I
> > could use the "stock" API instead.
> >
> > We'll probably release/announce in early/mid January.  I have the basics
> > working now but I'd like to dive in a little bit to optimizing the
> > implementation for traversals.  I'd like to combine traversal steps and
> > execute them as SPARQL queries instead.  Could you suggest a starting
> point
> > to look at for that?  I have MatchStep on my list - what else should be
> > high on the list?
> >
> > Thanks,
> > Mike
> >
> >
> >
> > On Mon, Dec 21, 2015 at 6:08 AM, Stephen Mallette <sp...@gmail.com>
> > wrote:
> >
> >> Mike - we've often discussed AutoCloseable on iterators (among other
> >> things).  To yell out a name, I believe Pieter Martin is in favor of
> that.
> >> We have a bit of a general discussion on the topic going here with the
> >> hope
> >> that we could make a change like that to get things straight for
> >> 3.2.0...please feel free to add your thoughts:
> >>
> >> https://issues.apache.org/jira/browse/TINKERPOP-790
> >>
> >> > EmptyVertexProperty and EmptyProperty are declared final
> >>
> >> I'm not sure if we had a specific reason for finalizing those.  We tend
> to
> >> finalize until someone yells and provides reasoning for removing the
> >> "final".  Please add a ticket in JIRA and if no one objects, we could
> >> probably remove that modifier for 3.1.1.
> >>
> >> Thanks for the update btw - glad to hear that BlazeGraph is basically
> >> working under TP3 now.  It would be nice to add BlazeGraph to the list
> of
> >> supporting graphs on the TinkerPop home page - will there be a general
> >> announcement of that anytime soon?
> >>
> >>
> >>
> >> On Sat, Dec 19, 2015 at 2:54 PM, Mike Personick <mi...@systap.com>
> wrote:
> >>
> >> > Another much more minor problem is that EmptyVertexProperty and
> >> > EmptyProperty are declared final.  Since the test suites specifically
> >> check
> >> > for instances of these classes, it makes it impossible for me to
> extend
> >> > them so that I can strengthen the return types on methods that return
> >> > VertexProperty and Property to Blaze-specific implementations of those
> >> > interfaces (ones that use CloseableIterator instead of Iterator).  I'm
> >> just
> >> > going to have to skip a few tests in the test suite to get around this
> >> for
> >> > now.
> >> >
> >> > On Sat, Dec 19, 2015 at 10:51 AM, Mike Personick <mi...@systap.com>
> >> wrote:
> >> >
> >> > > Still have a few kinds to work through but our basic Blazegraph /
> TP3
> >> > > integration is more or less working at this point.
> >> > >
> >> > > Everything went pretty smoothly - my only major frustration with the
> >> API
> >> > > is the liberal use of Iterators, which are not AutoCloseable, and
> the
> >> > > built-in assumption widespread throughout the Tinkerpop codebase
> that
> >> > > iterations provided by the Graph implementation do not have any
> >> resource
> >> > > cleanup requirements.
> >> > >
> >> > > For example - Graph.edges() and Graph.vertices().  I am returning
> >> > > iterators backed by a database connection, which very much needs to
> be
> >> > > released when the iteration is done.  I've done the best I can to
> >> protect
> >> > > the caller, by strengthening the return type to CloseableIterator
> >> > (extends
> >> > > Iterator and AutoCloseable), and I've even gone so far as to
> >> auto-close
> >> > the
> >> > > iterator inside next() if there is no hasNext().  But this does not
> >> fully
> >> > > protect the caller against resource leaks if they are writing to the
> >> > basic
> >> > > Graph API, (e.g. they do not fully drain the iterator).
> >> > >
> >> > > Humbly asking you to please take this into account in future version
> >> of
> >> > > the API.
> >> > >
> >> > > Thanks,
> >> > > Mike
> >> > >
> >> >
> >>
> >
> >
>

Re: Blazegraph / TP3

Posted by Mike Personick <mi...@systap.com>.
I commented on Ticket 790 and created a new one:

https://issues.apache.org/jira/browse/TINKERPOP-1058

-MP

On Mon, Dec 21, 2015 at 8:13 AM, Mike Personick <mi...@systap.com> wrote:

> Thanks for the response Stephen.  Yes I feel very strongly that any
> open-ended result sets provided by the graph must be assumed to be backed
> by a database resource that needs to be closed.  It's true for Blazegraph
> and it will certainly be true for others.  When you ask for a tuple
> iterator on a database you must close it, and that is what asking for an
> iterator over edges, vertices, properties is.  Materializing the result set
> fully into memory first to manage this problem is not a viable solution for
> a large graph.  As I said, I hid a close() inside the last call to next(),
> but this does not protect callers that do not fully drain the iterator.
>
> The EmptyVertexProperty and EmptyProperty is more of a nit - I need to
> extend those to make certain tests in the StructureStandardSuite pass
> without overriding them and changing their logic a bit.   Those tests do
> things like this:
>
> assertEquals(VertexProperty.empty(), p);
>
> Where in my case p is a BlazeVertexProperty (so that it can strengthen the
> return type on any iterators to CloseableIterator), but the equality method
> on EmptyVertexProperty checks the class of other.  So I'd need to subclass
> EmptyVertexProperty to make it work.  Of course if we changed the GraphAPI
> to use CloseableIterators then this change would become unnecessary as I
> could use the "stock" API instead.
>
> We'll probably release/announce in early/mid January.  I have the basics
> working now but I'd like to dive in a little bit to optimizing the
> implementation for traversals.  I'd like to combine traversal steps and
> execute them as SPARQL queries instead.  Could you suggest a starting point
> to look at for that?  I have MatchStep on my list - what else should be
> high on the list?
>
> Thanks,
> Mike
>
>
>
> On Mon, Dec 21, 2015 at 6:08 AM, Stephen Mallette <sp...@gmail.com>
> wrote:
>
>> Mike - we've often discussed AutoCloseable on iterators (among other
>> things).  To yell out a name, I believe Pieter Martin is in favor of that.
>> We have a bit of a general discussion on the topic going here with the
>> hope
>> that we could make a change like that to get things straight for
>> 3.2.0...please feel free to add your thoughts:
>>
>> https://issues.apache.org/jira/browse/TINKERPOP-790
>>
>> > EmptyVertexProperty and EmptyProperty are declared final
>>
>> I'm not sure if we had a specific reason for finalizing those.  We tend to
>> finalize until someone yells and provides reasoning for removing the
>> "final".  Please add a ticket in JIRA and if no one objects, we could
>> probably remove that modifier for 3.1.1.
>>
>> Thanks for the update btw - glad to hear that BlazeGraph is basically
>> working under TP3 now.  It would be nice to add BlazeGraph to the list of
>> supporting graphs on the TinkerPop home page - will there be a general
>> announcement of that anytime soon?
>>
>>
>>
>> On Sat, Dec 19, 2015 at 2:54 PM, Mike Personick <mi...@systap.com> wrote:
>>
>> > Another much more minor problem is that EmptyVertexProperty and
>> > EmptyProperty are declared final.  Since the test suites specifically
>> check
>> > for instances of these classes, it makes it impossible for me to extend
>> > them so that I can strengthen the return types on methods that return
>> > VertexProperty and Property to Blaze-specific implementations of those
>> > interfaces (ones that use CloseableIterator instead of Iterator).  I'm
>> just
>> > going to have to skip a few tests in the test suite to get around this
>> for
>> > now.
>> >
>> > On Sat, Dec 19, 2015 at 10:51 AM, Mike Personick <mi...@systap.com>
>> wrote:
>> >
>> > > Still have a few kinds to work through but our basic Blazegraph / TP3
>> > > integration is more or less working at this point.
>> > >
>> > > Everything went pretty smoothly - my only major frustration with the
>> API
>> > > is the liberal use of Iterators, which are not AutoCloseable, and the
>> > > built-in assumption widespread throughout the Tinkerpop codebase that
>> > > iterations provided by the Graph implementation do not have any
>> resource
>> > > cleanup requirements.
>> > >
>> > > For example - Graph.edges() and Graph.vertices().  I am returning
>> > > iterators backed by a database connection, which very much needs to be
>> > > released when the iteration is done.  I've done the best I can to
>> protect
>> > > the caller, by strengthening the return type to CloseableIterator
>> > (extends
>> > > Iterator and AutoCloseable), and I've even gone so far as to
>> auto-close
>> > the
>> > > iterator inside next() if there is no hasNext().  But this does not
>> fully
>> > > protect the caller against resource leaks if they are writing to the
>> > basic
>> > > Graph API, (e.g. they do not fully drain the iterator).
>> > >
>> > > Humbly asking you to please take this into account in future version
>> of
>> > > the API.
>> > >
>> > > Thanks,
>> > > Mike
>> > >
>> >
>>
>
>

Re: Blazegraph / TP3

Posted by Mike Personick <mi...@systap.com>.
Thanks for the response Stephen.  Yes I feel very strongly that any
open-ended result sets provided by the graph must be assumed to be backed
by a database resource that needs to be closed.  It's true for Blazegraph
and it will certainly be true for others.  When you ask for a tuple
iterator on a database you must close it, and that is what asking for an
iterator over edges, vertices, properties is.  Materializing the result set
fully into memory first to manage this problem is not a viable solution for
a large graph.  As I said, I hid a close() inside the last call to next(),
but this does not protect callers that do not fully drain the iterator.

The EmptyVertexProperty and EmptyProperty is more of a nit - I need to
extend those to make certain tests in the StructureStandardSuite pass
without overriding them and changing their logic a bit.   Those tests do
things like this:

assertEquals(VertexProperty.empty(), p);

Where in my case p is a BlazeVertexProperty (so that it can strengthen the
return type on any iterators to CloseableIterator), but the equality method
on EmptyVertexProperty checks the class of other.  So I'd need to subclass
EmptyVertexProperty to make it work.  Of course if we changed the GraphAPI
to use CloseableIterators then this change would become unnecessary as I
could use the "stock" API instead.

We'll probably release/announce in early/mid January.  I have the basics
working now but I'd like to dive in a little bit to optimizing the
implementation for traversals.  I'd like to combine traversal steps and
execute them as SPARQL queries instead.  Could you suggest a starting point
to look at for that?  I have MatchStep on my list - what else should be
high on the list?

Thanks,
Mike



On Mon, Dec 21, 2015 at 6:08 AM, Stephen Mallette <sp...@gmail.com>
wrote:

> Mike - we've often discussed AutoCloseable on iterators (among other
> things).  To yell out a name, I believe Pieter Martin is in favor of that.
> We have a bit of a general discussion on the topic going here with the hope
> that we could make a change like that to get things straight for
> 3.2.0...please feel free to add your thoughts:
>
> https://issues.apache.org/jira/browse/TINKERPOP-790
>
> > EmptyVertexProperty and EmptyProperty are declared final
>
> I'm not sure if we had a specific reason for finalizing those.  We tend to
> finalize until someone yells and provides reasoning for removing the
> "final".  Please add a ticket in JIRA and if no one objects, we could
> probably remove that modifier for 3.1.1.
>
> Thanks for the update btw - glad to hear that BlazeGraph is basically
> working under TP3 now.  It would be nice to add BlazeGraph to the list of
> supporting graphs on the TinkerPop home page - will there be a general
> announcement of that anytime soon?
>
>
>
> On Sat, Dec 19, 2015 at 2:54 PM, Mike Personick <mi...@systap.com> wrote:
>
> > Another much more minor problem is that EmptyVertexProperty and
> > EmptyProperty are declared final.  Since the test suites specifically
> check
> > for instances of these classes, it makes it impossible for me to extend
> > them so that I can strengthen the return types on methods that return
> > VertexProperty and Property to Blaze-specific implementations of those
> > interfaces (ones that use CloseableIterator instead of Iterator).  I'm
> just
> > going to have to skip a few tests in the test suite to get around this
> for
> > now.
> >
> > On Sat, Dec 19, 2015 at 10:51 AM, Mike Personick <mi...@systap.com>
> wrote:
> >
> > > Still have a few kinds to work through but our basic Blazegraph / TP3
> > > integration is more or less working at this point.
> > >
> > > Everything went pretty smoothly - my only major frustration with the
> API
> > > is the liberal use of Iterators, which are not AutoCloseable, and the
> > > built-in assumption widespread throughout the Tinkerpop codebase that
> > > iterations provided by the Graph implementation do not have any
> resource
> > > cleanup requirements.
> > >
> > > For example - Graph.edges() and Graph.vertices().  I am returning
> > > iterators backed by a database connection, which very much needs to be
> > > released when the iteration is done.  I've done the best I can to
> protect
> > > the caller, by strengthening the return type to CloseableIterator
> > (extends
> > > Iterator and AutoCloseable), and I've even gone so far as to auto-close
> > the
> > > iterator inside next() if there is no hasNext().  But this does not
> fully
> > > protect the caller against resource leaks if they are writing to the
> > basic
> > > Graph API, (e.g. they do not fully drain the iterator).
> > >
> > > Humbly asking you to please take this into account in future version of
> > > the API.
> > >
> > > Thanks,
> > > Mike
> > >
> >
>

Re: Blazegraph / TP3

Posted by Stephen Mallette <sp...@gmail.com>.
Mike - we've often discussed AutoCloseable on iterators (among other
things).  To yell out a name, I believe Pieter Martin is in favor of that.
We have a bit of a general discussion on the topic going here with the hope
that we could make a change like that to get things straight for
3.2.0...please feel free to add your thoughts:

https://issues.apache.org/jira/browse/TINKERPOP-790

> EmptyVertexProperty and EmptyProperty are declared final

I'm not sure if we had a specific reason for finalizing those.  We tend to
finalize until someone yells and provides reasoning for removing the
"final".  Please add a ticket in JIRA and if no one objects, we could
probably remove that modifier for 3.1.1.

Thanks for the update btw - glad to hear that BlazeGraph is basically
working under TP3 now.  It would be nice to add BlazeGraph to the list of
supporting graphs on the TinkerPop home page - will there be a general
announcement of that anytime soon?



On Sat, Dec 19, 2015 at 2:54 PM, Mike Personick <mi...@systap.com> wrote:

> Another much more minor problem is that EmptyVertexProperty and
> EmptyProperty are declared final.  Since the test suites specifically check
> for instances of these classes, it makes it impossible for me to extend
> them so that I can strengthen the return types on methods that return
> VertexProperty and Property to Blaze-specific implementations of those
> interfaces (ones that use CloseableIterator instead of Iterator).  I'm just
> going to have to skip a few tests in the test suite to get around this for
> now.
>
> On Sat, Dec 19, 2015 at 10:51 AM, Mike Personick <mi...@systap.com> wrote:
>
> > Still have a few kinds to work through but our basic Blazegraph / TP3
> > integration is more or less working at this point.
> >
> > Everything went pretty smoothly - my only major frustration with the API
> > is the liberal use of Iterators, which are not AutoCloseable, and the
> > built-in assumption widespread throughout the Tinkerpop codebase that
> > iterations provided by the Graph implementation do not have any resource
> > cleanup requirements.
> >
> > For example - Graph.edges() and Graph.vertices().  I am returning
> > iterators backed by a database connection, which very much needs to be
> > released when the iteration is done.  I've done the best I can to protect
> > the caller, by strengthening the return type to CloseableIterator
> (extends
> > Iterator and AutoCloseable), and I've even gone so far as to auto-close
> the
> > iterator inside next() if there is no hasNext().  But this does not fully
> > protect the caller against resource leaks if they are writing to the
> basic
> > Graph API, (e.g. they do not fully drain the iterator).
> >
> > Humbly asking you to please take this into account in future version of
> > the API.
> >
> > Thanks,
> > Mike
> >
>

Re: Blazegraph / TP3

Posted by Mike Personick <mi...@systap.com>.
Another much more minor problem is that EmptyVertexProperty and
EmptyProperty are declared final.  Since the test suites specifically check
for instances of these classes, it makes it impossible for me to extend
them so that I can strengthen the return types on methods that return
VertexProperty and Property to Blaze-specific implementations of those
interfaces (ones that use CloseableIterator instead of Iterator).  I'm just
going to have to skip a few tests in the test suite to get around this for
now.

On Sat, Dec 19, 2015 at 10:51 AM, Mike Personick <mi...@systap.com> wrote:

> Still have a few kinds to work through but our basic Blazegraph / TP3
> integration is more or less working at this point.
>
> Everything went pretty smoothly - my only major frustration with the API
> is the liberal use of Iterators, which are not AutoCloseable, and the
> built-in assumption widespread throughout the Tinkerpop codebase that
> iterations provided by the Graph implementation do not have any resource
> cleanup requirements.
>
> For example - Graph.edges() and Graph.vertices().  I am returning
> iterators backed by a database connection, which very much needs to be
> released when the iteration is done.  I've done the best I can to protect
> the caller, by strengthening the return type to CloseableIterator (extends
> Iterator and AutoCloseable), and I've even gone so far as to auto-close the
> iterator inside next() if there is no hasNext().  But this does not fully
> protect the caller against resource leaks if they are writing to the basic
> Graph API, (e.g. they do not fully drain the iterator).
>
> Humbly asking you to please take this into account in future version of
> the API.
>
> Thanks,
> Mike
>