You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by David Bechberger <da...@bechberger.com> on 2020/09/17 01:25:41 UTC

[DISCUSS] Creating pattern steps to codify best practices

I've had a few on and off discussions with a few people here, so I wanted
to send this out to everyone for feedback.

What are people's thoughts on creating a new set of steps that codify
common Gremlin best practices?

I think there are several common Gremlin patterns where users would benefit
from the additional guidance that these codified steps represent.  The
first one I would recommend though is codifying the element existence
pattern into a single Gremlin step, something like:

mergeV(String, Map, Map)
     String - The vertex label
     Map (first) - The properties to match existing vertices on
     Map (second) - Any additional properties to set if a new vertex is
created (optional)
mergeE(String, Map, Map)
     String - The edge label
     Map (first) - The properties to match existing edge on
     Map (second) - Any additional properties to set if a new edge is
created (optional)

In each of these cases these steps would perform the same upsert
functionality as the element existence pattern.

Example:

g.V().has('person','name','stephen').
           fold().
           coalesce(unfold(),
                    addV('person').
                      property('name','stephen').
                      property('age',34))

would become:

g.mergeV('person', {'name': 'stephen'}, {'age', 34})

I think that this change would be a good addition to the language for
several reasons:

* This codifies the best practice for a specific action/recipe, which
reduces the chance that someone uses the pattern incorrectly
* Most complex Gremlin traversals are verbose.  Reducing the amount of code
that needs to be written and maintained allows for a better developer
experience.
* It will lower the bar of entry for a developer by making these actions
more discoverable.  The more we can help bring these patterns to the
forefront of the language via these pattern/meta steps the more we guide
users towards writing better Gremlin faster
* This allows DB vendors to optimize for this pattern

I know that this would likely be the first step in Gremlin that codifies a
pattern, so I'd like to get other's thoughts on this?

Dave

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Stephen Mallette <sp...@gmail.com>.
I gave a bit more thought to this one today. I think the stream case is
still stuck:

https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31#upsert-with-stream

Gremlin can only produce a Map<String,E2> which precludes T as a key. If
you look at the select()/project() signatures none of them nicely shift to
specification of T as part of the existing String arguments. Obviously we
could switch the String types in select() to wide open Object arguments but
I'm not sure that's a great thing as we already have enough type safety
sorts of issues with Gremlin in Java. I suppose it is less of an issue in
other languages like Python. Anyway, would still like to get this "stream"
part of upsert right....

On Tue, Dec 15, 2020 at 8:13 AM Stephen Mallette <sp...@gmail.com>
wrote:

> We may yet need to keep upsertV/E() as a new step because without that
> addV() does get a bit odd as we would then lose the nice implicit match
> behavior of:
>
> // implicitly match on name/age without having to specify by()
> g.upsertV('person', [name: 'marko', age: 29])
>
> If we were to try to trigger that functionality with Map arguments but not
> the current method we'd likely fall deeper into kelvin's area of concerns.
> I've updated the gist again with the by() options and tweaked some other
> things:
>
> https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
>
>
>
> On Mon, Dec 14, 2020 at 3:20 PM Kelvin Lawrence <gf...@yahoo.com.invalid>
> wrote:
>
>> In general I like the idea of having a simpler way to express an upsert
>> without needing to know the coalesce pattern.
>> I am a little worried that if the addV and addE steps are overloaded to
>> to perform the upsert task that suddenly, steps that have always worked one
>> way, can now sometimes, do something different - not create but just find
>> an existing element. My concern is mainly around readability of queries and
>> a user knowing that a step can do certain extra things based on the
>> parameterization. This kind of goes a little against the Linux style
>> philosophy of each command doing one thing and doing it well.  Again, I'm
>> mostly raising this not because I am against the idea but wanting to make
>> sure it is very clear (maybe due to the presence of a by() modulator, that
>> a given addV/addE step is in "upsert mode". We have a few other cases where
>> steps work differently based on subtle parameterizations and people get
>> confused. Consider the case of.
>> "where(is(eq('A'))"  versus  "where(eq('A'))"
>> I also want to think a bit about how/if these can be nested. Today a see
>> a lot of nested coalesce steps in queries people write trying to do large
>> (complex) multi-part upserts.
>> Cheers,Kelvin
>> Kelvin R. Lawrence
>>
>>     On Friday, December 11, 2020, 12:36:45 PM CST, Stephen Mallette <
>> spmallette@gmail.com> wrote:
>>
>>  +1 to no new step. I haven't yet thought of a reason why this shouldn't
>> just be a variation on addV/E().
>>
>> On Fri, Dec 11, 2020 at 1:03 PM David Bechberger <da...@bechberger.com>
>> wrote:
>>
>> > I agree with Josh's description of what the upsertV() functionality is
>> > intended to be.
>> >
>> > I also fully support the simplification provided by the by() modulation
>> > that Stephen suggested, removing the second map.  I think that provides
>> a
>> > much cleaner and easier to comprehend syntax.
>> >
>> > With that agreement, I think this does beg the question of if this
>> should
>> > be a new step (upsertV()) or just an additional signature on the addV()
>> > step? I
>> >
>> > Stephen alluded to this on the dev list and after thinking about this a
>> bit
>> > I think that I am favor of not adding a step and just adding a new
>> > signature to the existing step (if possible).  Thoughts?
>> >
>> >
>> > Dave
>> >
>> > On Wed, Dec 9, 2020 at 4:33 AM Stephen Mallette <sp...@gmail.com>
>> > wrote:
>> >
>> > > Josh, thanks for your thoughts - some responses inline:
>> > >
>> > > On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman <joshperryman@gmail.com
>> >
>> > > wrote:
>> > >
>> > > > I'll offer some thoughts. I'm seeing upsertV() as an idempotent
>> > > getOrCreate
>> > > > call which always returns a vertex with the label/property values
>> > > specified
>> > > > within the step. It's sort of a declarative pattern: "return this
>> > vertex
>> > > to
>> > > > me, find it if you can, create it if you must."
>> > > >
>> > >
>> > > I like this description - I've added it to the gist, though it's a
>> bit at
>> > > odds with Dave's previous post, so we'll consider it a temporary
>> addition
>> > > until he responds.
>> > >
>> > >
>> > > > On that account, I do like the simplification in 1. Repetition
>> > shouldn't
>> > > be
>> > > > necessary. In an ideal world, the engine should know the primary
>> > > > identifiers (name or id) and find/create the vertex based on them.
>> Any
>> > > > other included values will be "trued up" as well. But this may be a
>> > > bridge
>> > > > too far for TinkerPop since knowing identifiers may require a
>> specified
>> > > > schema. I'd prefer to omit the third input, but it might be
>> necessary
>> > to
>> > > > keep it so that the second input can be for the matching use case.
>> > > >
>> > >
>> > > In my most recent post on gremlin-users I think I came up with a nice
>> way
>> > > to get rid of the second Map. One Map that forms the full list of
>> > > properties for upserting is easier than partitioning two Maps that
>> > > essentially merge together. I imagine it's unlikely that application
>> code
>> > > will have that separation naturally so users will have the added step
>> of
>> > > trying to separate their data into searchable vs "just data". Getting
>> us
>> > to
>> > > one Map argument will simplify APIs for us and reduce complexity to
>> > users.
>> > > Here is what I'd proposed for those not following over there:
>> > >
>> > > // match on name and age (or perhaps whatever the underlying graph
>> system
>> > > thinks is best?)
>> > > g.upsertV('person', [name:'marko',age:29])
>> > >
>> > > // match on name only
>> > > g.upsertV('person', [name:'marko',age:29]).by('name')
>> > >
>> > > // explicitly match on name and age
>> > > g.upsertV('person', [name:'marko',age:29]).
>> > >  by('name').by('age')
>> > >
>> > > // match on id only
>> > > g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)
>> > >
>> > > // match on whatever the by(Traversal) predicate defines
>> > > g.upsertV('person', [name:'marko',age:29]).
>> > >  by(has('name', 'marko'))
>> > >
>> > > // match on id, then update age
>> > > g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
>> > >  property('age',29)
>> > >
>> > > With this model, we get one Map argument that represents the complete
>> > > property set to be added/updated to the graph and the user can hint on
>> > what
>> > > key they wish to match on using by() where that sort of step
>> modulation
>> > > should be a well understood and familiar concept in Gremlin at this
>> > point.
>> > >
>> > > So that means I think 2 should always match or update the additional
>> > > > values. Again, we're specifying the expected result and letting the
>> > > engine
>> > > > figure out best how to return that results and appropriately
>> maintain
>> > > > state.
>> > > >
>> > >
>> > > I again like this description, but we'll see what Dave's thoughts are
>> > since
>> > > he's a bit behind on the threads at this point I think.
>> > >
>> > >
>> > > > I'm also presuming that anything not included as inputs to the
>> > upsertV()
>> > > > step are then to be handled by following steps. I'm hoping that is a
>> > > > sufficient approach for addressing the multi/meta property use cases
>> > > > brought up in 3.
>> > > >
>> > >
>> > > yeah................it needs more thought. I spent more time thinking
>> on
>> > > this issue yesterday than I have for all the previous posts combined
>> and
>> > I
>> > > think it yielded something good in that revised syntax. It's going to
>> > take
>> > > more of that kind of elbow grease to dig into these lesser use cases
>> to
>> > > make sure we aren't coding ourselves into corners.
>> > >
>> > >
>> > > > I do like the idea of using modulators (with(), by()) for more
>> > > > sophisticated usage and advanced use cases. Also, the streaming
>> > examples
>> > > > are quite elegant allowing for a helpful separation of data and
>> logic.
>> > > >
>> > >
>> > > cool - hope you like the revised syntax I posted then. :)
>> > >
>> > >
>> > > > That's my humble take. This is a very welcome addition to the
>> language
>> > > and
>> > > > I appreciate the thoughtful & collaborative approach to the design
>> > > > considerations.
>> > > >
>> > >
>> > > Thanks again and please keep the thoughts coming. Lots of other
>> > interesting
>> > > design discussions seem to be brewing.
>> > >
>> > >
>> > > >
>> > > > Josh
>> > > >
>> > > > On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <
>> spmallette@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > I started a expanded this discussion to gremlin-users for a wider
>> > > > audience
>> > > > > and the thread is starting to grow:
>> > > > >
>> > > > >
>> > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
>> > > > >
>> > > > > I guess we'll need to summarize that discussion back here now....
>> > > > >
>> > > > > I did have some more thoughts to hang out there and figured that I
>> > > > wouldn't
>> > > > > convolute the discussion on gremlin-users with it so I will
>> continue
>> > > the
>> > > > > discussion here.
>> > > > >
>> > > > > 1, The very first couple of examples seem wrong (or at least not
>> best
>> > > > > demonstrating the usage):
>> > > > >
>> > > > > g.upsertV('person', [name: 'marko'],
>> > > > >                    [name: 'marko', age: 29])
>> > > > > g.upsertV('person', [(T.id): 1],
>> > > > >                    [(T.id): 1, name: 'Marko'])
>> > > > >
>> > > > > should instead be:
>> > > > >
>> > > > > g.upsertV('person', [name: 'marko'],
>> > > > >                    [age: 29])
>> > > > > g.upsertV('person', [(T.id): 1],
>> > > > >                    [name: 'Marko'])
>> > > > >
>> > > > > 2. I can't recall if we made this clear anywhere but in situations
>> > > where
>> > > > we
>> > > > > "get" rather than "create" do the additional properties act in an
>> > > update
>> > > > > fashion to the element that was found? I think I've been working
>> on
>> > the
>> > > > > assumption that it would, though perhaps that is not always
>> > desirable?
>> > > > >
>> > > > > 3. We really never settled up how to deal with
>> multi/meta-properties.
>> > > > That
>> > > > > story should be clear so that when we document upsert() we include
>> > the
>> > > > > approaches for the fallback patterns that don't meet the 90% of
>> use
>> > > cases
>> > > > > we are targeting and I sense that we're saying that
>> > > meta/multi-properties
>> > > > > don't fit in that bucket. So with that in mind, I don't think that
>> > the
>> > > > > following works for metaproperties:
>> > > > >
>> > > > > g.upsertV('person', [(T.id): 1],
>> > > > >                    [name:[acl: 'public'])
>> > > > >
>> > > > > as it doesn't let us set the value for the "name" just the pairs
>> for
>> > > the
>> > > > > meta-properties. I guess a user would have to fall back to:
>> > > > >
>> > > > > g.upsertV('person', [(T.id): 1]).
>> > > > >  property('name','marko','acl','public')
>> > > > >
>> > > > > // or use the additional properties syntax
>> > > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
>> > > > >  properties('name').property('acl','public')
>> > > > >
>> > > > > // or if there were multi-properties then maybe...
>> > > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
>> > > > >  properties('name').hasValue('marko').property('acl','public')
>> > > > >
>> > > > > As for multi-properties, I dont think we should assume that a List
>> > > object
>> > > > > should be interpreted by Gremlin as a multi-property. Perhaps we
>> just
>> > > > rely
>> > > > > on the underlying graph to properly deal with that given a schema
>> or
>> > > the
>> > > > > user falls back to:
>> > > > >
>> > > > > // this ends up as however the graph deals with a List object
>> > > > > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
>> > > > >
>> > > > > // this is explicit
>> > > > > g.upsertV('person', [(T.id): 1]).
>> > > > >  property(list, 'lang', 'java').
>> > > > >  property(list, 'lang, 'scala').
>> > > > >  property(list, 'lang', 'java')
>> > > > >
>> > > > > If that makes sense to everyone, I will update the gist.
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <
>> > dave@bechberger.com
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hello Stephen,
>> > > > > >
>> > > > > > Thanks for making that gist, its much easier to follow along
>> with
>> > the
>> > > > > > proposed syntax there.  To your specific comments:
>> > > > > >
>> > > > > > #1 - My only worry with the term upsert is that it is not as
>> > widely a
>> > > > > used
>> > > > > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).
>> > > However I
>> > > > > > don't have a strong opinion on this, so I am fine with either.
>> > > > > > #2 - My only real objective here is to make sure that we make
>> the
>> > > > 80-90%
>> > > > > > case easy and straightforward.  I think that having the fallback
>> > > option
>> > > > > of
>> > > > > > using the current syntax for any complicated edge cases should
>> be
>> > > > > > considered here as well. I'd appreciate your thoughts here as
>> these
>> > > are
>> > > > > > good points you bring up that definitely fall into the 80-90%
>> use
>> > > case.
>> > > > > > #3 - Those points make sense to me, not sure I have anything
>> > further
>> > > to
>> > > > > add
>> > > > > > #4 - I don't think I would expect to have the values extracted
>> from
>> > > the
>> > > > > > traversal filters but I'd be interested in other opinions on
>> that.
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Dave
>> > > > > >
>> > > > > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <
>> > > spmallette@gmail.com
>> > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > It's been a couple weeks since we moved this thread so I went
>> > back
>> > > > > > through
>> > > > > > > it and refreshed my mind with what's there. Dave, nice job
>> > putting
>> > > > all
>> > > > > > > those examples together. I've recollected them all together
>> and
>> > > have
>> > > > > just
>> > > > > > > been reviewing them in a more formatted style with proper
>> Groovy
>> > > > syntax
>> > > > > > > than what can be accomplished in this mailing list. Anyone who
>> > > cares
>> > > > to
>> > > > > > > review in that form can do so here (i've pasted its markdown
>> > > contents
>> > > > > at
>> > > > > > > the bottom of this post as well):
>> > > > > > >
>> > > > > > >
>> > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
>> > > > > > >
>> > > > > > > In this light, I have the following comments and thoughts:
>> > > > > > >
>> > > > > > > 1. I feel more inclined toward saving "merge" for a more
>> > > generalized
>> > > > > step
>> > > > > > > of that name and have thus renamed the examples to use
>> "upsert".
>> > > > > > >
>> > > > > > > 2. I still don't quite feel comfortable with
>> > meta/multi-properties
>> > > > > > > examples. Perhaps you could explain further as it's possible
>> I'm
>> > > not
>> > > > > > > thinking things through properly. For meta-properties the
>> example
>> > > is:
>> > > > > > >
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, name:[first: Marko', last:
>> 'R'])
>> > > > > > >
>> > > > > > > So I see that "name" takes a Map here, but how does Gremlin
>> know
>> > if
>> > > > > that
>> > > > > > > means "meta-property" or "Map value" and if the former, then
>> how
>> > do
>> > > > you
>> > > > > > set
>> > > > > > > the value of "name"?  My question is similar to the
>> > multi-property
>> > > > > > examples
>> > > > > > > - using this one:
>> > > > > > >
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1,lang: ['java', 'scala'])
>> > > > > > >
>> > > > > > > How does Gremlin know if the user means that "lang" should be
>> a
>> > > > > > > "multi-property" with Cardinality.list (or Cardinality.set for
>> > that
>> > > > > > matter)
>> > > > > > > or a value of Cardinality.single with a List/Array value? I
>> can
>> > > > perhaps
>> > > > > > > suggest some solutions here but wanted to make sure I wasn't
>> just
>> > > > > > > misunderstanding something.
>> > > > > > >
>> > > > > > > 3. The "upsert with stream" example is:
>> > > > > > >
>> > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
>> > > > > > >          [(id): 2, (label): 'person', name: 'josh']).
>> > > > > > >  upsertV(values(label), valueMap(id), valueMap())
>> > > > > > >
>> > > > > > > That would establish an API of upsertV(Traversal, Traversal,
>> > > > Traversal)
>> > > > > > > which would enable your final thought of your last post with:
>> > > > > > >
>> > > > > > > g.V().upsertV('person',
>> > __.has('person','age',gt(29)).has('state',
>> > > > > 'NY'),
>> > > > > > >                        [active: true])
>> > > > > > >
>> > > > > > > The open-ended nature of that is neat but comes with some
>> > > complexity
>> > > > > with
>> > > > > > > steps that traditionally take an "open" Traversal (and this
>> step
>> > > now
>> > > > > has
>> > > > > > up
>> > > > > > > to three of them) - perhaps, that's our own fault in some
>> ways.
>> > For
>> > > > the
>> > > > > > > "stream" concept, values()/valueMap() won't immediately work
>> that
>> > > way
>> > > > > as
>> > > > > > > they only apply to Element...given today's Gremlin semantics,
>> I
>> > > think
>> > > > > > we'd
>> > > > > > > re-write that as:
>> > > > > > >
>> > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
>> > > > > > >          [(id): 2, (label): 'person', name: 'josh']).
>> > > > > > >  upsertV(select(label), select(id), select(id,label,'name'))
>> > > > > > >
>> > > > > > > Though, even that doesn't quite work because you can't
>> select(T)
>> > > > right
>> > > > > > now
>> > > > > > > which means you'd need to go with:
>> > > > > > >
>> > > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
>> > > > > > >          [id: 2, label: 'person', name: 'josh']).
>> > > > > > >  upsertV(select('label'), select('id'),
>> > > select('id','label','name'))
>> > > > > > >
>> > > > > > > not terrible in a way and perhaps we could fix select(T) -
>> can't
>> > > > > remember
>> > > > > > > why that isn't allowed except that select() operates over
>> > multiple
>> > > > > scopes
>> > > > > > > and perhaps trying T through those scopes causes trouble.
>> > > > > > >
>> > > > > > > 4. Continuing with the use of Traversal as arguments, let's go
>> > back
>> > > > to:
>> > > > > > >
>> > > > > > > g.V().upsertV('person',
>> > __.has('person','age',gt(29)).has('state',
>> > > > > 'NY'),
>> > > > > > >                        [active: true])
>> > > > > > >
>> > > > > > > I originally had it in my mind to prefer this approach over a
>> Map
>> > > for
>> > > > > the
>> > > > > > > search as it provides a great deal of flexibility and fits the
>> > > common
>> > > > > > > filtering patterns that users follow really well. I suppose
>> the
>> > > > > question
>> > > > > > is
>> > > > > > > what to do in the situation of "create". Would we extract all
>> the
>> > > > > strict
>> > > > > > > equality filter values from the initial has(), thus, in the
>> > > example,
>> > > > > > > "state" but not "age", and create a vertex that has "state=NY,
>> > > > > > > active=true"? That is possible with how things are designed in
>> > > > Gremlin
>> > > > > > > today I think.
>> > > > > > >
>> > > > > > > This syntax does create some conflict with the subject of 3
>> and
>> > > > streams
>> > > > > > > because the traverser in the streams case is the incoming Map
>> but
>> > > for
>> > > > > > this
>> > > > > > > context it's meant as a filter of V(). Getting too whacky?
>> > > > > > >
>> > > > > > > 5. I like the idea of using a side-effect to capture whether
>> the
>> > > > > element
>> > > > > > > was created or not. that makes sense. I think that can work.
>> > > > > > >
>> > > > > > > ==============================
>> > > > > > > = gist contents
>> > > > > > > ==============================
>> > > > > > >
>> > > > > > > # API
>> > > > > > >
>> > > > > > > ```java
>> > > > > > > upsertV(String label, Map matchOrCreateProperties)
>> > > > > > > upsertV(String label, Map matchOrCreateProperties, Map
>> > > > > > > additionalProperties)
>> > > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
>> > > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties,
>> > > Traversal
>> > > > > > > additionalProperties)
>> > > > > > > upsertV(...).
>> > > > > > >  with(WithOptions.sideEffectLabel, 'a')
>> > > > > > >
>> > > > > > > upsertE(String label, Map matchOrCreateProperties)
>> > > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
>> > > > > > > upsertE(String label, Map matchOrCreateProperties, Map
>> > > > > > > additionalProperties)
>> > > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties,
>> > > Traversal
>> > > > > > > additionalProperties)
>> > > > > > > upsertE(...).
>> > > > > > >  from(Vertex incidentOut).to(Vertex incidentIn)
>> > > > > > >  with(WithOptions.sideEffectLabel, 'a')
>> > > > > > > ```
>> > > > > > >
>> > > > > > > # Examples
>> > > > > > >
>> > > > > > > ## upsert
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [name: 'marko'],
>> > > > > > >                    [name: 'marko', age: 29])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with id
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, name: 'Marko'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with Meta Properties
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, name:[first: Marko', last:
>> 'R'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with Multi Properties
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1,lang: ['java', 'scala'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with Single Cardinality
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, lang: 'java')
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with List Cardinality
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, lang: ['java', 'scala',
>> 'java'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with Set Cardinality
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, lang: ['java','scala'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with List Cardinality - and add value to list
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, lang: ['java','scala','java']).
>> > > > > > >  property(Cardinality.list, 'lang', 'java')
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with stream
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > // doesn't work with today's Gremlin semantics
>> > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
>> > > > > > >          [(id): 2, (label): 'person', name: 'josh']).
>> > > > > > >  upsertV(values(label), valueMap(id), valueMap())
>> > > > > > >
>> > > > > > > // the following would be more in line with what's possible
>> with
>> > > > > existing
>> > > > > > > Gremlin semantics:
>> > > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
>> > > > > > >          [id: 2, label: 'person', name: 'josh']).
>> > > > > > >  upsertV(select('label'), select('id'),
>> > > select('id','label','name'))
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with reporting added or updated side effect
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [name: 'marko'],
>> > > > > > >                    [name: 'marko', age: 29]).
>> > > > > > >    with(WithOptions.sideEffectLabel, 'a').
>> > > > > > >  project('vertex', 'added').
>> > > > > > >    by().
>> > > > > > >    by(cap('a'))
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert Edge assuming a self-relation of "marko"
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.V().has('person','name','marko').
>> > > > > > >  upsertE( 'self', [weight:0.5])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert Edge with incident vertex checks
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertE('knows', [weight:0.5]).
>> > > > > > >    from(V(1)).to(V(2))
>> > > > > > > g.V().has('person','name','marko').
>> > > > > > >  upsertE( 'knows', [weight:0.5]).
>> > > > > > >    to(V(2))
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert using a Traversal for the match
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.V().upsertV('person',
>> > __.has('person','age',gt(29)).has('state',
>> > > > > 'NY'),
>> > > > > > >                        [active: true])
>> > > > > > > ```
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
>> > > > dave@bechberger.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > So, I've been doing some additional thinking about the ways
>> > that
>> > > > this
>> > > > > > > could
>> > > > > > > > work based on the comments below and have put my comments
>> > inline.
>> > > > > > > >
>> > > > > > > > Dave
>> > > > > > > >
>> > > > > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
>> > > > > spmallette@gmail.com
>> > > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > I added some thoughts inline below:
>> > > > > > > > >
>> > > > > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
>> > > > > > dave@bechberger.com>
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Thanks for the detailed comments Stephen.  I have
>> addressed
>> > > > them
>> > > > > > > inline
>> > > > > > > > > > below.
>> > > > > > > > > >
>> > > > > > > > > > I did read the proposal from earlier and I think that we
>> > are
>> > > in
>> > > > > > close
>> > > > > > > > > > agreement with what we are trying to accomplish.  I also
>> > > fully
>> > > > > > > support
>> > > > > > > > > > Josh's comment on providing a mechanism for submitting a
>> > map
>> > > of
>> > > > > > > > > properties
>> > > > > > > > > > as manually unrolling this all right now leads to a lot
>> of
>> > > > > > potential
>> > > > > > > > for
>> > > > > > > > > > error and a long messy traversal.
>> > > > > > > > > >
>> > > > > > > > > > I'm looking forward to this discussion on how to merge
>> > these
>> > > > two
>> > > > > > > > > proposals.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > 1. How would multi/meta-properties fit into the API
>> you've
>> > > > > > proposed?
>> > > > > > > > > >
>> > > > > > > > > > My first thought here is that multi-properties would be
>> > > > > represented
>> > > > > > > as
>> > > > > > > > > > lists in the map, e.g.
>> > > > > > > > > >
>> > > > > > > > > > {names: ['Dave', 'David']}
>> > > > > > > > > >
>> > > > > > > > > > and meta-properties would be represented as maps in the
>> > maps,
>> > > > > e.g.
>> > > > > > > > > >
>> > > > > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
>> > > > > > > > > >
>> > > > > > > > > > I can't say I've thought through all the implications of
>> > this
>> > > > > > though
>> > > > > > > so
>> > > > > > > > > it
>> > > > > > > > > > is an area we would need to explore.
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > The first implication that comes to mind is that it makes
>> the
>> > > > > > > assumption
>> > > > > > > > > the user wants multi/meta-properties as opposed to a
>> single
>> > > > > > cardinality
>> > > > > > > > of
>> > > > > > > > > List in the first case and a Map as a property value in
>> the
>> > > > second
>> > > > > > > case.
>> > > > > > > > I
>> > > > > > > > > suppose that graphs with a schema could resolve those
>> > > assumptions
>> > > > > but
>> > > > > > > > > graphs that are schemaless would have a problem. The issue
>> > > could
>> > > > be
>> > > > > > > > > resolved by specialized configuration of "g" or per
>> merge()
>> > > step
>> > > > > > using
>> > > > > > > a
>> > > > > > > > > with() modulator I suppose but that goes into a yet
>> another
>> > > level
>> > > > > of
>> > > > > > > > > implications to consider. I've often wondered if the start
>> > > point
>> > > > > for
>> > > > > > > > > getting types/schema into TP3 without a full rewrite
>> would be
>> > > in
>> > > > > this
>> > > > > > > > form
>> > > > > > > > > where Gremlin would be given hints as to what to expect
>> as to
>> > > the
>> > > > > > types
>> > > > > > > > of
>> > > > > > > > > data it might encounter while traversing. Anyway, I'd be
>> > > hesitant
>> > > > > to
>> > > > > > go
>> > > > > > > > > down paths that don't account for multi/metaproperties
>> well.
>> > > > They
>> > > > > > are
>> > > > > > > > > first class citizens in TP3 (with those hoping for
>> extension
>> > of
>> > > > at
>> > > > > > > least
>> > > > > > > > > multiproperties to edges) and while I find them a constant
>> > > > > annoyance
>> > > > > > > for
>> > > > > > > > so
>> > > > > > > > > many reasons, we're kinda stuck with them.
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > I agree we need to account for multi/meta properties as 1st
>> > class
>> > > > > > > > citizens.  This is my current thinking on the syntax for the
>> > > > > situations
>> > > > > > > we
>> > > > > > > > have laid out so far:
>> > > > > > > >
>> > > > > > > > *Merge*
>> > > > > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age':
>> > 29})
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1,
>> 'name':
>> > > > > 'Marko'})
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1},
>> {T.id:
>> > > 1,
>> > > > > > > 'name':
>> > > > > > > > {'first': Marko', 'last': 'R'})
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1},
>> > {T.id:
>> > > > 1,
>> > > > > > > > 'lang': ['java', 'scala'])
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1},
>> > > {T.id:
>> > > > 1,
>> > > > > > > > 'lang': 'java')
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1},
>> > {T.id:
>> > > 1,
>> > > > > > > 'lang':
>> > > > > > > > ['java', 'scala', 'java'])
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1},
>> > {T.id:
>> > > 1,
>> > > > > > > 'lang':
>> > > > > > > > ['java', 'scala'])
>> > > > > > > >
>> > > > > > > > Since in a mergeV() scenario we are only ever adding
>> whatever
>> > > > values
>> > > > > > are
>> > > > > > > > passed in there would be no need to specify the cardinality
>> of
>> > > the
>> > > > > > > property
>> > > > > > > > being added.  If they wanted to add a value to an existing
>> > > property
>> > > > > > then
>> > > > > > > > the current property() method would still be available on
>> the
>> > > > output
>> > > > > > > > traversal. e.g.
>> > > > > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java',
>> 'scala',
>> > > > > > > > 'java']).property(Cardinality.list, 'lang, 'java')
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with stream * g.inject([{id: 1, label: 'person',
>> 'name':
>> > > > > > 'marko'},
>> > > > > > > > {id: 2, label: 'person', 'name': 'josh'}]).
>> > > > > > > >  mergeV(values('label'), valueMap('id'), valueMap())
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with reporting added or updated side effect*
>> > > > g.mergeV('name',
>> > > > > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
>> > > > > > > >  with(WithOptions.sideEffectLabel, 'a').
>> > > > > > > >  project('vertex', 'added').
>> > > > > > > >    by().
>> > > > > > > >    by(cap('a'))
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 2. How would users set the T.id on creation? would that
>> > T.id
>> > > > just
>> > > > > > be
>> > > > > > > a
>> > > > > > > > > key
>> > > > > > > > > > in the first Map argument?
>> > > > > > > > > >
>> > > > > > > > > > Yes, I was thinking that T.id would be the key name,
>> e.g.:
>> > > > > > > > > >
>> > > > > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > ok - I can't say I see a problem with that atm.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 3. I do like the general idea of a match on multiple
>> > > properties
>> > > > > for
>> > > > > > > the
>> > > > > > > > > > first argument as a convenience but wonder about the
>> > > > specificity
>> > > > > of
>> > > > > > > > this
>> > > > > > > > > > API a bit as it focuses heavily on equality - I suppose
>> > > that's
>> > > > > most
>> > > > > > > > cases
>> > > > > > > > > > for get-or-create, so perhaps that's ok.
>> > > > > > > > > >
>> > > > > > > > > > In most cases I've seen use exact matches on the vertex
>> or
>> > > > > edge.  I
>> > > > > > > > think
>> > > > > > > > > > it might be best to keep this straightforward as any
>> > complex
>> > > > edge
>> > > > > > > cases
>> > > > > > > > > > still can perform the same functionality using the
>> > coalesce()
>> > > > > > > pattern.
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > I played around with the idea to use modulators to apply
>> > > > additional
>> > > > > > > > > constraints:
>> > > > > > > > >
>> > > > > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
>> > > > > > > > >    by(has('age', gt(29))
>> > > > > > > > >
>> > > > > > > > > I suppose that's neat but maybe not...we could easily get
>> the
>> > > > same
>> > > > > > > thing
>> > > > > > > > > from:
>> > > > > > > > >
>> > > > > > > > > g.V().has('person','age',gt(29)).
>> > > > > > > > >  mergeV('person', {'state':'NY'}, {'active': true}).
>> > > > > > > > >
>> > > > > > > > > which is more readable though it does make it harder for
>> > > > providers
>> > > > > to
>> > > > > > > > > optimize upsert operations if they could make use of the
>> > has()
>> > > as
>> > > > > > part
>> > > > > > > of
>> > > > > > > > > that. I think I like has() as part of the main query
>> rather
>> > > than
>> > > > > in a
>> > > > > > > > by()
>> > > > > > > > > - just thinking out loud.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > Another potential option here would be to allow the second
>> > > > parameter
>> > > > > of
>> > > > > > > > mergeV() accept a traversal.  Not sure how much complexity
>> that
>> > > > would
>> > > > > > add
>> > > > > > > > though
>> > > > > > > >
>> > > > > > > > g.V().mergeV('person',
>> > __.has('person','age',gt(29)).has('state',
>> > > > > > 'NY'}.,
>> > > > > > > > {'active': true}).
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 4. I think your suggestion points to one of the troubles
>> > > > Gremlin
>> > > > > > has
>> > > > > > > > > which
>> > > > > > > > > > we see with "algorithms" - extending the language with
>> new
>> > > > steps
>> > > > > > that
>> > > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up
>> > with
>> > > > > > > > > > shortestPath() step) pollutes the core language a bit,
>> > hence
>> > > my
>> > > > > > > > > > generalization of "merging" in my link above which fits
>> > into
>> > > > the
>> > > > > > core
>> > > > > > > > > > Gremlin language style. There is a bigger picture where
>> we
>> > > are
>> > > > > > > missing
>> > > > > > > > > > something in Gremlin that lets us extend the language in
>> > ways
>> > > > > that
>> > > > > > > let
>> > > > > > > > us
>> > > > > > > > > > easily introduce new steps that aren't for general
>> purpose.
>> > > > This
>> > > > > > > issue
>> > > > > > > > is
>> > > > > > > > > > discussed in terms of "algorithms" here:
>> > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991
>> but I
>> > > > think
>> > > > > > we
>> > > > > > > > > could
>> > > > > > > > > > see how there might be some "mutation" extension steps
>> that
>> > > > would
>> > > > > > > cover
>> > > > > > > > > > your suggested API, plus batch operations, etc. We need
>> a
>> > way
>> > > > to
>> > > > > > add
>> > > > > > > > > > "sugar" without it interfering with the consistency of
>> the
>> > > > core.
>> > > > > > > > > Obviously
>> > > > > > > > > > this is a bigger issue but perhaps important to solve to
>> > > > > implement
>> > > > > > > > steps
>> > > > > > > > > in
>> > > > > > > > > > the fashion you describe.
>> > > > > > > > > >
>> > > > > > > > > > I agree that the "algorithm" steps do seem a bit odd in
>> the
>> > > > core
>> > > > > > > > language
>> > > > > > > > > > but I think the need is there.  I'd be interested in
>> > > furthering
>> > > > > > this
>> > > > > > > > > > discussion but I think these "pattern" steps may or may
>> not
>> > > be
>> > > > > the
>> > > > > > > same
>> > > > > > > > > as
>> > > > > > > > > > the algorithm steps.  I'll have to think on that.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
>> > > > specify
>> > > > > > what
>> > > > > > > > > > element type the first Map argument should be applied
>> to?
>> > > what
>> > > > > > about
>> > > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
>> > > That's
>> > > > > > > tricky
>> > > > > > > > > but
>> > > > > > > > > > I don't think we should miss that. Perhaps merge() could
>> > be a
>> > > > > > > "complex
>> > > > > > > > > > modulator"?? that's a new concept of course, but you
>> would
>> > do
>> > > > > > > > > g.V().merge()
>> > > > > > > > > > and the label and first Map would fold to
>> VertexStartStep
>> > > (i.e.
>> > > > > > V())
>> > > > > > > > for
>> > > > > > > > > > the lookup and then a MergeStep would follow - thus a
>> > > "complex
>> > > > > > > > modulator"
>> > > > > > > > > > as it does more than just change the behavior of the
>> > previous
>> > > > > step
>> > > > > > -
>> > > > > > > it
>> > > > > > > > > > also adds its own. I suppose it could also add has()
>> steps
>> > > > > followed
>> > > > > > > by
>> > > > > > > > > the
>> > > > > > > > > > MergeStep and then the has() operations would fold in
>> > > normally
>> > > > as
>> > > > > > > they
>> > > > > > > > do
>> > > > > > > > > > today. In this way, we can simplify to just one single
>> > > > > > > > > > merge(String,Map,Map). ??
>> > > > > > > > > >
>> > > > > > > > > > I agree that we should also think about how to include
>> > > > properties
>> > > > > > in
>> > > > > > > > this
>> > > > > > > > > > merge construct.  The reason I was thinking about
>> mergeV()
>> > > and
>> > > > > > > mergeE()
>> > > > > > > > > is
>> > > > > > > > > > that it follows the same pattern as the already well
>> > > understood
>> > > > > > > > > > addV()/addE() steps.  I am a bit wary of trying to
>> > generalize
>> > > > > this
>> > > > > > > down
>> > > > > > > > > to
>> > > > > > > > > > a single merge() step as these sorts of complex
>> overloads
>> > > make
>> > > > it
>> > > > > > > hard
>> > > > > > > > to
>> > > > > > > > > > figure out which Gremlin step you should use for a
>> > particular
>> > > > > > pattern
>> > > > > > > > > (e.g.
>> > > > > > > > > > upsert vertex or upsert edge).  One thing that I think
>> > > > customers
>> > > > > > find
>> > > > > > > > > > useful about the addV/addE step is that they are very
>> > > > > discoverable,
>> > > > > > > the
>> > > > > > > > > > name tells me what functionality to expect from that
>> step.
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > I'm starting to agree with the idea that we have to do
>> > > something
>> > > > > like
>> > > > > > > > > mergeV()/E() as it wouldn't quite work as a start step
>> > > otherwise.
>> > > > > We
>> > > > > > > > > couldn't do:
>> > > > > > > > >
>> > > > > > > > > g.merge()
>> > > > > > > > >
>> > > > > > > > > as we wouldnt know what sort of Element it applied to. If
>> > so, I
>> > > > > > wonder
>> > > > > > > if
>> > > > > > > > > it would be better to preserve "merge" for my more general
>> > > > use-case
>> > > > > > and
>> > > > > > > > > prefer upsertV()/E()?
>> > > > > > > > >
>> > > > > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps
>> it
>> > is
>> > > an
>> > > > > > > > uncommon
>> > > > > > > > > case (compared to V/E()) and we don't really have addVP()
>> as
>> > an
>> > > > > > > analogous
>> > > > > > > > > step. Perhaps existing coalesce() patterns and/or my more
>> > > general
>> > > > > > > purpose
>> > > > > > > > > merge() step would satisfy those situations??
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > I think that a mergeVP() type step may require a different
>> > > pattern
>> > > > > > since
>> > > > > > > it
>> > > > > > > > would really need to accept a map of values otherwise it
>> would
>> > > > > > > essentially
>> > > > > > > > perform the same function as property()
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 6. One thing neither my approach nor yours seems to do
>> is
>> > > tell
>> > > > > the
>> > > > > > > user
>> > > > > > > > > if
>> > > > > > > > > > they created something or updated something - that's
>> > another
>> > > > > thing
>> > > > > > > I've
>> > > > > > > > > > seen users want to have in get-or-create. Here again we
>> go
>> > > > deeper
>> > > > > > > into
>> > > > > > > > a
>> > > > > > > > > > less general step specification as alluded to in 4, but
>> a
>> > > > merge()
>> > > > > > > step
>> > > > > > > > as
>> > > > > > > > > > proposed in 5, might return [Element,boolean] so as to
>> > > provide
>> > > > an
>> > > > > > > > > indicator
>> > > > > > > > > > of creation?
>> > > > > > > > > >
>> > > > > > > > > > Hmm, I'll have to think about that.  How would returning
>> > > > multiple
>> > > > > > > > values
>> > > > > > > > > > work if we want to chain these together. e.g. Add a
>> vertex
>> > > > > between
>> > > > > > > two
>> > > > > > > > > > edges but make sure the vertices exist?
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > Added a thought on how to accomplish this above that I'd
>> like
>> > to
>> > > > get
>> > > > > > > > thoughts on.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > 7. You were just introducing your ideas here, so perhaps
>> > you
>> > > > > > haven't
>> > > > > > > > > gotten
>> > > > > > > > > > this far yet, but a shortcoming to doing
>> > > merge(String,Map,Map)
>> > > > is
>> > > > > > > that
>> > > > > > > > it
>> > > > > > > > > > leaves open no opportunity to stream a List of Maps to a
>> > > > merge()
>> > > > > > for
>> > > > > > > a
>> > > > > > > > > form
>> > > > > > > > > > of batch loading which is mighty common and one of the
>> > > > variations
>> > > > > > of
>> > > > > > > > the
>> > > > > > > > > > coalesce() pattern that I alluded to at the start of all
>> > > this.
>> > > > I
>> > > > > > > think
>> > > > > > > > > that
>> > > > > > > > > > we would want to be sure that we left open the option
>> to do
>> > > > that
>> > > > > > > > somehow.
>> > > > > > > > > > 8. If we had a general purpose merge() step I wonder if
>> it
>> > > > makes
>> > > > > > > > > developing
>> > > > > > > > > > the API as you suggested easier to do?
>> > > > > > > > > >
>> > > > > > > > > > Hmm, let me think about that one.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > I will add an item 8 to think about which I didn't mention
>> > > > before:
>> > > > > > > > >
>> > > > > > > > > 8. The signature you suggested for mergeE() is:
>> > > > > > > > >
>> > > > > > > > > mergeE(String, Map, Map)
>> > > > > > > > >      String - The edge label
>> > > > > > > > >      Map (first) - The properties to match existing edge
>> on
>> > > > > > > > >      Map (second) - Any additional properties to set if a
>> new
>> > > > edge
>> > > > > is
>> > > > > > > > > created (optional)
>> > > > > > > > >
>> > > > > > > > > but does that exactly answer the need? Typically the idea
>> is
>> > to
>> > > > > > detect
>> > > > > > > an
>> > > > > > > > > edge between two vertices not globally in the graph by
>> way of
>> > > > some
>> > > > > > > > > properties. This signature doesn't seem to allow for that
>> as
>> > it
>> > > > > > doesn't
>> > > > > > > > > allow specification of the vertices to test against.
>> Perhaps
>> > > the
>> > > > > > answer
>> > > > > > > > is
>> > > > > > > > > to use from() and to() modulators?
>> > > > > > > > >
>> > > > > > > > > g.mergeE('knows', {'weight':0.5}).
>> > > > > > > > >  from(V(1)).to(V(2))
>> > > > > > > > > g.V().has('person','name','marko').
>> > > > > > > > >  mergeE( 'knows', {'weight':0.5}).
>> > > > > > > > >    to(V(2))
>> > > > > > > > > g.V().has('person','name','marko').
>> > > > > > > > >  mergeE( 'self', {'weight':0.5})
>> > > > > > > > >
>> > > > > > > > > That seems to work?
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > I think that is an elegant solution to that problem.  I
>> like it
>> > > and
>> > > > > it
>> > > > > > > > keeps in line with the way that addE() works.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
>> > > > > > > spmallette@gmail.com
>> > > > > > > > >
>> > > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > I like our coalesce() pattern but it is verbose and
>> over
>> > > time
>> > > > > it
>> > > > > > > has
>> > > > > > > > > gone
>> > > > > > > > > > > from a simple pattern to one with numerous variations
>> for
>> > > all
>> > > > > > > manner
>> > > > > > > > of
>> > > > > > > > > > > different sorts of merge-like operations. As such, I
>> do
>> > > think
>> > > > > we
>> > > > > > > > should
>> > > > > > > > > > > introduce something to cover this pattern.
>> > > > > > > > > > >
>> > > > > > > > > > > I like that you used the word "merge" in your
>> description
>> > > of
>> > > > > this
>> > > > > > > as
>> > > > > > > > it
>> > > > > > > > > > is
>> > > > > > > > > > > the word I've liked using. You might want to give a
>> look
>> > at
>> > > > my
>> > > > > > > > proposed
>> > > > > > > > > > > merge() step from earlier in the year:
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
>> > > > > > > > > > >
>> > > > > > > > > > > I'm just going to dump thoughts as they come regarding
>> > what
>> > > > you
>> > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > 1. How would multi/meta-properties fit into the API
>> > you've
>> > > > > > > proposed?
>> > > > > > > > > > > 2. How would users set the T.id on creation? would
>> that
>> > > T.id
>> > > > > just
>> > > > > > > be
>> > > > > > > > a
>> > > > > > > > > > key
>> > > > > > > > > > > in the first Map argument?
>> > > > > > > > > > > 3. I do like the general idea of a match on multiple
>> > > > properties
>> > > > > > for
>> > > > > > > > the
>> > > > > > > > > > > first argument as a convenience but wonder about the
>> > > > > specificity
>> > > > > > of
>> > > > > > > > > this
>> > > > > > > > > > > API a bit as it focuses heavily on equality - I
>> suppose
>> > > > that's
>> > > > > > most
>> > > > > > > > > cases
>> > > > > > > > > > > for get-or-create, so perhaps that's ok.
>> > > > > > > > > > > 4. I think your suggestion points to one of the
>> troubles
>> > > > > Gremlin
>> > > > > > > has
>> > > > > > > > > > which
>> > > > > > > > > > > we see with "algorithms" - extending the language with
>> > new
>> > > > > steps
>> > > > > > > that
>> > > > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end
>> up
>> > > with
>> > > > > > > > > > > shortestPath() step) pollutes the core language a bit,
>> > > hence
>> > > > my
>> > > > > > > > > > > generalization of "merging" in my link above which
>> fits
>> > > into
>> > > > > the
>> > > > > > > core
>> > > > > > > > > > > Gremlin language style. There is a bigger picture
>> where
>> > we
>> > > > are
>> > > > > > > > missing
>> > > > > > > > > > > something in Gremlin that lets us extend the language
>> in
>> > > ways
>> > > > > > that
>> > > > > > > > let
>> > > > > > > > > us
>> > > > > > > > > > > easily introduce new steps that aren't for general
>> > purpose.
>> > > > > This
>> > > > > > > > issue
>> > > > > > > > > is
>> > > > > > > > > > > discussed in terms of "algorithms" here:
>> > > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991
>> > but I
>> > > > > think
>> > > > > > > we
>> > > > > > > > > > could
>> > > > > > > > > > > see how there might be some "mutation" extension steps
>> > that
>> > > > > would
>> > > > > > > > cover
>> > > > > > > > > > > your suggested API, plus batch operations, etc. We
>> need a
>> > > way
>> > > > > to
>> > > > > > > add
>> > > > > > > > > > > "sugar" without it interfering with the consistency of
>> > the
>> > > > > core.
>> > > > > > > > > > Obviously
>> > > > > > > > > > > this is a bigger issue but perhaps important to solve
>> to
>> > > > > > implement
>> > > > > > > > > steps
>> > > > > > > > > > in
>> > > > > > > > > > > the fashion you describe.
>> > > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is
>> to
>> > > > > specify
>> > > > > > > what
>> > > > > > > > > > > element type the first Map argument should be applied
>> to?
>> > > > what
>> > > > > > > about
>> > > > > > > > > > > mergeVP (i.e. vertex property as it too is an
>> element) ?
>> > > > That's
>> > > > > > > > tricky
>> > > > > > > > > > but
>> > > > > > > > > > > I don't think we should miss that. Perhaps merge()
>> could
>> > > be a
>> > > > > > > > "complex
>> > > > > > > > > > > modulator"?? that's a new concept of course, but you
>> > would
>> > > do
>> > > > > > > > > > g.V().merge()
>> > > > > > > > > > > and the label and first Map would fold to
>> VertexStartStep
>> > > > (i.e.
>> > > > > > > V())
>> > > > > > > > > for
>> > > > > > > > > > > the lookup and then a MergeStep would follow - thus a
>> > > > "complex
>> > > > > > > > > modulator"
>> > > > > > > > > > > as it does more than just change the behavior of the
>> > > previous
>> > > > > > step
>> > > > > > > -
>> > > > > > > > it
>> > > > > > > > > > > also adds its own. I suppose it could also add has()
>> > steps
>> > > > > > followed
>> > > > > > > > by
>> > > > > > > > > > the
>> > > > > > > > > > > MergeStep and then the has() operations would fold in
>> > > > normally
>> > > > > as
>> > > > > > > > they
>> > > > > > > > > do
>> > > > > > > > > > > today. In this way, we can simplify to just one single
>> > > > > > > > > > > merge(String,Map,Map). ??
>> > > > > > > > > > > 6. One thing neither my approach nor yours seems to
>> do is
>> > > > tell
>> > > > > > the
>> > > > > > > > user
>> > > > > > > > > > if
>> > > > > > > > > > > they created something or updated something - that's
>> > > another
>> > > > > > thing
>> > > > > > > > I've
>> > > > > > > > > > > seen users want to have in get-or-create. Here again
>> we
>> > go
>> > > > > deeper
>> > > > > > > > into
>> > > > > > > > > a
>> > > > > > > > > > > less general step specification as alluded to in 4,
>> but a
>> > > > > merge()
>> > > > > > > > step
>> > > > > > > > > as
>> > > > > > > > > > > proposed in 5, might return [Element,boolean] so as to
>> > > > provide
>> > > > > an
>> > > > > > > > > > indicator
>> > > > > > > > > > > of creation?
>> > > > > > > > > > > 7. You were just introducing your ideas here, so
>> perhaps
>> > > you
>> > > > > > > haven't
>> > > > > > > > > > gotten
>> > > > > > > > > > > this far yet, but a shortcoming to doing
>> > > > merge(String,Map,Map)
>> > > > > is
>> > > > > > > > that
>> > > > > > > > > it
>> > > > > > > > > > > leaves open no opportunity to stream a List of Maps
>> to a
>> > > > > merge()
>> > > > > > > for
>> > > > > > > > a
>> > > > > > > > > > form
>> > > > > > > > > > > of batch loading which is mighty common and one of the
>> > > > > variations
>> > > > > > > of
>> > > > > > > > > the
>> > > > > > > > > > > coalesce() pattern that I alluded to at the start of
>> all
>> > > > this.
>> > > > > I
>> > > > > > > > think
>> > > > > > > > > > that
>> > > > > > > > > > > we would want to be sure that we left open the option
>> to
>> > do
>> > > > > that
>> > > > > > > > > somehow.
>> > > > > > > > > > > 8. If we had a general purpose merge() step I wonder
>> if
>> > it
>> > > > > makes
>> > > > > > > > > > developing
>> > > > > > > > > > > the API as you suggested easier to do?
>> > > > > > > > > > >
>> > > > > > > > > > > I think I'd like to solve the problems you describe in
>> > your
>> > > > > post
>> > > > > > as
>> > > > > > > > > well
>> > > > > > > > > > as
>> > > > > > > > > > > the ones in mine. There is some relation there, but
>> gaps
>> > as
>> > > > > well.
>> > > > > > > > With
>> > > > > > > > > > more
>> > > > > > > > > > > discussion here we can figure something out.
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks for starting this talk - good one!
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
>> > > > > > > > dave@bechberger.com>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > I've had a few on and off discussions with a few
>> people
>> > > > here,
>> > > > > > so
>> > > > > > > I
>> > > > > > > > > > wanted
>> > > > > > > > > > > > to send this out to everyone for feedback.
>> > > > > > > > > > > >
>> > > > > > > > > > > > What are people's thoughts on creating a new set of
>> > steps
>> > > > > that
>> > > > > > > > codify
>> > > > > > > > > > > > common Gremlin best practices?
>> > > > > > > > > > > >
>> > > > > > > > > > > > I think there are several common Gremlin patterns
>> where
>> > > > users
>> > > > > > > would
>> > > > > > > > > > > benefit
>> > > > > > > > > > > > from the additional guidance that these codified
>> steps
>> > > > > > represent.
>> > > > > > > > > The
>> > > > > > > > > > > > first one I would recommend though is codifying the
>> > > element
>> > > > > > > > existence
>> > > > > > > > > > > > pattern into a single Gremlin step, something like:
>> > > > > > > > > > > >
>> > > > > > > > > > > > mergeV(String, Map, Map)
>> > > > > > > > > > > >      String - The vertex label
>> > > > > > > > > > > >      Map (first) - The properties to match existing
>> > > > vertices
>> > > > > on
>> > > > > > > > > > > >      Map (second) - Any additional properties to set
>> > if a
>> > > > new
>> > > > > > > > vertex
>> > > > > > > > > is
>> > > > > > > > > > > > created (optional)
>> > > > > > > > > > > > mergeE(String, Map, Map)
>> > > > > > > > > > > >      String - The edge label
>> > > > > > > > > > > >      Map (first) - The properties to match existing
>> > edge
>> > > on
>> > > > > > > > > > > >      Map (second) - Any additional properties to set
>> > if a
>> > > > new
>> > > > > > > edge
>> > > > > > > > is
>> > > > > > > > > > > > created (optional)
>> > > > > > > > > > > >
>> > > > > > > > > > > > In each of these cases these steps would perform the
>> > same
>> > > > > > upsert
>> > > > > > > > > > > > functionality as the element existence pattern.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Example:
>> > > > > > > > > > > >
>> > > > > > > > > > > > g.V().has('person','name','stephen').
>> > > > > > > > > > > >            fold().
>> > > > > > > > > > > >            coalesce(unfold(),
>> > > > > > > > > > > >                    addV('person').
>> > > > > > > > > > > >                      property('name','stephen').
>> > > > > > > > > > > >                      property('age',34))
>> > > > > > > > > > > >
>> > > > > > > > > > > > would become:
>> > > > > > > > > > > >
>> > > > > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
>> > > > > > > > > > > >
>> > > > > > > > > > > > I think that this change would be a good addition to
>> > the
>> > > > > > language
>> > > > > > > > for
>> > > > > > > > > > > > several reasons:
>> > > > > > > > > > > >
>> > > > > > > > > > > > * This codifies the best practice for a specific
>> > > > > action/recipe,
>> > > > > > > > which
>> > > > > > > > > > > > reduces the chance that someone uses the pattern
>> > > > incorrectly
>> > > > > > > > > > > > * Most complex Gremlin traversals are verbose.
>> > Reducing
>> > > > the
>> > > > > > > amount
>> > > > > > > > > of
>> > > > > > > > > > > code
>> > > > > > > > > > > > that needs to be written and maintained allows for a
>> > > better
>> > > > > > > > developer
>> > > > > > > > > > > > experience.
>> > > > > > > > > > > > * It will lower the bar of entry for a developer by
>> > > making
>> > > > > > these
>> > > > > > > > > > actions
>> > > > > > > > > > > > more discoverable.  The more we can help bring these
>> > > > patterns
>> > > > > > to
>> > > > > > > > the
>> > > > > > > > > > > > forefront of the language via these pattern/meta
>> steps
>> > > the
>> > > > > more
>> > > > > > > we
>> > > > > > > > > > guide
>> > > > > > > > > > > > users towards writing better Gremlin faster
>> > > > > > > > > > > > * This allows DB vendors to optimize for this
>> pattern
>> > > > > > > > > > > >
>> > > > > > > > > > > > I know that this would likely be the first step in
>> > > Gremlin
>> > > > > that
>> > > > > > > > > > codifies
>> > > > > > > > > > > a
>> > > > > > > > > > > > pattern, so I'd like to get other's thoughts on
>> this?
>> > > > > > > > > > > >
>> > > > > > > > > > > > Dave
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Stephen Mallette <sp...@gmail.com>.
We may yet need to keep upsertV/E() as a new step because without that
addV() does get a bit odd as we would then lose the nice implicit match
behavior of:

// implicitly match on name/age without having to specify by()
g.upsertV('person', [name: 'marko', age: 29])

If we were to try to trigger that functionality with Map arguments but not
the current method we'd likely fall deeper into kelvin's area of concerns.
I've updated the gist again with the by() options and tweaked some other
things:

https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31



On Mon, Dec 14, 2020 at 3:20 PM Kelvin Lawrence <gf...@yahoo.com.invalid>
wrote:

> In general I like the idea of having a simpler way to express an upsert
> without needing to know the coalesce pattern.
> I am a little worried that if the addV and addE steps are overloaded to to
> perform the upsert task that suddenly, steps that have always worked one
> way, can now sometimes, do something different - not create but just find
> an existing element. My concern is mainly around readability of queries and
> a user knowing that a step can do certain extra things based on the
> parameterization. This kind of goes a little against the Linux style
> philosophy of each command doing one thing and doing it well.  Again, I'm
> mostly raising this not because I am against the idea but wanting to make
> sure it is very clear (maybe due to the presence of a by() modulator, that
> a given addV/addE step is in "upsert mode". We have a few other cases where
> steps work differently based on subtle parameterizations and people get
> confused. Consider the case of.
> "where(is(eq('A'))"  versus  "where(eq('A'))"
> I also want to think a bit about how/if these can be nested. Today a see a
> lot of nested coalesce steps in queries people write trying to do large
> (complex) multi-part upserts.
> Cheers,Kelvin
> Kelvin R. Lawrence
>
>     On Friday, December 11, 2020, 12:36:45 PM CST, Stephen Mallette <
> spmallette@gmail.com> wrote:
>
>  +1 to no new step. I haven't yet thought of a reason why this shouldn't
> just be a variation on addV/E().
>
> On Fri, Dec 11, 2020 at 1:03 PM David Bechberger <da...@bechberger.com>
> wrote:
>
> > I agree with Josh's description of what the upsertV() functionality is
> > intended to be.
> >
> > I also fully support the simplification provided by the by() modulation
> > that Stephen suggested, removing the second map.  I think that provides a
> > much cleaner and easier to comprehend syntax.
> >
> > With that agreement, I think this does beg the question of if this should
> > be a new step (upsertV()) or just an additional signature on the addV()
> > step? I
> >
> > Stephen alluded to this on the dev list and after thinking about this a
> bit
> > I think that I am favor of not adding a step and just adding a new
> > signature to the existing step (if possible).  Thoughts?
> >
> >
> > Dave
> >
> > On Wed, Dec 9, 2020 at 4:33 AM Stephen Mallette <sp...@gmail.com>
> > wrote:
> >
> > > Josh, thanks for your thoughts - some responses inline:
> > >
> > > On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman <jo...@gmail.com>
> > > wrote:
> > >
> > > > I'll offer some thoughts. I'm seeing upsertV() as an idempotent
> > > getOrCreate
> > > > call which always returns a vertex with the label/property values
> > > specified
> > > > within the step. It's sort of a declarative pattern: "return this
> > vertex
> > > to
> > > > me, find it if you can, create it if you must."
> > > >
> > >
> > > I like this description - I've added it to the gist, though it's a bit
> at
> > > odds with Dave's previous post, so we'll consider it a temporary
> addition
> > > until he responds.
> > >
> > >
> > > > On that account, I do like the simplification in 1. Repetition
> > shouldn't
> > > be
> > > > necessary. In an ideal world, the engine should know the primary
> > > > identifiers (name or id) and find/create the vertex based on them.
> Any
> > > > other included values will be "trued up" as well. But this may be a
> > > bridge
> > > > too far for TinkerPop since knowing identifiers may require a
> specified
> > > > schema. I'd prefer to omit the third input, but it might be necessary
> > to
> > > > keep it so that the second input can be for the matching use case.
> > > >
> > >
> > > In my most recent post on gremlin-users I think I came up with a nice
> way
> > > to get rid of the second Map. One Map that forms the full list of
> > > properties for upserting is easier than partitioning two Maps that
> > > essentially merge together. I imagine it's unlikely that application
> code
> > > will have that separation naturally so users will have the added step
> of
> > > trying to separate their data into searchable vs "just data". Getting
> us
> > to
> > > one Map argument will simplify APIs for us and reduce complexity to
> > users.
> > > Here is what I'd proposed for those not following over there:
> > >
> > > // match on name and age (or perhaps whatever the underlying graph
> system
> > > thinks is best?)
> > > g.upsertV('person', [name:'marko',age:29])
> > >
> > > // match on name only
> > > g.upsertV('person', [name:'marko',age:29]).by('name')
> > >
> > > // explicitly match on name and age
> > > g.upsertV('person', [name:'marko',age:29]).
> > >  by('name').by('age')
> > >
> > > // match on id only
> > > g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)
> > >
> > > // match on whatever the by(Traversal) predicate defines
> > > g.upsertV('person', [name:'marko',age:29]).
> > >  by(has('name', 'marko'))
> > >
> > > // match on id, then update age
> > > g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
> > >  property('age',29)
> > >
> > > With this model, we get one Map argument that represents the complete
> > > property set to be added/updated to the graph and the user can hint on
> > what
> > > key they wish to match on using by() where that sort of step modulation
> > > should be a well understood and familiar concept in Gremlin at this
> > point.
> > >
> > > So that means I think 2 should always match or update the additional
> > > > values. Again, we're specifying the expected result and letting the
> > > engine
> > > > figure out best how to return that results and appropriately maintain
> > > > state.
> > > >
> > >
> > > I again like this description, but we'll see what Dave's thoughts are
> > since
> > > he's a bit behind on the threads at this point I think.
> > >
> > >
> > > > I'm also presuming that anything not included as inputs to the
> > upsertV()
> > > > step are then to be handled by following steps. I'm hoping that is a
> > > > sufficient approach for addressing the multi/meta property use cases
> > > > brought up in 3.
> > > >
> > >
> > > yeah................it needs more thought. I spent more time thinking
> on
> > > this issue yesterday than I have for all the previous posts combined
> and
> > I
> > > think it yielded something good in that revised syntax. It's going to
> > take
> > > more of that kind of elbow grease to dig into these lesser use cases to
> > > make sure we aren't coding ourselves into corners.
> > >
> > >
> > > > I do like the idea of using modulators (with(), by()) for more
> > > > sophisticated usage and advanced use cases. Also, the streaming
> > examples
> > > > are quite elegant allowing for a helpful separation of data and
> logic.
> > > >
> > >
> > > cool - hope you like the revised syntax I posted then. :)
> > >
> > >
> > > > That's my humble take. This is a very welcome addition to the
> language
> > > and
> > > > I appreciate the thoughtful & collaborative approach to the design
> > > > considerations.
> > > >
> > >
> > > Thanks again and please keep the thoughts coming. Lots of other
> > interesting
> > > design discussions seem to be brewing.
> > >
> > >
> > > >
> > > > Josh
> > > >
> > > > On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <
> spmallette@gmail.com>
> > > > wrote:
> > > >
> > > > > I started a expanded this discussion to gremlin-users for a wider
> > > > audience
> > > > > and the thread is starting to grow:
> > > > >
> > > > >
> > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
> > > > >
> > > > > I guess we'll need to summarize that discussion back here now....
> > > > >
> > > > > I did have some more thoughts to hang out there and figured that I
> > > > wouldn't
> > > > > convolute the discussion on gremlin-users with it so I will
> continue
> > > the
> > > > > discussion here.
> > > > >
> > > > > 1, The very first couple of examples seem wrong (or at least not
> best
> > > > > demonstrating the usage):
> > > > >
> > > > > g.upsertV('person', [name: 'marko'],
> > > > >                    [name: 'marko', age: 29])
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                    [(T.id): 1, name: 'Marko'])
> > > > >
> > > > > should instead be:
> > > > >
> > > > > g.upsertV('person', [name: 'marko'],
> > > > >                    [age: 29])
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                    [name: 'Marko'])
> > > > >
> > > > > 2. I can't recall if we made this clear anywhere but in situations
> > > where
> > > > we
> > > > > "get" rather than "create" do the additional properties act in an
> > > update
> > > > > fashion to the element that was found? I think I've been working on
> > the
> > > > > assumption that it would, though perhaps that is not always
> > desirable?
> > > > >
> > > > > 3. We really never settled up how to deal with
> multi/meta-properties.
> > > > That
> > > > > story should be clear so that when we document upsert() we include
> > the
> > > > > approaches for the fallback patterns that don't meet the 90% of use
> > > cases
> > > > > we are targeting and I sense that we're saying that
> > > meta/multi-properties
> > > > > don't fit in that bucket. So with that in mind, I don't think that
> > the
> > > > > following works for metaproperties:
> > > > >
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                    [name:[acl: 'public'])
> > > > >
> > > > > as it doesn't let us set the value for the "name" just the pairs
> for
> > > the
> > > > > meta-properties. I guess a user would have to fall back to:
> > > > >
> > > > > g.upsertV('person', [(T.id): 1]).
> > > > >  property('name','marko','acl','public')
> > > > >
> > > > > // or use the additional properties syntax
> > > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
> > > > >  properties('name').property('acl','public')
> > > > >
> > > > > // or if there were multi-properties then maybe...
> > > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
> > > > >  properties('name').hasValue('marko').property('acl','public')
> > > > >
> > > > > As for multi-properties, I dont think we should assume that a List
> > > object
> > > > > should be interpreted by Gremlin as a multi-property. Perhaps we
> just
> > > > rely
> > > > > on the underlying graph to properly deal with that given a schema
> or
> > > the
> > > > > user falls back to:
> > > > >
> > > > > // this ends up as however the graph deals with a List object
> > > > > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
> > > > >
> > > > > // this is explicit
> > > > > g.upsertV('person', [(T.id): 1]).
> > > > >  property(list, 'lang', 'java').
> > > > >  property(list, 'lang, 'scala').
> > > > >  property(list, 'lang', 'java')
> > > > >
> > > > > If that makes sense to everyone, I will update the gist.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <
> > dave@bechberger.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hello Stephen,
> > > > > >
> > > > > > Thanks for making that gist, its much easier to follow along with
> > the
> > > > > > proposed syntax there.  To your specific comments:
> > > > > >
> > > > > > #1 - My only worry with the term upsert is that it is not as
> > widely a
> > > > > used
> > > > > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).
> > > However I
> > > > > > don't have a strong opinion on this, so I am fine with either.
> > > > > > #2 - My only real objective here is to make sure that we make the
> > > > 80-90%
> > > > > > case easy and straightforward.  I think that having the fallback
> > > option
> > > > > of
> > > > > > using the current syntax for any complicated edge cases should be
> > > > > > considered here as well. I'd appreciate your thoughts here as
> these
> > > are
> > > > > > good points you bring up that definitely fall into the 80-90% use
> > > case.
> > > > > > #3 - Those points make sense to me, not sure I have anything
> > further
> > > to
> > > > > add
> > > > > > #4 - I don't think I would expect to have the values extracted
> from
> > > the
> > > > > > traversal filters but I'd be interested in other opinions on
> that.
> > > > > >
> > > > > > Thanks,
> > > > > > Dave
> > > > > >
> > > > > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <
> > > spmallette@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > It's been a couple weeks since we moved this thread so I went
> > back
> > > > > > through
> > > > > > > it and refreshed my mind with what's there. Dave, nice job
> > putting
> > > > all
> > > > > > > those examples together. I've recollected them all together and
> > > have
> > > > > just
> > > > > > > been reviewing them in a more formatted style with proper
> Groovy
> > > > syntax
> > > > > > > than what can be accomplished in this mailing list. Anyone who
> > > cares
> > > > to
> > > > > > > review in that form can do so here (i've pasted its markdown
> > > contents
> > > > > at
> > > > > > > the bottom of this post as well):
> > > > > > >
> > > > > > >
> > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > > > > > >
> > > > > > > In this light, I have the following comments and thoughts:
> > > > > > >
> > > > > > > 1. I feel more inclined toward saving "merge" for a more
> > > generalized
> > > > > step
> > > > > > > of that name and have thus renamed the examples to use
> "upsert".
> > > > > > >
> > > > > > > 2. I still don't quite feel comfortable with
> > meta/multi-properties
> > > > > > > examples. Perhaps you could explain further as it's possible
> I'm
> > > not
> > > > > > > thinking things through properly. For meta-properties the
> example
> > > is:
> > > > > > >
> > > > > > > g.upsertV('person', [(T.id): 1],
> > > > > > >                    [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > > > >
> > > > > > > So I see that "name" takes a Map here, but how does Gremlin
> know
> > if
> > > > > that
> > > > > > > means "meta-property" or "Map value" and if the former, then
> how
> > do
> > > > you
> > > > > > set
> > > > > > > the value of "name"?  My question is similar to the
> > multi-property
> > > > > > examples
> > > > > > > - using this one:
> > > > > > >
> > > > > > > g.upsertV('person', [(T.id): 1],
> > > > > > >                    [(T.id): 1,lang: ['java', 'scala'])
> > > > > > >
> > > > > > > How does Gremlin know if the user means that "lang" should be a
> > > > > > > "multi-property" with Cardinality.list (or Cardinality.set for
> > that
> > > > > > matter)
> > > > > > > or a value of Cardinality.single with a List/Array value? I can
> > > > perhaps
> > > > > > > suggest some solutions here but wanted to make sure I wasn't
> just
> > > > > > > misunderstanding something.
> > > > > > >
> > > > > > > 3. The "upsert with stream" example is:
> > > > > > >
> > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > > > >  upsertV(values(label), valueMap(id), valueMap())
> > > > > > >
> > > > > > > That would establish an API of upsertV(Traversal, Traversal,
> > > > Traversal)
> > > > > > > which would enable your final thought of your last post with:
> > > > > > >
> > > > > > > g.V().upsertV('person',
> > __.has('person','age',gt(29)).has('state',
> > > > > 'NY'),
> > > > > > >                        [active: true])
> > > > > > >
> > > > > > > The open-ended nature of that is neat but comes with some
> > > complexity
> > > > > with
> > > > > > > steps that traditionally take an "open" Traversal (and this
> step
> > > now
> > > > > has
> > > > > > up
> > > > > > > to three of them) - perhaps, that's our own fault in some ways.
> > For
> > > > the
> > > > > > > "stream" concept, values()/valueMap() won't immediately work
> that
> > > way
> > > > > as
> > > > > > > they only apply to Element...given today's Gremlin semantics, I
> > > think
> > > > > > we'd
> > > > > > > re-write that as:
> > > > > > >
> > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > > > >  upsertV(select(label), select(id), select(id,label,'name'))
> > > > > > >
> > > > > > > Though, even that doesn't quite work because you can't
> select(T)
> > > > right
> > > > > > now
> > > > > > > which means you'd need to go with:
> > > > > > >
> > > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > > > > >          [id: 2, label: 'person', name: 'josh']).
> > > > > > >  upsertV(select('label'), select('id'),
> > > select('id','label','name'))
> > > > > > >
> > > > > > > not terrible in a way and perhaps we could fix select(T) -
> can't
> > > > > remember
> > > > > > > why that isn't allowed except that select() operates over
> > multiple
> > > > > scopes
> > > > > > > and perhaps trying T through those scopes causes trouble.
> > > > > > >
> > > > > > > 4. Continuing with the use of Traversal as arguments, let's go
> > back
> > > > to:
> > > > > > >
> > > > > > > g.V().upsertV('person',
> > __.has('person','age',gt(29)).has('state',
> > > > > 'NY'),
> > > > > > >                        [active: true])
> > > > > > >
> > > > > > > I originally had it in my mind to prefer this approach over a
> Map
> > > for
> > > > > the
> > > > > > > search as it provides a great deal of flexibility and fits the
> > > common
> > > > > > > filtering patterns that users follow really well. I suppose the
> > > > > question
> > > > > > is
> > > > > > > what to do in the situation of "create". Would we extract all
> the
> > > > > strict
> > > > > > > equality filter values from the initial has(), thus, in the
> > > example,
> > > > > > > "state" but not "age", and create a vertex that has "state=NY,
> > > > > > > active=true"? That is possible with how things are designed in
> > > > Gremlin
> > > > > > > today I think.
> > > > > > >
> > > > > > > This syntax does create some conflict with the subject of 3 and
> > > > streams
> > > > > > > because the traverser in the streams case is the incoming Map
> but
> > > for
> > > > > > this
> > > > > > > context it's meant as a filter of V(). Getting too whacky?
> > > > > > >
> > > > > > > 5. I like the idea of using a side-effect to capture whether
> the
> > > > > element
> > > > > > > was created or not. that makes sense. I think that can work.
> > > > > > >
> > > > > > > ==============================
> > > > > > > = gist contents
> > > > > > > ==============================
> > > > > > >
> > > > > > > # API
> > > > > > >
> > > > > > > ```java
> > > > > > > upsertV(String label, Map matchOrCreateProperties)
> > > > > > > upsertV(String label, Map matchOrCreateProperties, Map
> > > > > > > additionalProperties)
> > > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties,
> > > Traversal
> > > > > > > additionalProperties)
> > > > > > > upsertV(...).
> > > > > > >  with(WithOptions.sideEffectLabel, 'a')
> > > > > > >
> > > > > > > upsertE(String label, Map matchOrCreateProperties)
> > > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > > > > > upsertE(String label, Map matchOrCreateProperties, Map
> > > > > > > additionalProperties)
> > > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties,
> > > Traversal
> > > > > > > additionalProperties)
> > > > > > > upsertE(...).
> > > > > > >  from(Vertex incidentOut).to(Vertex incidentIn)
> > > > > > >  with(WithOptions.sideEffectLabel, 'a')
> > > > > > > ```
> > > > > > >
> > > > > > > # Examples
> > > > > > >
> > > > > > > ## upsert
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertV('person', [name: 'marko'],
> > > > > > >                    [name: 'marko', age: 29])
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert with id
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertV('person', [(T.id): 1],
> > > > > > >                    [(T.id): 1, name: 'Marko'])
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert with Meta Properties
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertV('person', [(T.id): 1],
> > > > > > >                    [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert with Multi Properties
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertV('person', [(T.id): 1],
> > > > > > >                    [(T.id): 1,lang: ['java', 'scala'])
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert with Single Cardinality
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertV('person', [(T.id): 1],
> > > > > > >                    [(T.id): 1, lang: 'java')
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert with List Cardinality
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertV('person', [(T.id): 1],
> > > > > > >                    [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert with Set Cardinality
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertV('person', [(T.id): 1],
> > > > > > >                    [(T.id): 1, lang: ['java','scala'])
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert with List Cardinality - and add value to list
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertV('person', [(T.id): 1],
> > > > > > >                    [(T.id): 1, lang: ['java','scala','java']).
> > > > > > >  property(Cardinality.list, 'lang', 'java')
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert with stream
> > > > > > >
> > > > > > > ```groovy
> > > > > > > // doesn't work with today's Gremlin semantics
> > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > > > >  upsertV(values(label), valueMap(id), valueMap())
> > > > > > >
> > > > > > > // the following would be more in line with what's possible
> with
> > > > > existing
> > > > > > > Gremlin semantics:
> > > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > > > > >          [id: 2, label: 'person', name: 'josh']).
> > > > > > >  upsertV(select('label'), select('id'),
> > > select('id','label','name'))
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert with reporting added or updated side effect
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertV('person', [name: 'marko'],
> > > > > > >                    [name: 'marko', age: 29]).
> > > > > > >    with(WithOptions.sideEffectLabel, 'a').
> > > > > > >  project('vertex', 'added').
> > > > > > >    by().
> > > > > > >    by(cap('a'))
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert Edge assuming a self-relation of "marko"
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.V().has('person','name','marko').
> > > > > > >  upsertE( 'self', [weight:0.5])
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert Edge with incident vertex checks
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.upsertE('knows', [weight:0.5]).
> > > > > > >    from(V(1)).to(V(2))
> > > > > > > g.V().has('person','name','marko').
> > > > > > >  upsertE( 'knows', [weight:0.5]).
> > > > > > >    to(V(2))
> > > > > > > ```
> > > > > > >
> > > > > > > ## upsert using a Traversal for the match
> > > > > > >
> > > > > > > ```groovy
> > > > > > > g.V().upsertV('person',
> > __.has('person','age',gt(29)).has('state',
> > > > > 'NY'),
> > > > > > >                        [active: true])
> > > > > > > ```
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
> > > > dave@bechberger.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > So, I've been doing some additional thinking about the ways
> > that
> > > > this
> > > > > > > could
> > > > > > > > work based on the comments below and have put my comments
> > inline.
> > > > > > > >
> > > > > > > > Dave
> > > > > > > >
> > > > > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> > > > > spmallette@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I added some thoughts inline below:
> > > > > > > > >
> > > > > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > > > > > dave@bechberger.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks for the detailed comments Stephen.  I have
> addressed
> > > > them
> > > > > > > inline
> > > > > > > > > > below.
> > > > > > > > > >
> > > > > > > > > > I did read the proposal from earlier and I think that we
> > are
> > > in
> > > > > > close
> > > > > > > > > > agreement with what we are trying to accomplish.  I also
> > > fully
> > > > > > > support
> > > > > > > > > > Josh's comment on providing a mechanism for submitting a
> > map
> > > of
> > > > > > > > > properties
> > > > > > > > > > as manually unrolling this all right now leads to a lot
> of
> > > > > > potential
> > > > > > > > for
> > > > > > > > > > error and a long messy traversal.
> > > > > > > > > >
> > > > > > > > > > I'm looking forward to this discussion on how to merge
> > these
> > > > two
> > > > > > > > > proposals.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 1. How would multi/meta-properties fit into the API
> you've
> > > > > > proposed?
> > > > > > > > > >
> > > > > > > > > > My first thought here is that multi-properties would be
> > > > > represented
> > > > > > > as
> > > > > > > > > > lists in the map, e.g.
> > > > > > > > > >
> > > > > > > > > > {names: ['Dave', 'David']}
> > > > > > > > > >
> > > > > > > > > > and meta-properties would be represented as maps in the
> > maps,
> > > > > e.g.
> > > > > > > > > >
> > > > > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > > > > > >
> > > > > > > > > > I can't say I've thought through all the implications of
> > this
> > > > > > though
> > > > > > > so
> > > > > > > > > it
> > > > > > > > > > is an area we would need to explore.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The first implication that comes to mind is that it makes
> the
> > > > > > > assumption
> > > > > > > > > the user wants multi/meta-properties as opposed to a single
> > > > > > cardinality
> > > > > > > > of
> > > > > > > > > List in the first case and a Map as a property value in the
> > > > second
> > > > > > > case.
> > > > > > > > I
> > > > > > > > > suppose that graphs with a schema could resolve those
> > > assumptions
> > > > > but
> > > > > > > > > graphs that are schemaless would have a problem. The issue
> > > could
> > > > be
> > > > > > > > > resolved by specialized configuration of "g" or per merge()
> > > step
> > > > > > using
> > > > > > > a
> > > > > > > > > with() modulator I suppose but that goes into a yet another
> > > level
> > > > > of
> > > > > > > > > implications to consider. I've often wondered if the start
> > > point
> > > > > for
> > > > > > > > > getting types/schema into TP3 without a full rewrite would
> be
> > > in
> > > > > this
> > > > > > > > form
> > > > > > > > > where Gremlin would be given hints as to what to expect as
> to
> > > the
> > > > > > types
> > > > > > > > of
> > > > > > > > > data it might encounter while traversing. Anyway, I'd be
> > > hesitant
> > > > > to
> > > > > > go
> > > > > > > > > down paths that don't account for multi/metaproperties
> well.
> > > > They
> > > > > > are
> > > > > > > > > first class citizens in TP3 (with those hoping for
> extension
> > of
> > > > at
> > > > > > > least
> > > > > > > > > multiproperties to edges) and while I find them a constant
> > > > > annoyance
> > > > > > > for
> > > > > > > > so
> > > > > > > > > many reasons, we're kinda stuck with them.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I agree we need to account for multi/meta properties as 1st
> > class
> > > > > > > > citizens.  This is my current thinking on the syntax for the
> > > > > situations
> > > > > > > we
> > > > > > > > have laid out so far:
> > > > > > > >
> > > > > > > > *Merge*
> > > > > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age':
> > 29})
> > > > > > > >
> > > > > > > >
> > > > > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> > > > > 'Marko'})
> > > > > > > >
> > > > > > > >
> > > > > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1},
> {T.id:
> > > 1,
> > > > > > > 'name':
> > > > > > > > {'first': Marko', 'last': 'R'})
> > > > > > > >
> > > > > > > >
> > > > > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1},
> > {T.id:
> > > > 1,
> > > > > > > > 'lang': ['java', 'scala'])
> > > > > > > >
> > > > > > > >
> > > > > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1},
> > > {T.id:
> > > > 1,
> > > > > > > > 'lang': 'java')
> > > > > > > >
> > > > > > > >
> > > > > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1},
> > {T.id:
> > > 1,
> > > > > > > 'lang':
> > > > > > > > ['java', 'scala', 'java'])
> > > > > > > >
> > > > > > > >
> > > > > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1},
> > {T.id:
> > > 1,
> > > > > > > 'lang':
> > > > > > > > ['java', 'scala'])
> > > > > > > >
> > > > > > > > Since in a mergeV() scenario we are only ever adding whatever
> > > > values
> > > > > > are
> > > > > > > > passed in there would be no need to specify the cardinality
> of
> > > the
> > > > > > > property
> > > > > > > > being added.  If they wanted to add a value to an existing
> > > property
> > > > > > then
> > > > > > > > the current property() method would still be available on the
> > > > output
> > > > > > > > traversal. e.g.
> > > > > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java',
> 'scala',
> > > > > > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > > > > > >
> > > > > > > >
> > > > > > > > *Merge with stream * g.inject([{id: 1, label: 'person',
> 'name':
> > > > > > 'marko'},
> > > > > > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > > > > > >  mergeV(values('label'), valueMap('id'), valueMap())
> > > > > > > >
> > > > > > > >
> > > > > > > > *Merge with reporting added or updated side effect*
> > > > g.mergeV('name',
> > > > > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > > > > > >  with(WithOptions.sideEffectLabel, 'a').
> > > > > > > >  project('vertex', 'added').
> > > > > > > >    by().
> > > > > > > >    by(cap('a'))
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > 2. How would users set the T.id on creation? would that
> > T.id
> > > > just
> > > > > > be
> > > > > > > a
> > > > > > > > > key
> > > > > > > > > > in the first Map argument?
> > > > > > > > > >
> > > > > > > > > > Yes, I was thinking that T.id would be the key name,
> e.g.:
> > > > > > > > > >
> > > > > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > ok - I can't say I see a problem with that atm.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > 3. I do like the general idea of a match on multiple
> > > properties
> > > > > for
> > > > > > > the
> > > > > > > > > > first argument as a convenience but wonder about the
> > > > specificity
> > > > > of
> > > > > > > > this
> > > > > > > > > > API a bit as it focuses heavily on equality - I suppose
> > > that's
> > > > > most
> > > > > > > > cases
> > > > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > > >
> > > > > > > > > > In most cases I've seen use exact matches on the vertex
> or
> > > > > edge.  I
> > > > > > > > think
> > > > > > > > > > it might be best to keep this straightforward as any
> > complex
> > > > edge
> > > > > > > cases
> > > > > > > > > > still can perform the same functionality using the
> > coalesce()
> > > > > > > pattern.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I played around with the idea to use modulators to apply
> > > > additional
> > > > > > > > > constraints:
> > > > > > > > >
> > > > > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > > > > >    by(has('age', gt(29))
> > > > > > > > >
> > > > > > > > > I suppose that's neat but maybe not...we could easily get
> the
> > > > same
> > > > > > > thing
> > > > > > > > > from:
> > > > > > > > >
> > > > > > > > > g.V().has('person','age',gt(29)).
> > > > > > > > >  mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > > > > >
> > > > > > > > > which is more readable though it does make it harder for
> > > > providers
> > > > > to
> > > > > > > > > optimize upsert operations if they could make use of the
> > has()
> > > as
> > > > > > part
> > > > > > > of
> > > > > > > > > that. I think I like has() as part of the main query rather
> > > than
> > > > > in a
> > > > > > > > by()
> > > > > > > > > - just thinking out loud.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > Another potential option here would be to allow the second
> > > > parameter
> > > > > of
> > > > > > > > mergeV() accept a traversal.  Not sure how much complexity
> that
> > > > would
> > > > > > add
> > > > > > > > though
> > > > > > > >
> > > > > > > > g.V().mergeV('person',
> > __.has('person','age',gt(29)).has('state',
> > > > > > 'NY'}.,
> > > > > > > > {'active': true}).
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > 4. I think your suggestion points to one of the troubles
> > > > Gremlin
> > > > > > has
> > > > > > > > > which
> > > > > > > > > > we see with "algorithms" - extending the language with
> new
> > > > steps
> > > > > > that
> > > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up
> > with
> > > > > > > > > > shortestPath() step) pollutes the core language a bit,
> > hence
> > > my
> > > > > > > > > > generalization of "merging" in my link above which fits
> > into
> > > > the
> > > > > > core
> > > > > > > > > > Gremlin language style. There is a bigger picture where
> we
> > > are
> > > > > > > missing
> > > > > > > > > > something in Gremlin that lets us extend the language in
> > ways
> > > > > that
> > > > > > > let
> > > > > > > > us
> > > > > > > > > > easily introduce new steps that aren't for general
> purpose.
> > > > This
> > > > > > > issue
> > > > > > > > is
> > > > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991
> but I
> > > > think
> > > > > > we
> > > > > > > > > could
> > > > > > > > > > see how there might be some "mutation" extension steps
> that
> > > > would
> > > > > > > cover
> > > > > > > > > > your suggested API, plus batch operations, etc. We need a
> > way
> > > > to
> > > > > > add
> > > > > > > > > > "sugar" without it interfering with the consistency of
> the
> > > > core.
> > > > > > > > > Obviously
> > > > > > > > > > this is a bigger issue but perhaps important to solve to
> > > > > implement
> > > > > > > > steps
> > > > > > > > > in
> > > > > > > > > > the fashion you describe.
> > > > > > > > > >
> > > > > > > > > > I agree that the "algorithm" steps do seem a bit odd in
> the
> > > > core
> > > > > > > > language
> > > > > > > > > > but I think the need is there.  I'd be interested in
> > > furthering
> > > > > > this
> > > > > > > > > > discussion but I think these "pattern" steps may or may
> not
> > > be
> > > > > the
> > > > > > > same
> > > > > > > > > as
> > > > > > > > > > the algorithm steps.  I'll have to think on that.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > > > specify
> > > > > > what
> > > > > > > > > > element type the first Map argument should be applied to?
> > > what
> > > > > > about
> > > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> > > That's
> > > > > > > tricky
> > > > > > > > > but
> > > > > > > > > > I don't think we should miss that. Perhaps merge() could
> > be a
> > > > > > > "complex
> > > > > > > > > > modulator"?? that's a new concept of course, but you
> would
> > do
> > > > > > > > > g.V().merge()
> > > > > > > > > > and the label and first Map would fold to VertexStartStep
> > > (i.e.
> > > > > > V())
> > > > > > > > for
> > > > > > > > > > the lookup and then a MergeStep would follow - thus a
> > > "complex
> > > > > > > > modulator"
> > > > > > > > > > as it does more than just change the behavior of the
> > previous
> > > > > step
> > > > > > -
> > > > > > > it
> > > > > > > > > > also adds its own. I suppose it could also add has()
> steps
> > > > > followed
> > > > > > > by
> > > > > > > > > the
> > > > > > > > > > MergeStep and then the has() operations would fold in
> > > normally
> > > > as
> > > > > > > they
> > > > > > > > do
> > > > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > > > merge(String,Map,Map). ??
> > > > > > > > > >
> > > > > > > > > > I agree that we should also think about how to include
> > > > properties
> > > > > > in
> > > > > > > > this
> > > > > > > > > > merge construct.  The reason I was thinking about
> mergeV()
> > > and
> > > > > > > mergeE()
> > > > > > > > > is
> > > > > > > > > > that it follows the same pattern as the already well
> > > understood
> > > > > > > > > > addV()/addE() steps.  I am a bit wary of trying to
> > generalize
> > > > > this
> > > > > > > down
> > > > > > > > > to
> > > > > > > > > > a single merge() step as these sorts of complex overloads
> > > make
> > > > it
> > > > > > > hard
> > > > > > > > to
> > > > > > > > > > figure out which Gremlin step you should use for a
> > particular
> > > > > > pattern
> > > > > > > > > (e.g.
> > > > > > > > > > upsert vertex or upsert edge).  One thing that I think
> > > > customers
> > > > > > find
> > > > > > > > > > useful about the addV/addE step is that they are very
> > > > > discoverable,
> > > > > > > the
> > > > > > > > > > name tells me what functionality to expect from that
> step.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I'm starting to agree with the idea that we have to do
> > > something
> > > > > like
> > > > > > > > > mergeV()/E() as it wouldn't quite work as a start step
> > > otherwise.
> > > > > We
> > > > > > > > > couldn't do:
> > > > > > > > >
> > > > > > > > > g.merge()
> > > > > > > > >
> > > > > > > > > as we wouldnt know what sort of Element it applied to. If
> > so, I
> > > > > > wonder
> > > > > > > if
> > > > > > > > > it would be better to preserve "merge" for my more general
> > > > use-case
> > > > > > and
> > > > > > > > > prefer upsertV()/E()?
> > > > > > > > >
> > > > > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it
> > is
> > > an
> > > > > > > > uncommon
> > > > > > > > > case (compared to V/E()) and we don't really have addVP()
> as
> > an
> > > > > > > analogous
> > > > > > > > > step. Perhaps existing coalesce() patterns and/or my more
> > > general
> > > > > > > purpose
> > > > > > > > > merge() step would satisfy those situations??
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think that a mergeVP() type step may require a different
> > > pattern
> > > > > > since
> > > > > > > it
> > > > > > > > would really need to accept a map of values otherwise it
> would
> > > > > > > essentially
> > > > > > > > perform the same function as property()
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > 6. One thing neither my approach nor yours seems to do is
> > > tell
> > > > > the
> > > > > > > user
> > > > > > > > > if
> > > > > > > > > > they created something or updated something - that's
> > another
> > > > > thing
> > > > > > > I've
> > > > > > > > > > seen users want to have in get-or-create. Here again we
> go
> > > > deeper
> > > > > > > into
> > > > > > > > a
> > > > > > > > > > less general step specification as alluded to in 4, but a
> > > > merge()
> > > > > > > step
> > > > > > > > as
> > > > > > > > > > proposed in 5, might return [Element,boolean] so as to
> > > provide
> > > > an
> > > > > > > > > indicator
> > > > > > > > > > of creation?
> > > > > > > > > >
> > > > > > > > > > Hmm, I'll have to think about that.  How would returning
> > > > multiple
> > > > > > > > values
> > > > > > > > > > work if we want to chain these together. e.g. Add a
> vertex
> > > > > between
> > > > > > > two
> > > > > > > > > > edges but make sure the vertices exist?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > Added a thought on how to accomplish this above that I'd like
> > to
> > > > get
> > > > > > > > thoughts on.
> > > > > > > >
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 7. You were just introducing your ideas here, so perhaps
> > you
> > > > > > haven't
> > > > > > > > > gotten
> > > > > > > > > > this far yet, but a shortcoming to doing
> > > merge(String,Map,Map)
> > > > is
> > > > > > > that
> > > > > > > > it
> > > > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > > > merge()
> > > > > > for
> > > > > > > a
> > > > > > > > > form
> > > > > > > > > > of batch loading which is mighty common and one of the
> > > > variations
> > > > > > of
> > > > > > > > the
> > > > > > > > > > coalesce() pattern that I alluded to at the start of all
> > > this.
> > > > I
> > > > > > > think
> > > > > > > > > that
> > > > > > > > > > we would want to be sure that we left open the option to
> do
> > > > that
> > > > > > > > somehow.
> > > > > > > > > > 8. If we had a general purpose merge() step I wonder if
> it
> > > > makes
> > > > > > > > > developing
> > > > > > > > > > the API as you suggested easier to do?
> > > > > > > > > >
> > > > > > > > > > Hmm, let me think about that one.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > I will add an item 8 to think about which I didn't mention
> > > > before:
> > > > > > > > >
> > > > > > > > > 8. The signature you suggested for mergeE() is:
> > > > > > > > >
> > > > > > > > > mergeE(String, Map, Map)
> > > > > > > > >      String - The edge label
> > > > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > > > >      Map (second) - Any additional properties to set if a
> new
> > > > edge
> > > > > is
> > > > > > > > > created (optional)
> > > > > > > > >
> > > > > > > > > but does that exactly answer the need? Typically the idea
> is
> > to
> > > > > > detect
> > > > > > > an
> > > > > > > > > edge between two vertices not globally in the graph by way
> of
> > > > some
> > > > > > > > > properties. This signature doesn't seem to allow for that
> as
> > it
> > > > > > doesn't
> > > > > > > > > allow specification of the vertices to test against.
> Perhaps
> > > the
> > > > > > answer
> > > > > > > > is
> > > > > > > > > to use from() and to() modulators?
> > > > > > > > >
> > > > > > > > > g.mergeE('knows', {'weight':0.5}).
> > > > > > > > >  from(V(1)).to(V(2))
> > > > > > > > > g.V().has('person','name','marko').
> > > > > > > > >  mergeE( 'knows', {'weight':0.5}).
> > > > > > > > >    to(V(2))
> > > > > > > > > g.V().has('person','name','marko').
> > > > > > > > >  mergeE( 'self', {'weight':0.5})
> > > > > > > > >
> > > > > > > > > That seems to work?
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think that is an elegant solution to that problem.  I like
> it
> > > and
> > > > > it
> > > > > > > > keeps in line with the way that addE() works.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > > > > > spmallette@gmail.com
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I like our coalesce() pattern but it is verbose and
> over
> > > time
> > > > > it
> > > > > > > has
> > > > > > > > > gone
> > > > > > > > > > > from a simple pattern to one with numerous variations
> for
> > > all
> > > > > > > manner
> > > > > > > > of
> > > > > > > > > > > different sorts of merge-like operations. As such, I do
> > > think
> > > > > we
> > > > > > > > should
> > > > > > > > > > > introduce something to cover this pattern.
> > > > > > > > > > >
> > > > > > > > > > > I like that you used the word "merge" in your
> description
> > > of
> > > > > this
> > > > > > > as
> > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > the word I've liked using. You might want to give a
> look
> > at
> > > > my
> > > > > > > > proposed
> > > > > > > > > > > merge() step from earlier in the year:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > > > > > >
> > > > > > > > > > > I'm just going to dump thoughts as they come regarding
> > what
> > > > you
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > 1. How would multi/meta-properties fit into the API
> > you've
> > > > > > > proposed?
> > > > > > > > > > > 2. How would users set the T.id on creation? would that
> > > T.id
> > > > > just
> > > > > > > be
> > > > > > > > a
> > > > > > > > > > key
> > > > > > > > > > > in the first Map argument?
> > > > > > > > > > > 3. I do like the general idea of a match on multiple
> > > > properties
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > first argument as a convenience but wonder about the
> > > > > specificity
> > > > > > of
> > > > > > > > > this
> > > > > > > > > > > API a bit as it focuses heavily on equality - I suppose
> > > > that's
> > > > > > most
> > > > > > > > > cases
> > > > > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > > > > 4. I think your suggestion points to one of the
> troubles
> > > > > Gremlin
> > > > > > > has
> > > > > > > > > > which
> > > > > > > > > > > we see with "algorithms" - extending the language with
> > new
> > > > > steps
> > > > > > > that
> > > > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end
> up
> > > with
> > > > > > > > > > > shortestPath() step) pollutes the core language a bit,
> > > hence
> > > > my
> > > > > > > > > > > generalization of "merging" in my link above which fits
> > > into
> > > > > the
> > > > > > > core
> > > > > > > > > > > Gremlin language style. There is a bigger picture where
> > we
> > > > are
> > > > > > > > missing
> > > > > > > > > > > something in Gremlin that lets us extend the language
> in
> > > ways
> > > > > > that
> > > > > > > > let
> > > > > > > > > us
> > > > > > > > > > > easily introduce new steps that aren't for general
> > purpose.
> > > > > This
> > > > > > > > issue
> > > > > > > > > is
> > > > > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991
> > but I
> > > > > think
> > > > > > > we
> > > > > > > > > > could
> > > > > > > > > > > see how there might be some "mutation" extension steps
> > that
> > > > > would
> > > > > > > > cover
> > > > > > > > > > > your suggested API, plus batch operations, etc. We
> need a
> > > way
> > > > > to
> > > > > > > add
> > > > > > > > > > > "sugar" without it interfering with the consistency of
> > the
> > > > > core.
> > > > > > > > > > Obviously
> > > > > > > > > > > this is a bigger issue but perhaps important to solve
> to
> > > > > > implement
> > > > > > > > > steps
> > > > > > > > > > in
> > > > > > > > > > > the fashion you describe.
> > > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is
> to
> > > > > specify
> > > > > > > what
> > > > > > > > > > > element type the first Map argument should be applied
> to?
> > > > what
> > > > > > > about
> > > > > > > > > > > mergeVP (i.e. vertex property as it too is an element)
> ?
> > > > That's
> > > > > > > > tricky
> > > > > > > > > > but
> > > > > > > > > > > I don't think we should miss that. Perhaps merge()
> could
> > > be a
> > > > > > > > "complex
> > > > > > > > > > > modulator"?? that's a new concept of course, but you
> > would
> > > do
> > > > > > > > > > g.V().merge()
> > > > > > > > > > > and the label and first Map would fold to
> VertexStartStep
> > > > (i.e.
> > > > > > > V())
> > > > > > > > > for
> > > > > > > > > > > the lookup and then a MergeStep would follow - thus a
> > > > "complex
> > > > > > > > > modulator"
> > > > > > > > > > > as it does more than just change the behavior of the
> > > previous
> > > > > > step
> > > > > > > -
> > > > > > > > it
> > > > > > > > > > > also adds its own. I suppose it could also add has()
> > steps
> > > > > > followed
> > > > > > > > by
> > > > > > > > > > the
> > > > > > > > > > > MergeStep and then the has() operations would fold in
> > > > normally
> > > > > as
> > > > > > > > they
> > > > > > > > > do
> > > > > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > > > > merge(String,Map,Map). ??
> > > > > > > > > > > 6. One thing neither my approach nor yours seems to do
> is
> > > > tell
> > > > > > the
> > > > > > > > user
> > > > > > > > > > if
> > > > > > > > > > > they created something or updated something - that's
> > > another
> > > > > > thing
> > > > > > > > I've
> > > > > > > > > > > seen users want to have in get-or-create. Here again we
> > go
> > > > > deeper
> > > > > > > > into
> > > > > > > > > a
> > > > > > > > > > > less general step specification as alluded to in 4,
> but a
> > > > > merge()
> > > > > > > > step
> > > > > > > > > as
> > > > > > > > > > > proposed in 5, might return [Element,boolean] so as to
> > > > provide
> > > > > an
> > > > > > > > > > indicator
> > > > > > > > > > > of creation?
> > > > > > > > > > > 7. You were just introducing your ideas here, so
> perhaps
> > > you
> > > > > > > haven't
> > > > > > > > > > gotten
> > > > > > > > > > > this far yet, but a shortcoming to doing
> > > > merge(String,Map,Map)
> > > > > is
> > > > > > > > that
> > > > > > > > > it
> > > > > > > > > > > leaves open no opportunity to stream a List of Maps to
> a
> > > > > merge()
> > > > > > > for
> > > > > > > > a
> > > > > > > > > > form
> > > > > > > > > > > of batch loading which is mighty common and one of the
> > > > > variations
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > coalesce() pattern that I alluded to at the start of
> all
> > > > this.
> > > > > I
> > > > > > > > think
> > > > > > > > > > that
> > > > > > > > > > > we would want to be sure that we left open the option
> to
> > do
> > > > > that
> > > > > > > > > somehow.
> > > > > > > > > > > 8. If we had a general purpose merge() step I wonder if
> > it
> > > > > makes
> > > > > > > > > > developing
> > > > > > > > > > > the API as you suggested easier to do?
> > > > > > > > > > >
> > > > > > > > > > > I think I'd like to solve the problems you describe in
> > your
> > > > > post
> > > > > > as
> > > > > > > > > well
> > > > > > > > > > as
> > > > > > > > > > > the ones in mine. There is some relation there, but
> gaps
> > as
> > > > > well.
> > > > > > > > With
> > > > > > > > > > more
> > > > > > > > > > > discussion here we can figure something out.
> > > > > > > > > > >
> > > > > > > > > > > Thanks for starting this talk - good one!
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > > > > > dave@bechberger.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > I've had a few on and off discussions with a few
> people
> > > > here,
> > > > > > so
> > > > > > > I
> > > > > > > > > > wanted
> > > > > > > > > > > > to send this out to everyone for feedback.
> > > > > > > > > > > >
> > > > > > > > > > > > What are people's thoughts on creating a new set of
> > steps
> > > > > that
> > > > > > > > codify
> > > > > > > > > > > > common Gremlin best practices?
> > > > > > > > > > > >
> > > > > > > > > > > > I think there are several common Gremlin patterns
> where
> > > > users
> > > > > > > would
> > > > > > > > > > > benefit
> > > > > > > > > > > > from the additional guidance that these codified
> steps
> > > > > > represent.
> > > > > > > > > The
> > > > > > > > > > > > first one I would recommend though is codifying the
> > > element
> > > > > > > > existence
> > > > > > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > > > > > >
> > > > > > > > > > > > mergeV(String, Map, Map)
> > > > > > > > > > > >      String - The vertex label
> > > > > > > > > > > >      Map (first) - The properties to match existing
> > > > vertices
> > > > > on
> > > > > > > > > > > >      Map (second) - Any additional properties to set
> > if a
> > > > new
> > > > > > > > vertex
> > > > > > > > > is
> > > > > > > > > > > > created (optional)
> > > > > > > > > > > > mergeE(String, Map, Map)
> > > > > > > > > > > >      String - The edge label
> > > > > > > > > > > >      Map (first) - The properties to match existing
> > edge
> > > on
> > > > > > > > > > > >      Map (second) - Any additional properties to set
> > if a
> > > > new
> > > > > > > edge
> > > > > > > > is
> > > > > > > > > > > > created (optional)
> > > > > > > > > > > >
> > > > > > > > > > > > In each of these cases these steps would perform the
> > same
> > > > > > upsert
> > > > > > > > > > > > functionality as the element existence pattern.
> > > > > > > > > > > >
> > > > > > > > > > > > Example:
> > > > > > > > > > > >
> > > > > > > > > > > > g.V().has('person','name','stephen').
> > > > > > > > > > > >            fold().
> > > > > > > > > > > >            coalesce(unfold(),
> > > > > > > > > > > >                    addV('person').
> > > > > > > > > > > >                      property('name','stephen').
> > > > > > > > > > > >                      property('age',34))
> > > > > > > > > > > >
> > > > > > > > > > > > would become:
> > > > > > > > > > > >
> > > > > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > > > > > >
> > > > > > > > > > > > I think that this change would be a good addition to
> > the
> > > > > > language
> > > > > > > > for
> > > > > > > > > > > > several reasons:
> > > > > > > > > > > >
> > > > > > > > > > > > * This codifies the best practice for a specific
> > > > > action/recipe,
> > > > > > > > which
> > > > > > > > > > > > reduces the chance that someone uses the pattern
> > > > incorrectly
> > > > > > > > > > > > * Most complex Gremlin traversals are verbose.
> > Reducing
> > > > the
> > > > > > > amount
> > > > > > > > > of
> > > > > > > > > > > code
> > > > > > > > > > > > that needs to be written and maintained allows for a
> > > better
> > > > > > > > developer
> > > > > > > > > > > > experience.
> > > > > > > > > > > > * It will lower the bar of entry for a developer by
> > > making
> > > > > > these
> > > > > > > > > > actions
> > > > > > > > > > > > more discoverable.  The more we can help bring these
> > > > patterns
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > > forefront of the language via these pattern/meta
> steps
> > > the
> > > > > more
> > > > > > > we
> > > > > > > > > > guide
> > > > > > > > > > > > users towards writing better Gremlin faster
> > > > > > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > > > > > >
> > > > > > > > > > > > I know that this would likely be the first step in
> > > Gremlin
> > > > > that
> > > > > > > > > > codifies
> > > > > > > > > > > a
> > > > > > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > > > > > >
> > > > > > > > > > > > Dave
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Kelvin Lawrence <gf...@yahoo.com.INVALID>.
In general I like the idea of having a simpler way to express an upsert without needing to know the coalesce pattern.
I am a little worried that if the addV and addE steps are overloaded to to perform the upsert task that suddenly, steps that have always worked one way, can now sometimes, do something different - not create but just find an existing element. My concern is mainly around readability of queries and a user knowing that a step can do certain extra things based on the parameterization. This kind of goes a little against the Linux style philosophy of each command doing one thing and doing it well.  Again, I'm mostly raising this not because I am against the idea but wanting to make sure it is very clear (maybe due to the presence of a by() modulator, that a given addV/addE step is in "upsert mode". We have a few other cases where steps work differently based on subtle parameterizations and people get confused. Consider the case of.
"where(is(eq('A'))"  versus  "where(eq('A'))"
I also want to think a bit about how/if these can be nested. Today a see a lot of nested coalesce steps in queries people write trying to do large (complex) multi-part upserts.
Cheers,Kelvin
Kelvin R. Lawrence  

    On Friday, December 11, 2020, 12:36:45 PM CST, Stephen Mallette <sp...@gmail.com> wrote:  
 
 +1 to no new step. I haven't yet thought of a reason why this shouldn't
just be a variation on addV/E().

On Fri, Dec 11, 2020 at 1:03 PM David Bechberger <da...@bechberger.com>
wrote:

> I agree with Josh's description of what the upsertV() functionality is
> intended to be.
>
> I also fully support the simplification provided by the by() modulation
> that Stephen suggested, removing the second map.  I think that provides a
> much cleaner and easier to comprehend syntax.
>
> With that agreement, I think this does beg the question of if this should
> be a new step (upsertV()) or just an additional signature on the addV()
> step? I
>
> Stephen alluded to this on the dev list and after thinking about this a bit
> I think that I am favor of not adding a step and just adding a new
> signature to the existing step (if possible).  Thoughts?
>
>
> Dave
>
> On Wed, Dec 9, 2020 at 4:33 AM Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > Josh, thanks for your thoughts - some responses inline:
> >
> > On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman <jo...@gmail.com>
> > wrote:
> >
> > > I'll offer some thoughts. I'm seeing upsertV() as an idempotent
> > getOrCreate
> > > call which always returns a vertex with the label/property values
> > specified
> > > within the step. It's sort of a declarative pattern: "return this
> vertex
> > to
> > > me, find it if you can, create it if you must."
> > >
> >
> > I like this description - I've added it to the gist, though it's a bit at
> > odds with Dave's previous post, so we'll consider it a temporary addition
> > until he responds.
> >
> >
> > > On that account, I do like the simplification in 1. Repetition
> shouldn't
> > be
> > > necessary. In an ideal world, the engine should know the primary
> > > identifiers (name or id) and find/create the vertex based on them. Any
> > > other included values will be "trued up" as well. But this may be a
> > bridge
> > > too far for TinkerPop since knowing identifiers may require a specified
> > > schema. I'd prefer to omit the third input, but it might be necessary
> to
> > > keep it so that the second input can be for the matching use case.
> > >
> >
> > In my most recent post on gremlin-users I think I came up with a nice way
> > to get rid of the second Map. One Map that forms the full list of
> > properties for upserting is easier than partitioning two Maps that
> > essentially merge together. I imagine it's unlikely that application code
> > will have that separation naturally so users will have the added step of
> > trying to separate their data into searchable vs "just data". Getting us
> to
> > one Map argument will simplify APIs for us and reduce complexity to
> users.
> > Here is what I'd proposed for those not following over there:
> >
> > // match on name and age (or perhaps whatever the underlying graph system
> > thinks is best?)
> > g.upsertV('person', [name:'marko',age:29])
> >
> > // match on name only
> > g.upsertV('person', [name:'marko',age:29]).by('name')
> >
> > // explicitly match on name and age
> > g.upsertV('person', [name:'marko',age:29]).
> >  by('name').by('age')
> >
> > // match on id only
> > g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)
> >
> > // match on whatever the by(Traversal) predicate defines
> > g.upsertV('person', [name:'marko',age:29]).
> >  by(has('name', 'marko'))
> >
> > // match on id, then update age
> > g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
> >  property('age',29)
> >
> > With this model, we get one Map argument that represents the complete
> > property set to be added/updated to the graph and the user can hint on
> what
> > key they wish to match on using by() where that sort of step modulation
> > should be a well understood and familiar concept in Gremlin at this
> point.
> >
> > So that means I think 2 should always match or update the additional
> > > values. Again, we're specifying the expected result and letting the
> > engine
> > > figure out best how to return that results and appropriately maintain
> > > state.
> > >
> >
> > I again like this description, but we'll see what Dave's thoughts are
> since
> > he's a bit behind on the threads at this point I think.
> >
> >
> > > I'm also presuming that anything not included as inputs to the
> upsertV()
> > > step are then to be handled by following steps. I'm hoping that is a
> > > sufficient approach for addressing the multi/meta property use cases
> > > brought up in 3.
> > >
> >
> > yeah................it needs more thought. I spent more time thinking on
> > this issue yesterday than I have for all the previous posts combined and
> I
> > think it yielded something good in that revised syntax. It's going to
> take
> > more of that kind of elbow grease to dig into these lesser use cases to
> > make sure we aren't coding ourselves into corners.
> >
> >
> > > I do like the idea of using modulators (with(), by()) for more
> > > sophisticated usage and advanced use cases. Also, the streaming
> examples
> > > are quite elegant allowing for a helpful separation of data and logic.
> > >
> >
> > cool - hope you like the revised syntax I posted then. :)
> >
> >
> > > That's my humble take. This is a very welcome addition to the language
> > and
> > > I appreciate the thoughtful & collaborative approach to the design
> > > considerations.
> > >
> >
> > Thanks again and please keep the thoughts coming. Lots of other
> interesting
> > design discussions seem to be brewing.
> >
> >
> > >
> > > Josh
> > >
> > > On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <sp...@gmail.com>
> > > wrote:
> > >
> > > > I started a expanded this discussion to gremlin-users for a wider
> > > audience
> > > > and the thread is starting to grow:
> > > >
> > > >
> https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
> > > >
> > > > I guess we'll need to summarize that discussion back here now....
> > > >
> > > > I did have some more thoughts to hang out there and figured that I
> > > wouldn't
> > > > convolute the discussion on gremlin-users with it so I will continue
> > the
> > > > discussion here.
> > > >
> > > > 1, The very first couple of examples seem wrong (or at least not best
> > > > demonstrating the usage):
> > > >
> > > > g.upsertV('person', [name: 'marko'],
> > > >                    [name: 'marko', age: 29])
> > > > g.upsertV('person', [(T.id): 1],
> > > >                    [(T.id): 1, name: 'Marko'])
> > > >
> > > > should instead be:
> > > >
> > > > g.upsertV('person', [name: 'marko'],
> > > >                    [age: 29])
> > > > g.upsertV('person', [(T.id): 1],
> > > >                    [name: 'Marko'])
> > > >
> > > > 2. I can't recall if we made this clear anywhere but in situations
> > where
> > > we
> > > > "get" rather than "create" do the additional properties act in an
> > update
> > > > fashion to the element that was found? I think I've been working on
> the
> > > > assumption that it would, though perhaps that is not always
> desirable?
> > > >
> > > > 3. We really never settled up how to deal with multi/meta-properties.
> > > That
> > > > story should be clear so that when we document upsert() we include
> the
> > > > approaches for the fallback patterns that don't meet the 90% of use
> > cases
> > > > we are targeting and I sense that we're saying that
> > meta/multi-properties
> > > > don't fit in that bucket. So with that in mind, I don't think that
> the
> > > > following works for metaproperties:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                    [name:[acl: 'public'])
> > > >
> > > > as it doesn't let us set the value for the "name" just the pairs for
> > the
> > > > meta-properties. I guess a user would have to fall back to:
> > > >
> > > > g.upsertV('person', [(T.id): 1]).
> > > >  property('name','marko','acl','public')
> > > >
> > > > // or use the additional properties syntax
> > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
> > > >  properties('name').property('acl','public')
> > > >
> > > > // or if there were multi-properties then maybe...
> > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
> > > >  properties('name').hasValue('marko').property('acl','public')
> > > >
> > > > As for multi-properties, I dont think we should assume that a List
> > object
> > > > should be interpreted by Gremlin as a multi-property. Perhaps we just
> > > rely
> > > > on the underlying graph to properly deal with that given a schema or
> > the
> > > > user falls back to:
> > > >
> > > > // this ends up as however the graph deals with a List object
> > > > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
> > > >
> > > > // this is explicit
> > > > g.upsertV('person', [(T.id): 1]).
> > > >  property(list, 'lang', 'java').
> > > >  property(list, 'lang, 'scala').
> > > >  property(list, 'lang', 'java')
> > > >
> > > > If that makes sense to everyone, I will update the gist.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <
> dave@bechberger.com
> > >
> > > > wrote:
> > > >
> > > > > Hello Stephen,
> > > > >
> > > > > Thanks for making that gist, its much easier to follow along with
> the
> > > > > proposed syntax there.  To your specific comments:
> > > > >
> > > > > #1 - My only worry with the term upsert is that it is not as
> widely a
> > > > used
> > > > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).
> > However I
> > > > > don't have a strong opinion on this, so I am fine with either.
> > > > > #2 - My only real objective here is to make sure that we make the
> > > 80-90%
> > > > > case easy and straightforward.  I think that having the fallback
> > option
> > > > of
> > > > > using the current syntax for any complicated edge cases should be
> > > > > considered here as well. I'd appreciate your thoughts here as these
> > are
> > > > > good points you bring up that definitely fall into the 80-90% use
> > case.
> > > > > #3 - Those points make sense to me, not sure I have anything
> further
> > to
> > > > add
> > > > > #4 - I don't think I would expect to have the values extracted from
> > the
> > > > > traversal filters but I'd be interested in other opinions on that.
> > > > >
> > > > > Thanks,
> > > > > Dave
> > > > >
> > > > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <
> > spmallette@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > It's been a couple weeks since we moved this thread so I went
> back
> > > > > through
> > > > > > it and refreshed my mind with what's there. Dave, nice job
> putting
> > > all
> > > > > > those examples together. I've recollected them all together and
> > have
> > > > just
> > > > > > been reviewing them in a more formatted style with proper Groovy
> > > syntax
> > > > > > than what can be accomplished in this mailing list. Anyone who
> > cares
> > > to
> > > > > > review in that form can do so here (i've pasted its markdown
> > contents
> > > > at
> > > > > > the bottom of this post as well):
> > > > > >
> > > > > >
> > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > > > > >
> > > > > > In this light, I have the following comments and thoughts:
> > > > > >
> > > > > > 1. I feel more inclined toward saving "merge" for a more
> > generalized
> > > > step
> > > > > > of that name and have thus renamed the examples to use "upsert".
> > > > > >
> > > > > > 2. I still don't quite feel comfortable with
> meta/multi-properties
> > > > > > examples. Perhaps you could explain further as it's possible I'm
> > not
> > > > > > thinking things through properly. For meta-properties the example
> > is:
> > > > > >
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                    [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > > >
> > > > > > So I see that "name" takes a Map here, but how does Gremlin know
> if
> > > > that
> > > > > > means "meta-property" or "Map value" and if the former, then how
> do
> > > you
> > > > > set
> > > > > > the value of "name"?  My question is similar to the
> multi-property
> > > > > examples
> > > > > > - using this one:
> > > > > >
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                    [(T.id): 1,lang: ['java', 'scala'])
> > > > > >
> > > > > > How does Gremlin know if the user means that "lang" should be a
> > > > > > "multi-property" with Cardinality.list (or Cardinality.set for
> that
> > > > > matter)
> > > > > > or a value of Cardinality.single with a List/Array value? I can
> > > perhaps
> > > > > > suggest some solutions here but wanted to make sure I wasn't just
> > > > > > misunderstanding something.
> > > > > >
> > > > > > 3. The "upsert with stream" example is:
> > > > > >
> > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > > >  upsertV(values(label), valueMap(id), valueMap())
> > > > > >
> > > > > > That would establish an API of upsertV(Traversal, Traversal,
> > > Traversal)
> > > > > > which would enable your final thought of your last post with:
> > > > > >
> > > > > > g.V().upsertV('person',
> __.has('person','age',gt(29)).has('state',
> > > > 'NY'),
> > > > > >                        [active: true])
> > > > > >
> > > > > > The open-ended nature of that is neat but comes with some
> > complexity
> > > > with
> > > > > > steps that traditionally take an "open" Traversal (and this step
> > now
> > > > has
> > > > > up
> > > > > > to three of them) - perhaps, that's our own fault in some ways.
> For
> > > the
> > > > > > "stream" concept, values()/valueMap() won't immediately work that
> > way
> > > > as
> > > > > > they only apply to Element...given today's Gremlin semantics, I
> > think
> > > > > we'd
> > > > > > re-write that as:
> > > > > >
> > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > > >  upsertV(select(label), select(id), select(id,label,'name'))
> > > > > >
> > > > > > Though, even that doesn't quite work because you can't select(T)
> > > right
> > > > > now
> > > > > > which means you'd need to go with:
> > > > > >
> > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > > > >          [id: 2, label: 'person', name: 'josh']).
> > > > > >  upsertV(select('label'), select('id'),
> > select('id','label','name'))
> > > > > >
> > > > > > not terrible in a way and perhaps we could fix select(T) - can't
> > > > remember
> > > > > > why that isn't allowed except that select() operates over
> multiple
> > > > scopes
> > > > > > and perhaps trying T through those scopes causes trouble.
> > > > > >
> > > > > > 4. Continuing with the use of Traversal as arguments, let's go
> back
> > > to:
> > > > > >
> > > > > > g.V().upsertV('person',
> __.has('person','age',gt(29)).has('state',
> > > > 'NY'),
> > > > > >                        [active: true])
> > > > > >
> > > > > > I originally had it in my mind to prefer this approach over a Map
> > for
> > > > the
> > > > > > search as it provides a great deal of flexibility and fits the
> > common
> > > > > > filtering patterns that users follow really well. I suppose the
> > > > question
> > > > > is
> > > > > > what to do in the situation of "create". Would we extract all the
> > > > strict
> > > > > > equality filter values from the initial has(), thus, in the
> > example,
> > > > > > "state" but not "age", and create a vertex that has "state=NY,
> > > > > > active=true"? That is possible with how things are designed in
> > > Gremlin
> > > > > > today I think.
> > > > > >
> > > > > > This syntax does create some conflict with the subject of 3 and
> > > streams
> > > > > > because the traverser in the streams case is the incoming Map but
> > for
> > > > > this
> > > > > > context it's meant as a filter of V(). Getting too whacky?
> > > > > >
> > > > > > 5. I like the idea of using a side-effect to capture whether the
> > > > element
> > > > > > was created or not. that makes sense. I think that can work.
> > > > > >
> > > > > > ==============================
> > > > > > = gist contents
> > > > > > ==============================
> > > > > >
> > > > > > # API
> > > > > >
> > > > > > ```java
> > > > > > upsertV(String label, Map matchOrCreateProperties)
> > > > > > upsertV(String label, Map matchOrCreateProperties, Map
> > > > > > additionalProperties)
> > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties,
> > Traversal
> > > > > > additionalProperties)
> > > > > > upsertV(...).
> > > > > >  with(WithOptions.sideEffectLabel, 'a')
> > > > > >
> > > > > > upsertE(String label, Map matchOrCreateProperties)
> > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > > > > upsertE(String label, Map matchOrCreateProperties, Map
> > > > > > additionalProperties)
> > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties,
> > Traversal
> > > > > > additionalProperties)
> > > > > > upsertE(...).
> > > > > >  from(Vertex incidentOut).to(Vertex incidentIn)
> > > > > >  with(WithOptions.sideEffectLabel, 'a')
> > > > > > ```
> > > > > >
> > > > > > # Examples
> > > > > >
> > > > > > ## upsert
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [name: 'marko'],
> > > > > >                    [name: 'marko', age: 29])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with id
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                    [(T.id): 1, name: 'Marko'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with Meta Properties
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                    [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with Multi Properties
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                    [(T.id): 1,lang: ['java', 'scala'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with Single Cardinality
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                    [(T.id): 1, lang: 'java')
> > > > > > ```
> > > > > >
> > > > > > ## upsert with List Cardinality
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                    [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with Set Cardinality
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                    [(T.id): 1, lang: ['java','scala'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with List Cardinality - and add value to list
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                    [(T.id): 1, lang: ['java','scala','java']).
> > > > > >  property(Cardinality.list, 'lang', 'java')
> > > > > > ```
> > > > > >
> > > > > > ## upsert with stream
> > > > > >
> > > > > > ```groovy
> > > > > > // doesn't work with today's Gremlin semantics
> > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > > >  upsertV(values(label), valueMap(id), valueMap())
> > > > > >
> > > > > > // the following would be more in line with what's possible with
> > > > existing
> > > > > > Gremlin semantics:
> > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > > > >          [id: 2, label: 'person', name: 'josh']).
> > > > > >  upsertV(select('label'), select('id'),
> > select('id','label','name'))
> > > > > > ```
> > > > > >
> > > > > > ## upsert with reporting added or updated side effect
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [name: 'marko'],
> > > > > >                    [name: 'marko', age: 29]).
> > > > > >    with(WithOptions.sideEffectLabel, 'a').
> > > > > >  project('vertex', 'added').
> > > > > >    by().
> > > > > >    by(cap('a'))
> > > > > > ```
> > > > > >
> > > > > > ## upsert Edge assuming a self-relation of "marko"
> > > > > >
> > > > > > ```groovy
> > > > > > g.V().has('person','name','marko').
> > > > > >  upsertE( 'self', [weight:0.5])
> > > > > > ```
> > > > > >
> > > > > > ## upsert Edge with incident vertex checks
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertE('knows', [weight:0.5]).
> > > > > >    from(V(1)).to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >  upsertE( 'knows', [weight:0.5]).
> > > > > >    to(V(2))
> > > > > > ```
> > > > > >
> > > > > > ## upsert using a Traversal for the match
> > > > > >
> > > > > > ```groovy
> > > > > > g.V().upsertV('person',
> __.has('person','age',gt(29)).has('state',
> > > > 'NY'),
> > > > > >                        [active: true])
> > > > > > ```
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
> > > dave@bechberger.com>
> > > > > > wrote:
> > > > > >
> > > > > > > So, I've been doing some additional thinking about the ways
> that
> > > this
> > > > > > could
> > > > > > > work based on the comments below and have put my comments
> inline.
> > > > > > >
> > > > > > > Dave
> > > > > > >
> > > > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> > > > spmallette@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I added some thoughts inline below:
> > > > > > > >
> > > > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > > > > dave@bechberger.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the detailed comments Stephen.  I have addressed
> > > them
> > > > > > inline
> > > > > > > > > below.
> > > > > > > > >
> > > > > > > > > I did read the proposal from earlier and I think that we
> are
> > in
> > > > > close
> > > > > > > > > agreement with what we are trying to accomplish.  I also
> > fully
> > > > > > support
> > > > > > > > > Josh's comment on providing a mechanism for submitting a
> map
> > of
> > > > > > > > properties
> > > > > > > > > as manually unrolling this all right now leads to a lot of
> > > > > potential
> > > > > > > for
> > > > > > > > > error and a long messy traversal.
> > > > > > > > >
> > > > > > > > > I'm looking forward to this discussion on how to merge
> these
> > > two
> > > > > > > > proposals.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > > > proposed?
> > > > > > > > >
> > > > > > > > > My first thought here is that multi-properties would be
> > > > represented
> > > > > > as
> > > > > > > > > lists in the map, e.g.
> > > > > > > > >
> > > > > > > > > {names: ['Dave', 'David']}
> > > > > > > > >
> > > > > > > > > and meta-properties would be represented as maps in the
> maps,
> > > > e.g.
> > > > > > > > >
> > > > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > > > > >
> > > > > > > > > I can't say I've thought through all the implications of
> this
> > > > > though
> > > > > > so
> > > > > > > > it
> > > > > > > > > is an area we would need to explore.
> > > > > > > > >
> > > > > > > >
> > > > > > > > The first implication that comes to mind is that it makes the
> > > > > > assumption
> > > > > > > > the user wants multi/meta-properties as opposed to a single
> > > > > cardinality
> > > > > > > of
> > > > > > > > List in the first case and a Map as a property value in the
> > > second
> > > > > > case.
> > > > > > > I
> > > > > > > > suppose that graphs with a schema could resolve those
> > assumptions
> > > > but
> > > > > > > > graphs that are schemaless would have a problem. The issue
> > could
> > > be
> > > > > > > > resolved by specialized configuration of "g" or per merge()
> > step
> > > > > using
> > > > > > a
> > > > > > > > with() modulator I suppose but that goes into a yet another
> > level
> > > > of
> > > > > > > > implications to consider. I've often wondered if the start
> > point
> > > > for
> > > > > > > > getting types/schema into TP3 without a full rewrite would be
> > in
> > > > this
> > > > > > > form
> > > > > > > > where Gremlin would be given hints as to what to expect as to
> > the
> > > > > types
> > > > > > > of
> > > > > > > > data it might encounter while traversing. Anyway, I'd be
> > hesitant
> > > > to
> > > > > go
> > > > > > > > down paths that don't account for multi/metaproperties well.
> > > They
> > > > > are
> > > > > > > > first class citizens in TP3 (with those hoping for extension
> of
> > > at
> > > > > > least
> > > > > > > > multiproperties to edges) and while I find them a constant
> > > > annoyance
> > > > > > for
> > > > > > > so
> > > > > > > > many reasons, we're kinda stuck with them.
> > > > > > > >
> > > > > > >
> > > > > > > I agree we need to account for multi/meta properties as 1st
> class
> > > > > > > citizens.  This is my current thinking on the syntax for the
> > > > situations
> > > > > > we
> > > > > > > have laid out so far:
> > > > > > >
> > > > > > > *Merge*
> > > > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age':
> 29})
> > > > > > >
> > > > > > >
> > > > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> > > > 'Marko'})
> > > > > > >
> > > > > > >
> > > > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id:
> > 1,
> > > > > > 'name':
> > > > > > > {'first': Marko', 'last': 'R'})
> > > > > > >
> > > > > > >
> > > > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1},
> {T.id:
> > > 1,
> > > > > > > 'lang': ['java', 'scala'])
> > > > > > >
> > > > > > >
> > > > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1},
> > {T.id:
> > > 1,
> > > > > > > 'lang': 'java')
> > > > > > >
> > > > > > >
> > > > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1},
> {T.id:
> > 1,
> > > > > > 'lang':
> > > > > > > ['java', 'scala', 'java'])
> > > > > > >
> > > > > > >
> > > > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1},
> {T.id:
> > 1,
> > > > > > 'lang':
> > > > > > > ['java', 'scala'])
> > > > > > >
> > > > > > > Since in a mergeV() scenario we are only ever adding whatever
> > > values
> > > > > are
> > > > > > > passed in there would be no need to specify the cardinality of
> > the
> > > > > > property
> > > > > > > being added.  If they wanted to add a value to an existing
> > property
> > > > > then
> > > > > > > the current property() method would still be available on the
> > > output
> > > > > > > traversal. e.g.
> > > > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > > > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > > > > >
> > > > > > >
> > > > > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> > > > > 'marko'},
> > > > > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > > > > >  mergeV(values('label'), valueMap('id'), valueMap())
> > > > > > >
> > > > > > >
> > > > > > > *Merge with reporting added or updated side effect*
> > > g.mergeV('name',
> > > > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > > > > >  with(WithOptions.sideEffectLabel, 'a').
> > > > > > >  project('vertex', 'added').
> > > > > > >    by().
> > > > > > >    by(cap('a'))
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > 2. How would users set the T.id on creation? would that
> T.id
> > > just
> > > > > be
> > > > > > a
> > > > > > > > key
> > > > > > > > > in the first Map argument?
> > > > > > > > >
> > > > > > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > > > > > >
> > > > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > > > > >
> > > > > > > >
> > > > > > > > ok - I can't say I see a problem with that atm.
> > > > > > > >
> > > > > > > >
> > > > > > > > > 3. I do like the general idea of a match on multiple
> > properties
> > > > for
> > > > > > the
> > > > > > > > > first argument as a convenience but wonder about the
> > > specificity
> > > > of
> > > > > > > this
> > > > > > > > > API a bit as it focuses heavily on equality - I suppose
> > that's
> > > > most
> > > > > > > cases
> > > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > >
> > > > > > > > > In most cases I've seen use exact matches on the vertex or
> > > > edge.  I
> > > > > > > think
> > > > > > > > > it might be best to keep this straightforward as any
> complex
> > > edge
> > > > > > cases
> > > > > > > > > still can perform the same functionality using the
> coalesce()
> > > > > > pattern.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I played around with the idea to use modulators to apply
> > > additional
> > > > > > > > constraints:
> > > > > > > >
> > > > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > > > >    by(has('age', gt(29))
> > > > > > > >
> > > > > > > > I suppose that's neat but maybe not...we could easily get the
> > > same
> > > > > > thing
> > > > > > > > from:
> > > > > > > >
> > > > > > > > g.V().has('person','age',gt(29)).
> > > > > > > >  mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > > > >
> > > > > > > > which is more readable though it does make it harder for
> > > providers
> > > > to
> > > > > > > > optimize upsert operations if they could make use of the
> has()
> > as
> > > > > part
> > > > > > of
> > > > > > > > that. I think I like has() as part of the main query rather
> > than
> > > > in a
> > > > > > > by()
> > > > > > > > - just thinking out loud.
> > > > > > > >
> > > > > > > >
> > > > > > > Another potential option here would be to allow the second
> > > parameter
> > > > of
> > > > > > > mergeV() accept a traversal.  Not sure how much complexity that
> > > would
> > > > > add
> > > > > > > though
> > > > > > >
> > > > > > > g.V().mergeV('person',
> __.has('person','age',gt(29)).has('state',
> > > > > 'NY'}.,
> > > > > > > {'active': true}).
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > 4. I think your suggestion points to one of the troubles
> > > Gremlin
> > > > > has
> > > > > > > > which
> > > > > > > > > we see with "algorithms" - extending the language with new
> > > steps
> > > > > that
> > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up
> with
> > > > > > > > > shortestPath() step) pollutes the core language a bit,
> hence
> > my
> > > > > > > > > generalization of "merging" in my link above which fits
> into
> > > the
> > > > > core
> > > > > > > > > Gremlin language style. There is a bigger picture where we
> > are
> > > > > > missing
> > > > > > > > > something in Gremlin that lets us extend the language in
> ways
> > > > that
> > > > > > let
> > > > > > > us
> > > > > > > > > easily introduce new steps that aren't for general purpose.
> > > This
> > > > > > issue
> > > > > > > is
> > > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> > > think
> > > > > we
> > > > > > > > could
> > > > > > > > > see how there might be some "mutation" extension steps that
> > > would
> > > > > > cover
> > > > > > > > > your suggested API, plus batch operations, etc. We need a
> way
> > > to
> > > > > add
> > > > > > > > > "sugar" without it interfering with the consistency of the
> > > core.
> > > > > > > > Obviously
> > > > > > > > > this is a bigger issue but perhaps important to solve to
> > > > implement
> > > > > > > steps
> > > > > > > > in
> > > > > > > > > the fashion you describe.
> > > > > > > > >
> > > > > > > > > I agree that the "algorithm" steps do seem a bit odd in the
> > > core
> > > > > > > language
> > > > > > > > > but I think the need is there.  I'd be interested in
> > furthering
> > > > > this
> > > > > > > > > discussion but I think these "pattern" steps may or may not
> > be
> > > > the
> > > > > > same
> > > > > > > > as
> > > > > > > > > the algorithm steps.  I'll have to think on that.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > > specify
> > > > > what
> > > > > > > > > element type the first Map argument should be applied to?
> > what
> > > > > about
> > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> > That's
> > > > > > tricky
> > > > > > > > but
> > > > > > > > > I don't think we should miss that. Perhaps merge() could
> be a
> > > > > > "complex
> > > > > > > > > modulator"?? that's a new concept of course, but you would
> do
> > > > > > > > g.V().merge()
> > > > > > > > > and the label and first Map would fold to VertexStartStep
> > (i.e.
> > > > > V())
> > > > > > > for
> > > > > > > > > the lookup and then a MergeStep would follow - thus a
> > "complex
> > > > > > > modulator"
> > > > > > > > > as it does more than just change the behavior of the
> previous
> > > > step
> > > > > -
> > > > > > it
> > > > > > > > > also adds its own. I suppose it could also add has() steps
> > > > followed
> > > > > > by
> > > > > > > > the
> > > > > > > > > MergeStep and then the has() operations would fold in
> > normally
> > > as
> > > > > > they
> > > > > > > do
> > > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > > merge(String,Map,Map). ??
> > > > > > > > >
> > > > > > > > > I agree that we should also think about how to include
> > > properties
> > > > > in
> > > > > > > this
> > > > > > > > > merge construct.  The reason I was thinking about mergeV()
> > and
> > > > > > mergeE()
> > > > > > > > is
> > > > > > > > > that it follows the same pattern as the already well
> > understood
> > > > > > > > > addV()/addE() steps.  I am a bit wary of trying to
> generalize
> > > > this
> > > > > > down
> > > > > > > > to
> > > > > > > > > a single merge() step as these sorts of complex overloads
> > make
> > > it
> > > > > > hard
> > > > > > > to
> > > > > > > > > figure out which Gremlin step you should use for a
> particular
> > > > > pattern
> > > > > > > > (e.g.
> > > > > > > > > upsert vertex or upsert edge).  One thing that I think
> > > customers
> > > > > find
> > > > > > > > > useful about the addV/addE step is that they are very
> > > > discoverable,
> > > > > > the
> > > > > > > > > name tells me what functionality to expect from that step.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I'm starting to agree with the idea that we have to do
> > something
> > > > like
> > > > > > > > mergeV()/E() as it wouldn't quite work as a start step
> > otherwise.
> > > > We
> > > > > > > > couldn't do:
> > > > > > > >
> > > > > > > > g.merge()
> > > > > > > >
> > > > > > > > as we wouldnt know what sort of Element it applied to. If
> so, I
> > > > > wonder
> > > > > > if
> > > > > > > > it would be better to preserve "merge" for my more general
> > > use-case
> > > > > and
> > > > > > > > prefer upsertV()/E()?
> > > > > > > >
> > > > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it
> is
> > an
> > > > > > > uncommon
> > > > > > > > case (compared to V/E()) and we don't really have addVP() as
> an
> > > > > > analogous
> > > > > > > > step. Perhaps existing coalesce() patterns and/or my more
> > general
> > > > > > purpose
> > > > > > > > merge() step would satisfy those situations??
> > > > > > > >
> > > > > > >
> > > > > > > I think that a mergeVP() type step may require a different
> > pattern
> > > > > since
> > > > > > it
> > > > > > > would really need to accept a map of values otherwise it would
> > > > > > essentially
> > > > > > > perform the same function as property()
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > 6. One thing neither my approach nor yours seems to do is
> > tell
> > > > the
> > > > > > user
> > > > > > > > if
> > > > > > > > > they created something or updated something - that's
> another
> > > > thing
> > > > > > I've
> > > > > > > > > seen users want to have in get-or-create. Here again we go
> > > deeper
> > > > > > into
> > > > > > > a
> > > > > > > > > less general step specification as alluded to in 4, but a
> > > merge()
> > > > > > step
> > > > > > > as
> > > > > > > > > proposed in 5, might return [Element,boolean] so as to
> > provide
> > > an
> > > > > > > > indicator
> > > > > > > > > of creation?
> > > > > > > > >
> > > > > > > > > Hmm, I'll have to think about that.  How would returning
> > > multiple
> > > > > > > values
> > > > > > > > > work if we want to chain these together. e.g. Add a vertex
> > > > between
> > > > > > two
> > > > > > > > > edges but make sure the vertices exist?
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > Added a thought on how to accomplish this above that I'd like
> to
> > > get
> > > > > > > thoughts on.
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > 7. You were just introducing your ideas here, so perhaps
> you
> > > > > haven't
> > > > > > > > gotten
> > > > > > > > > this far yet, but a shortcoming to doing
> > merge(String,Map,Map)
> > > is
> > > > > > that
> > > > > > > it
> > > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > > merge()
> > > > > for
> > > > > > a
> > > > > > > > form
> > > > > > > > > of batch loading which is mighty common and one of the
> > > variations
> > > > > of
> > > > > > > the
> > > > > > > > > coalesce() pattern that I alluded to at the start of all
> > this.
> > > I
> > > > > > think
> > > > > > > > that
> > > > > > > > > we would want to be sure that we left open the option to do
> > > that
> > > > > > > somehow.
> > > > > > > > > 8. If we had a general purpose merge() step I wonder if it
> > > makes
> > > > > > > > developing
> > > > > > > > > the API as you suggested easier to do?
> > > > > > > > >
> > > > > > > > > Hmm, let me think about that one.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > I will add an item 8 to think about which I didn't mention
> > > before:
> > > > > > > >
> > > > > > > > 8. The signature you suggested for mergeE() is:
> > > > > > > >
> > > > > > > > mergeE(String, Map, Map)
> > > > > > > >      String - The edge label
> > > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > > >      Map (second) - Any additional properties to set if a new
> > > edge
> > > > is
> > > > > > > > created (optional)
> > > > > > > >
> > > > > > > > but does that exactly answer the need? Typically the idea is
> to
> > > > > detect
> > > > > > an
> > > > > > > > edge between two vertices not globally in the graph by way of
> > > some
> > > > > > > > properties. This signature doesn't seem to allow for that as
> it
> > > > > doesn't
> > > > > > > > allow specification of the vertices to test against. Perhaps
> > the
> > > > > answer
> > > > > > > is
> > > > > > > > to use from() and to() modulators?
> > > > > > > >
> > > > > > > > g.mergeE('knows', {'weight':0.5}).
> > > > > > > >  from(V(1)).to(V(2))
> > > > > > > > g.V().has('person','name','marko').
> > > > > > > >  mergeE( 'knows', {'weight':0.5}).
> > > > > > > >    to(V(2))
> > > > > > > > g.V().has('person','name','marko').
> > > > > > > >  mergeE( 'self', {'weight':0.5})
> > > > > > > >
> > > > > > > > That seems to work?
> > > > > > > >
> > > > > > >
> > > > > > > I think that is an elegant solution to that problem.  I like it
> > and
> > > > it
> > > > > > > keeps in line with the way that addE() works.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > > > > spmallette@gmail.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I like our coalesce() pattern but it is verbose and over
> > time
> > > > it
> > > > > > has
> > > > > > > > gone
> > > > > > > > > > from a simple pattern to one with numerous variations for
> > all
> > > > > > manner
> > > > > > > of
> > > > > > > > > > different sorts of merge-like operations. As such, I do
> > think
> > > > we
> > > > > > > should
> > > > > > > > > > introduce something to cover this pattern.
> > > > > > > > > >
> > > > > > > > > > I like that you used the word "merge" in your description
> > of
> > > > this
> > > > > > as
> > > > > > > it
> > > > > > > > > is
> > > > > > > > > > the word I've liked using. You might want to give a look
> at
> > > my
> > > > > > > proposed
> > > > > > > > > > merge() step from earlier in the year:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > > > > >
> > > > > > > > > > I'm just going to dump thoughts as they come regarding
> what
> > > you
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > 1. How would multi/meta-properties fit into the API
> you've
> > > > > > proposed?
> > > > > > > > > > 2. How would users set the T.id on creation? would that
> > T.id
> > > > just
> > > > > > be
> > > > > > > a
> > > > > > > > > key
> > > > > > > > > > in the first Map argument?
> > > > > > > > > > 3. I do like the general idea of a match on multiple
> > > properties
> > > > > for
> > > > > > > the
> > > > > > > > > > first argument as a convenience but wonder about the
> > > > specificity
> > > > > of
> > > > > > > > this
> > > > > > > > > > API a bit as it focuses heavily on equality - I suppose
> > > that's
> > > > > most
> > > > > > > > cases
> > > > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > > > 4. I think your suggestion points to one of the troubles
> > > > Gremlin
> > > > > > has
> > > > > > > > > which
> > > > > > > > > > we see with "algorithms" - extending the language with
> new
> > > > steps
> > > > > > that
> > > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up
> > with
> > > > > > > > > > shortestPath() step) pollutes the core language a bit,
> > hence
> > > my
> > > > > > > > > > generalization of "merging" in my link above which fits
> > into
> > > > the
> > > > > > core
> > > > > > > > > > Gremlin language style. There is a bigger picture where
> we
> > > are
> > > > > > > missing
> > > > > > > > > > something in Gremlin that lets us extend the language in
> > ways
> > > > > that
> > > > > > > let
> > > > > > > > us
> > > > > > > > > > easily introduce new steps that aren't for general
> purpose.
> > > > This
> > > > > > > issue
> > > > > > > > is
> > > > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991
> but I
> > > > think
> > > > > > we
> > > > > > > > > could
> > > > > > > > > > see how there might be some "mutation" extension steps
> that
> > > > would
> > > > > > > cover
> > > > > > > > > > your suggested API, plus batch operations, etc. We need a
> > way
> > > > to
> > > > > > add
> > > > > > > > > > "sugar" without it interfering with the consistency of
> the
> > > > core.
> > > > > > > > > Obviously
> > > > > > > > > > this is a bigger issue but perhaps important to solve to
> > > > > implement
> > > > > > > > steps
> > > > > > > > > in
> > > > > > > > > > the fashion you describe.
> > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > > > specify
> > > > > > what
> > > > > > > > > > element type the first Map argument should be applied to?
> > > what
> > > > > > about
> > > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> > > That's
> > > > > > > tricky
> > > > > > > > > but
> > > > > > > > > > I don't think we should miss that. Perhaps merge() could
> > be a
> > > > > > > "complex
> > > > > > > > > > modulator"?? that's a new concept of course, but you
> would
> > do
> > > > > > > > > g.V().merge()
> > > > > > > > > > and the label and first Map would fold to VertexStartStep
> > > (i.e.
> > > > > > V())
> > > > > > > > for
> > > > > > > > > > the lookup and then a MergeStep would follow - thus a
> > > "complex
> > > > > > > > modulator"
> > > > > > > > > > as it does more than just change the behavior of the
> > previous
> > > > > step
> > > > > > -
> > > > > > > it
> > > > > > > > > > also adds its own. I suppose it could also add has()
> steps
> > > > > followed
> > > > > > > by
> > > > > > > > > the
> > > > > > > > > > MergeStep and then the has() operations would fold in
> > > normally
> > > > as
> > > > > > > they
> > > > > > > > do
> > > > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > > > merge(String,Map,Map). ??
> > > > > > > > > > 6. One thing neither my approach nor yours seems to do is
> > > tell
> > > > > the
> > > > > > > user
> > > > > > > > > if
> > > > > > > > > > they created something or updated something - that's
> > another
> > > > > thing
> > > > > > > I've
> > > > > > > > > > seen users want to have in get-or-create. Here again we
> go
> > > > deeper
> > > > > > > into
> > > > > > > > a
> > > > > > > > > > less general step specification as alluded to in 4, but a
> > > > merge()
> > > > > > > step
> > > > > > > > as
> > > > > > > > > > proposed in 5, might return [Element,boolean] so as to
> > > provide
> > > > an
> > > > > > > > > indicator
> > > > > > > > > > of creation?
> > > > > > > > > > 7. You were just introducing your ideas here, so perhaps
> > you
> > > > > > haven't
> > > > > > > > > gotten
> > > > > > > > > > this far yet, but a shortcoming to doing
> > > merge(String,Map,Map)
> > > > is
> > > > > > > that
> > > > > > > > it
> > > > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > > > merge()
> > > > > > for
> > > > > > > a
> > > > > > > > > form
> > > > > > > > > > of batch loading which is mighty common and one of the
> > > > variations
> > > > > > of
> > > > > > > > the
> > > > > > > > > > coalesce() pattern that I alluded to at the start of all
> > > this.
> > > > I
> > > > > > > think
> > > > > > > > > that
> > > > > > > > > > we would want to be sure that we left open the option to
> do
> > > > that
> > > > > > > > somehow.
> > > > > > > > > > 8. If we had a general purpose merge() step I wonder if
> it
> > > > makes
> > > > > > > > > developing
> > > > > > > > > > the API as you suggested easier to do?
> > > > > > > > > >
> > > > > > > > > > I think I'd like to solve the problems you describe in
> your
> > > > post
> > > > > as
> > > > > > > > well
> > > > > > > > > as
> > > > > > > > > > the ones in mine. There is some relation there, but gaps
> as
> > > > well.
> > > > > > > With
> > > > > > > > > more
> > > > > > > > > > discussion here we can figure something out.
> > > > > > > > > >
> > > > > > > > > > Thanks for starting this talk - good one!
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > > > > dave@bechberger.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I've had a few on and off discussions with a few people
> > > here,
> > > > > so
> > > > > > I
> > > > > > > > > wanted
> > > > > > > > > > > to send this out to everyone for feedback.
> > > > > > > > > > >
> > > > > > > > > > > What are people's thoughts on creating a new set of
> steps
> > > > that
> > > > > > > codify
> > > > > > > > > > > common Gremlin best practices?
> > > > > > > > > > >
> > > > > > > > > > > I think there are several common Gremlin patterns where
> > > users
> > > > > > would
> > > > > > > > > > benefit
> > > > > > > > > > > from the additional guidance that these codified steps
> > > > > represent.
> > > > > > > > The
> > > > > > > > > > > first one I would recommend though is codifying the
> > element
> > > > > > > existence
> > > > > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > > > > >
> > > > > > > > > > > mergeV(String, Map, Map)
> > > > > > > > > > >      String - The vertex label
> > > > > > > > > > >      Map (first) - The properties to match existing
> > > vertices
> > > > on
> > > > > > > > > > >      Map (second) - Any additional properties to set
> if a
> > > new
> > > > > > > vertex
> > > > > > > > is
> > > > > > > > > > > created (optional)
> > > > > > > > > > > mergeE(String, Map, Map)
> > > > > > > > > > >      String - The edge label
> > > > > > > > > > >      Map (first) - The properties to match existing
> edge
> > on
> > > > > > > > > > >      Map (second) - Any additional properties to set
> if a
> > > new
> > > > > > edge
> > > > > > > is
> > > > > > > > > > > created (optional)
> > > > > > > > > > >
> > > > > > > > > > > In each of these cases these steps would perform the
> same
> > > > > upsert
> > > > > > > > > > > functionality as the element existence pattern.
> > > > > > > > > > >
> > > > > > > > > > > Example:
> > > > > > > > > > >
> > > > > > > > > > > g.V().has('person','name','stephen').
> > > > > > > > > > >            fold().
> > > > > > > > > > >            coalesce(unfold(),
> > > > > > > > > > >                    addV('person').
> > > > > > > > > > >                      property('name','stephen').
> > > > > > > > > > >                      property('age',34))
> > > > > > > > > > >
> > > > > > > > > > > would become:
> > > > > > > > > > >
> > > > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > > > > >
> > > > > > > > > > > I think that this change would be a good addition to
> the
> > > > > language
> > > > > > > for
> > > > > > > > > > > several reasons:
> > > > > > > > > > >
> > > > > > > > > > > * This codifies the best practice for a specific
> > > > action/recipe,
> > > > > > > which
> > > > > > > > > > > reduces the chance that someone uses the pattern
> > > incorrectly
> > > > > > > > > > > * Most complex Gremlin traversals are verbose.
> Reducing
> > > the
> > > > > > amount
> > > > > > > > of
> > > > > > > > > > code
> > > > > > > > > > > that needs to be written and maintained allows for a
> > better
> > > > > > > developer
> > > > > > > > > > > experience.
> > > > > > > > > > > * It will lower the bar of entry for a developer by
> > making
> > > > > these
> > > > > > > > > actions
> > > > > > > > > > > more discoverable.  The more we can help bring these
> > > patterns
> > > > > to
> > > > > > > the
> > > > > > > > > > > forefront of the language via these pattern/meta steps
> > the
> > > > more
> > > > > > we
> > > > > > > > > guide
> > > > > > > > > > > users towards writing better Gremlin faster
> > > > > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > > > > >
> > > > > > > > > > > I know that this would likely be the first step in
> > Gremlin
> > > > that
> > > > > > > > > codifies
> > > > > > > > > > a
> > > > > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > > > > >
> > > > > > > > > > > Dave
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
  

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Stephen Mallette <sp...@gmail.com>.
+1 to no new step. I haven't yet thought of a reason why this shouldn't
just be a variation on addV/E().

On Fri, Dec 11, 2020 at 1:03 PM David Bechberger <da...@bechberger.com>
wrote:

> I agree with Josh's description of what the upsertV() functionality is
> intended to be.
>
> I also fully support the simplification provided by the by() modulation
> that Stephen suggested, removing the second map.  I think that provides a
> much cleaner and easier to comprehend syntax.
>
> With that agreement, I think this does beg the question of if this should
> be a new step (upsertV()) or just an additional signature on the addV()
> step? I
>
> Stephen alluded to this on the dev list and after thinking about this a bit
> I think that I am favor of not adding a step and just adding a new
> signature to the existing step (if possible).  Thoughts?
>
>
> Dave
>
> On Wed, Dec 9, 2020 at 4:33 AM Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > Josh, thanks for your thoughts - some responses inline:
> >
> > On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman <jo...@gmail.com>
> > wrote:
> >
> > > I'll offer some thoughts. I'm seeing upsertV() as an idempotent
> > getOrCreate
> > > call which always returns a vertex with the label/property values
> > specified
> > > within the step. It's sort of a declarative pattern: "return this
> vertex
> > to
> > > me, find it if you can, create it if you must."
> > >
> >
> > I like this description - I've added it to the gist, though it's a bit at
> > odds with Dave's previous post, so we'll consider it a temporary addition
> > until he responds.
> >
> >
> > > On that account, I do like the simplification in 1. Repetition
> shouldn't
> > be
> > > necessary. In an ideal world, the engine should know the primary
> > > identifiers (name or id) and find/create the vertex based on them. Any
> > > other included values will be "trued up" as well. But this may be a
> > bridge
> > > too far for TinkerPop since knowing identifiers may require a specified
> > > schema. I'd prefer to omit the third input, but it might be necessary
> to
> > > keep it so that the second input can be for the matching use case.
> > >
> >
> > In my most recent post on gremlin-users I think I came up with a nice way
> > to get rid of the second Map. One Map that forms the full list of
> > properties for upserting is easier than partitioning two Maps that
> > essentially merge together. I imagine it's unlikely that application code
> > will have that separation naturally so users will have the added step of
> > trying to separate their data into searchable vs "just data". Getting us
> to
> > one Map argument will simplify APIs for us and reduce complexity to
> users.
> > Here is what I'd proposed for those not following over there:
> >
> > // match on name and age (or perhaps whatever the underlying graph system
> > thinks is best?)
> > g.upsertV('person', [name:'marko',age:29])
> >
> > // match on name only
> > g.upsertV('person', [name:'marko',age:29]).by('name')
> >
> > // explicitly match on name and age
> > g.upsertV('person', [name:'marko',age:29]).
> >   by('name').by('age')
> >
> > // match on id only
> > g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)
> >
> > // match on whatever the by(Traversal) predicate defines
> > g.upsertV('person', [name:'marko',age:29]).
> >   by(has('name', 'marko'))
> >
> > // match on id, then update age
> > g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
> >   property('age',29)
> >
> > With this model, we get one Map argument that represents the complete
> > property set to be added/updated to the graph and the user can hint on
> what
> > key they wish to match on using by() where that sort of step modulation
> > should be a well understood and familiar concept in Gremlin at this
> point.
> >
> > So that means I think 2 should always match or update the additional
> > > values. Again, we're specifying the expected result and letting the
> > engine
> > > figure out best how to return that results and appropriately maintain
> > > state.
> > >
> >
> > I again like this description, but we'll see what Dave's thoughts are
> since
> > he's a bit behind on the threads at this point I think.
> >
> >
> > > I'm also presuming that anything not included as inputs to the
> upsertV()
> > > step are then to be handled by following steps. I'm hoping that is a
> > > sufficient approach for addressing the multi/meta property use cases
> > > brought up in 3.
> > >
> >
> > yeah................it needs more thought. I spent more time thinking on
> > this issue yesterday than I have for all the previous posts combined and
> I
> > think it yielded something good in that revised syntax. It's going to
> take
> > more of that kind of elbow grease to dig into these lesser use cases to
> > make sure we aren't coding ourselves into corners.
> >
> >
> > > I do like the idea of using modulators (with(), by()) for more
> > > sophisticated usage and advanced use cases. Also, the streaming
> examples
> > > are quite elegant allowing for a helpful separation of data and logic.
> > >
> >
> > cool - hope you like the revised syntax I posted then. :)
> >
> >
> > > That's my humble take. This is a very welcome addition to the language
> > and
> > > I appreciate the thoughtful & collaborative approach to the design
> > > considerations.
> > >
> >
> > Thanks again and please keep the thoughts coming. Lots of other
> interesting
> > design discussions seem to be brewing.
> >
> >
> > >
> > > Josh
> > >
> > > On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <sp...@gmail.com>
> > > wrote:
> > >
> > > > I started a expanded this discussion to gremlin-users for a wider
> > > audience
> > > > and the thread is starting to grow:
> > > >
> > > >
> https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
> > > >
> > > > I guess we'll need to summarize that discussion back here now....
> > > >
> > > > I did have some more thoughts to hang out there and figured that I
> > > wouldn't
> > > > convolute the discussion on gremlin-users with it so I will continue
> > the
> > > > discussion here.
> > > >
> > > > 1, The very first couple of examples seem wrong (or at least not best
> > > > demonstrating the usage):
> > > >
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [name: 'marko', age: 29])
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name: 'Marko'])
> > > >
> > > > should instead be:
> > > >
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [age: 29])
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [name: 'Marko'])
> > > >
> > > > 2. I can't recall if we made this clear anywhere but in situations
> > where
> > > we
> > > > "get" rather than "create" do the additional properties act in an
> > update
> > > > fashion to the element that was found? I think I've been working on
> the
> > > > assumption that it would, though perhaps that is not always
> desirable?
> > > >
> > > > 3. We really never settled up how to deal with multi/meta-properties.
> > > That
> > > > story should be clear so that when we document upsert() we include
> the
> > > > approaches for the fallback patterns that don't meet the 90% of use
> > cases
> > > > we are targeting and I sense that we're saying that
> > meta/multi-properties
> > > > don't fit in that bucket. So with that in mind, I don't think that
> the
> > > > following works for metaproperties:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [name:[acl: 'public'])
> > > >
> > > > as it doesn't let us set the value for the "name" just the pairs for
> > the
> > > > meta-properties. I guess a user would have to fall back to:
> > > >
> > > > g.upsertV('person', [(T.id): 1]).
> > > >   property('name','marko','acl','public')
> > > >
> > > > // or use the additional properties syntax
> > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
> > > >   properties('name').property('acl','public')
> > > >
> > > > // or if there were multi-properties then maybe...
> > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
> > > >   properties('name').hasValue('marko').property('acl','public')
> > > >
> > > > As for multi-properties, I dont think we should assume that a List
> > object
> > > > should be interpreted by Gremlin as a multi-property. Perhaps we just
> > > rely
> > > > on the underlying graph to properly deal with that given a schema or
> > the
> > > > user falls back to:
> > > >
> > > > // this ends up as however the graph deals with a List object
> > > > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
> > > >
> > > > // this is explicit
> > > > g.upsertV('person', [(T.id): 1]).
> > > >   property(list, 'lang', 'java').
> > > >   property(list, 'lang, 'scala').
> > > >   property(list, 'lang', 'java')
> > > >
> > > > If that makes sense to everyone, I will update the gist.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <
> dave@bechberger.com
> > >
> > > > wrote:
> > > >
> > > > > Hello Stephen,
> > > > >
> > > > > Thanks for making that gist, its much easier to follow along with
> the
> > > > > proposed syntax there.  To your specific comments:
> > > > >
> > > > > #1 - My only worry with the term upsert is that it is not as
> widely a
> > > > used
> > > > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).
> > However I
> > > > > don't have a strong opinion on this, so I am fine with either.
> > > > > #2 - My only real objective here is to make sure that we make the
> > > 80-90%
> > > > > case easy and straightforward.  I think that having the fallback
> > option
> > > > of
> > > > > using the current syntax for any complicated edge cases should be
> > > > > considered here as well. I'd appreciate your thoughts here as these
> > are
> > > > > good points you bring up that definitely fall into the 80-90% use
> > case.
> > > > > #3 - Those points make sense to me, not sure I have anything
> further
> > to
> > > > add
> > > > > #4 - I don't think I would expect to have the values extracted from
> > the
> > > > > traversal filters but I'd be interested in other opinions on that.
> > > > >
> > > > > Thanks,
> > > > > Dave
> > > > >
> > > > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <
> > spmallette@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > It's been a couple weeks since we moved this thread so I went
> back
> > > > > through
> > > > > > it and refreshed my mind with what's there. Dave, nice job
> putting
> > > all
> > > > > > those examples together. I've recollected them all together and
> > have
> > > > just
> > > > > > been reviewing them in a more formatted style with proper Groovy
> > > syntax
> > > > > > than what can be accomplished in this mailing list. Anyone who
> > cares
> > > to
> > > > > > review in that form can do so here (i've pasted its markdown
> > contents
> > > > at
> > > > > > the bottom of this post as well):
> > > > > >
> > > > > >
> > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > > > > >
> > > > > > In this light, I have the following comments and thoughts:
> > > > > >
> > > > > > 1. I feel more inclined toward saving "merge" for a more
> > generalized
> > > > step
> > > > > > of that name and have thus renamed the examples to use "upsert".
> > > > > >
> > > > > > 2. I still don't quite feel comfortable with
> meta/multi-properties
> > > > > > examples. Perhaps you could explain further as it's possible I'm
> > not
> > > > > > thinking things through properly. For meta-properties the example
> > is:
> > > > > >
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > > >
> > > > > > So I see that "name" takes a Map here, but how does Gremlin know
> if
> > > > that
> > > > > > means "meta-property" or "Map value" and if the former, then how
> do
> > > you
> > > > > set
> > > > > > the value of "name"?  My question is similar to the
> multi-property
> > > > > examples
> > > > > > - using this one:
> > > > > >
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > > > >
> > > > > > How does Gremlin know if the user means that "lang" should be a
> > > > > > "multi-property" with Cardinality.list (or Cardinality.set for
> that
> > > > > matter)
> > > > > > or a value of Cardinality.single with a List/Array value? I can
> > > perhaps
> > > > > > suggest some solutions here but wanted to make sure I wasn't just
> > > > > > misunderstanding something.
> > > > > >
> > > > > > 3. The "upsert with stream" example is:
> > > > > >
> > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > > >   upsertV(values(label), valueMap(id), valueMap())
> > > > > >
> > > > > > That would establish an API of upsertV(Traversal, Traversal,
> > > Traversal)
> > > > > > which would enable your final thought of your last post with:
> > > > > >
> > > > > > g.V().upsertV('person',
> __.has('person','age',gt(29)).has('state',
> > > > 'NY'),
> > > > > >                         [active: true])
> > > > > >
> > > > > > The open-ended nature of that is neat but comes with some
> > complexity
> > > > with
> > > > > > steps that traditionally take an "open" Traversal (and this step
> > now
> > > > has
> > > > > up
> > > > > > to three of them) - perhaps, that's our own fault in some ways.
> For
> > > the
> > > > > > "stream" concept, values()/valueMap() won't immediately work that
> > way
> > > > as
> > > > > > they only apply to Element...given today's Gremlin semantics, I
> > think
> > > > > we'd
> > > > > > re-write that as:
> > > > > >
> > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > > >   upsertV(select(label), select(id), select(id,label,'name'))
> > > > > >
> > > > > > Though, even that doesn't quite work because you can't select(T)
> > > right
> > > > > now
> > > > > > which means you'd need to go with:
> > > > > >
> > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > > > >          [id: 2, label: 'person', name: 'josh']).
> > > > > >   upsertV(select('label'), select('id'),
> > select('id','label','name'))
> > > > > >
> > > > > > not terrible in a way and perhaps we could fix select(T) - can't
> > > > remember
> > > > > > why that isn't allowed except that select() operates over
> multiple
> > > > scopes
> > > > > > and perhaps trying T through those scopes causes trouble.
> > > > > >
> > > > > > 4. Continuing with the use of Traversal as arguments, let's go
> back
> > > to:
> > > > > >
> > > > > > g.V().upsertV('person',
> __.has('person','age',gt(29)).has('state',
> > > > 'NY'),
> > > > > >                         [active: true])
> > > > > >
> > > > > > I originally had it in my mind to prefer this approach over a Map
> > for
> > > > the
> > > > > > search as it provides a great deal of flexibility and fits the
> > common
> > > > > > filtering patterns that users follow really well. I suppose the
> > > > question
> > > > > is
> > > > > > what to do in the situation of "create". Would we extract all the
> > > > strict
> > > > > > equality filter values from the initial has(), thus, in the
> > example,
> > > > > > "state" but not "age", and create a vertex that has "state=NY,
> > > > > > active=true"? That is possible with how things are designed in
> > > Gremlin
> > > > > > today I think.
> > > > > >
> > > > > > This syntax does create some conflict with the subject of 3 and
> > > streams
> > > > > > because the traverser in the streams case is the incoming Map but
> > for
> > > > > this
> > > > > > context it's meant as a filter of V(). Getting too whacky?
> > > > > >
> > > > > > 5. I like the idea of using a side-effect to capture whether the
> > > > element
> > > > > > was created or not. that makes sense. I think that can work.
> > > > > >
> > > > > > ==============================
> > > > > > = gist contents
> > > > > > ==============================
> > > > > >
> > > > > > # API
> > > > > >
> > > > > > ```java
> > > > > > upsertV(String label, Map matchOrCreateProperties)
> > > > > > upsertV(String label, Map matchOrCreateProperties, Map
> > > > > > additionalProperties)
> > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties,
> > Traversal
> > > > > > additionalProperties)
> > > > > > upsertV(...).
> > > > > >   with(WithOptions.sideEffectLabel, 'a')
> > > > > >
> > > > > > upsertE(String label, Map matchOrCreateProperties)
> > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > > > > upsertE(String label, Map matchOrCreateProperties, Map
> > > > > > additionalProperties)
> > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties,
> > Traversal
> > > > > > additionalProperties)
> > > > > > upsertE(...).
> > > > > >   from(Vertex incidentOut).to(Vertex incidentIn)
> > > > > >   with(WithOptions.sideEffectLabel, 'a')
> > > > > > ```
> > > > > >
> > > > > > # Examples
> > > > > >
> > > > > > ## upsert
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [name: 'marko'],
> > > > > >                     [name: 'marko', age: 29])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with id
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                     [(T.id): 1, name: 'Marko'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with Meta Properties
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with Multi Properties
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with Single Cardinality
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                     [(T.id): 1, lang: 'java')
> > > > > > ```
> > > > > >
> > > > > > ## upsert with List Cardinality
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with Set Cardinality
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                     [(T.id): 1, lang: ['java','scala'])
> > > > > > ```
> > > > > >
> > > > > > ## upsert with List Cardinality - and add value to list
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [(T.id): 1],
> > > > > >                     [(T.id): 1, lang: ['java','scala','java']).
> > > > > >   property(Cardinality.list, 'lang', 'java')
> > > > > > ```
> > > > > >
> > > > > > ## upsert with stream
> > > > > >
> > > > > > ```groovy
> > > > > > // doesn't work with today's Gremlin semantics
> > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > > >   upsertV(values(label), valueMap(id), valueMap())
> > > > > >
> > > > > > // the following would be more in line with what's possible with
> > > > existing
> > > > > > Gremlin semantics:
> > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > > > >          [id: 2, label: 'person', name: 'josh']).
> > > > > >   upsertV(select('label'), select('id'),
> > select('id','label','name'))
> > > > > > ```
> > > > > >
> > > > > > ## upsert with reporting added or updated side effect
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertV('person', [name: 'marko'],
> > > > > >                     [name: 'marko', age: 29]).
> > > > > >     with(WithOptions.sideEffectLabel, 'a').
> > > > > >   project('vertex', 'added').
> > > > > >     by().
> > > > > >     by(cap('a'))
> > > > > > ```
> > > > > >
> > > > > > ## upsert Edge assuming a self-relation of "marko"
> > > > > >
> > > > > > ```groovy
> > > > > > g.V().has('person','name','marko').
> > > > > >   upsertE( 'self', [weight:0.5])
> > > > > > ```
> > > > > >
> > > > > > ## upsert Edge with incident vertex checks
> > > > > >
> > > > > > ```groovy
> > > > > > g.upsertE('knows', [weight:0.5]).
> > > > > >    from(V(1)).to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >   upsertE( 'knows', [weight:0.5]).
> > > > > >     to(V(2))
> > > > > > ```
> > > > > >
> > > > > > ## upsert using a Traversal for the match
> > > > > >
> > > > > > ```groovy
> > > > > > g.V().upsertV('person',
> __.has('person','age',gt(29)).has('state',
> > > > 'NY'),
> > > > > >                         [active: true])
> > > > > > ```
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
> > > dave@bechberger.com>
> > > > > > wrote:
> > > > > >
> > > > > > > So, I've been doing some additional thinking about the ways
> that
> > > this
> > > > > > could
> > > > > > > work based on the comments below and have put my comments
> inline.
> > > > > > >
> > > > > > > Dave
> > > > > > >
> > > > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> > > > spmallette@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I added some thoughts inline below:
> > > > > > > >
> > > > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > > > > dave@bechberger.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the detailed comments Stephen.  I have addressed
> > > them
> > > > > > inline
> > > > > > > > > below.
> > > > > > > > >
> > > > > > > > > I did read the proposal from earlier and I think that we
> are
> > in
> > > > > close
> > > > > > > > > agreement with what we are trying to accomplish.  I also
> > fully
> > > > > > support
> > > > > > > > > Josh's comment on providing a mechanism for submitting a
> map
> > of
> > > > > > > > properties
> > > > > > > > > as manually unrolling this all right now leads to a lot of
> > > > > potential
> > > > > > > for
> > > > > > > > > error and a long messy traversal.
> > > > > > > > >
> > > > > > > > > I'm looking forward to this discussion on how to merge
> these
> > > two
> > > > > > > > proposals.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > > > proposed?
> > > > > > > > >
> > > > > > > > > My first thought here is that multi-properties would be
> > > > represented
> > > > > > as
> > > > > > > > > lists in the map, e.g.
> > > > > > > > >
> > > > > > > > > {names: ['Dave', 'David']}
> > > > > > > > >
> > > > > > > > > and meta-properties would be represented as maps in the
> maps,
> > > > e.g.
> > > > > > > > >
> > > > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > > > > >
> > > > > > > > > I can't say I've thought through all the implications of
> this
> > > > > though
> > > > > > so
> > > > > > > > it
> > > > > > > > > is an area we would need to explore.
> > > > > > > > >
> > > > > > > >
> > > > > > > > The first implication that comes to mind is that it makes the
> > > > > > assumption
> > > > > > > > the user wants multi/meta-properties as opposed to a single
> > > > > cardinality
> > > > > > > of
> > > > > > > > List in the first case and a Map as a property value in the
> > > second
> > > > > > case.
> > > > > > > I
> > > > > > > > suppose that graphs with a schema could resolve those
> > assumptions
> > > > but
> > > > > > > > graphs that are schemaless would have a problem. The issue
> > could
> > > be
> > > > > > > > resolved by specialized configuration of "g" or per merge()
> > step
> > > > > using
> > > > > > a
> > > > > > > > with() modulator I suppose but that goes into a yet another
> > level
> > > > of
> > > > > > > > implications to consider. I've often wondered if the start
> > point
> > > > for
> > > > > > > > getting types/schema into TP3 without a full rewrite would be
> > in
> > > > this
> > > > > > > form
> > > > > > > > where Gremlin would be given hints as to what to expect as to
> > the
> > > > > types
> > > > > > > of
> > > > > > > > data it might encounter while traversing. Anyway, I'd be
> > hesitant
> > > > to
> > > > > go
> > > > > > > > down paths that don't account for multi/metaproperties well.
> > > They
> > > > > are
> > > > > > > > first class citizens in TP3 (with those hoping for extension
> of
> > > at
> > > > > > least
> > > > > > > > multiproperties to edges) and while I find them a constant
> > > > annoyance
> > > > > > for
> > > > > > > so
> > > > > > > > many reasons, we're kinda stuck with them.
> > > > > > > >
> > > > > > >
> > > > > > > I agree we need to account for multi/meta properties as 1st
> class
> > > > > > > citizens.  This is my current thinking on the syntax for the
> > > > situations
> > > > > > we
> > > > > > > have laid out so far:
> > > > > > >
> > > > > > > *Merge*
> > > > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age':
> 29})
> > > > > > >
> > > > > > >
> > > > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> > > > 'Marko'})
> > > > > > >
> > > > > > >
> > > > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id:
> > 1,
> > > > > > 'name':
> > > > > > > {'first': Marko', 'last': 'R'})
> > > > > > >
> > > > > > >
> > > > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1},
> {T.id:
> > > 1,
> > > > > > > 'lang': ['java', 'scala'])
> > > > > > >
> > > > > > >
> > > > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1},
> > {T.id:
> > > 1,
> > > > > > > 'lang': 'java')
> > > > > > >
> > > > > > >
> > > > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1},
> {T.id:
> > 1,
> > > > > > 'lang':
> > > > > > > ['java', 'scala', 'java'])
> > > > > > >
> > > > > > >
> > > > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1},
> {T.id:
> > 1,
> > > > > > 'lang':
> > > > > > > ['java', 'scala'])
> > > > > > >
> > > > > > > Since in a mergeV() scenario we are only ever adding whatever
> > > values
> > > > > are
> > > > > > > passed in there would be no need to specify the cardinality of
> > the
> > > > > > property
> > > > > > > being added.  If they wanted to add a value to an existing
> > property
> > > > > then
> > > > > > > the current property() method would still be available on the
> > > output
> > > > > > > traversal. e.g.
> > > > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > > > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > > > > >
> > > > > > >
> > > > > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> > > > > 'marko'},
> > > > > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > > > > >   mergeV(values('label'), valueMap('id'), valueMap())
> > > > > > >
> > > > > > >
> > > > > > > *Merge with reporting added or updated side effect*
> > > g.mergeV('name',
> > > > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > > > > >   with(WithOptions.sideEffectLabel, 'a').
> > > > > > >   project('vertex', 'added').
> > > > > > >     by().
> > > > > > >     by(cap('a'))
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > 2. How would users set the T.id on creation? would that
> T.id
> > > just
> > > > > be
> > > > > > a
> > > > > > > > key
> > > > > > > > > in the first Map argument?
> > > > > > > > >
> > > > > > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > > > > > >
> > > > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > > > > >
> > > > > > > >
> > > > > > > > ok - I can't say I see a problem with that atm.
> > > > > > > >
> > > > > > > >
> > > > > > > > > 3. I do like the general idea of a match on multiple
> > properties
> > > > for
> > > > > > the
> > > > > > > > > first argument as a convenience but wonder about the
> > > specificity
> > > > of
> > > > > > > this
> > > > > > > > > API a bit as it focuses heavily on equality - I suppose
> > that's
> > > > most
> > > > > > > cases
> > > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > >
> > > > > > > > > In most cases I've seen use exact matches on the vertex or
> > > > edge.  I
> > > > > > > think
> > > > > > > > > it might be best to keep this straightforward as any
> complex
> > > edge
> > > > > > cases
> > > > > > > > > still can perform the same functionality using the
> coalesce()
> > > > > > pattern.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I played around with the idea to use modulators to apply
> > > additional
> > > > > > > > constraints:
> > > > > > > >
> > > > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > > > >     by(has('age', gt(29))
> > > > > > > >
> > > > > > > > I suppose that's neat but maybe not...we could easily get the
> > > same
> > > > > > thing
> > > > > > > > from:
> > > > > > > >
> > > > > > > > g.V().has('person','age',gt(29)).
> > > > > > > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > > > >
> > > > > > > > which is more readable though it does make it harder for
> > > providers
> > > > to
> > > > > > > > optimize upsert operations if they could make use of the
> has()
> > as
> > > > > part
> > > > > > of
> > > > > > > > that. I think I like has() as part of the main query rather
> > than
> > > > in a
> > > > > > > by()
> > > > > > > > - just thinking out loud.
> > > > > > > >
> > > > > > > >
> > > > > > > Another potential option here would be to allow the second
> > > parameter
> > > > of
> > > > > > > mergeV() accept a traversal.  Not sure how much complexity that
> > > would
> > > > > add
> > > > > > > though
> > > > > > >
> > > > > > > g.V().mergeV('person',
> __.has('person','age',gt(29)).has('state',
> > > > > 'NY'}.,
> > > > > > > {'active': true}).
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > 4. I think your suggestion points to one of the troubles
> > > Gremlin
> > > > > has
> > > > > > > > which
> > > > > > > > > we see with "algorithms" - extending the language with new
> > > steps
> > > > > that
> > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up
> with
> > > > > > > > > shortestPath() step) pollutes the core language a bit,
> hence
> > my
> > > > > > > > > generalization of "merging" in my link above which fits
> into
> > > the
> > > > > core
> > > > > > > > > Gremlin language style. There is a bigger picture where we
> > are
> > > > > > missing
> > > > > > > > > something in Gremlin that lets us extend the language in
> ways
> > > > that
> > > > > > let
> > > > > > > us
> > > > > > > > > easily introduce new steps that aren't for general purpose.
> > > This
> > > > > > issue
> > > > > > > is
> > > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> > > think
> > > > > we
> > > > > > > > could
> > > > > > > > > see how there might be some "mutation" extension steps that
> > > would
> > > > > > cover
> > > > > > > > > your suggested API, plus batch operations, etc. We need a
> way
> > > to
> > > > > add
> > > > > > > > > "sugar" without it interfering with the consistency of the
> > > core.
> > > > > > > > Obviously
> > > > > > > > > this is a bigger issue but perhaps important to solve to
> > > > implement
> > > > > > > steps
> > > > > > > > in
> > > > > > > > > the fashion you describe.
> > > > > > > > >
> > > > > > > > > I agree that the "algorithm" steps do seem a bit odd in the
> > > core
> > > > > > > language
> > > > > > > > > but I think the need is there.  I'd be interested in
> > furthering
> > > > > this
> > > > > > > > > discussion but I think these "pattern" steps may or may not
> > be
> > > > the
> > > > > > same
> > > > > > > > as
> > > > > > > > > the algorithm steps.  I'll have to think on that.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > > specify
> > > > > what
> > > > > > > > > element type the first Map argument should be applied to?
> > what
> > > > > about
> > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> > That's
> > > > > > tricky
> > > > > > > > but
> > > > > > > > > I don't think we should miss that. Perhaps merge() could
> be a
> > > > > > "complex
> > > > > > > > > modulator"?? that's a new concept of course, but you would
> do
> > > > > > > > g.V().merge()
> > > > > > > > > and the label and first Map would fold to VertexStartStep
> > (i.e.
> > > > > V())
> > > > > > > for
> > > > > > > > > the lookup and then a MergeStep would follow - thus a
> > "complex
> > > > > > > modulator"
> > > > > > > > > as it does more than just change the behavior of the
> previous
> > > > step
> > > > > -
> > > > > > it
> > > > > > > > > also adds its own. I suppose it could also add has() steps
> > > > followed
> > > > > > by
> > > > > > > > the
> > > > > > > > > MergeStep and then the has() operations would fold in
> > normally
> > > as
> > > > > > they
> > > > > > > do
> > > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > > merge(String,Map,Map). ??
> > > > > > > > >
> > > > > > > > > I agree that we should also think about how to include
> > > properties
> > > > > in
> > > > > > > this
> > > > > > > > > merge construct.  The reason I was thinking about mergeV()
> > and
> > > > > > mergeE()
> > > > > > > > is
> > > > > > > > > that it follows the same pattern as the already well
> > understood
> > > > > > > > > addV()/addE() steps.  I am a bit wary of trying to
> generalize
> > > > this
> > > > > > down
> > > > > > > > to
> > > > > > > > > a single merge() step as these sorts of complex overloads
> > make
> > > it
> > > > > > hard
> > > > > > > to
> > > > > > > > > figure out which Gremlin step you should use for a
> particular
> > > > > pattern
> > > > > > > > (e.g.
> > > > > > > > > upsert vertex or upsert edge).  One thing that I think
> > > customers
> > > > > find
> > > > > > > > > useful about the addV/addE step is that they are very
> > > > discoverable,
> > > > > > the
> > > > > > > > > name tells me what functionality to expect from that step.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I'm starting to agree with the idea that we have to do
> > something
> > > > like
> > > > > > > > mergeV()/E() as it wouldn't quite work as a start step
> > otherwise.
> > > > We
> > > > > > > > couldn't do:
> > > > > > > >
> > > > > > > > g.merge()
> > > > > > > >
> > > > > > > > as we wouldnt know what sort of Element it applied to. If
> so, I
> > > > > wonder
> > > > > > if
> > > > > > > > it would be better to preserve "merge" for my more general
> > > use-case
> > > > > and
> > > > > > > > prefer upsertV()/E()?
> > > > > > > >
> > > > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it
> is
> > an
> > > > > > > uncommon
> > > > > > > > case (compared to V/E()) and we don't really have addVP() as
> an
> > > > > > analogous
> > > > > > > > step. Perhaps existing coalesce() patterns and/or my more
> > general
> > > > > > purpose
> > > > > > > > merge() step would satisfy those situations??
> > > > > > > >
> > > > > > >
> > > > > > > I think that a mergeVP() type step may require a different
> > pattern
> > > > > since
> > > > > > it
> > > > > > > would really need to accept a map of values otherwise it would
> > > > > > essentially
> > > > > > > perform the same function as property()
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > 6. One thing neither my approach nor yours seems to do is
> > tell
> > > > the
> > > > > > user
> > > > > > > > if
> > > > > > > > > they created something or updated something - that's
> another
> > > > thing
> > > > > > I've
> > > > > > > > > seen users want to have in get-or-create. Here again we go
> > > deeper
> > > > > > into
> > > > > > > a
> > > > > > > > > less general step specification as alluded to in 4, but a
> > > merge()
> > > > > > step
> > > > > > > as
> > > > > > > > > proposed in 5, might return [Element,boolean] so as to
> > provide
> > > an
> > > > > > > > indicator
> > > > > > > > > of creation?
> > > > > > > > >
> > > > > > > > > Hmm, I'll have to think about that.  How would returning
> > > multiple
> > > > > > > values
> > > > > > > > > work if we want to chain these together. e.g. Add a vertex
> > > > between
> > > > > > two
> > > > > > > > > edges but make sure the vertices exist?
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > Added a thought on how to accomplish this above that I'd like
> to
> > > get
> > > > > > > thoughts on.
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > 7. You were just introducing your ideas here, so perhaps
> you
> > > > > haven't
> > > > > > > > gotten
> > > > > > > > > this far yet, but a shortcoming to doing
> > merge(String,Map,Map)
> > > is
> > > > > > that
> > > > > > > it
> > > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > > merge()
> > > > > for
> > > > > > a
> > > > > > > > form
> > > > > > > > > of batch loading which is mighty common and one of the
> > > variations
> > > > > of
> > > > > > > the
> > > > > > > > > coalesce() pattern that I alluded to at the start of all
> > this.
> > > I
> > > > > > think
> > > > > > > > that
> > > > > > > > > we would want to be sure that we left open the option to do
> > > that
> > > > > > > somehow.
> > > > > > > > > 8. If we had a general purpose merge() step I wonder if it
> > > makes
> > > > > > > > developing
> > > > > > > > > the API as you suggested easier to do?
> > > > > > > > >
> > > > > > > > > Hmm, let me think about that one.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > I will add an item 8 to think about which I didn't mention
> > > before:
> > > > > > > >
> > > > > > > > 8. The signature you suggested for mergeE() is:
> > > > > > > >
> > > > > > > > mergeE(String, Map, Map)
> > > > > > > >      String - The edge label
> > > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > > >      Map (second) - Any additional properties to set if a new
> > > edge
> > > > is
> > > > > > > > created (optional)
> > > > > > > >
> > > > > > > > but does that exactly answer the need? Typically the idea is
> to
> > > > > detect
> > > > > > an
> > > > > > > > edge between two vertices not globally in the graph by way of
> > > some
> > > > > > > > properties. This signature doesn't seem to allow for that as
> it
> > > > > doesn't
> > > > > > > > allow specification of the vertices to test against. Perhaps
> > the
> > > > > answer
> > > > > > > is
> > > > > > > > to use from() and to() modulators?
> > > > > > > >
> > > > > > > > g.mergeE('knows', {'weight':0.5}).
> > > > > > > >   from(V(1)).to(V(2))
> > > > > > > > g.V().has('person','name','marko').
> > > > > > > >   mergeE( 'knows', {'weight':0.5}).
> > > > > > > >     to(V(2))
> > > > > > > > g.V().has('person','name','marko').
> > > > > > > >   mergeE( 'self', {'weight':0.5})
> > > > > > > >
> > > > > > > > That seems to work?
> > > > > > > >
> > > > > > >
> > > > > > > I think that is an elegant solution to that problem.  I like it
> > and
> > > > it
> > > > > > > keeps in line with the way that addE() works.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > > > > spmallette@gmail.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I like our coalesce() pattern but it is verbose and over
> > time
> > > > it
> > > > > > has
> > > > > > > > gone
> > > > > > > > > > from a simple pattern to one with numerous variations for
> > all
> > > > > > manner
> > > > > > > of
> > > > > > > > > > different sorts of merge-like operations. As such, I do
> > think
> > > > we
> > > > > > > should
> > > > > > > > > > introduce something to cover this pattern.
> > > > > > > > > >
> > > > > > > > > > I like that you used the word "merge" in your description
> > of
> > > > this
> > > > > > as
> > > > > > > it
> > > > > > > > > is
> > > > > > > > > > the word I've liked using. You might want to give a look
> at
> > > my
> > > > > > > proposed
> > > > > > > > > > merge() step from earlier in the year:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > > > > >
> > > > > > > > > > I'm just going to dump thoughts as they come regarding
> what
> > > you
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > 1. How would multi/meta-properties fit into the API
> you've
> > > > > > proposed?
> > > > > > > > > > 2. How would users set the T.id on creation? would that
> > T.id
> > > > just
> > > > > > be
> > > > > > > a
> > > > > > > > > key
> > > > > > > > > > in the first Map argument?
> > > > > > > > > > 3. I do like the general idea of a match on multiple
> > > properties
> > > > > for
> > > > > > > the
> > > > > > > > > > first argument as a convenience but wonder about the
> > > > specificity
> > > > > of
> > > > > > > > this
> > > > > > > > > > API a bit as it focuses heavily on equality - I suppose
> > > that's
> > > > > most
> > > > > > > > cases
> > > > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > > > 4. I think your suggestion points to one of the troubles
> > > > Gremlin
> > > > > > has
> > > > > > > > > which
> > > > > > > > > > we see with "algorithms" - extending the language with
> new
> > > > steps
> > > > > > that
> > > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up
> > with
> > > > > > > > > > shortestPath() step) pollutes the core language a bit,
> > hence
> > > my
> > > > > > > > > > generalization of "merging" in my link above which fits
> > into
> > > > the
> > > > > > core
> > > > > > > > > > Gremlin language style. There is a bigger picture where
> we
> > > are
> > > > > > > missing
> > > > > > > > > > something in Gremlin that lets us extend the language in
> > ways
> > > > > that
> > > > > > > let
> > > > > > > > us
> > > > > > > > > > easily introduce new steps that aren't for general
> purpose.
> > > > This
> > > > > > > issue
> > > > > > > > is
> > > > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991
> but I
> > > > think
> > > > > > we
> > > > > > > > > could
> > > > > > > > > > see how there might be some "mutation" extension steps
> that
> > > > would
> > > > > > > cover
> > > > > > > > > > your suggested API, plus batch operations, etc. We need a
> > way
> > > > to
> > > > > > add
> > > > > > > > > > "sugar" without it interfering with the consistency of
> the
> > > > core.
> > > > > > > > > Obviously
> > > > > > > > > > this is a bigger issue but perhaps important to solve to
> > > > > implement
> > > > > > > > steps
> > > > > > > > > in
> > > > > > > > > > the fashion you describe.
> > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > > > specify
> > > > > > what
> > > > > > > > > > element type the first Map argument should be applied to?
> > > what
> > > > > > about
> > > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> > > That's
> > > > > > > tricky
> > > > > > > > > but
> > > > > > > > > > I don't think we should miss that. Perhaps merge() could
> > be a
> > > > > > > "complex
> > > > > > > > > > modulator"?? that's a new concept of course, but you
> would
> > do
> > > > > > > > > g.V().merge()
> > > > > > > > > > and the label and first Map would fold to VertexStartStep
> > > (i.e.
> > > > > > V())
> > > > > > > > for
> > > > > > > > > > the lookup and then a MergeStep would follow - thus a
> > > "complex
> > > > > > > > modulator"
> > > > > > > > > > as it does more than just change the behavior of the
> > previous
> > > > > step
> > > > > > -
> > > > > > > it
> > > > > > > > > > also adds its own. I suppose it could also add has()
> steps
> > > > > followed
> > > > > > > by
> > > > > > > > > the
> > > > > > > > > > MergeStep and then the has() operations would fold in
> > > normally
> > > > as
> > > > > > > they
> > > > > > > > do
> > > > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > > > merge(String,Map,Map). ??
> > > > > > > > > > 6. One thing neither my approach nor yours seems to do is
> > > tell
> > > > > the
> > > > > > > user
> > > > > > > > > if
> > > > > > > > > > they created something or updated something - that's
> > another
> > > > > thing
> > > > > > > I've
> > > > > > > > > > seen users want to have in get-or-create. Here again we
> go
> > > > deeper
> > > > > > > into
> > > > > > > > a
> > > > > > > > > > less general step specification as alluded to in 4, but a
> > > > merge()
> > > > > > > step
> > > > > > > > as
> > > > > > > > > > proposed in 5, might return [Element,boolean] so as to
> > > provide
> > > > an
> > > > > > > > > indicator
> > > > > > > > > > of creation?
> > > > > > > > > > 7. You were just introducing your ideas here, so perhaps
> > you
> > > > > > haven't
> > > > > > > > > gotten
> > > > > > > > > > this far yet, but a shortcoming to doing
> > > merge(String,Map,Map)
> > > > is
> > > > > > > that
> > > > > > > > it
> > > > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > > > merge()
> > > > > > for
> > > > > > > a
> > > > > > > > > form
> > > > > > > > > > of batch loading which is mighty common and one of the
> > > > variations
> > > > > > of
> > > > > > > > the
> > > > > > > > > > coalesce() pattern that I alluded to at the start of all
> > > this.
> > > > I
> > > > > > > think
> > > > > > > > > that
> > > > > > > > > > we would want to be sure that we left open the option to
> do
> > > > that
> > > > > > > > somehow.
> > > > > > > > > > 8. If we had a general purpose merge() step I wonder if
> it
> > > > makes
> > > > > > > > > developing
> > > > > > > > > > the API as you suggested easier to do?
> > > > > > > > > >
> > > > > > > > > > I think I'd like to solve the problems you describe in
> your
> > > > post
> > > > > as
> > > > > > > > well
> > > > > > > > > as
> > > > > > > > > > the ones in mine. There is some relation there, but gaps
> as
> > > > well.
> > > > > > > With
> > > > > > > > > more
> > > > > > > > > > discussion here we can figure something out.
> > > > > > > > > >
> > > > > > > > > > Thanks for starting this talk - good one!
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > > > > dave@bechberger.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I've had a few on and off discussions with a few people
> > > here,
> > > > > so
> > > > > > I
> > > > > > > > > wanted
> > > > > > > > > > > to send this out to everyone for feedback.
> > > > > > > > > > >
> > > > > > > > > > > What are people's thoughts on creating a new set of
> steps
> > > > that
> > > > > > > codify
> > > > > > > > > > > common Gremlin best practices?
> > > > > > > > > > >
> > > > > > > > > > > I think there are several common Gremlin patterns where
> > > users
> > > > > > would
> > > > > > > > > > benefit
> > > > > > > > > > > from the additional guidance that these codified steps
> > > > > represent.
> > > > > > > > The
> > > > > > > > > > > first one I would recommend though is codifying the
> > element
> > > > > > > existence
> > > > > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > > > > >
> > > > > > > > > > > mergeV(String, Map, Map)
> > > > > > > > > > >      String - The vertex label
> > > > > > > > > > >      Map (first) - The properties to match existing
> > > vertices
> > > > on
> > > > > > > > > > >      Map (second) - Any additional properties to set
> if a
> > > new
> > > > > > > vertex
> > > > > > > > is
> > > > > > > > > > > created (optional)
> > > > > > > > > > > mergeE(String, Map, Map)
> > > > > > > > > > >      String - The edge label
> > > > > > > > > > >      Map (first) - The properties to match existing
> edge
> > on
> > > > > > > > > > >      Map (second) - Any additional properties to set
> if a
> > > new
> > > > > > edge
> > > > > > > is
> > > > > > > > > > > created (optional)
> > > > > > > > > > >
> > > > > > > > > > > In each of these cases these steps would perform the
> same
> > > > > upsert
> > > > > > > > > > > functionality as the element existence pattern.
> > > > > > > > > > >
> > > > > > > > > > > Example:
> > > > > > > > > > >
> > > > > > > > > > > g.V().has('person','name','stephen').
> > > > > > > > > > >            fold().
> > > > > > > > > > >            coalesce(unfold(),
> > > > > > > > > > >                     addV('person').
> > > > > > > > > > >                       property('name','stephen').
> > > > > > > > > > >                       property('age',34))
> > > > > > > > > > >
> > > > > > > > > > > would become:
> > > > > > > > > > >
> > > > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > > > > >
> > > > > > > > > > > I think that this change would be a good addition to
> the
> > > > > language
> > > > > > > for
> > > > > > > > > > > several reasons:
> > > > > > > > > > >
> > > > > > > > > > > * This codifies the best practice for a specific
> > > > action/recipe,
> > > > > > > which
> > > > > > > > > > > reduces the chance that someone uses the pattern
> > > incorrectly
> > > > > > > > > > > * Most complex Gremlin traversals are verbose.
> Reducing
> > > the
> > > > > > amount
> > > > > > > > of
> > > > > > > > > > code
> > > > > > > > > > > that needs to be written and maintained allows for a
> > better
> > > > > > > developer
> > > > > > > > > > > experience.
> > > > > > > > > > > * It will lower the bar of entry for a developer by
> > making
> > > > > these
> > > > > > > > > actions
> > > > > > > > > > > more discoverable.  The more we can help bring these
> > > patterns
> > > > > to
> > > > > > > the
> > > > > > > > > > > forefront of the language via these pattern/meta steps
> > the
> > > > more
> > > > > > we
> > > > > > > > > guide
> > > > > > > > > > > users towards writing better Gremlin faster
> > > > > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > > > > >
> > > > > > > > > > > I know that this would likely be the first step in
> > Gremlin
> > > > that
> > > > > > > > > codifies
> > > > > > > > > > a
> > > > > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > > > > >
> > > > > > > > > > > Dave
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by David Bechberger <da...@bechberger.com>.
I agree with Josh's description of what the upsertV() functionality is
intended to be.

I also fully support the simplification provided by the by() modulation
that Stephen suggested, removing the second map.  I think that provides a
much cleaner and easier to comprehend syntax.

With that agreement, I think this does beg the question of if this should
be a new step (upsertV()) or just an additional signature on the addV()
step? I

Stephen alluded to this on the dev list and after thinking about this a bit
I think that I am favor of not adding a step and just adding a new
signature to the existing step (if possible).  Thoughts?


Dave

On Wed, Dec 9, 2020 at 4:33 AM Stephen Mallette <sp...@gmail.com>
wrote:

> Josh, thanks for your thoughts - some responses inline:
>
> On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman <jo...@gmail.com>
> wrote:
>
> > I'll offer some thoughts. I'm seeing upsertV() as an idempotent
> getOrCreate
> > call which always returns a vertex with the label/property values
> specified
> > within the step. It's sort of a declarative pattern: "return this vertex
> to
> > me, find it if you can, create it if you must."
> >
>
> I like this description - I've added it to the gist, though it's a bit at
> odds with Dave's previous post, so we'll consider it a temporary addition
> until he responds.
>
>
> > On that account, I do like the simplification in 1. Repetition shouldn't
> be
> > necessary. In an ideal world, the engine should know the primary
> > identifiers (name or id) and find/create the vertex based on them. Any
> > other included values will be "trued up" as well. But this may be a
> bridge
> > too far for TinkerPop since knowing identifiers may require a specified
> > schema. I'd prefer to omit the third input, but it might be necessary to
> > keep it so that the second input can be for the matching use case.
> >
>
> In my most recent post on gremlin-users I think I came up with a nice way
> to get rid of the second Map. One Map that forms the full list of
> properties for upserting is easier than partitioning two Maps that
> essentially merge together. I imagine it's unlikely that application code
> will have that separation naturally so users will have the added step of
> trying to separate their data into searchable vs "just data". Getting us to
> one Map argument will simplify APIs for us and reduce complexity to users.
> Here is what I'd proposed for those not following over there:
>
> // match on name and age (or perhaps whatever the underlying graph system
> thinks is best?)
> g.upsertV('person', [name:'marko',age:29])
>
> // match on name only
> g.upsertV('person', [name:'marko',age:29]).by('name')
>
> // explicitly match on name and age
> g.upsertV('person', [name:'marko',age:29]).
>   by('name').by('age')
>
> // match on id only
> g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)
>
> // match on whatever the by(Traversal) predicate defines
> g.upsertV('person', [name:'marko',age:29]).
>   by(has('name', 'marko'))
>
> // match on id, then update age
> g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
>   property('age',29)
>
> With this model, we get one Map argument that represents the complete
> property set to be added/updated to the graph and the user can hint on what
> key they wish to match on using by() where that sort of step modulation
> should be a well understood and familiar concept in Gremlin at this point.
>
> So that means I think 2 should always match or update the additional
> > values. Again, we're specifying the expected result and letting the
> engine
> > figure out best how to return that results and appropriately maintain
> > state.
> >
>
> I again like this description, but we'll see what Dave's thoughts are since
> he's a bit behind on the threads at this point I think.
>
>
> > I'm also presuming that anything not included as inputs to the upsertV()
> > step are then to be handled by following steps. I'm hoping that is a
> > sufficient approach for addressing the multi/meta property use cases
> > brought up in 3.
> >
>
> yeah................it needs more thought. I spent more time thinking on
> this issue yesterday than I have for all the previous posts combined and I
> think it yielded something good in that revised syntax. It's going to take
> more of that kind of elbow grease to dig into these lesser use cases to
> make sure we aren't coding ourselves into corners.
>
>
> > I do like the idea of using modulators (with(), by()) for more
> > sophisticated usage and advanced use cases. Also, the streaming examples
> > are quite elegant allowing for a helpful separation of data and logic.
> >
>
> cool - hope you like the revised syntax I posted then. :)
>
>
> > That's my humble take. This is a very welcome addition to the language
> and
> > I appreciate the thoughtful & collaborative approach to the design
> > considerations.
> >
>
> Thanks again and please keep the thoughts coming. Lots of other interesting
> design discussions seem to be brewing.
>
>
> >
> > Josh
> >
> > On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <sp...@gmail.com>
> > wrote:
> >
> > > I started a expanded this discussion to gremlin-users for a wider
> > audience
> > > and the thread is starting to grow:
> > >
> > > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
> > >
> > > I guess we'll need to summarize that discussion back here now....
> > >
> > > I did have some more thoughts to hang out there and figured that I
> > wouldn't
> > > convolute the discussion on gremlin-users with it so I will continue
> the
> > > discussion here.
> > >
> > > 1, The very first couple of examples seem wrong (or at least not best
> > > demonstrating the usage):
> > >
> > > g.upsertV('person', [name: 'marko'],
> > >                     [name: 'marko', age: 29])
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, name: 'Marko'])
> > >
> > > should instead be:
> > >
> > > g.upsertV('person', [name: 'marko'],
> > >                     [age: 29])
> > > g.upsertV('person', [(T.id): 1],
> > >                     [name: 'Marko'])
> > >
> > > 2. I can't recall if we made this clear anywhere but in situations
> where
> > we
> > > "get" rather than "create" do the additional properties act in an
> update
> > > fashion to the element that was found? I think I've been working on the
> > > assumption that it would, though perhaps that is not always desirable?
> > >
> > > 3. We really never settled up how to deal with multi/meta-properties.
> > That
> > > story should be clear so that when we document upsert() we include the
> > > approaches for the fallback patterns that don't meet the 90% of use
> cases
> > > we are targeting and I sense that we're saying that
> meta/multi-properties
> > > don't fit in that bucket. So with that in mind, I don't think that the
> > > following works for metaproperties:
> > >
> > > g.upsertV('person', [(T.id): 1],
> > >                     [name:[acl: 'public'])
> > >
> > > as it doesn't let us set the value for the "name" just the pairs for
> the
> > > meta-properties. I guess a user would have to fall back to:
> > >
> > > g.upsertV('person', [(T.id): 1]).
> > >   property('name','marko','acl','public')
> > >
> > > // or use the additional properties syntax
> > > g.upsertV('person', [(T.id): 1],[name:'marko']).
> > >   properties('name').property('acl','public')
> > >
> > > // or if there were multi-properties then maybe...
> > > g.upsertV('person', [(T.id): 1],[name:'marko']).
> > >   properties('name').hasValue('marko').property('acl','public')
> > >
> > > As for multi-properties, I dont think we should assume that a List
> object
> > > should be interpreted by Gremlin as a multi-property. Perhaps we just
> > rely
> > > on the underlying graph to properly deal with that given a schema or
> the
> > > user falls back to:
> > >
> > > // this ends up as however the graph deals with a List object
> > > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
> > >
> > > // this is explicit
> > > g.upsertV('person', [(T.id): 1]).
> > >   property(list, 'lang', 'java').
> > >   property(list, 'lang, 'scala').
> > >   property(list, 'lang', 'java')
> > >
> > > If that makes sense to everyone, I will update the gist.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <dave@bechberger.com
> >
> > > wrote:
> > >
> > > > Hello Stephen,
> > > >
> > > > Thanks for making that gist, its much easier to follow along with the
> > > > proposed syntax there.  To your specific comments:
> > > >
> > > > #1 - My only worry with the term upsert is that it is not as widely a
> > > used
> > > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).
> However I
> > > > don't have a strong opinion on this, so I am fine with either.
> > > > #2 - My only real objective here is to make sure that we make the
> > 80-90%
> > > > case easy and straightforward.  I think that having the fallback
> option
> > > of
> > > > using the current syntax for any complicated edge cases should be
> > > > considered here as well. I'd appreciate your thoughts here as these
> are
> > > > good points you bring up that definitely fall into the 80-90% use
> case.
> > > > #3 - Those points make sense to me, not sure I have anything further
> to
> > > add
> > > > #4 - I don't think I would expect to have the values extracted from
> the
> > > > traversal filters but I'd be interested in other opinions on that.
> > > >
> > > > Thanks,
> > > > Dave
> > > >
> > > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <
> spmallette@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > It's been a couple weeks since we moved this thread so I went back
> > > > through
> > > > > it and refreshed my mind with what's there. Dave, nice job putting
> > all
> > > > > those examples together. I've recollected them all together and
> have
> > > just
> > > > > been reviewing them in a more formatted style with proper Groovy
> > syntax
> > > > > than what can be accomplished in this mailing list. Anyone who
> cares
> > to
> > > > > review in that form can do so here (i've pasted its markdown
> contents
> > > at
> > > > > the bottom of this post as well):
> > > > >
> > > > >
> https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > > > >
> > > > > In this light, I have the following comments and thoughts:
> > > > >
> > > > > 1. I feel more inclined toward saving "merge" for a more
> generalized
> > > step
> > > > > of that name and have thus renamed the examples to use "upsert".
> > > > >
> > > > > 2. I still don't quite feel comfortable with meta/multi-properties
> > > > > examples. Perhaps you could explain further as it's possible I'm
> not
> > > > > thinking things through properly. For meta-properties the example
> is:
> > > > >
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > >
> > > > > So I see that "name" takes a Map here, but how does Gremlin know if
> > > that
> > > > > means "meta-property" or "Map value" and if the former, then how do
> > you
> > > > set
> > > > > the value of "name"?  My question is similar to the multi-property
> > > > examples
> > > > > - using this one:
> > > > >
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > > >
> > > > > How does Gremlin know if the user means that "lang" should be a
> > > > > "multi-property" with Cardinality.list (or Cardinality.set for that
> > > > matter)
> > > > > or a value of Cardinality.single with a List/Array value? I can
> > perhaps
> > > > > suggest some solutions here but wanted to make sure I wasn't just
> > > > > misunderstanding something.
> > > > >
> > > > > 3. The "upsert with stream" example is:
> > > > >
> > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > >   upsertV(values(label), valueMap(id), valueMap())
> > > > >
> > > > > That would establish an API of upsertV(Traversal, Traversal,
> > Traversal)
> > > > > which would enable your final thought of your last post with:
> > > > >
> > > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > > 'NY'),
> > > > >                         [active: true])
> > > > >
> > > > > The open-ended nature of that is neat but comes with some
> complexity
> > > with
> > > > > steps that traditionally take an "open" Traversal (and this step
> now
> > > has
> > > > up
> > > > > to three of them) - perhaps, that's our own fault in some ways. For
> > the
> > > > > "stream" concept, values()/valueMap() won't immediately work that
> way
> > > as
> > > > > they only apply to Element...given today's Gremlin semantics, I
> think
> > > > we'd
> > > > > re-write that as:
> > > > >
> > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > >   upsertV(select(label), select(id), select(id,label,'name'))
> > > > >
> > > > > Though, even that doesn't quite work because you can't select(T)
> > right
> > > > now
> > > > > which means you'd need to go with:
> > > > >
> > > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > > >          [id: 2, label: 'person', name: 'josh']).
> > > > >   upsertV(select('label'), select('id'),
> select('id','label','name'))
> > > > >
> > > > > not terrible in a way and perhaps we could fix select(T) - can't
> > > remember
> > > > > why that isn't allowed except that select() operates over multiple
> > > scopes
> > > > > and perhaps trying T through those scopes causes trouble.
> > > > >
> > > > > 4. Continuing with the use of Traversal as arguments, let's go back
> > to:
> > > > >
> > > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > > 'NY'),
> > > > >                         [active: true])
> > > > >
> > > > > I originally had it in my mind to prefer this approach over a Map
> for
> > > the
> > > > > search as it provides a great deal of flexibility and fits the
> common
> > > > > filtering patterns that users follow really well. I suppose the
> > > question
> > > > is
> > > > > what to do in the situation of "create". Would we extract all the
> > > strict
> > > > > equality filter values from the initial has(), thus, in the
> example,
> > > > > "state" but not "age", and create a vertex that has "state=NY,
> > > > > active=true"? That is possible with how things are designed in
> > Gremlin
> > > > > today I think.
> > > > >
> > > > > This syntax does create some conflict with the subject of 3 and
> > streams
> > > > > because the traverser in the streams case is the incoming Map but
> for
> > > > this
> > > > > context it's meant as a filter of V(). Getting too whacky?
> > > > >
> > > > > 5. I like the idea of using a side-effect to capture whether the
> > > element
> > > > > was created or not. that makes sense. I think that can work.
> > > > >
> > > > > ==============================
> > > > > = gist contents
> > > > > ==============================
> > > > >
> > > > > # API
> > > > >
> > > > > ```java
> > > > > upsertV(String label, Map matchOrCreateProperties)
> > > > > upsertV(String label, Map matchOrCreateProperties, Map
> > > > > additionalProperties)
> > > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > > > upsertV(Traversal label, Traversal matchOrCreateProperties,
> Traversal
> > > > > additionalProperties)
> > > > > upsertV(...).
> > > > >   with(WithOptions.sideEffectLabel, 'a')
> > > > >
> > > > > upsertE(String label, Map matchOrCreateProperties)
> > > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > > > upsertE(String label, Map matchOrCreateProperties, Map
> > > > > additionalProperties)
> > > > > upsertE(Traversal label, Traversal matchOrCreateProperties,
> Traversal
> > > > > additionalProperties)
> > > > > upsertE(...).
> > > > >   from(Vertex incidentOut).to(Vertex incidentIn)
> > > > >   with(WithOptions.sideEffectLabel, 'a')
> > > > > ```
> > > > >
> > > > > # Examples
> > > > >
> > > > > ## upsert
> > > > >
> > > > > ```groovy
> > > > > g.upsertV('person', [name: 'marko'],
> > > > >                     [name: 'marko', age: 29])
> > > > > ```
> > > > >
> > > > > ## upsert with id
> > > > >
> > > > > ```groovy
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                     [(T.id): 1, name: 'Marko'])
> > > > > ```
> > > > >
> > > > > ## upsert with Meta Properties
> > > > >
> > > > > ```groovy
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > > ```
> > > > >
> > > > > ## upsert with Multi Properties
> > > > >
> > > > > ```groovy
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > > > ```
> > > > >
> > > > > ## upsert with Single Cardinality
> > > > >
> > > > > ```groovy
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                     [(T.id): 1, lang: 'java')
> > > > > ```
> > > > >
> > > > > ## upsert with List Cardinality
> > > > >
> > > > > ```groovy
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > > > ```
> > > > >
> > > > > ## upsert with Set Cardinality
> > > > >
> > > > > ```groovy
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                     [(T.id): 1, lang: ['java','scala'])
> > > > > ```
> > > > >
> > > > > ## upsert with List Cardinality - and add value to list
> > > > >
> > > > > ```groovy
> > > > > g.upsertV('person', [(T.id): 1],
> > > > >                     [(T.id): 1, lang: ['java','scala','java']).
> > > > >   property(Cardinality.list, 'lang', 'java')
> > > > > ```
> > > > >
> > > > > ## upsert with stream
> > > > >
> > > > > ```groovy
> > > > > // doesn't work with today's Gremlin semantics
> > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > > >   upsertV(values(label), valueMap(id), valueMap())
> > > > >
> > > > > // the following would be more in line with what's possible with
> > > existing
> > > > > Gremlin semantics:
> > > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > > >          [id: 2, label: 'person', name: 'josh']).
> > > > >   upsertV(select('label'), select('id'),
> select('id','label','name'))
> > > > > ```
> > > > >
> > > > > ## upsert with reporting added or updated side effect
> > > > >
> > > > > ```groovy
> > > > > g.upsertV('person', [name: 'marko'],
> > > > >                     [name: 'marko', age: 29]).
> > > > >     with(WithOptions.sideEffectLabel, 'a').
> > > > >   project('vertex', 'added').
> > > > >     by().
> > > > >     by(cap('a'))
> > > > > ```
> > > > >
> > > > > ## upsert Edge assuming a self-relation of "marko"
> > > > >
> > > > > ```groovy
> > > > > g.V().has('person','name','marko').
> > > > >   upsertE( 'self', [weight:0.5])
> > > > > ```
> > > > >
> > > > > ## upsert Edge with incident vertex checks
> > > > >
> > > > > ```groovy
> > > > > g.upsertE('knows', [weight:0.5]).
> > > > >    from(V(1)).to(V(2))
> > > > > g.V().has('person','name','marko').
> > > > >   upsertE( 'knows', [weight:0.5]).
> > > > >     to(V(2))
> > > > > ```
> > > > >
> > > > > ## upsert using a Traversal for the match
> > > > >
> > > > > ```groovy
> > > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > > 'NY'),
> > > > >                         [active: true])
> > > > > ```
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
> > dave@bechberger.com>
> > > > > wrote:
> > > > >
> > > > > > So, I've been doing some additional thinking about the ways that
> > this
> > > > > could
> > > > > > work based on the comments below and have put my comments inline.
> > > > > >
> > > > > > Dave
> > > > > >
> > > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> > > spmallette@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I added some thoughts inline below:
> > > > > > >
> > > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > > > dave@bechberger.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks for the detailed comments Stephen.  I have addressed
> > them
> > > > > inline
> > > > > > > > below.
> > > > > > > >
> > > > > > > > I did read the proposal from earlier and I think that we are
> in
> > > > close
> > > > > > > > agreement with what we are trying to accomplish.  I also
> fully
> > > > > support
> > > > > > > > Josh's comment on providing a mechanism for submitting a map
> of
> > > > > > > properties
> > > > > > > > as manually unrolling this all right now leads to a lot of
> > > > potential
> > > > > > for
> > > > > > > > error and a long messy traversal.
> > > > > > > >
> > > > > > > > I'm looking forward to this discussion on how to merge these
> > two
> > > > > > > proposals.
> > > > > > > >
> > > > > > > >
> > > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > > proposed?
> > > > > > > >
> > > > > > > > My first thought here is that multi-properties would be
> > > represented
> > > > > as
> > > > > > > > lists in the map, e.g.
> > > > > > > >
> > > > > > > > {names: ['Dave', 'David']}
> > > > > > > >
> > > > > > > > and meta-properties would be represented as maps in the maps,
> > > e.g.
> > > > > > > >
> > > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > > > >
> > > > > > > > I can't say I've thought through all the implications of this
> > > > though
> > > > > so
> > > > > > > it
> > > > > > > > is an area we would need to explore.
> > > > > > > >
> > > > > > >
> > > > > > > The first implication that comes to mind is that it makes the
> > > > > assumption
> > > > > > > the user wants multi/meta-properties as opposed to a single
> > > > cardinality
> > > > > > of
> > > > > > > List in the first case and a Map as a property value in the
> > second
> > > > > case.
> > > > > > I
> > > > > > > suppose that graphs with a schema could resolve those
> assumptions
> > > but
> > > > > > > graphs that are schemaless would have a problem. The issue
> could
> > be
> > > > > > > resolved by specialized configuration of "g" or per merge()
> step
> > > > using
> > > > > a
> > > > > > > with() modulator I suppose but that goes into a yet another
> level
> > > of
> > > > > > > implications to consider. I've often wondered if the start
> point
> > > for
> > > > > > > getting types/schema into TP3 without a full rewrite would be
> in
> > > this
> > > > > > form
> > > > > > > where Gremlin would be given hints as to what to expect as to
> the
> > > > types
> > > > > > of
> > > > > > > data it might encounter while traversing. Anyway, I'd be
> hesitant
> > > to
> > > > go
> > > > > > > down paths that don't account for multi/metaproperties well.
> > They
> > > > are
> > > > > > > first class citizens in TP3 (with those hoping for extension of
> > at
> > > > > least
> > > > > > > multiproperties to edges) and while I find them a constant
> > > annoyance
> > > > > for
> > > > > > so
> > > > > > > many reasons, we're kinda stuck with them.
> > > > > > >
> > > > > >
> > > > > > I agree we need to account for multi/meta properties as 1st class
> > > > > > citizens.  This is my current thinking on the syntax for the
> > > situations
> > > > > we
> > > > > > have laid out so far:
> > > > > >
> > > > > > *Merge*
> > > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
> > > > > >
> > > > > >
> > > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> > > 'Marko'})
> > > > > >
> > > > > >
> > > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'name':
> > > > > > {'first': Marko', 'last': 'R'})
> > > > > >
> > > > > >
> > > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id:
> > 1,
> > > > > > 'lang': ['java', 'scala'])
> > > > > >
> > > > > >
> > > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1},
> {T.id:
> > 1,
> > > > > > 'lang': 'java')
> > > > > >
> > > > > >
> > > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang':
> > > > > > ['java', 'scala', 'java'])
> > > > > >
> > > > > >
> > > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang':
> > > > > > ['java', 'scala'])
> > > > > >
> > > > > > Since in a mergeV() scenario we are only ever adding whatever
> > values
> > > > are
> > > > > > passed in there would be no need to specify the cardinality of
> the
> > > > > property
> > > > > > being added.  If they wanted to add a value to an existing
> property
> > > > then
> > > > > > the current property() method would still be available on the
> > output
> > > > > > traversal. e.g.
> > > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > > > >
> > > > > >
> > > > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> > > > 'marko'},
> > > > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > > > >   mergeV(values('label'), valueMap('id'), valueMap())
> > > > > >
> > > > > >
> > > > > > *Merge with reporting added or updated side effect*
> > g.mergeV('name',
> > > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > > > >   with(WithOptions.sideEffectLabel, 'a').
> > > > > >   project('vertex', 'added').
> > > > > >     by().
> > > > > >     by(cap('a'))
> > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > 2. How would users set the T.id on creation? would that T.id
> > just
> > > > be
> > > > > a
> > > > > > > key
> > > > > > > > in the first Map argument?
> > > > > > > >
> > > > > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > > > > >
> > > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > > > >
> > > > > > >
> > > > > > > ok - I can't say I see a problem with that atm.
> > > > > > >
> > > > > > >
> > > > > > > > 3. I do like the general idea of a match on multiple
> properties
> > > for
> > > > > the
> > > > > > > > first argument as a convenience but wonder about the
> > specificity
> > > of
> > > > > > this
> > > > > > > > API a bit as it focuses heavily on equality - I suppose
> that's
> > > most
> > > > > > cases
> > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > >
> > > > > > > > In most cases I've seen use exact matches on the vertex or
> > > edge.  I
> > > > > > think
> > > > > > > > it might be best to keep this straightforward as any complex
> > edge
> > > > > cases
> > > > > > > > still can perform the same functionality using the coalesce()
> > > > > pattern.
> > > > > > > >
> > > > > > >
> > > > > > > I played around with the idea to use modulators to apply
> > additional
> > > > > > > constraints:
> > > > > > >
> > > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > > >     by(has('age', gt(29))
> > > > > > >
> > > > > > > I suppose that's neat but maybe not...we could easily get the
> > same
> > > > > thing
> > > > > > > from:
> > > > > > >
> > > > > > > g.V().has('person','age',gt(29)).
> > > > > > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > > >
> > > > > > > which is more readable though it does make it harder for
> > providers
> > > to
> > > > > > > optimize upsert operations if they could make use of the has()
> as
> > > > part
> > > > > of
> > > > > > > that. I think I like has() as part of the main query rather
> than
> > > in a
> > > > > > by()
> > > > > > > - just thinking out loud.
> > > > > > >
> > > > > > >
> > > > > > Another potential option here would be to allow the second
> > parameter
> > > of
> > > > > > mergeV() accept a traversal.  Not sure how much complexity that
> > would
> > > > add
> > > > > > though
> > > > > >
> > > > > > g.V().mergeV('person', __.has('person','age',gt(29)).has('state',
> > > > 'NY'}.,
> > > > > > {'active': true}).
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > 4. I think your suggestion points to one of the troubles
> > Gremlin
> > > > has
> > > > > > > which
> > > > > > > > we see with "algorithms" - extending the language with new
> > steps
> > > > that
> > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > > shortestPath() step) pollutes the core language a bit, hence
> my
> > > > > > > > generalization of "merging" in my link above which fits into
> > the
> > > > core
> > > > > > > > Gremlin language style. There is a bigger picture where we
> are
> > > > > missing
> > > > > > > > something in Gremlin that lets us extend the language in ways
> > > that
> > > > > let
> > > > > > us
> > > > > > > > easily introduce new steps that aren't for general purpose.
> > This
> > > > > issue
> > > > > > is
> > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> > think
> > > > we
> > > > > > > could
> > > > > > > > see how there might be some "mutation" extension steps that
> > would
> > > > > cover
> > > > > > > > your suggested API, plus batch operations, etc. We need a way
> > to
> > > > add
> > > > > > > > "sugar" without it interfering with the consistency of the
> > core.
> > > > > > > Obviously
> > > > > > > > this is a bigger issue but perhaps important to solve to
> > > implement
> > > > > > steps
> > > > > > > in
> > > > > > > > the fashion you describe.
> > > > > > > >
> > > > > > > > I agree that the "algorithm" steps do seem a bit odd in the
> > core
> > > > > > language
> > > > > > > > but I think the need is there.  I'd be interested in
> furthering
> > > > this
> > > > > > > > discussion but I think these "pattern" steps may or may not
> be
> > > the
> > > > > same
> > > > > > > as
> > > > > > > > the algorithm steps.  I'll have to think on that.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > specify
> > > > what
> > > > > > > > element type the first Map argument should be applied to?
> what
> > > > about
> > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> That's
> > > > > tricky
> > > > > > > but
> > > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > > "complex
> > > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > > g.V().merge()
> > > > > > > > and the label and first Map would fold to VertexStartStep
> (i.e.
> > > > V())
> > > > > > for
> > > > > > > > the lookup and then a MergeStep would follow - thus a
> "complex
> > > > > > modulator"
> > > > > > > > as it does more than just change the behavior of the previous
> > > step
> > > > -
> > > > > it
> > > > > > > > also adds its own. I suppose it could also add has() steps
> > > followed
> > > > > by
> > > > > > > the
> > > > > > > > MergeStep and then the has() operations would fold in
> normally
> > as
> > > > > they
> > > > > > do
> > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > merge(String,Map,Map). ??
> > > > > > > >
> > > > > > > > I agree that we should also think about how to include
> > properties
> > > > in
> > > > > > this
> > > > > > > > merge construct.  The reason I was thinking about mergeV()
> and
> > > > > mergeE()
> > > > > > > is
> > > > > > > > that it follows the same pattern as the already well
> understood
> > > > > > > > addV()/addE() steps.  I am a bit wary of trying to generalize
> > > this
> > > > > down
> > > > > > > to
> > > > > > > > a single merge() step as these sorts of complex overloads
> make
> > it
> > > > > hard
> > > > > > to
> > > > > > > > figure out which Gremlin step you should use for a particular
> > > > pattern
> > > > > > > (e.g.
> > > > > > > > upsert vertex or upsert edge).  One thing that I think
> > customers
> > > > find
> > > > > > > > useful about the addV/addE step is that they are very
> > > discoverable,
> > > > > the
> > > > > > > > name tells me what functionality to expect from that step.
> > > > > > > >
> > > > > > >
> > > > > > > I'm starting to agree with the idea that we have to do
> something
> > > like
> > > > > > > mergeV()/E() as it wouldn't quite work as a start step
> otherwise.
> > > We
> > > > > > > couldn't do:
> > > > > > >
> > > > > > > g.merge()
> > > > > > >
> > > > > > > as we wouldnt know what sort of Element it applied to. If so, I
> > > > wonder
> > > > > if
> > > > > > > it would be better to preserve "merge" for my more general
> > use-case
> > > > and
> > > > > > > prefer upsertV()/E()?
> > > > > > >
> > > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is
> an
> > > > > > uncommon
> > > > > > > case (compared to V/E()) and we don't really have addVP() as an
> > > > > analogous
> > > > > > > step. Perhaps existing coalesce() patterns and/or my more
> general
> > > > > purpose
> > > > > > > merge() step would satisfy those situations??
> > > > > > >
> > > > > >
> > > > > > I think that a mergeVP() type step may require a different
> pattern
> > > > since
> > > > > it
> > > > > > would really need to accept a map of values otherwise it would
> > > > > essentially
> > > > > > perform the same function as property()
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > 6. One thing neither my approach nor yours seems to do is
> tell
> > > the
> > > > > user
> > > > > > > if
> > > > > > > > they created something or updated something - that's another
> > > thing
> > > > > I've
> > > > > > > > seen users want to have in get-or-create. Here again we go
> > deeper
> > > > > into
> > > > > > a
> > > > > > > > less general step specification as alluded to in 4, but a
> > merge()
> > > > > step
> > > > > > as
> > > > > > > > proposed in 5, might return [Element,boolean] so as to
> provide
> > an
> > > > > > > indicator
> > > > > > > > of creation?
> > > > > > > >
> > > > > > > > Hmm, I'll have to think about that.  How would returning
> > multiple
> > > > > > values
> > > > > > > > work if we want to chain these together. e.g. Add a vertex
> > > between
> > > > > two
> > > > > > > > edges but make sure the vertices exist?
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > > Added a thought on how to accomplish this above that I'd like to
> > get
> > > > > > thoughts on.
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > > haven't
> > > > > > > gotten
> > > > > > > > this far yet, but a shortcoming to doing
> merge(String,Map,Map)
> > is
> > > > > that
> > > > > > it
> > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > merge()
> > > > for
> > > > > a
> > > > > > > form
> > > > > > > > of batch loading which is mighty common and one of the
> > variations
> > > > of
> > > > > > the
> > > > > > > > coalesce() pattern that I alluded to at the start of all
> this.
> > I
> > > > > think
> > > > > > > that
> > > > > > > > we would want to be sure that we left open the option to do
> > that
> > > > > > somehow.
> > > > > > > > 8. If we had a general purpose merge() step I wonder if it
> > makes
> > > > > > > developing
> > > > > > > > the API as you suggested easier to do?
> > > > > > > >
> > > > > > > > Hmm, let me think about that one.
> > > > > > > >
> > > > > > > >
> > > > > > > I will add an item 8 to think about which I didn't mention
> > before:
> > > > > > >
> > > > > > > 8. The signature you suggested for mergeE() is:
> > > > > > >
> > > > > > > mergeE(String, Map, Map)
> > > > > > >      String - The edge label
> > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > >      Map (second) - Any additional properties to set if a new
> > edge
> > > is
> > > > > > > created (optional)
> > > > > > >
> > > > > > > but does that exactly answer the need? Typically the idea is to
> > > > detect
> > > > > an
> > > > > > > edge between two vertices not globally in the graph by way of
> > some
> > > > > > > properties. This signature doesn't seem to allow for that as it
> > > > doesn't
> > > > > > > allow specification of the vertices to test against. Perhaps
> the
> > > > answer
> > > > > > is
> > > > > > > to use from() and to() modulators?
> > > > > > >
> > > > > > > g.mergeE('knows', {'weight':0.5}).
> > > > > > >   from(V(1)).to(V(2))
> > > > > > > g.V().has('person','name','marko').
> > > > > > >   mergeE( 'knows', {'weight':0.5}).
> > > > > > >     to(V(2))
> > > > > > > g.V().has('person','name','marko').
> > > > > > >   mergeE( 'self', {'weight':0.5})
> > > > > > >
> > > > > > > That seems to work?
> > > > > > >
> > > > > >
> > > > > > I think that is an elegant solution to that problem.  I like it
> and
> > > it
> > > > > > keeps in line with the way that addE() works.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > > > spmallette@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I like our coalesce() pattern but it is verbose and over
> time
> > > it
> > > > > has
> > > > > > > gone
> > > > > > > > > from a simple pattern to one with numerous variations for
> all
> > > > > manner
> > > > > > of
> > > > > > > > > different sorts of merge-like operations. As such, I do
> think
> > > we
> > > > > > should
> > > > > > > > > introduce something to cover this pattern.
> > > > > > > > >
> > > > > > > > > I like that you used the word "merge" in your description
> of
> > > this
> > > > > as
> > > > > > it
> > > > > > > > is
> > > > > > > > > the word I've liked using. You might want to give a look at
> > my
> > > > > > proposed
> > > > > > > > > merge() step from earlier in the year:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > > > >
> > > > > > > > > I'm just going to dump thoughts as they come regarding what
> > you
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > > > proposed?
> > > > > > > > > 2. How would users set the T.id on creation? would that
> T.id
> > > just
> > > > > be
> > > > > > a
> > > > > > > > key
> > > > > > > > > in the first Map argument?
> > > > > > > > > 3. I do like the general idea of a match on multiple
> > properties
> > > > for
> > > > > > the
> > > > > > > > > first argument as a convenience but wonder about the
> > > specificity
> > > > of
> > > > > > > this
> > > > > > > > > API a bit as it focuses heavily on equality - I suppose
> > that's
> > > > most
> > > > > > > cases
> > > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > > 4. I think your suggestion points to one of the troubles
> > > Gremlin
> > > > > has
> > > > > > > > which
> > > > > > > > > we see with "algorithms" - extending the language with new
> > > steps
> > > > > that
> > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up
> with
> > > > > > > > > shortestPath() step) pollutes the core language a bit,
> hence
> > my
> > > > > > > > > generalization of "merging" in my link above which fits
> into
> > > the
> > > > > core
> > > > > > > > > Gremlin language style. There is a bigger picture where we
> > are
> > > > > > missing
> > > > > > > > > something in Gremlin that lets us extend the language in
> ways
> > > > that
> > > > > > let
> > > > > > > us
> > > > > > > > > easily introduce new steps that aren't for general purpose.
> > > This
> > > > > > issue
> > > > > > > is
> > > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> > > think
> > > > > we
> > > > > > > > could
> > > > > > > > > see how there might be some "mutation" extension steps that
> > > would
> > > > > > cover
> > > > > > > > > your suggested API, plus batch operations, etc. We need a
> way
> > > to
> > > > > add
> > > > > > > > > "sugar" without it interfering with the consistency of the
> > > core.
> > > > > > > > Obviously
> > > > > > > > > this is a bigger issue but perhaps important to solve to
> > > > implement
> > > > > > > steps
> > > > > > > > in
> > > > > > > > > the fashion you describe.
> > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > > specify
> > > > > what
> > > > > > > > > element type the first Map argument should be applied to?
> > what
> > > > > about
> > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> > That's
> > > > > > tricky
> > > > > > > > but
> > > > > > > > > I don't think we should miss that. Perhaps merge() could
> be a
> > > > > > "complex
> > > > > > > > > modulator"?? that's a new concept of course, but you would
> do
> > > > > > > > g.V().merge()
> > > > > > > > > and the label and first Map would fold to VertexStartStep
> > (i.e.
> > > > > V())
> > > > > > > for
> > > > > > > > > the lookup and then a MergeStep would follow - thus a
> > "complex
> > > > > > > modulator"
> > > > > > > > > as it does more than just change the behavior of the
> previous
> > > > step
> > > > > -
> > > > > > it
> > > > > > > > > also adds its own. I suppose it could also add has() steps
> > > > followed
> > > > > > by
> > > > > > > > the
> > > > > > > > > MergeStep and then the has() operations would fold in
> > normally
> > > as
> > > > > > they
> > > > > > > do
> > > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > > merge(String,Map,Map). ??
> > > > > > > > > 6. One thing neither my approach nor yours seems to do is
> > tell
> > > > the
> > > > > > user
> > > > > > > > if
> > > > > > > > > they created something or updated something - that's
> another
> > > > thing
> > > > > > I've
> > > > > > > > > seen users want to have in get-or-create. Here again we go
> > > deeper
> > > > > > into
> > > > > > > a
> > > > > > > > > less general step specification as alluded to in 4, but a
> > > merge()
> > > > > > step
> > > > > > > as
> > > > > > > > > proposed in 5, might return [Element,boolean] so as to
> > provide
> > > an
> > > > > > > > indicator
> > > > > > > > > of creation?
> > > > > > > > > 7. You were just introducing your ideas here, so perhaps
> you
> > > > > haven't
> > > > > > > > gotten
> > > > > > > > > this far yet, but a shortcoming to doing
> > merge(String,Map,Map)
> > > is
> > > > > > that
> > > > > > > it
> > > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > > merge()
> > > > > for
> > > > > > a
> > > > > > > > form
> > > > > > > > > of batch loading which is mighty common and one of the
> > > variations
> > > > > of
> > > > > > > the
> > > > > > > > > coalesce() pattern that I alluded to at the start of all
> > this.
> > > I
> > > > > > think
> > > > > > > > that
> > > > > > > > > we would want to be sure that we left open the option to do
> > > that
> > > > > > > somehow.
> > > > > > > > > 8. If we had a general purpose merge() step I wonder if it
> > > makes
> > > > > > > > developing
> > > > > > > > > the API as you suggested easier to do?
> > > > > > > > >
> > > > > > > > > I think I'd like to solve the problems you describe in your
> > > post
> > > > as
> > > > > > > well
> > > > > > > > as
> > > > > > > > > the ones in mine. There is some relation there, but gaps as
> > > well.
> > > > > > With
> > > > > > > > more
> > > > > > > > > discussion here we can figure something out.
> > > > > > > > >
> > > > > > > > > Thanks for starting this talk - good one!
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > > > dave@bechberger.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I've had a few on and off discussions with a few people
> > here,
> > > > so
> > > > > I
> > > > > > > > wanted
> > > > > > > > > > to send this out to everyone for feedback.
> > > > > > > > > >
> > > > > > > > > > What are people's thoughts on creating a new set of steps
> > > that
> > > > > > codify
> > > > > > > > > > common Gremlin best practices?
> > > > > > > > > >
> > > > > > > > > > I think there are several common Gremlin patterns where
> > users
> > > > > would
> > > > > > > > > benefit
> > > > > > > > > > from the additional guidance that these codified steps
> > > > represent.
> > > > > > > The
> > > > > > > > > > first one I would recommend though is codifying the
> element
> > > > > > existence
> > > > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > > > >
> > > > > > > > > > mergeV(String, Map, Map)
> > > > > > > > > >      String - The vertex label
> > > > > > > > > >      Map (first) - The properties to match existing
> > vertices
> > > on
> > > > > > > > > >      Map (second) - Any additional properties to set if a
> > new
> > > > > > vertex
> > > > > > > is
> > > > > > > > > > created (optional)
> > > > > > > > > > mergeE(String, Map, Map)
> > > > > > > > > >      String - The edge label
> > > > > > > > > >      Map (first) - The properties to match existing edge
> on
> > > > > > > > > >      Map (second) - Any additional properties to set if a
> > new
> > > > > edge
> > > > > > is
> > > > > > > > > > created (optional)
> > > > > > > > > >
> > > > > > > > > > In each of these cases these steps would perform the same
> > > > upsert
> > > > > > > > > > functionality as the element existence pattern.
> > > > > > > > > >
> > > > > > > > > > Example:
> > > > > > > > > >
> > > > > > > > > > g.V().has('person','name','stephen').
> > > > > > > > > >            fold().
> > > > > > > > > >            coalesce(unfold(),
> > > > > > > > > >                     addV('person').
> > > > > > > > > >                       property('name','stephen').
> > > > > > > > > >                       property('age',34))
> > > > > > > > > >
> > > > > > > > > > would become:
> > > > > > > > > >
> > > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > > > >
> > > > > > > > > > I think that this change would be a good addition to the
> > > > language
> > > > > > for
> > > > > > > > > > several reasons:
> > > > > > > > > >
> > > > > > > > > > * This codifies the best practice for a specific
> > > action/recipe,
> > > > > > which
> > > > > > > > > > reduces the chance that someone uses the pattern
> > incorrectly
> > > > > > > > > > * Most complex Gremlin traversals are verbose.  Reducing
> > the
> > > > > amount
> > > > > > > of
> > > > > > > > > code
> > > > > > > > > > that needs to be written and maintained allows for a
> better
> > > > > > developer
> > > > > > > > > > experience.
> > > > > > > > > > * It will lower the bar of entry for a developer by
> making
> > > > these
> > > > > > > > actions
> > > > > > > > > > more discoverable.  The more we can help bring these
> > patterns
> > > > to
> > > > > > the
> > > > > > > > > > forefront of the language via these pattern/meta steps
> the
> > > more
> > > > > we
> > > > > > > > guide
> > > > > > > > > > users towards writing better Gremlin faster
> > > > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > > > >
> > > > > > > > > > I know that this would likely be the first step in
> Gremlin
> > > that
> > > > > > > > codifies
> > > > > > > > > a
> > > > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > > > >
> > > > > > > > > > Dave
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Stephen Mallette <sp...@gmail.com>.
Josh, thanks for your thoughts - some responses inline:

On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman <jo...@gmail.com>
wrote:

> I'll offer some thoughts. I'm seeing upsertV() as an idempotent getOrCreate
> call which always returns a vertex with the label/property values specified
> within the step. It's sort of a declarative pattern: "return this vertex to
> me, find it if you can, create it if you must."
>

I like this description - I've added it to the gist, though it's a bit at
odds with Dave's previous post, so we'll consider it a temporary addition
until he responds.


> On that account, I do like the simplification in 1. Repetition shouldn't be
> necessary. In an ideal world, the engine should know the primary
> identifiers (name or id) and find/create the vertex based on them. Any
> other included values will be "trued up" as well. But this may be a bridge
> too far for TinkerPop since knowing identifiers may require a specified
> schema. I'd prefer to omit the third input, but it might be necessary to
> keep it so that the second input can be for the matching use case.
>

In my most recent post on gremlin-users I think I came up with a nice way
to get rid of the second Map. One Map that forms the full list of
properties for upserting is easier than partitioning two Maps that
essentially merge together. I imagine it's unlikely that application code
will have that separation naturally so users will have the added step of
trying to separate their data into searchable vs "just data". Getting us to
one Map argument will simplify APIs for us and reduce complexity to users.
Here is what I'd proposed for those not following over there:

// match on name and age (or perhaps whatever the underlying graph system
thinks is best?)
g.upsertV('person', [name:'marko',age:29])

// match on name only
g.upsertV('person', [name:'marko',age:29]).by('name')

// explicitly match on name and age
g.upsertV('person', [name:'marko',age:29]).
  by('name').by('age')

// match on id only
g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)

// match on whatever the by(Traversal) predicate defines
g.upsertV('person', [name:'marko',age:29]).
  by(has('name', 'marko'))

// match on id, then update age
g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
  property('age',29)

With this model, we get one Map argument that represents the complete
property set to be added/updated to the graph and the user can hint on what
key they wish to match on using by() where that sort of step modulation
should be a well understood and familiar concept in Gremlin at this point.

So that means I think 2 should always match or update the additional
> values. Again, we're specifying the expected result and letting the engine
> figure out best how to return that results and appropriately maintain
> state.
>

I again like this description, but we'll see what Dave's thoughts are since
he's a bit behind on the threads at this point I think.


> I'm also presuming that anything not included as inputs to the upsertV()
> step are then to be handled by following steps. I'm hoping that is a
> sufficient approach for addressing the multi/meta property use cases
> brought up in 3.
>

yeah................it needs more thought. I spent more time thinking on
this issue yesterday than I have for all the previous posts combined and I
think it yielded something good in that revised syntax. It's going to take
more of that kind of elbow grease to dig into these lesser use cases to
make sure we aren't coding ourselves into corners.


> I do like the idea of using modulators (with(), by()) for more
> sophisticated usage and advanced use cases. Also, the streaming examples
> are quite elegant allowing for a helpful separation of data and logic.
>

cool - hope you like the revised syntax I posted then. :)


> That's my humble take. This is a very welcome addition to the language and
> I appreciate the thoughtful & collaborative approach to the design
> considerations.
>

Thanks again and please keep the thoughts coming. Lots of other interesting
design discussions seem to be brewing.


>
> Josh
>
> On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > I started a expanded this discussion to gremlin-users for a wider
> audience
> > and the thread is starting to grow:
> >
> > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
> >
> > I guess we'll need to summarize that discussion back here now....
> >
> > I did have some more thoughts to hang out there and figured that I
> wouldn't
> > convolute the discussion on gremlin-users with it so I will continue the
> > discussion here.
> >
> > 1, The very first couple of examples seem wrong (or at least not best
> > demonstrating the usage):
> >
> > g.upsertV('person', [name: 'marko'],
> >                     [name: 'marko', age: 29])
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, name: 'Marko'])
> >
> > should instead be:
> >
> > g.upsertV('person', [name: 'marko'],
> >                     [age: 29])
> > g.upsertV('person', [(T.id): 1],
> >                     [name: 'Marko'])
> >
> > 2. I can't recall if we made this clear anywhere but in situations where
> we
> > "get" rather than "create" do the additional properties act in an update
> > fashion to the element that was found? I think I've been working on the
> > assumption that it would, though perhaps that is not always desirable?
> >
> > 3. We really never settled up how to deal with multi/meta-properties.
> That
> > story should be clear so that when we document upsert() we include the
> > approaches for the fallback patterns that don't meet the 90% of use cases
> > we are targeting and I sense that we're saying that meta/multi-properties
> > don't fit in that bucket. So with that in mind, I don't think that the
> > following works for metaproperties:
> >
> > g.upsertV('person', [(T.id): 1],
> >                     [name:[acl: 'public'])
> >
> > as it doesn't let us set the value for the "name" just the pairs for the
> > meta-properties. I guess a user would have to fall back to:
> >
> > g.upsertV('person', [(T.id): 1]).
> >   property('name','marko','acl','public')
> >
> > // or use the additional properties syntax
> > g.upsertV('person', [(T.id): 1],[name:'marko']).
> >   properties('name').property('acl','public')
> >
> > // or if there were multi-properties then maybe...
> > g.upsertV('person', [(T.id): 1],[name:'marko']).
> >   properties('name').hasValue('marko').property('acl','public')
> >
> > As for multi-properties, I dont think we should assume that a List object
> > should be interpreted by Gremlin as a multi-property. Perhaps we just
> rely
> > on the underlying graph to properly deal with that given a schema or the
> > user falls back to:
> >
> > // this ends up as however the graph deals with a List object
> > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
> >
> > // this is explicit
> > g.upsertV('person', [(T.id): 1]).
> >   property(list, 'lang', 'java').
> >   property(list, 'lang, 'scala').
> >   property(list, 'lang', 'java')
> >
> > If that makes sense to everyone, I will update the gist.
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <da...@bechberger.com>
> > wrote:
> >
> > > Hello Stephen,
> > >
> > > Thanks for making that gist, its much easier to follow along with the
> > > proposed syntax there.  To your specific comments:
> > >
> > > #1 - My only worry with the term upsert is that it is not as widely a
> > used
> > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).  However I
> > > don't have a strong opinion on this, so I am fine with either.
> > > #2 - My only real objective here is to make sure that we make the
> 80-90%
> > > case easy and straightforward.  I think that having the fallback option
> > of
> > > using the current syntax for any complicated edge cases should be
> > > considered here as well. I'd appreciate your thoughts here as these are
> > > good points you bring up that definitely fall into the 80-90% use case.
> > > #3 - Those points make sense to me, not sure I have anything further to
> > add
> > > #4 - I don't think I would expect to have the values extracted from the
> > > traversal filters but I'd be interested in other opinions on that.
> > >
> > > Thanks,
> > > Dave
> > >
> > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <spmallette@gmail.com
> >
> > > wrote:
> > >
> > > > It's been a couple weeks since we moved this thread so I went back
> > > through
> > > > it and refreshed my mind with what's there. Dave, nice job putting
> all
> > > > those examples together. I've recollected them all together and have
> > just
> > > > been reviewing them in a more formatted style with proper Groovy
> syntax
> > > > than what can be accomplished in this mailing list. Anyone who cares
> to
> > > > review in that form can do so here (i've pasted its markdown contents
> > at
> > > > the bottom of this post as well):
> > > >
> > > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > > >
> > > > In this light, I have the following comments and thoughts:
> > > >
> > > > 1. I feel more inclined toward saving "merge" for a more generalized
> > step
> > > > of that name and have thus renamed the examples to use "upsert".
> > > >
> > > > 2. I still don't quite feel comfortable with meta/multi-properties
> > > > examples. Perhaps you could explain further as it's possible I'm not
> > > > thinking things through properly. For meta-properties the example is:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > >
> > > > So I see that "name" takes a Map here, but how does Gremlin know if
> > that
> > > > means "meta-property" or "Map value" and if the former, then how do
> you
> > > set
> > > > the value of "name"?  My question is similar to the multi-property
> > > examples
> > > > - using this one:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > >
> > > > How does Gremlin know if the user means that "lang" should be a
> > > > "multi-property" with Cardinality.list (or Cardinality.set for that
> > > matter)
> > > > or a value of Cardinality.single with a List/Array value? I can
> perhaps
> > > > suggest some solutions here but wanted to make sure I wasn't just
> > > > misunderstanding something.
> > > >
> > > > 3. The "upsert with stream" example is:
> > > >
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(values(label), valueMap(id), valueMap())
> > > >
> > > > That would establish an API of upsertV(Traversal, Traversal,
> Traversal)
> > > > which would enable your final thought of your last post with:
> > > >
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > >
> > > > The open-ended nature of that is neat but comes with some complexity
> > with
> > > > steps that traditionally take an "open" Traversal (and this step now
> > has
> > > up
> > > > to three of them) - perhaps, that's our own fault in some ways. For
> the
> > > > "stream" concept, values()/valueMap() won't immediately work that way
> > as
> > > > they only apply to Element...given today's Gremlin semantics, I think
> > > we'd
> > > > re-write that as:
> > > >
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(select(label), select(id), select(id,label,'name'))
> > > >
> > > > Though, even that doesn't quite work because you can't select(T)
> right
> > > now
> > > > which means you'd need to go with:
> > > >
> > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > >          [id: 2, label: 'person', name: 'josh']).
> > > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > >
> > > > not terrible in a way and perhaps we could fix select(T) - can't
> > remember
> > > > why that isn't allowed except that select() operates over multiple
> > scopes
> > > > and perhaps trying T through those scopes causes trouble.
> > > >
> > > > 4. Continuing with the use of Traversal as arguments, let's go back
> to:
> > > >
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > >
> > > > I originally had it in my mind to prefer this approach over a Map for
> > the
> > > > search as it provides a great deal of flexibility and fits the common
> > > > filtering patterns that users follow really well. I suppose the
> > question
> > > is
> > > > what to do in the situation of "create". Would we extract all the
> > strict
> > > > equality filter values from the initial has(), thus, in the example,
> > > > "state" but not "age", and create a vertex that has "state=NY,
> > > > active=true"? That is possible with how things are designed in
> Gremlin
> > > > today I think.
> > > >
> > > > This syntax does create some conflict with the subject of 3 and
> streams
> > > > because the traverser in the streams case is the incoming Map but for
> > > this
> > > > context it's meant as a filter of V(). Getting too whacky?
> > > >
> > > > 5. I like the idea of using a side-effect to capture whether the
> > element
> > > > was created or not. that makes sense. I think that can work.
> > > >
> > > > ==============================
> > > > = gist contents
> > > > ==============================
> > > >
> > > > # API
> > > >
> > > > ```java
> > > > upsertV(String label, Map matchOrCreateProperties)
> > > > upsertV(String label, Map matchOrCreateProperties, Map
> > > > additionalProperties)
> > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > > upsertV(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > > additionalProperties)
> > > > upsertV(...).
> > > >   with(WithOptions.sideEffectLabel, 'a')
> > > >
> > > > upsertE(String label, Map matchOrCreateProperties)
> > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > > upsertE(String label, Map matchOrCreateProperties, Map
> > > > additionalProperties)
> > > > upsertE(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > > additionalProperties)
> > > > upsertE(...).
> > > >   from(Vertex incidentOut).to(Vertex incidentIn)
> > > >   with(WithOptions.sideEffectLabel, 'a')
> > > > ```
> > > >
> > > > # Examples
> > > >
> > > > ## upsert
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [name: 'marko', age: 29])
> > > > ```
> > > >
> > > > ## upsert with id
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name: 'Marko'])
> > > > ```
> > > >
> > > > ## upsert with Meta Properties
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > ```
> > > >
> > > > ## upsert with Multi Properties
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > > ```
> > > >
> > > > ## upsert with Single Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: 'java')
> > > > ```
> > > >
> > > > ## upsert with List Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > > ```
> > > >
> > > > ## upsert with Set Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java','scala'])
> > > > ```
> > > >
> > > > ## upsert with List Cardinality - and add value to list
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java','scala','java']).
> > > >   property(Cardinality.list, 'lang', 'java')
> > > > ```
> > > >
> > > > ## upsert with stream
> > > >
> > > > ```groovy
> > > > // doesn't work with today's Gremlin semantics
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(values(label), valueMap(id), valueMap())
> > > >
> > > > // the following would be more in line with what's possible with
> > existing
> > > > Gremlin semantics:
> > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > >          [id: 2, label: 'person', name: 'josh']).
> > > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > > ```
> > > >
> > > > ## upsert with reporting added or updated side effect
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [name: 'marko', age: 29]).
> > > >     with(WithOptions.sideEffectLabel, 'a').
> > > >   project('vertex', 'added').
> > > >     by().
> > > >     by(cap('a'))
> > > > ```
> > > >
> > > > ## upsert Edge assuming a self-relation of "marko"
> > > >
> > > > ```groovy
> > > > g.V().has('person','name','marko').
> > > >   upsertE( 'self', [weight:0.5])
> > > > ```
> > > >
> > > > ## upsert Edge with incident vertex checks
> > > >
> > > > ```groovy
> > > > g.upsertE('knows', [weight:0.5]).
> > > >    from(V(1)).to(V(2))
> > > > g.V().has('person','name','marko').
> > > >   upsertE( 'knows', [weight:0.5]).
> > > >     to(V(2))
> > > > ```
> > > >
> > > > ## upsert using a Traversal for the match
> > > >
> > > > ```groovy
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > > ```
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
> dave@bechberger.com>
> > > > wrote:
> > > >
> > > > > So, I've been doing some additional thinking about the ways that
> this
> > > > could
> > > > > work based on the comments below and have put my comments inline.
> > > > >
> > > > > Dave
> > > > >
> > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> > spmallette@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > I added some thoughts inline below:
> > > > > >
> > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > > dave@bechberger.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for the detailed comments Stephen.  I have addressed
> them
> > > > inline
> > > > > > > below.
> > > > > > >
> > > > > > > I did read the proposal from earlier and I think that we are in
> > > close
> > > > > > > agreement with what we are trying to accomplish.  I also fully
> > > > support
> > > > > > > Josh's comment on providing a mechanism for submitting a map of
> > > > > > properties
> > > > > > > as manually unrolling this all right now leads to a lot of
> > > potential
> > > > > for
> > > > > > > error and a long messy traversal.
> > > > > > >
> > > > > > > I'm looking forward to this discussion on how to merge these
> two
> > > > > > proposals.
> > > > > > >
> > > > > > >
> > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > proposed?
> > > > > > >
> > > > > > > My first thought here is that multi-properties would be
> > represented
> > > > as
> > > > > > > lists in the map, e.g.
> > > > > > >
> > > > > > > {names: ['Dave', 'David']}
> > > > > > >
> > > > > > > and meta-properties would be represented as maps in the maps,
> > e.g.
> > > > > > >
> > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > > >
> > > > > > > I can't say I've thought through all the implications of this
> > > though
> > > > so
> > > > > > it
> > > > > > > is an area we would need to explore.
> > > > > > >
> > > > > >
> > > > > > The first implication that comes to mind is that it makes the
> > > > assumption
> > > > > > the user wants multi/meta-properties as opposed to a single
> > > cardinality
> > > > > of
> > > > > > List in the first case and a Map as a property value in the
> second
> > > > case.
> > > > > I
> > > > > > suppose that graphs with a schema could resolve those assumptions
> > but
> > > > > > graphs that are schemaless would have a problem. The issue could
> be
> > > > > > resolved by specialized configuration of "g" or per merge() step
> > > using
> > > > a
> > > > > > with() modulator I suppose but that goes into a yet another level
> > of
> > > > > > implications to consider. I've often wondered if the start point
> > for
> > > > > > getting types/schema into TP3 without a full rewrite would be in
> > this
> > > > > form
> > > > > > where Gremlin would be given hints as to what to expect as to the
> > > types
> > > > > of
> > > > > > data it might encounter while traversing. Anyway, I'd be hesitant
> > to
> > > go
> > > > > > down paths that don't account for multi/metaproperties well.
> They
> > > are
> > > > > > first class citizens in TP3 (with those hoping for extension of
> at
> > > > least
> > > > > > multiproperties to edges) and while I find them a constant
> > annoyance
> > > > for
> > > > > so
> > > > > > many reasons, we're kinda stuck with them.
> > > > > >
> > > > >
> > > > > I agree we need to account for multi/meta properties as 1st class
> > > > > citizens.  This is my current thinking on the syntax for the
> > situations
> > > > we
> > > > > have laid out so far:
> > > > >
> > > > > *Merge*
> > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
> > > > >
> > > > >
> > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> > 'Marko'})
> > > > >
> > > > >
> > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'name':
> > > > > {'first': Marko', 'last': 'R'})
> > > > >
> > > > >
> > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang': ['java', 'scala'])
> > > > >
> > > > >
> > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang': 'java')
> > > > >
> > > > >
> > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang':
> > > > > ['java', 'scala', 'java'])
> > > > >
> > > > >
> > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang':
> > > > > ['java', 'scala'])
> > > > >
> > > > > Since in a mergeV() scenario we are only ever adding whatever
> values
> > > are
> > > > > passed in there would be no need to specify the cardinality of the
> > > > property
> > > > > being added.  If they wanted to add a value to an existing property
> > > then
> > > > > the current property() method would still be available on the
> output
> > > > > traversal. e.g.
> > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > > >
> > > > >
> > > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> > > 'marko'},
> > > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > > >   mergeV(values('label'), valueMap('id'), valueMap())
> > > > >
> > > > >
> > > > > *Merge with reporting added or updated side effect*
> g.mergeV('name',
> > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > > >   with(WithOptions.sideEffectLabel, 'a').
> > > > >   project('vertex', 'added').
> > > > >     by().
> > > > >     by(cap('a'))
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > 2. How would users set the T.id on creation? would that T.id
> just
> > > be
> > > > a
> > > > > > key
> > > > > > > in the first Map argument?
> > > > > > >
> > > > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > > > >
> > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > > >
> > > > > >
> > > > > > ok - I can't say I see a problem with that atm.
> > > > > >
> > > > > >
> > > > > > > 3. I do like the general idea of a match on multiple properties
> > for
> > > > the
> > > > > > > first argument as a convenience but wonder about the
> specificity
> > of
> > > > > this
> > > > > > > API a bit as it focuses heavily on equality - I suppose that's
> > most
> > > > > cases
> > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > >
> > > > > > > In most cases I've seen use exact matches on the vertex or
> > edge.  I
> > > > > think
> > > > > > > it might be best to keep this straightforward as any complex
> edge
> > > > cases
> > > > > > > still can perform the same functionality using the coalesce()
> > > > pattern.
> > > > > > >
> > > > > >
> > > > > > I played around with the idea to use modulators to apply
> additional
> > > > > > constraints:
> > > > > >
> > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > >     by(has('age', gt(29))
> > > > > >
> > > > > > I suppose that's neat but maybe not...we could easily get the
> same
> > > > thing
> > > > > > from:
> > > > > >
> > > > > > g.V().has('person','age',gt(29)).
> > > > > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > >
> > > > > > which is more readable though it does make it harder for
> providers
> > to
> > > > > > optimize upsert operations if they could make use of the has() as
> > > part
> > > > of
> > > > > > that. I think I like has() as part of the main query rather than
> > in a
> > > > > by()
> > > > > > - just thinking out loud.
> > > > > >
> > > > > >
> > > > > Another potential option here would be to allow the second
> parameter
> > of
> > > > > mergeV() accept a traversal.  Not sure how much complexity that
> would
> > > add
> > > > > though
> > > > >
> > > > > g.V().mergeV('person', __.has('person','age',gt(29)).has('state',
> > > 'NY'}.,
> > > > > {'active': true}).
> > > > >
> > > > >
> > > > > >
> > > > > > > 4. I think your suggestion points to one of the troubles
> Gremlin
> > > has
> > > > > > which
> > > > > > > we see with "algorithms" - extending the language with new
> steps
> > > that
> > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > > > generalization of "merging" in my link above which fits into
> the
> > > core
> > > > > > > Gremlin language style. There is a bigger picture where we are
> > > > missing
> > > > > > > something in Gremlin that lets us extend the language in ways
> > that
> > > > let
> > > > > us
> > > > > > > easily introduce new steps that aren't for general purpose.
> This
> > > > issue
> > > > > is
> > > > > > > discussed in terms of "algorithms" here:
> > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> think
> > > we
> > > > > > could
> > > > > > > see how there might be some "mutation" extension steps that
> would
> > > > cover
> > > > > > > your suggested API, plus batch operations, etc. We need a way
> to
> > > add
> > > > > > > "sugar" without it interfering with the consistency of the
> core.
> > > > > > Obviously
> > > > > > > this is a bigger issue but perhaps important to solve to
> > implement
> > > > > steps
> > > > > > in
> > > > > > > the fashion you describe.
> > > > > > >
> > > > > > > I agree that the "algorithm" steps do seem a bit odd in the
> core
> > > > > language
> > > > > > > but I think the need is there.  I'd be interested in furthering
> > > this
> > > > > > > discussion but I think these "pattern" steps may or may not be
> > the
> > > > same
> > > > > > as
> > > > > > > the algorithm steps.  I'll have to think on that.
> > > > > >
> > > > > >
> > > > > >
> > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> specify
> > > what
> > > > > > > element type the first Map argument should be applied to? what
> > > about
> > > > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > > > tricky
> > > > > > but
> > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > "complex
> > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > g.V().merge()
> > > > > > > and the label and first Map would fold to VertexStartStep (i.e.
> > > V())
> > > > > for
> > > > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > > > modulator"
> > > > > > > as it does more than just change the behavior of the previous
> > step
> > > -
> > > > it
> > > > > > > also adds its own. I suppose it could also add has() steps
> > followed
> > > > by
> > > > > > the
> > > > > > > MergeStep and then the has() operations would fold in normally
> as
> > > > they
> > > > > do
> > > > > > > today. In this way, we can simplify to just one single
> > > > > > > merge(String,Map,Map). ??
> > > > > > >
> > > > > > > I agree that we should also think about how to include
> properties
> > > in
> > > > > this
> > > > > > > merge construct.  The reason I was thinking about mergeV() and
> > > > mergeE()
> > > > > > is
> > > > > > > that it follows the same pattern as the already well understood
> > > > > > > addV()/addE() steps.  I am a bit wary of trying to generalize
> > this
> > > > down
> > > > > > to
> > > > > > > a single merge() step as these sorts of complex overloads make
> it
> > > > hard
> > > > > to
> > > > > > > figure out which Gremlin step you should use for a particular
> > > pattern
> > > > > > (e.g.
> > > > > > > upsert vertex or upsert edge).  One thing that I think
> customers
> > > find
> > > > > > > useful about the addV/addE step is that they are very
> > discoverable,
> > > > the
> > > > > > > name tells me what functionality to expect from that step.
> > > > > > >
> > > > > >
> > > > > > I'm starting to agree with the idea that we have to do something
> > like
> > > > > > mergeV()/E() as it wouldn't quite work as a start step otherwise.
> > We
> > > > > > couldn't do:
> > > > > >
> > > > > > g.merge()
> > > > > >
> > > > > > as we wouldnt know what sort of Element it applied to. If so, I
> > > wonder
> > > > if
> > > > > > it would be better to preserve "merge" for my more general
> use-case
> > > and
> > > > > > prefer upsertV()/E()?
> > > > > >
> > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is an
> > > > > uncommon
> > > > > > case (compared to V/E()) and we don't really have addVP() as an
> > > > analogous
> > > > > > step. Perhaps existing coalesce() patterns and/or my more general
> > > > purpose
> > > > > > merge() step would satisfy those situations??
> > > > > >
> > > > >
> > > > > I think that a mergeVP() type step may require a different pattern
> > > since
> > > > it
> > > > > would really need to accept a map of values otherwise it would
> > > > essentially
> > > > > perform the same function as property()
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > 6. One thing neither my approach nor yours seems to do is tell
> > the
> > > > user
> > > > > > if
> > > > > > > they created something or updated something - that's another
> > thing
> > > > I've
> > > > > > > seen users want to have in get-or-create. Here again we go
> deeper
> > > > into
> > > > > a
> > > > > > > less general step specification as alluded to in 4, but a
> merge()
> > > > step
> > > > > as
> > > > > > > proposed in 5, might return [Element,boolean] so as to provide
> an
> > > > > > indicator
> > > > > > > of creation?
> > > > > > >
> > > > > > > Hmm, I'll have to think about that.  How would returning
> multiple
> > > > > values
> > > > > > > work if we want to chain these together. e.g. Add a vertex
> > between
> > > > two
> > > > > > > edges but make sure the vertices exist?
> > > > > > >
> > > > > >
> > > > >
> > > > > Added a thought on how to accomplish this above that I'd like to
> get
> > > > > thoughts on.
> > > > >
> > > > >
> > > > > > >
> > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > haven't
> > > > > > gotten
> > > > > > > this far yet, but a shortcoming to doing merge(String,Map,Map)
> is
> > > > that
> > > > > it
> > > > > > > leaves open no opportunity to stream a List of Maps to a
> merge()
> > > for
> > > > a
> > > > > > form
> > > > > > > of batch loading which is mighty common and one of the
> variations
> > > of
> > > > > the
> > > > > > > coalesce() pattern that I alluded to at the start of all this.
> I
> > > > think
> > > > > > that
> > > > > > > we would want to be sure that we left open the option to do
> that
> > > > > somehow.
> > > > > > > 8. If we had a general purpose merge() step I wonder if it
> makes
> > > > > > developing
> > > > > > > the API as you suggested easier to do?
> > > > > > >
> > > > > > > Hmm, let me think about that one.
> > > > > > >
> > > > > > >
> > > > > > I will add an item 8 to think about which I didn't mention
> before:
> > > > > >
> > > > > > 8. The signature you suggested for mergeE() is:
> > > > > >
> > > > > > mergeE(String, Map, Map)
> > > > > >      String - The edge label
> > > > > >      Map (first) - The properties to match existing edge on
> > > > > >      Map (second) - Any additional properties to set if a new
> edge
> > is
> > > > > > created (optional)
> > > > > >
> > > > > > but does that exactly answer the need? Typically the idea is to
> > > detect
> > > > an
> > > > > > edge between two vertices not globally in the graph by way of
> some
> > > > > > properties. This signature doesn't seem to allow for that as it
> > > doesn't
> > > > > > allow specification of the vertices to test against. Perhaps the
> > > answer
> > > > > is
> > > > > > to use from() and to() modulators?
> > > > > >
> > > > > > g.mergeE('knows', {'weight':0.5}).
> > > > > >   from(V(1)).to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >   mergeE( 'knows', {'weight':0.5}).
> > > > > >     to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >   mergeE( 'self', {'weight':0.5})
> > > > > >
> > > > > > That seems to work?
> > > > > >
> > > > >
> > > > > I think that is an elegant solution to that problem.  I like it and
> > it
> > > > > keeps in line with the way that addE() works.
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > > spmallette@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I like our coalesce() pattern but it is verbose and over time
> > it
> > > > has
> > > > > > gone
> > > > > > > > from a simple pattern to one with numerous variations for all
> > > > manner
> > > > > of
> > > > > > > > different sorts of merge-like operations. As such, I do think
> > we
> > > > > should
> > > > > > > > introduce something to cover this pattern.
> > > > > > > >
> > > > > > > > I like that you used the word "merge" in your description of
> > this
> > > > as
> > > > > it
> > > > > > > is
> > > > > > > > the word I've liked using. You might want to give a look at
> my
> > > > > proposed
> > > > > > > > merge() step from earlier in the year:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > > >
> > > > > > > > I'm just going to dump thoughts as they come regarding what
> you
> > > > > wrote:
> > > > > > > >
> > > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > > proposed?
> > > > > > > > 2. How would users set the T.id on creation? would that T.id
> > just
> > > > be
> > > > > a
> > > > > > > key
> > > > > > > > in the first Map argument?
> > > > > > > > 3. I do like the general idea of a match on multiple
> properties
> > > for
> > > > > the
> > > > > > > > first argument as a convenience but wonder about the
> > specificity
> > > of
> > > > > > this
> > > > > > > > API a bit as it focuses heavily on equality - I suppose
> that's
> > > most
> > > > > > cases
> > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > 4. I think your suggestion points to one of the troubles
> > Gremlin
> > > > has
> > > > > > > which
> > > > > > > > we see with "algorithms" - extending the language with new
> > steps
> > > > that
> > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > > shortestPath() step) pollutes the core language a bit, hence
> my
> > > > > > > > generalization of "merging" in my link above which fits into
> > the
> > > > core
> > > > > > > > Gremlin language style. There is a bigger picture where we
> are
> > > > > missing
> > > > > > > > something in Gremlin that lets us extend the language in ways
> > > that
> > > > > let
> > > > > > us
> > > > > > > > easily introduce new steps that aren't for general purpose.
> > This
> > > > > issue
> > > > > > is
> > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> > think
> > > > we
> > > > > > > could
> > > > > > > > see how there might be some "mutation" extension steps that
> > would
> > > > > cover
> > > > > > > > your suggested API, plus batch operations, etc. We need a way
> > to
> > > > add
> > > > > > > > "sugar" without it interfering with the consistency of the
> > core.
> > > > > > > Obviously
> > > > > > > > this is a bigger issue but perhaps important to solve to
> > > implement
> > > > > > steps
> > > > > > > in
> > > > > > > > the fashion you describe.
> > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > specify
> > > > what
> > > > > > > > element type the first Map argument should be applied to?
> what
> > > > about
> > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> That's
> > > > > tricky
> > > > > > > but
> > > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > > "complex
> > > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > > g.V().merge()
> > > > > > > > and the label and first Map would fold to VertexStartStep
> (i.e.
> > > > V())
> > > > > > for
> > > > > > > > the lookup and then a MergeStep would follow - thus a
> "complex
> > > > > > modulator"
> > > > > > > > as it does more than just change the behavior of the previous
> > > step
> > > > -
> > > > > it
> > > > > > > > also adds its own. I suppose it could also add has() steps
> > > followed
> > > > > by
> > > > > > > the
> > > > > > > > MergeStep and then the has() operations would fold in
> normally
> > as
> > > > > they
> > > > > > do
> > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > merge(String,Map,Map). ??
> > > > > > > > 6. One thing neither my approach nor yours seems to do is
> tell
> > > the
> > > > > user
> > > > > > > if
> > > > > > > > they created something or updated something - that's another
> > > thing
> > > > > I've
> > > > > > > > seen users want to have in get-or-create. Here again we go
> > deeper
> > > > > into
> > > > > > a
> > > > > > > > less general step specification as alluded to in 4, but a
> > merge()
> > > > > step
> > > > > > as
> > > > > > > > proposed in 5, might return [Element,boolean] so as to
> provide
> > an
> > > > > > > indicator
> > > > > > > > of creation?
> > > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > > haven't
> > > > > > > gotten
> > > > > > > > this far yet, but a shortcoming to doing
> merge(String,Map,Map)
> > is
> > > > > that
> > > > > > it
> > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > merge()
> > > > for
> > > > > a
> > > > > > > form
> > > > > > > > of batch loading which is mighty common and one of the
> > variations
> > > > of
> > > > > > the
> > > > > > > > coalesce() pattern that I alluded to at the start of all
> this.
> > I
> > > > > think
> > > > > > > that
> > > > > > > > we would want to be sure that we left open the option to do
> > that
> > > > > > somehow.
> > > > > > > > 8. If we had a general purpose merge() step I wonder if it
> > makes
> > > > > > > developing
> > > > > > > > the API as you suggested easier to do?
> > > > > > > >
> > > > > > > > I think I'd like to solve the problems you describe in your
> > post
> > > as
> > > > > > well
> > > > > > > as
> > > > > > > > the ones in mine. There is some relation there, but gaps as
> > well.
> > > > > With
> > > > > > > more
> > > > > > > > discussion here we can figure something out.
> > > > > > > >
> > > > > > > > Thanks for starting this talk - good one!
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > > dave@bechberger.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I've had a few on and off discussions with a few people
> here,
> > > so
> > > > I
> > > > > > > wanted
> > > > > > > > > to send this out to everyone for feedback.
> > > > > > > > >
> > > > > > > > > What are people's thoughts on creating a new set of steps
> > that
> > > > > codify
> > > > > > > > > common Gremlin best practices?
> > > > > > > > >
> > > > > > > > > I think there are several common Gremlin patterns where
> users
> > > > would
> > > > > > > > benefit
> > > > > > > > > from the additional guidance that these codified steps
> > > represent.
> > > > > > The
> > > > > > > > > first one I would recommend though is codifying the element
> > > > > existence
> > > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > > >
> > > > > > > > > mergeV(String, Map, Map)
> > > > > > > > >      String - The vertex label
> > > > > > > > >      Map (first) - The properties to match existing
> vertices
> > on
> > > > > > > > >      Map (second) - Any additional properties to set if a
> new
> > > > > vertex
> > > > > > is
> > > > > > > > > created (optional)
> > > > > > > > > mergeE(String, Map, Map)
> > > > > > > > >      String - The edge label
> > > > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > > > >      Map (second) - Any additional properties to set if a
> new
> > > > edge
> > > > > is
> > > > > > > > > created (optional)
> > > > > > > > >
> > > > > > > > > In each of these cases these steps would perform the same
> > > upsert
> > > > > > > > > functionality as the element existence pattern.
> > > > > > > > >
> > > > > > > > > Example:
> > > > > > > > >
> > > > > > > > > g.V().has('person','name','stephen').
> > > > > > > > >            fold().
> > > > > > > > >            coalesce(unfold(),
> > > > > > > > >                     addV('person').
> > > > > > > > >                       property('name','stephen').
> > > > > > > > >                       property('age',34))
> > > > > > > > >
> > > > > > > > > would become:
> > > > > > > > >
> > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > > >
> > > > > > > > > I think that this change would be a good addition to the
> > > language
> > > > > for
> > > > > > > > > several reasons:
> > > > > > > > >
> > > > > > > > > * This codifies the best practice for a specific
> > action/recipe,
> > > > > which
> > > > > > > > > reduces the chance that someone uses the pattern
> incorrectly
> > > > > > > > > * Most complex Gremlin traversals are verbose.  Reducing
> the
> > > > amount
> > > > > > of
> > > > > > > > code
> > > > > > > > > that needs to be written and maintained allows for a better
> > > > > developer
> > > > > > > > > experience.
> > > > > > > > > * It will lower the bar of entry for a developer by making
> > > these
> > > > > > > actions
> > > > > > > > > more discoverable.  The more we can help bring these
> patterns
> > > to
> > > > > the
> > > > > > > > > forefront of the language via these pattern/meta steps the
> > more
> > > > we
> > > > > > > guide
> > > > > > > > > users towards writing better Gremlin faster
> > > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > > >
> > > > > > > > > I know that this would likely be the first step in Gremlin
> > that
> > > > > > > codifies
> > > > > > > > a
> > > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > > >
> > > > > > > > > Dave
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Josh Perryman <jo...@gmail.com>.
I'll offer some thoughts. I'm seeing upsertV() as an idempotent getOrCreate
call which always returns a vertex with the label/property values specified
within the step. It's sort of a declarative pattern: "return this vertex to
me, find it if you can, create it if you must."

On that account, I do like the simplification in 1. Repetition shouldn't be
necessary. In an ideal world, the engine should know the primary
identifiers (name or id) and find/create the vertex based on them. Any
other included values will be "trued up" as well. But this may be a bridge
too far for TinkerPop since knowing identifiers may require a specified
schema. I'd prefer to omit the third input, but it might be necessary to
keep it so that the second input can be for the matching use case.

So that means I think 2 should always match or update the additional
values. Again, we're specifying the expected result and letting the engine
figure out best how to return that results and appropriately maintain
state.

I'm also presuming that anything not included as inputs to the upsertV()
step are then to be handled by following steps. I'm hoping that is a
sufficient approach for addressing the multi/meta property use cases
brought up in 3.

I do like the idea of using modulators (with(), by()) for more
sophisticated usage and advanced use cases. Also, the streaming examples
are quite elegant allowing for a helpful separation of data and logic.

That's my humble take. This is a very welcome addition to the language and
I appreciate the thoughtful & collaborative approach to the design
considerations.

Josh

On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <sp...@gmail.com>
wrote:

> I started a expanded this discussion to gremlin-users for a wider audience
> and the thread is starting to grow:
>
> https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
>
> I guess we'll need to summarize that discussion back here now....
>
> I did have some more thoughts to hang out there and figured that I wouldn't
> convolute the discussion on gremlin-users with it so I will continue the
> discussion here.
>
> 1, The very first couple of examples seem wrong (or at least not best
> demonstrating the usage):
>
> g.upsertV('person', [name: 'marko'],
>                     [name: 'marko', age: 29])
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1, name: 'Marko'])
>
> should instead be:
>
> g.upsertV('person', [name: 'marko'],
>                     [age: 29])
> g.upsertV('person', [(T.id): 1],
>                     [name: 'Marko'])
>
> 2. I can't recall if we made this clear anywhere but in situations where we
> "get" rather than "create" do the additional properties act in an update
> fashion to the element that was found? I think I've been working on the
> assumption that it would, though perhaps that is not always desirable?
>
> 3. We really never settled up how to deal with multi/meta-properties. That
> story should be clear so that when we document upsert() we include the
> approaches for the fallback patterns that don't meet the 90% of use cases
> we are targeting and I sense that we're saying that meta/multi-properties
> don't fit in that bucket. So with that in mind, I don't think that the
> following works for metaproperties:
>
> g.upsertV('person', [(T.id): 1],
>                     [name:[acl: 'public'])
>
> as it doesn't let us set the value for the "name" just the pairs for the
> meta-properties. I guess a user would have to fall back to:
>
> g.upsertV('person', [(T.id): 1]).
>   property('name','marko','acl','public')
>
> // or use the additional properties syntax
> g.upsertV('person', [(T.id): 1],[name:'marko']).
>   properties('name').property('acl','public')
>
> // or if there were multi-properties then maybe...
> g.upsertV('person', [(T.id): 1],[name:'marko']).
>   properties('name').hasValue('marko').property('acl','public')
>
> As for multi-properties, I dont think we should assume that a List object
> should be interpreted by Gremlin as a multi-property. Perhaps we just rely
> on the underlying graph to properly deal with that given a schema or the
> user falls back to:
>
> // this ends up as however the graph deals with a List object
> g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
>
> // this is explicit
> g.upsertV('person', [(T.id): 1]).
>   property(list, 'lang', 'java').
>   property(list, 'lang, 'scala').
>   property(list, 'lang', 'java')
>
> If that makes sense to everyone, I will update the gist.
>
>
>
>
>
>
>
> On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <da...@bechberger.com>
> wrote:
>
> > Hello Stephen,
> >
> > Thanks for making that gist, its much easier to follow along with the
> > proposed syntax there.  To your specific comments:
> >
> > #1 - My only worry with the term upsert is that it is not as widely a
> used
> > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).  However I
> > don't have a strong opinion on this, so I am fine with either.
> > #2 - My only real objective here is to make sure that we make the 80-90%
> > case easy and straightforward.  I think that having the fallback option
> of
> > using the current syntax for any complicated edge cases should be
> > considered here as well. I'd appreciate your thoughts here as these are
> > good points you bring up that definitely fall into the 80-90% use case.
> > #3 - Those points make sense to me, not sure I have anything further to
> add
> > #4 - I don't think I would expect to have the values extracted from the
> > traversal filters but I'd be interested in other opinions on that.
> >
> > Thanks,
> > Dave
> >
> > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <sp...@gmail.com>
> > wrote:
> >
> > > It's been a couple weeks since we moved this thread so I went back
> > through
> > > it and refreshed my mind with what's there. Dave, nice job putting all
> > > those examples together. I've recollected them all together and have
> just
> > > been reviewing them in a more formatted style with proper Groovy syntax
> > > than what can be accomplished in this mailing list. Anyone who cares to
> > > review in that form can do so here (i've pasted its markdown contents
> at
> > > the bottom of this post as well):
> > >
> > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > >
> > > In this light, I have the following comments and thoughts:
> > >
> > > 1. I feel more inclined toward saving "merge" for a more generalized
> step
> > > of that name and have thus renamed the examples to use "upsert".
> > >
> > > 2. I still don't quite feel comfortable with meta/multi-properties
> > > examples. Perhaps you could explain further as it's possible I'm not
> > > thinking things through properly. For meta-properties the example is:
> > >
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > >
> > > So I see that "name" takes a Map here, but how does Gremlin know if
> that
> > > means "meta-property" or "Map value" and if the former, then how do you
> > set
> > > the value of "name"?  My question is similar to the multi-property
> > examples
> > > - using this one:
> > >
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1,lang: ['java', 'scala'])
> > >
> > > How does Gremlin know if the user means that "lang" should be a
> > > "multi-property" with Cardinality.list (or Cardinality.set for that
> > matter)
> > > or a value of Cardinality.single with a List/Array value? I can perhaps
> > > suggest some solutions here but wanted to make sure I wasn't just
> > > misunderstanding something.
> > >
> > > 3. The "upsert with stream" example is:
> > >
> > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > >          [(id): 2, (label): 'person', name: 'josh']).
> > >   upsertV(values(label), valueMap(id), valueMap())
> > >
> > > That would establish an API of upsertV(Traversal, Traversal, Traversal)
> > > which would enable your final thought of your last post with:
> > >
> > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> 'NY'),
> > >                         [active: true])
> > >
> > > The open-ended nature of that is neat but comes with some complexity
> with
> > > steps that traditionally take an "open" Traversal (and this step now
> has
> > up
> > > to three of them) - perhaps, that's our own fault in some ways. For the
> > > "stream" concept, values()/valueMap() won't immediately work that way
> as
> > > they only apply to Element...given today's Gremlin semantics, I think
> > we'd
> > > re-write that as:
> > >
> > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > >          [(id): 2, (label): 'person', name: 'josh']).
> > >   upsertV(select(label), select(id), select(id,label,'name'))
> > >
> > > Though, even that doesn't quite work because you can't select(T) right
> > now
> > > which means you'd need to go with:
> > >
> > > g.inject([id: 1, label: 'person', name: 'marko'],
> > >          [id: 2, label: 'person', name: 'josh']).
> > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > >
> > > not terrible in a way and perhaps we could fix select(T) - can't
> remember
> > > why that isn't allowed except that select() operates over multiple
> scopes
> > > and perhaps trying T through those scopes causes trouble.
> > >
> > > 4. Continuing with the use of Traversal as arguments, let's go back to:
> > >
> > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> 'NY'),
> > >                         [active: true])
> > >
> > > I originally had it in my mind to prefer this approach over a Map for
> the
> > > search as it provides a great deal of flexibility and fits the common
> > > filtering patterns that users follow really well. I suppose the
> question
> > is
> > > what to do in the situation of "create". Would we extract all the
> strict
> > > equality filter values from the initial has(), thus, in the example,
> > > "state" but not "age", and create a vertex that has "state=NY,
> > > active=true"? That is possible with how things are designed in Gremlin
> > > today I think.
> > >
> > > This syntax does create some conflict with the subject of 3 and streams
> > > because the traverser in the streams case is the incoming Map but for
> > this
> > > context it's meant as a filter of V(). Getting too whacky?
> > >
> > > 5. I like the idea of using a side-effect to capture whether the
> element
> > > was created or not. that makes sense. I think that can work.
> > >
> > > ==============================
> > > = gist contents
> > > ==============================
> > >
> > > # API
> > >
> > > ```java
> > > upsertV(String label, Map matchOrCreateProperties)
> > > upsertV(String label, Map matchOrCreateProperties, Map
> > > additionalProperties)
> > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > upsertV(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > additionalProperties)
> > > upsertV(...).
> > >   with(WithOptions.sideEffectLabel, 'a')
> > >
> > > upsertE(String label, Map matchOrCreateProperties)
> > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > upsertE(String label, Map matchOrCreateProperties, Map
> > > additionalProperties)
> > > upsertE(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > additionalProperties)
> > > upsertE(...).
> > >   from(Vertex incidentOut).to(Vertex incidentIn)
> > >   with(WithOptions.sideEffectLabel, 'a')
> > > ```
> > >
> > > # Examples
> > >
> > > ## upsert
> > >
> > > ```groovy
> > > g.upsertV('person', [name: 'marko'],
> > >                     [name: 'marko', age: 29])
> > > ```
> > >
> > > ## upsert with id
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, name: 'Marko'])
> > > ```
> > >
> > > ## upsert with Meta Properties
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > ```
> > >
> > > ## upsert with Multi Properties
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > ```
> > >
> > > ## upsert with Single Cardinality
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, lang: 'java')
> > > ```
> > >
> > > ## upsert with List Cardinality
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > ```
> > >
> > > ## upsert with Set Cardinality
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, lang: ['java','scala'])
> > > ```
> > >
> > > ## upsert with List Cardinality - and add value to list
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, lang: ['java','scala','java']).
> > >   property(Cardinality.list, 'lang', 'java')
> > > ```
> > >
> > > ## upsert with stream
> > >
> > > ```groovy
> > > // doesn't work with today's Gremlin semantics
> > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > >          [(id): 2, (label): 'person', name: 'josh']).
> > >   upsertV(values(label), valueMap(id), valueMap())
> > >
> > > // the following would be more in line with what's possible with
> existing
> > > Gremlin semantics:
> > > g.inject([id: 1, label: 'person', name: 'marko'],
> > >          [id: 2, label: 'person', name: 'josh']).
> > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > ```
> > >
> > > ## upsert with reporting added or updated side effect
> > >
> > > ```groovy
> > > g.upsertV('person', [name: 'marko'],
> > >                     [name: 'marko', age: 29]).
> > >     with(WithOptions.sideEffectLabel, 'a').
> > >   project('vertex', 'added').
> > >     by().
> > >     by(cap('a'))
> > > ```
> > >
> > > ## upsert Edge assuming a self-relation of "marko"
> > >
> > > ```groovy
> > > g.V().has('person','name','marko').
> > >   upsertE( 'self', [weight:0.5])
> > > ```
> > >
> > > ## upsert Edge with incident vertex checks
> > >
> > > ```groovy
> > > g.upsertE('knows', [weight:0.5]).
> > >    from(V(1)).to(V(2))
> > > g.V().has('person','name','marko').
> > >   upsertE( 'knows', [weight:0.5]).
> > >     to(V(2))
> > > ```
> > >
> > > ## upsert using a Traversal for the match
> > >
> > > ```groovy
> > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> 'NY'),
> > >                         [active: true])
> > > ```
> > >
> > >
> > >
> > >
> > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <da...@bechberger.com>
> > > wrote:
> > >
> > > > So, I've been doing some additional thinking about the ways that this
> > > could
> > > > work based on the comments below and have put my comments inline.
> > > >
> > > > Dave
> > > >
> > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> spmallette@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > I added some thoughts inline below:
> > > > >
> > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > dave@bechberger.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the detailed comments Stephen.  I have addressed them
> > > inline
> > > > > > below.
> > > > > >
> > > > > > I did read the proposal from earlier and I think that we are in
> > close
> > > > > > agreement with what we are trying to accomplish.  I also fully
> > > support
> > > > > > Josh's comment on providing a mechanism for submitting a map of
> > > > > properties
> > > > > > as manually unrolling this all right now leads to a lot of
> > potential
> > > > for
> > > > > > error and a long messy traversal.
> > > > > >
> > > > > > I'm looking forward to this discussion on how to merge these two
> > > > > proposals.
> > > > > >
> > > > > >
> > > > > > 1. How would multi/meta-properties fit into the API you've
> > proposed?
> > > > > >
> > > > > > My first thought here is that multi-properties would be
> represented
> > > as
> > > > > > lists in the map, e.g.
> > > > > >
> > > > > > {names: ['Dave', 'David']}
> > > > > >
> > > > > > and meta-properties would be represented as maps in the maps,
> e.g.
> > > > > >
> > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > >
> > > > > > I can't say I've thought through all the implications of this
> > though
> > > so
> > > > > it
> > > > > > is an area we would need to explore.
> > > > > >
> > > > >
> > > > > The first implication that comes to mind is that it makes the
> > > assumption
> > > > > the user wants multi/meta-properties as opposed to a single
> > cardinality
> > > > of
> > > > > List in the first case and a Map as a property value in the second
> > > case.
> > > > I
> > > > > suppose that graphs with a schema could resolve those assumptions
> but
> > > > > graphs that are schemaless would have a problem. The issue could be
> > > > > resolved by specialized configuration of "g" or per merge() step
> > using
> > > a
> > > > > with() modulator I suppose but that goes into a yet another level
> of
> > > > > implications to consider. I've often wondered if the start point
> for
> > > > > getting types/schema into TP3 without a full rewrite would be in
> this
> > > > form
> > > > > where Gremlin would be given hints as to what to expect as to the
> > types
> > > > of
> > > > > data it might encounter while traversing. Anyway, I'd be hesitant
> to
> > go
> > > > > down paths that don't account for multi/metaproperties well.  They
> > are
> > > > > first class citizens in TP3 (with those hoping for extension of at
> > > least
> > > > > multiproperties to edges) and while I find them a constant
> annoyance
> > > for
> > > > so
> > > > > many reasons, we're kinda stuck with them.
> > > > >
> > > >
> > > > I agree we need to account for multi/meta properties as 1st class
> > > > citizens.  This is my current thinking on the syntax for the
> situations
> > > we
> > > > have laid out so far:
> > > >
> > > > *Merge*
> > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
> > > >
> > > >
> > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> 'Marko'})
> > > >
> > > >
> > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > 'name':
> > > > {'first': Marko', 'last': 'R'})
> > > >
> > > >
> > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang': ['java', 'scala'])
> > > >
> > > >
> > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang': 'java')
> > > >
> > > >
> > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > 'lang':
> > > > ['java', 'scala', 'java'])
> > > >
> > > >
> > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > 'lang':
> > > > ['java', 'scala'])
> > > >
> > > > Since in a mergeV() scenario we are only ever adding whatever values
> > are
> > > > passed in there would be no need to specify the cardinality of the
> > > property
> > > > being added.  If they wanted to add a value to an existing property
> > then
> > > > the current property() method would still be available on the output
> > > > traversal. e.g.
> > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > >
> > > >
> > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> > 'marko'},
> > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > >   mergeV(values('label'), valueMap('id'), valueMap())
> > > >
> > > >
> > > > *Merge with reporting added or updated side effect* g.mergeV('name',
> > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > >   with(WithOptions.sideEffectLabel, 'a').
> > > >   project('vertex', 'added').
> > > >     by().
> > > >     by(cap('a'))
> > > >
> > > >
> > > >
> > > > >
> > > > >
> > > > > > 2. How would users set the T.id on creation? would that T.id just
> > be
> > > a
> > > > > key
> > > > > > in the first Map argument?
> > > > > >
> > > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > > >
> > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > >
> > > > >
> > > > > ok - I can't say I see a problem with that atm.
> > > > >
> > > > >
> > > > > > 3. I do like the general idea of a match on multiple properties
> for
> > > the
> > > > > > first argument as a convenience but wonder about the specificity
> of
> > > > this
> > > > > > API a bit as it focuses heavily on equality - I suppose that's
> most
> > > > cases
> > > > > > for get-or-create, so perhaps that's ok.
> > > > > >
> > > > > > In most cases I've seen use exact matches on the vertex or
> edge.  I
> > > > think
> > > > > > it might be best to keep this straightforward as any complex edge
> > > cases
> > > > > > still can perform the same functionality using the coalesce()
> > > pattern.
> > > > > >
> > > > >
> > > > > I played around with the idea to use modulators to apply additional
> > > > > constraints:
> > > > >
> > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > >     by(has('age', gt(29))
> > > > >
> > > > > I suppose that's neat but maybe not...we could easily get the same
> > > thing
> > > > > from:
> > > > >
> > > > > g.V().has('person','age',gt(29)).
> > > > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > > > >
> > > > > which is more readable though it does make it harder for providers
> to
> > > > > optimize upsert operations if they could make use of the has() as
> > part
> > > of
> > > > > that. I think I like has() as part of the main query rather than
> in a
> > > > by()
> > > > > - just thinking out loud.
> > > > >
> > > > >
> > > > Another potential option here would be to allow the second parameter
> of
> > > > mergeV() accept a traversal.  Not sure how much complexity that would
> > add
> > > > though
> > > >
> > > > g.V().mergeV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'}.,
> > > > {'active': true}).
> > > >
> > > >
> > > > >
> > > > > > 4. I think your suggestion points to one of the troubles Gremlin
> > has
> > > > > which
> > > > > > we see with "algorithms" - extending the language with new steps
> > that
> > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > > generalization of "merging" in my link above which fits into the
> > core
> > > > > > Gremlin language style. There is a bigger picture where we are
> > > missing
> > > > > > something in Gremlin that lets us extend the language in ways
> that
> > > let
> > > > us
> > > > > > easily introduce new steps that aren't for general purpose. This
> > > issue
> > > > is
> > > > > > discussed in terms of "algorithms" here:
> > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think
> > we
> > > > > could
> > > > > > see how there might be some "mutation" extension steps that would
> > > cover
> > > > > > your suggested API, plus batch operations, etc. We need a way to
> > add
> > > > > > "sugar" without it interfering with the consistency of the core.
> > > > > Obviously
> > > > > > this is a bigger issue but perhaps important to solve to
> implement
> > > > steps
> > > > > in
> > > > > > the fashion you describe.
> > > > > >
> > > > > > I agree that the "algorithm" steps do seem a bit odd in the core
> > > > language
> > > > > > but I think the need is there.  I'd be interested in furthering
> > this
> > > > > > discussion but I think these "pattern" steps may or may not be
> the
> > > same
> > > > > as
> > > > > > the algorithm steps.  I'll have to think on that.
> > > > >
> > > > >
> > > > >
> > > > > > 5. I suppose that the reason for mergeE and mergeV is to specify
> > what
> > > > > > element type the first Map argument should be applied to? what
> > about
> > > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > > tricky
> > > > > but
> > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > "complex
> > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > g.V().merge()
> > > > > > and the label and first Map would fold to VertexStartStep (i.e.
> > V())
> > > > for
> > > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > > modulator"
> > > > > > as it does more than just change the behavior of the previous
> step
> > -
> > > it
> > > > > > also adds its own. I suppose it could also add has() steps
> followed
> > > by
> > > > > the
> > > > > > MergeStep and then the has() operations would fold in normally as
> > > they
> > > > do
> > > > > > today. In this way, we can simplify to just one single
> > > > > > merge(String,Map,Map). ??
> > > > > >
> > > > > > I agree that we should also think about how to include properties
> > in
> > > > this
> > > > > > merge construct.  The reason I was thinking about mergeV() and
> > > mergeE()
> > > > > is
> > > > > > that it follows the same pattern as the already well understood
> > > > > > addV()/addE() steps.  I am a bit wary of trying to generalize
> this
> > > down
> > > > > to
> > > > > > a single merge() step as these sorts of complex overloads make it
> > > hard
> > > > to
> > > > > > figure out which Gremlin step you should use for a particular
> > pattern
> > > > > (e.g.
> > > > > > upsert vertex or upsert edge).  One thing that I think customers
> > find
> > > > > > useful about the addV/addE step is that they are very
> discoverable,
> > > the
> > > > > > name tells me what functionality to expect from that step.
> > > > > >
> > > > >
> > > > > I'm starting to agree with the idea that we have to do something
> like
> > > > > mergeV()/E() as it wouldn't quite work as a start step otherwise.
> We
> > > > > couldn't do:
> > > > >
> > > > > g.merge()
> > > > >
> > > > > as we wouldnt know what sort of Element it applied to. If so, I
> > wonder
> > > if
> > > > > it would be better to preserve "merge" for my more general use-case
> > and
> > > > > prefer upsertV()/E()?
> > > > >
> > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is an
> > > > uncommon
> > > > > case (compared to V/E()) and we don't really have addVP() as an
> > > analogous
> > > > > step. Perhaps existing coalesce() patterns and/or my more general
> > > purpose
> > > > > merge() step would satisfy those situations??
> > > > >
> > > >
> > > > I think that a mergeVP() type step may require a different pattern
> > since
> > > it
> > > > would really need to accept a map of values otherwise it would
> > > essentially
> > > > perform the same function as property()
> > > >
> > > >
> > > > >
> > > > >
> > > > > > 6. One thing neither my approach nor yours seems to do is tell
> the
> > > user
> > > > > if
> > > > > > they created something or updated something - that's another
> thing
> > > I've
> > > > > > seen users want to have in get-or-create. Here again we go deeper
> > > into
> > > > a
> > > > > > less general step specification as alluded to in 4, but a merge()
> > > step
> > > > as
> > > > > > proposed in 5, might return [Element,boolean] so as to provide an
> > > > > indicator
> > > > > > of creation?
> > > > > >
> > > > > > Hmm, I'll have to think about that.  How would returning multiple
> > > > values
> > > > > > work if we want to chain these together. e.g. Add a vertex
> between
> > > two
> > > > > > edges but make sure the vertices exist?
> > > > > >
> > > > >
> > > >
> > > > Added a thought on how to accomplish this above that I'd like to get
> > > > thoughts on.
> > > >
> > > >
> > > > > >
> > > > > > 7. You were just introducing your ideas here, so perhaps you
> > haven't
> > > > > gotten
> > > > > > this far yet, but a shortcoming to doing merge(String,Map,Map) is
> > > that
> > > > it
> > > > > > leaves open no opportunity to stream a List of Maps to a merge()
> > for
> > > a
> > > > > form
> > > > > > of batch loading which is mighty common and one of the variations
> > of
> > > > the
> > > > > > coalesce() pattern that I alluded to at the start of all this. I
> > > think
> > > > > that
> > > > > > we would want to be sure that we left open the option to do that
> > > > somehow.
> > > > > > 8. If we had a general purpose merge() step I wonder if it makes
> > > > > developing
> > > > > > the API as you suggested easier to do?
> > > > > >
> > > > > > Hmm, let me think about that one.
> > > > > >
> > > > > >
> > > > > I will add an item 8 to think about which I didn't mention before:
> > > > >
> > > > > 8. The signature you suggested for mergeE() is:
> > > > >
> > > > > mergeE(String, Map, Map)
> > > > >      String - The edge label
> > > > >      Map (first) - The properties to match existing edge on
> > > > >      Map (second) - Any additional properties to set if a new edge
> is
> > > > > created (optional)
> > > > >
> > > > > but does that exactly answer the need? Typically the idea is to
> > detect
> > > an
> > > > > edge between two vertices not globally in the graph by way of some
> > > > > properties. This signature doesn't seem to allow for that as it
> > doesn't
> > > > > allow specification of the vertices to test against. Perhaps the
> > answer
> > > > is
> > > > > to use from() and to() modulators?
> > > > >
> > > > > g.mergeE('knows', {'weight':0.5}).
> > > > >   from(V(1)).to(V(2))
> > > > > g.V().has('person','name','marko').
> > > > >   mergeE( 'knows', {'weight':0.5}).
> > > > >     to(V(2))
> > > > > g.V().has('person','name','marko').
> > > > >   mergeE( 'self', {'weight':0.5})
> > > > >
> > > > > That seems to work?
> > > > >
> > > >
> > > > I think that is an elegant solution to that problem.  I like it and
> it
> > > > keeps in line with the way that addE() works.
> > > >
> > > >
> > > > >
> > > > >
> > > > >
> > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > spmallette@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I like our coalesce() pattern but it is verbose and over time
> it
> > > has
> > > > > gone
> > > > > > > from a simple pattern to one with numerous variations for all
> > > manner
> > > > of
> > > > > > > different sorts of merge-like operations. As such, I do think
> we
> > > > should
> > > > > > > introduce something to cover this pattern.
> > > > > > >
> > > > > > > I like that you used the word "merge" in your description of
> this
> > > as
> > > > it
> > > > > > is
> > > > > > > the word I've liked using. You might want to give a look at my
> > > > proposed
> > > > > > > merge() step from earlier in the year:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > >
> > > > > > > I'm just going to dump thoughts as they come regarding what you
> > > > wrote:
> > > > > > >
> > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > proposed?
> > > > > > > 2. How would users set the T.id on creation? would that T.id
> just
> > > be
> > > > a
> > > > > > key
> > > > > > > in the first Map argument?
> > > > > > > 3. I do like the general idea of a match on multiple properties
> > for
> > > > the
> > > > > > > first argument as a convenience but wonder about the
> specificity
> > of
> > > > > this
> > > > > > > API a bit as it focuses heavily on equality - I suppose that's
> > most
> > > > > cases
> > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > 4. I think your suggestion points to one of the troubles
> Gremlin
> > > has
> > > > > > which
> > > > > > > we see with "algorithms" - extending the language with new
> steps
> > > that
> > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > > > generalization of "merging" in my link above which fits into
> the
> > > core
> > > > > > > Gremlin language style. There is a bigger picture where we are
> > > > missing
> > > > > > > something in Gremlin that lets us extend the language in ways
> > that
> > > > let
> > > > > us
> > > > > > > easily introduce new steps that aren't for general purpose.
> This
> > > > issue
> > > > > is
> > > > > > > discussed in terms of "algorithms" here:
> > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> think
> > > we
> > > > > > could
> > > > > > > see how there might be some "mutation" extension steps that
> would
> > > > cover
> > > > > > > your suggested API, plus batch operations, etc. We need a way
> to
> > > add
> > > > > > > "sugar" without it interfering with the consistency of the
> core.
> > > > > > Obviously
> > > > > > > this is a bigger issue but perhaps important to solve to
> > implement
> > > > > steps
> > > > > > in
> > > > > > > the fashion you describe.
> > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> specify
> > > what
> > > > > > > element type the first Map argument should be applied to? what
> > > about
> > > > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > > > tricky
> > > > > > but
> > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > "complex
> > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > g.V().merge()
> > > > > > > and the label and first Map would fold to VertexStartStep (i.e.
> > > V())
> > > > > for
> > > > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > > > modulator"
> > > > > > > as it does more than just change the behavior of the previous
> > step
> > > -
> > > > it
> > > > > > > also adds its own. I suppose it could also add has() steps
> > followed
> > > > by
> > > > > > the
> > > > > > > MergeStep and then the has() operations would fold in normally
> as
> > > > they
> > > > > do
> > > > > > > today. In this way, we can simplify to just one single
> > > > > > > merge(String,Map,Map). ??
> > > > > > > 6. One thing neither my approach nor yours seems to do is tell
> > the
> > > > user
> > > > > > if
> > > > > > > they created something or updated something - that's another
> > thing
> > > > I've
> > > > > > > seen users want to have in get-or-create. Here again we go
> deeper
> > > > into
> > > > > a
> > > > > > > less general step specification as alluded to in 4, but a
> merge()
> > > > step
> > > > > as
> > > > > > > proposed in 5, might return [Element,boolean] so as to provide
> an
> > > > > > indicator
> > > > > > > of creation?
> > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > haven't
> > > > > > gotten
> > > > > > > this far yet, but a shortcoming to doing merge(String,Map,Map)
> is
> > > > that
> > > > > it
> > > > > > > leaves open no opportunity to stream a List of Maps to a
> merge()
> > > for
> > > > a
> > > > > > form
> > > > > > > of batch loading which is mighty common and one of the
> variations
> > > of
> > > > > the
> > > > > > > coalesce() pattern that I alluded to at the start of all this.
> I
> > > > think
> > > > > > that
> > > > > > > we would want to be sure that we left open the option to do
> that
> > > > > somehow.
> > > > > > > 8. If we had a general purpose merge() step I wonder if it
> makes
> > > > > > developing
> > > > > > > the API as you suggested easier to do?
> > > > > > >
> > > > > > > I think I'd like to solve the problems you describe in your
> post
> > as
> > > > > well
> > > > > > as
> > > > > > > the ones in mine. There is some relation there, but gaps as
> well.
> > > > With
> > > > > > more
> > > > > > > discussion here we can figure something out.
> > > > > > >
> > > > > > > Thanks for starting this talk - good one!
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > dave@bechberger.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I've had a few on and off discussions with a few people here,
> > so
> > > I
> > > > > > wanted
> > > > > > > > to send this out to everyone for feedback.
> > > > > > > >
> > > > > > > > What are people's thoughts on creating a new set of steps
> that
> > > > codify
> > > > > > > > common Gremlin best practices?
> > > > > > > >
> > > > > > > > I think there are several common Gremlin patterns where users
> > > would
> > > > > > > benefit
> > > > > > > > from the additional guidance that these codified steps
> > represent.
> > > > > The
> > > > > > > > first one I would recommend though is codifying the element
> > > > existence
> > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > >
> > > > > > > > mergeV(String, Map, Map)
> > > > > > > >      String - The vertex label
> > > > > > > >      Map (first) - The properties to match existing vertices
> on
> > > > > > > >      Map (second) - Any additional properties to set if a new
> > > > vertex
> > > > > is
> > > > > > > > created (optional)
> > > > > > > > mergeE(String, Map, Map)
> > > > > > > >      String - The edge label
> > > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > > >      Map (second) - Any additional properties to set if a new
> > > edge
> > > > is
> > > > > > > > created (optional)
> > > > > > > >
> > > > > > > > In each of these cases these steps would perform the same
> > upsert
> > > > > > > > functionality as the element existence pattern.
> > > > > > > >
> > > > > > > > Example:
> > > > > > > >
> > > > > > > > g.V().has('person','name','stephen').
> > > > > > > >            fold().
> > > > > > > >            coalesce(unfold(),
> > > > > > > >                     addV('person').
> > > > > > > >                       property('name','stephen').
> > > > > > > >                       property('age',34))
> > > > > > > >
> > > > > > > > would become:
> > > > > > > >
> > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > >
> > > > > > > > I think that this change would be a good addition to the
> > language
> > > > for
> > > > > > > > several reasons:
> > > > > > > >
> > > > > > > > * This codifies the best practice for a specific
> action/recipe,
> > > > which
> > > > > > > > reduces the chance that someone uses the pattern incorrectly
> > > > > > > > * Most complex Gremlin traversals are verbose.  Reducing the
> > > amount
> > > > > of
> > > > > > > code
> > > > > > > > that needs to be written and maintained allows for a better
> > > > developer
> > > > > > > > experience.
> > > > > > > > * It will lower the bar of entry for a developer by making
> > these
> > > > > > actions
> > > > > > > > more discoverable.  The more we can help bring these patterns
> > to
> > > > the
> > > > > > > > forefront of the language via these pattern/meta steps the
> more
> > > we
> > > > > > guide
> > > > > > > > users towards writing better Gremlin faster
> > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > >
> > > > > > > > I know that this would likely be the first step in Gremlin
> that
> > > > > > codifies
> > > > > > > a
> > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > >
> > > > > > > > Dave
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Stephen Mallette <sp...@gmail.com>.
ah - regarding 2, ok i couldn't remember and i didn't notice where we
talked about that when I scanned the thread. I've updated the gist
accordingly:

https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31





On Tue, Dec 8, 2020 at 10:51 AM David Bechberger <da...@bechberger.com>
wrote:

> For #1 and 3 I agree.  For #2 I thought that we agreed that the properties
> within the upsert were going to be for creation only and that any
> properties that needed to be updated would be done via the standard
> property() syntax after the upsert.  e.g.
>
> g.upsertV('person', [name: 'marko'],
>                     [age: 29]).property('lang', 'java')
>
> If it does not exist then this would create a vertex with the 'name',
> 'age', and 'lang' properties.  If it does exist then it would just update
> the 'lang' property.
>
> Dave
>
> On Tue, Dec 8, 2020 at 5:57 AM Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > I started a expanded this discussion to gremlin-users for a wider
> audience
> > and the thread is starting to grow:
> >
> > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
> >
> > I guess we'll need to summarize that discussion back here now....
> >
> > I did have some more thoughts to hang out there and figured that I
> wouldn't
> > convolute the discussion on gremlin-users with it so I will continue the
> > discussion here.
> >
> > 1, The very first couple of examples seem wrong (or at least not best
> > demonstrating the usage):
> >
> > g.upsertV('person', [name: 'marko'],
> >                     [name: 'marko', age: 29])
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, name: 'Marko'])
> >
> > should instead be:
> >
> > g.upsertV('person', [name: 'marko'],
> >                     [age: 29])
> > g.upsertV('person', [(T.id): 1],
> >                     [name: 'Marko'])
> >
> > 2. I can't recall if we made this clear anywhere but in situations where
> we
> > "get" rather than "create" do the additional properties act in an update
> > fashion to the element that was found? I think I've been working on the
> > assumption that it would, though perhaps that is not always desirable?
> >
> > 3. We really never settled up how to deal with multi/meta-properties.
> That
> > story should be clear so that when we document upsert() we include the
> > approaches for the fallback patterns that don't meet the 90% of use cases
> > we are targeting and I sense that we're saying that meta/multi-properties
> > don't fit in that bucket. So with that in mind, I don't think that the
> > following works for metaproperties:
> >
> > g.upsertV('person', [(T.id): 1],
> >                     [name:[acl: 'public'])
> >
> > as it doesn't let us set the value for the "name" just the pairs for the
> > meta-properties. I guess a user would have to fall back to:
> >
> > g.upsertV('person', [(T.id): 1]).
> >   property('name','marko','acl','public')
> >
> > // or use the additional properties syntax
> > g.upsertV('person', [(T.id): 1],[name:'marko']).
> >   properties('name').property('acl','public')
> >
> > // or if there were multi-properties then maybe...
> > g.upsertV('person', [(T.id): 1],[name:'marko']).
> >   properties('name').hasValue('marko').property('acl','public')
> >
> > As for multi-properties, I dont think we should assume that a List object
> > should be interpreted by Gremlin as a multi-property. Perhaps we just
> rely
> > on the underlying graph to properly deal with that given a schema or the
> > user falls back to:
> >
> > // this ends up as however the graph deals with a List object
> > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
> >
> > // this is explicit
> > g.upsertV('person', [(T.id): 1]).
> >   property(list, 'lang', 'java').
> >   property(list, 'lang, 'scala').
> >   property(list, 'lang', 'java')
> >
> > If that makes sense to everyone, I will update the gist.
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <da...@bechberger.com>
> > wrote:
> >
> > > Hello Stephen,
> > >
> > > Thanks for making that gist, its much easier to follow along with the
> > > proposed syntax there.  To your specific comments:
> > >
> > > #1 - My only worry with the term upsert is that it is not as widely a
> > used
> > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).  However I
> > > don't have a strong opinion on this, so I am fine with either.
> > > #2 - My only real objective here is to make sure that we make the
> 80-90%
> > > case easy and straightforward.  I think that having the fallback option
> > of
> > > using the current syntax for any complicated edge cases should be
> > > considered here as well. I'd appreciate your thoughts here as these are
> > > good points you bring up that definitely fall into the 80-90% use case.
> > > #3 - Those points make sense to me, not sure I have anything further to
> > add
> > > #4 - I don't think I would expect to have the values extracted from the
> > > traversal filters but I'd be interested in other opinions on that.
> > >
> > > Thanks,
> > > Dave
> > >
> > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <spmallette@gmail.com
> >
> > > wrote:
> > >
> > > > It's been a couple weeks since we moved this thread so I went back
> > > through
> > > > it and refreshed my mind with what's there. Dave, nice job putting
> all
> > > > those examples together. I've recollected them all together and have
> > just
> > > > been reviewing them in a more formatted style with proper Groovy
> syntax
> > > > than what can be accomplished in this mailing list. Anyone who cares
> to
> > > > review in that form can do so here (i've pasted its markdown contents
> > at
> > > > the bottom of this post as well):
> > > >
> > > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > > >
> > > > In this light, I have the following comments and thoughts:
> > > >
> > > > 1. I feel more inclined toward saving "merge" for a more generalized
> > step
> > > > of that name and have thus renamed the examples to use "upsert".
> > > >
> > > > 2. I still don't quite feel comfortable with meta/multi-properties
> > > > examples. Perhaps you could explain further as it's possible I'm not
> > > > thinking things through properly. For meta-properties the example is:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > >
> > > > So I see that "name" takes a Map here, but how does Gremlin know if
> > that
> > > > means "meta-property" or "Map value" and if the former, then how do
> you
> > > set
> > > > the value of "name"?  My question is similar to the multi-property
> > > examples
> > > > - using this one:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > >
> > > > How does Gremlin know if the user means that "lang" should be a
> > > > "multi-property" with Cardinality.list (or Cardinality.set for that
> > > matter)
> > > > or a value of Cardinality.single with a List/Array value? I can
> perhaps
> > > > suggest some solutions here but wanted to make sure I wasn't just
> > > > misunderstanding something.
> > > >
> > > > 3. The "upsert with stream" example is:
> > > >
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(values(label), valueMap(id), valueMap())
> > > >
> > > > That would establish an API of upsertV(Traversal, Traversal,
> Traversal)
> > > > which would enable your final thought of your last post with:
> > > >
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > >
> > > > The open-ended nature of that is neat but comes with some complexity
> > with
> > > > steps that traditionally take an "open" Traversal (and this step now
> > has
> > > up
> > > > to three of them) - perhaps, that's our own fault in some ways. For
> the
> > > > "stream" concept, values()/valueMap() won't immediately work that way
> > as
> > > > they only apply to Element...given today's Gremlin semantics, I think
> > > we'd
> > > > re-write that as:
> > > >
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(select(label), select(id), select(id,label,'name'))
> > > >
> > > > Though, even that doesn't quite work because you can't select(T)
> right
> > > now
> > > > which means you'd need to go with:
> > > >
> > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > >          [id: 2, label: 'person', name: 'josh']).
> > > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > >
> > > > not terrible in a way and perhaps we could fix select(T) - can't
> > remember
> > > > why that isn't allowed except that select() operates over multiple
> > scopes
> > > > and perhaps trying T through those scopes causes trouble.
> > > >
> > > > 4. Continuing with the use of Traversal as arguments, let's go back
> to:
> > > >
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > >
> > > > I originally had it in my mind to prefer this approach over a Map for
> > the
> > > > search as it provides a great deal of flexibility and fits the common
> > > > filtering patterns that users follow really well. I suppose the
> > question
> > > is
> > > > what to do in the situation of "create". Would we extract all the
> > strict
> > > > equality filter values from the initial has(), thus, in the example,
> > > > "state" but not "age", and create a vertex that has "state=NY,
> > > > active=true"? That is possible with how things are designed in
> Gremlin
> > > > today I think.
> > > >
> > > > This syntax does create some conflict with the subject of 3 and
> streams
> > > > because the traverser in the streams case is the incoming Map but for
> > > this
> > > > context it's meant as a filter of V(). Getting too whacky?
> > > >
> > > > 5. I like the idea of using a side-effect to capture whether the
> > element
> > > > was created or not. that makes sense. I think that can work.
> > > >
> > > > ==============================
> > > > = gist contents
> > > > ==============================
> > > >
> > > > # API
> > > >
> > > > ```java
> > > > upsertV(String label, Map matchOrCreateProperties)
> > > > upsertV(String label, Map matchOrCreateProperties, Map
> > > > additionalProperties)
> > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > > upsertV(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > > additionalProperties)
> > > > upsertV(...).
> > > >   with(WithOptions.sideEffectLabel, 'a')
> > > >
> > > > upsertE(String label, Map matchOrCreateProperties)
> > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > > upsertE(String label, Map matchOrCreateProperties, Map
> > > > additionalProperties)
> > > > upsertE(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > > additionalProperties)
> > > > upsertE(...).
> > > >   from(Vertex incidentOut).to(Vertex incidentIn)
> > > >   with(WithOptions.sideEffectLabel, 'a')
> > > > ```
> > > >
> > > > # Examples
> > > >
> > > > ## upsert
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [name: 'marko', age: 29])
> > > > ```
> > > >
> > > > ## upsert with id
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name: 'Marko'])
> > > > ```
> > > >
> > > > ## upsert with Meta Properties
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > ```
> > > >
> > > > ## upsert with Multi Properties
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > > ```
> > > >
> > > > ## upsert with Single Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: 'java')
> > > > ```
> > > >
> > > > ## upsert with List Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > > ```
> > > >
> > > > ## upsert with Set Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java','scala'])
> > > > ```
> > > >
> > > > ## upsert with List Cardinality - and add value to list
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java','scala','java']).
> > > >   property(Cardinality.list, 'lang', 'java')
> > > > ```
> > > >
> > > > ## upsert with stream
> > > >
> > > > ```groovy
> > > > // doesn't work with today's Gremlin semantics
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(values(label), valueMap(id), valueMap())
> > > >
> > > > // the following would be more in line with what's possible with
> > existing
> > > > Gremlin semantics:
> > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > >          [id: 2, label: 'person', name: 'josh']).
> > > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > > ```
> > > >
> > > > ## upsert with reporting added or updated side effect
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [name: 'marko', age: 29]).
> > > >     with(WithOptions.sideEffectLabel, 'a').
> > > >   project('vertex', 'added').
> > > >     by().
> > > >     by(cap('a'))
> > > > ```
> > > >
> > > > ## upsert Edge assuming a self-relation of "marko"
> > > >
> > > > ```groovy
> > > > g.V().has('person','name','marko').
> > > >   upsertE( 'self', [weight:0.5])
> > > > ```
> > > >
> > > > ## upsert Edge with incident vertex checks
> > > >
> > > > ```groovy
> > > > g.upsertE('knows', [weight:0.5]).
> > > >    from(V(1)).to(V(2))
> > > > g.V().has('person','name','marko').
> > > >   upsertE( 'knows', [weight:0.5]).
> > > >     to(V(2))
> > > > ```
> > > >
> > > > ## upsert using a Traversal for the match
> > > >
> > > > ```groovy
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > > ```
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
> dave@bechberger.com>
> > > > wrote:
> > > >
> > > > > So, I've been doing some additional thinking about the ways that
> this
> > > > could
> > > > > work based on the comments below and have put my comments inline.
> > > > >
> > > > > Dave
> > > > >
> > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> > spmallette@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > I added some thoughts inline below:
> > > > > >
> > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > > dave@bechberger.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for the detailed comments Stephen.  I have addressed
> them
> > > > inline
> > > > > > > below.
> > > > > > >
> > > > > > > I did read the proposal from earlier and I think that we are in
> > > close
> > > > > > > agreement with what we are trying to accomplish.  I also fully
> > > > support
> > > > > > > Josh's comment on providing a mechanism for submitting a map of
> > > > > > properties
> > > > > > > as manually unrolling this all right now leads to a lot of
> > > potential
> > > > > for
> > > > > > > error and a long messy traversal.
> > > > > > >
> > > > > > > I'm looking forward to this discussion on how to merge these
> two
> > > > > > proposals.
> > > > > > >
> > > > > > >
> > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > proposed?
> > > > > > >
> > > > > > > My first thought here is that multi-properties would be
> > represented
> > > > as
> > > > > > > lists in the map, e.g.
> > > > > > >
> > > > > > > {names: ['Dave', 'David']}
> > > > > > >
> > > > > > > and meta-properties would be represented as maps in the maps,
> > e.g.
> > > > > > >
> > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > > >
> > > > > > > I can't say I've thought through all the implications of this
> > > though
> > > > so
> > > > > > it
> > > > > > > is an area we would need to explore.
> > > > > > >
> > > > > >
> > > > > > The first implication that comes to mind is that it makes the
> > > > assumption
> > > > > > the user wants multi/meta-properties as opposed to a single
> > > cardinality
> > > > > of
> > > > > > List in the first case and a Map as a property value in the
> second
> > > > case.
> > > > > I
> > > > > > suppose that graphs with a schema could resolve those assumptions
> > but
> > > > > > graphs that are schemaless would have a problem. The issue could
> be
> > > > > > resolved by specialized configuration of "g" or per merge() step
> > > using
> > > > a
> > > > > > with() modulator I suppose but that goes into a yet another level
> > of
> > > > > > implications to consider. I've often wondered if the start point
> > for
> > > > > > getting types/schema into TP3 without a full rewrite would be in
> > this
> > > > > form
> > > > > > where Gremlin would be given hints as to what to expect as to the
> > > types
> > > > > of
> > > > > > data it might encounter while traversing. Anyway, I'd be hesitant
> > to
> > > go
> > > > > > down paths that don't account for multi/metaproperties well.
> They
> > > are
> > > > > > first class citizens in TP3 (with those hoping for extension of
> at
> > > > least
> > > > > > multiproperties to edges) and while I find them a constant
> > annoyance
> > > > for
> > > > > so
> > > > > > many reasons, we're kinda stuck with them.
> > > > > >
> > > > >
> > > > > I agree we need to account for multi/meta properties as 1st class
> > > > > citizens.  This is my current thinking on the syntax for the
> > situations
> > > > we
> > > > > have laid out so far:
> > > > >
> > > > > *Merge*
> > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
> > > > >
> > > > >
> > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> > 'Marko'})
> > > > >
> > > > >
> > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'name':
> > > > > {'first': Marko', 'last': 'R'})
> > > > >
> > > > >
> > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang': ['java', 'scala'])
> > > > >
> > > > >
> > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang': 'java')
> > > > >
> > > > >
> > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang':
> > > > > ['java', 'scala', 'java'])
> > > > >
> > > > >
> > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang':
> > > > > ['java', 'scala'])
> > > > >
> > > > > Since in a mergeV() scenario we are only ever adding whatever
> values
> > > are
> > > > > passed in there would be no need to specify the cardinality of the
> > > > property
> > > > > being added.  If they wanted to add a value to an existing property
> > > then
> > > > > the current property() method would still be available on the
> output
> > > > > traversal. e.g.
> > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > > >
> > > > >
> > > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> > > 'marko'},
> > > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > > >   mergeV(values('label'), valueMap('id'), valueMap())
> > > > >
> > > > >
> > > > > *Merge with reporting added or updated side effect*
> g.mergeV('name',
> > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > > >   with(WithOptions.sideEffectLabel, 'a').
> > > > >   project('vertex', 'added').
> > > > >     by().
> > > > >     by(cap('a'))
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > 2. How would users set the T.id on creation? would that T.id
> just
> > > be
> > > > a
> > > > > > key
> > > > > > > in the first Map argument?
> > > > > > >
> > > > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > > > >
> > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > > >
> > > > > >
> > > > > > ok - I can't say I see a problem with that atm.
> > > > > >
> > > > > >
> > > > > > > 3. I do like the general idea of a match on multiple properties
> > for
> > > > the
> > > > > > > first argument as a convenience but wonder about the
> specificity
> > of
> > > > > this
> > > > > > > API a bit as it focuses heavily on equality - I suppose that's
> > most
> > > > > cases
> > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > >
> > > > > > > In most cases I've seen use exact matches on the vertex or
> > edge.  I
> > > > > think
> > > > > > > it might be best to keep this straightforward as any complex
> edge
> > > > cases
> > > > > > > still can perform the same functionality using the coalesce()
> > > > pattern.
> > > > > > >
> > > > > >
> > > > > > I played around with the idea to use modulators to apply
> additional
> > > > > > constraints:
> > > > > >
> > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > >     by(has('age', gt(29))
> > > > > >
> > > > > > I suppose that's neat but maybe not...we could easily get the
> same
> > > > thing
> > > > > > from:
> > > > > >
> > > > > > g.V().has('person','age',gt(29)).
> > > > > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > >
> > > > > > which is more readable though it does make it harder for
> providers
> > to
> > > > > > optimize upsert operations if they could make use of the has() as
> > > part
> > > > of
> > > > > > that. I think I like has() as part of the main query rather than
> > in a
> > > > > by()
> > > > > > - just thinking out loud.
> > > > > >
> > > > > >
> > > > > Another potential option here would be to allow the second
> parameter
> > of
> > > > > mergeV() accept a traversal.  Not sure how much complexity that
> would
> > > add
> > > > > though
> > > > >
> > > > > g.V().mergeV('person', __.has('person','age',gt(29)).has('state',
> > > 'NY'}.,
> > > > > {'active': true}).
> > > > >
> > > > >
> > > > > >
> > > > > > > 4. I think your suggestion points to one of the troubles
> Gremlin
> > > has
> > > > > > which
> > > > > > > we see with "algorithms" - extending the language with new
> steps
> > > that
> > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > > > generalization of "merging" in my link above which fits into
> the
> > > core
> > > > > > > Gremlin language style. There is a bigger picture where we are
> > > > missing
> > > > > > > something in Gremlin that lets us extend the language in ways
> > that
> > > > let
> > > > > us
> > > > > > > easily introduce new steps that aren't for general purpose.
> This
> > > > issue
> > > > > is
> > > > > > > discussed in terms of "algorithms" here:
> > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> think
> > > we
> > > > > > could
> > > > > > > see how there might be some "mutation" extension steps that
> would
> > > > cover
> > > > > > > your suggested API, plus batch operations, etc. We need a way
> to
> > > add
> > > > > > > "sugar" without it interfering with the consistency of the
> core.
> > > > > > Obviously
> > > > > > > this is a bigger issue but perhaps important to solve to
> > implement
> > > > > steps
> > > > > > in
> > > > > > > the fashion you describe.
> > > > > > >
> > > > > > > I agree that the "algorithm" steps do seem a bit odd in the
> core
> > > > > language
> > > > > > > but I think the need is there.  I'd be interested in furthering
> > > this
> > > > > > > discussion but I think these "pattern" steps may or may not be
> > the
> > > > same
> > > > > > as
> > > > > > > the algorithm steps.  I'll have to think on that.
> > > > > >
> > > > > >
> > > > > >
> > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> specify
> > > what
> > > > > > > element type the first Map argument should be applied to? what
> > > about
> > > > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > > > tricky
> > > > > > but
> > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > "complex
> > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > g.V().merge()
> > > > > > > and the label and first Map would fold to VertexStartStep (i.e.
> > > V())
> > > > > for
> > > > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > > > modulator"
> > > > > > > as it does more than just change the behavior of the previous
> > step
> > > -
> > > > it
> > > > > > > also adds its own. I suppose it could also add has() steps
> > followed
> > > > by
> > > > > > the
> > > > > > > MergeStep and then the has() operations would fold in normally
> as
> > > > they
> > > > > do
> > > > > > > today. In this way, we can simplify to just one single
> > > > > > > merge(String,Map,Map). ??
> > > > > > >
> > > > > > > I agree that we should also think about how to include
> properties
> > > in
> > > > > this
> > > > > > > merge construct.  The reason I was thinking about mergeV() and
> > > > mergeE()
> > > > > > is
> > > > > > > that it follows the same pattern as the already well understood
> > > > > > > addV()/addE() steps.  I am a bit wary of trying to generalize
> > this
> > > > down
> > > > > > to
> > > > > > > a single merge() step as these sorts of complex overloads make
> it
> > > > hard
> > > > > to
> > > > > > > figure out which Gremlin step you should use for a particular
> > > pattern
> > > > > > (e.g.
> > > > > > > upsert vertex or upsert edge).  One thing that I think
> customers
> > > find
> > > > > > > useful about the addV/addE step is that they are very
> > discoverable,
> > > > the
> > > > > > > name tells me what functionality to expect from that step.
> > > > > > >
> > > > > >
> > > > > > I'm starting to agree with the idea that we have to do something
> > like
> > > > > > mergeV()/E() as it wouldn't quite work as a start step otherwise.
> > We
> > > > > > couldn't do:
> > > > > >
> > > > > > g.merge()
> > > > > >
> > > > > > as we wouldnt know what sort of Element it applied to. If so, I
> > > wonder
> > > > if
> > > > > > it would be better to preserve "merge" for my more general
> use-case
> > > and
> > > > > > prefer upsertV()/E()?
> > > > > >
> > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is an
> > > > > uncommon
> > > > > > case (compared to V/E()) and we don't really have addVP() as an
> > > > analogous
> > > > > > step. Perhaps existing coalesce() patterns and/or my more general
> > > > purpose
> > > > > > merge() step would satisfy those situations??
> > > > > >
> > > > >
> > > > > I think that a mergeVP() type step may require a different pattern
> > > since
> > > > it
> > > > > would really need to accept a map of values otherwise it would
> > > > essentially
> > > > > perform the same function as property()
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > 6. One thing neither my approach nor yours seems to do is tell
> > the
> > > > user
> > > > > > if
> > > > > > > they created something or updated something - that's another
> > thing
> > > > I've
> > > > > > > seen users want to have in get-or-create. Here again we go
> deeper
> > > > into
> > > > > a
> > > > > > > less general step specification as alluded to in 4, but a
> merge()
> > > > step
> > > > > as
> > > > > > > proposed in 5, might return [Element,boolean] so as to provide
> an
> > > > > > indicator
> > > > > > > of creation?
> > > > > > >
> > > > > > > Hmm, I'll have to think about that.  How would returning
> multiple
> > > > > values
> > > > > > > work if we want to chain these together. e.g. Add a vertex
> > between
> > > > two
> > > > > > > edges but make sure the vertices exist?
> > > > > > >
> > > > > >
> > > > >
> > > > > Added a thought on how to accomplish this above that I'd like to
> get
> > > > > thoughts on.
> > > > >
> > > > >
> > > > > > >
> > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > haven't
> > > > > > gotten
> > > > > > > this far yet, but a shortcoming to doing merge(String,Map,Map)
> is
> > > > that
> > > > > it
> > > > > > > leaves open no opportunity to stream a List of Maps to a
> merge()
> > > for
> > > > a
> > > > > > form
> > > > > > > of batch loading which is mighty common and one of the
> variations
> > > of
> > > > > the
> > > > > > > coalesce() pattern that I alluded to at the start of all this.
> I
> > > > think
> > > > > > that
> > > > > > > we would want to be sure that we left open the option to do
> that
> > > > > somehow.
> > > > > > > 8. If we had a general purpose merge() step I wonder if it
> makes
> > > > > > developing
> > > > > > > the API as you suggested easier to do?
> > > > > > >
> > > > > > > Hmm, let me think about that one.
> > > > > > >
> > > > > > >
> > > > > > I will add an item 8 to think about which I didn't mention
> before:
> > > > > >
> > > > > > 8. The signature you suggested for mergeE() is:
> > > > > >
> > > > > > mergeE(String, Map, Map)
> > > > > >      String - The edge label
> > > > > >      Map (first) - The properties to match existing edge on
> > > > > >      Map (second) - Any additional properties to set if a new
> edge
> > is
> > > > > > created (optional)
> > > > > >
> > > > > > but does that exactly answer the need? Typically the idea is to
> > > detect
> > > > an
> > > > > > edge between two vertices not globally in the graph by way of
> some
> > > > > > properties. This signature doesn't seem to allow for that as it
> > > doesn't
> > > > > > allow specification of the vertices to test against. Perhaps the
> > > answer
> > > > > is
> > > > > > to use from() and to() modulators?
> > > > > >
> > > > > > g.mergeE('knows', {'weight':0.5}).
> > > > > >   from(V(1)).to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >   mergeE( 'knows', {'weight':0.5}).
> > > > > >     to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >   mergeE( 'self', {'weight':0.5})
> > > > > >
> > > > > > That seems to work?
> > > > > >
> > > > >
> > > > > I think that is an elegant solution to that problem.  I like it and
> > it
> > > > > keeps in line with the way that addE() works.
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > > spmallette@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I like our coalesce() pattern but it is verbose and over time
> > it
> > > > has
> > > > > > gone
> > > > > > > > from a simple pattern to one with numerous variations for all
> > > > manner
> > > > > of
> > > > > > > > different sorts of merge-like operations. As such, I do think
> > we
> > > > > should
> > > > > > > > introduce something to cover this pattern.
> > > > > > > >
> > > > > > > > I like that you used the word "merge" in your description of
> > this
> > > > as
> > > > > it
> > > > > > > is
> > > > > > > > the word I've liked using. You might want to give a look at
> my
> > > > > proposed
> > > > > > > > merge() step from earlier in the year:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > > >
> > > > > > > > I'm just going to dump thoughts as they come regarding what
> you
> > > > > wrote:
> > > > > > > >
> > > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > > proposed?
> > > > > > > > 2. How would users set the T.id on creation? would that T.id
> > just
> > > > be
> > > > > a
> > > > > > > key
> > > > > > > > in the first Map argument?
> > > > > > > > 3. I do like the general idea of a match on multiple
> properties
> > > for
> > > > > the
> > > > > > > > first argument as a convenience but wonder about the
> > specificity
> > > of
> > > > > > this
> > > > > > > > API a bit as it focuses heavily on equality - I suppose
> that's
> > > most
> > > > > > cases
> > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > 4. I think your suggestion points to one of the troubles
> > Gremlin
> > > > has
> > > > > > > which
> > > > > > > > we see with "algorithms" - extending the language with new
> > steps
> > > > that
> > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > > shortestPath() step) pollutes the core language a bit, hence
> my
> > > > > > > > generalization of "merging" in my link above which fits into
> > the
> > > > core
> > > > > > > > Gremlin language style. There is a bigger picture where we
> are
> > > > > missing
> > > > > > > > something in Gremlin that lets us extend the language in ways
> > > that
> > > > > let
> > > > > > us
> > > > > > > > easily introduce new steps that aren't for general purpose.
> > This
> > > > > issue
> > > > > > is
> > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> > think
> > > > we
> > > > > > > could
> > > > > > > > see how there might be some "mutation" extension steps that
> > would
> > > > > cover
> > > > > > > > your suggested API, plus batch operations, etc. We need a way
> > to
> > > > add
> > > > > > > > "sugar" without it interfering with the consistency of the
> > core.
> > > > > > > Obviously
> > > > > > > > this is a bigger issue but perhaps important to solve to
> > > implement
> > > > > > steps
> > > > > > > in
> > > > > > > > the fashion you describe.
> > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > specify
> > > > what
> > > > > > > > element type the first Map argument should be applied to?
> what
> > > > about
> > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> That's
> > > > > tricky
> > > > > > > but
> > > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > > "complex
> > > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > > g.V().merge()
> > > > > > > > and the label and first Map would fold to VertexStartStep
> (i.e.
> > > > V())
> > > > > > for
> > > > > > > > the lookup and then a MergeStep would follow - thus a
> "complex
> > > > > > modulator"
> > > > > > > > as it does more than just change the behavior of the previous
> > > step
> > > > -
> > > > > it
> > > > > > > > also adds its own. I suppose it could also add has() steps
> > > followed
> > > > > by
> > > > > > > the
> > > > > > > > MergeStep and then the has() operations would fold in
> normally
> > as
> > > > > they
> > > > > > do
> > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > merge(String,Map,Map). ??
> > > > > > > > 6. One thing neither my approach nor yours seems to do is
> tell
> > > the
> > > > > user
> > > > > > > if
> > > > > > > > they created something or updated something - that's another
> > > thing
> > > > > I've
> > > > > > > > seen users want to have in get-or-create. Here again we go
> > deeper
> > > > > into
> > > > > > a
> > > > > > > > less general step specification as alluded to in 4, but a
> > merge()
> > > > > step
> > > > > > as
> > > > > > > > proposed in 5, might return [Element,boolean] so as to
> provide
> > an
> > > > > > > indicator
> > > > > > > > of creation?
> > > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > > haven't
> > > > > > > gotten
> > > > > > > > this far yet, but a shortcoming to doing
> merge(String,Map,Map)
> > is
> > > > > that
> > > > > > it
> > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > merge()
> > > > for
> > > > > a
> > > > > > > form
> > > > > > > > of batch loading which is mighty common and one of the
> > variations
> > > > of
> > > > > > the
> > > > > > > > coalesce() pattern that I alluded to at the start of all
> this.
> > I
> > > > > think
> > > > > > > that
> > > > > > > > we would want to be sure that we left open the option to do
> > that
> > > > > > somehow.
> > > > > > > > 8. If we had a general purpose merge() step I wonder if it
> > makes
> > > > > > > developing
> > > > > > > > the API as you suggested easier to do?
> > > > > > > >
> > > > > > > > I think I'd like to solve the problems you describe in your
> > post
> > > as
> > > > > > well
> > > > > > > as
> > > > > > > > the ones in mine. There is some relation there, but gaps as
> > well.
> > > > > With
> > > > > > > more
> > > > > > > > discussion here we can figure something out.
> > > > > > > >
> > > > > > > > Thanks for starting this talk - good one!
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > > dave@bechberger.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I've had a few on and off discussions with a few people
> here,
> > > so
> > > > I
> > > > > > > wanted
> > > > > > > > > to send this out to everyone for feedback.
> > > > > > > > >
> > > > > > > > > What are people's thoughts on creating a new set of steps
> > that
> > > > > codify
> > > > > > > > > common Gremlin best practices?
> > > > > > > > >
> > > > > > > > > I think there are several common Gremlin patterns where
> users
> > > > would
> > > > > > > > benefit
> > > > > > > > > from the additional guidance that these codified steps
> > > represent.
> > > > > > The
> > > > > > > > > first one I would recommend though is codifying the element
> > > > > existence
> > > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > > >
> > > > > > > > > mergeV(String, Map, Map)
> > > > > > > > >      String - The vertex label
> > > > > > > > >      Map (first) - The properties to match existing
> vertices
> > on
> > > > > > > > >      Map (second) - Any additional properties to set if a
> new
> > > > > vertex
> > > > > > is
> > > > > > > > > created (optional)
> > > > > > > > > mergeE(String, Map, Map)
> > > > > > > > >      String - The edge label
> > > > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > > > >      Map (second) - Any additional properties to set if a
> new
> > > > edge
> > > > > is
> > > > > > > > > created (optional)
> > > > > > > > >
> > > > > > > > > In each of these cases these steps would perform the same
> > > upsert
> > > > > > > > > functionality as the element existence pattern.
> > > > > > > > >
> > > > > > > > > Example:
> > > > > > > > >
> > > > > > > > > g.V().has('person','name','stephen').
> > > > > > > > >            fold().
> > > > > > > > >            coalesce(unfold(),
> > > > > > > > >                     addV('person').
> > > > > > > > >                       property('name','stephen').
> > > > > > > > >                       property('age',34))
> > > > > > > > >
> > > > > > > > > would become:
> > > > > > > > >
> > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > > >
> > > > > > > > > I think that this change would be a good addition to the
> > > language
> > > > > for
> > > > > > > > > several reasons:
> > > > > > > > >
> > > > > > > > > * This codifies the best practice for a specific
> > action/recipe,
> > > > > which
> > > > > > > > > reduces the chance that someone uses the pattern
> incorrectly
> > > > > > > > > * Most complex Gremlin traversals are verbose.  Reducing
> the
> > > > amount
> > > > > > of
> > > > > > > > code
> > > > > > > > > that needs to be written and maintained allows for a better
> > > > > developer
> > > > > > > > > experience.
> > > > > > > > > * It will lower the bar of entry for a developer by making
> > > these
> > > > > > > actions
> > > > > > > > > more discoverable.  The more we can help bring these
> patterns
> > > to
> > > > > the
> > > > > > > > > forefront of the language via these pattern/meta steps the
> > more
> > > > we
> > > > > > > guide
> > > > > > > > > users towards writing better Gremlin faster
> > > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > > >
> > > > > > > > > I know that this would likely be the first step in Gremlin
> > that
> > > > > > > codifies
> > > > > > > > a
> > > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > > >
> > > > > > > > > Dave
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by David Bechberger <da...@bechberger.com>.
For #1 and 3 I agree.  For #2 I thought that we agreed that the properties
within the upsert were going to be for creation only and that any
properties that needed to be updated would be done via the standard
property() syntax after the upsert.  e.g.

g.upsertV('person', [name: 'marko'],
                    [age: 29]).property('lang', 'java')

If it does not exist then this would create a vertex with the 'name',
'age', and 'lang' properties.  If it does exist then it would just update
the 'lang' property.

Dave

On Tue, Dec 8, 2020 at 5:57 AM Stephen Mallette <sp...@gmail.com>
wrote:

> I started a expanded this discussion to gremlin-users for a wider audience
> and the thread is starting to grow:
>
> https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
>
> I guess we'll need to summarize that discussion back here now....
>
> I did have some more thoughts to hang out there and figured that I wouldn't
> convolute the discussion on gremlin-users with it so I will continue the
> discussion here.
>
> 1, The very first couple of examples seem wrong (or at least not best
> demonstrating the usage):
>
> g.upsertV('person', [name: 'marko'],
>                     [name: 'marko', age: 29])
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1, name: 'Marko'])
>
> should instead be:
>
> g.upsertV('person', [name: 'marko'],
>                     [age: 29])
> g.upsertV('person', [(T.id): 1],
>                     [name: 'Marko'])
>
> 2. I can't recall if we made this clear anywhere but in situations where we
> "get" rather than "create" do the additional properties act in an update
> fashion to the element that was found? I think I've been working on the
> assumption that it would, though perhaps that is not always desirable?
>
> 3. We really never settled up how to deal with multi/meta-properties. That
> story should be clear so that when we document upsert() we include the
> approaches for the fallback patterns that don't meet the 90% of use cases
> we are targeting and I sense that we're saying that meta/multi-properties
> don't fit in that bucket. So with that in mind, I don't think that the
> following works for metaproperties:
>
> g.upsertV('person', [(T.id): 1],
>                     [name:[acl: 'public'])
>
> as it doesn't let us set the value for the "name" just the pairs for the
> meta-properties. I guess a user would have to fall back to:
>
> g.upsertV('person', [(T.id): 1]).
>   property('name','marko','acl','public')
>
> // or use the additional properties syntax
> g.upsertV('person', [(T.id): 1],[name:'marko']).
>   properties('name').property('acl','public')
>
> // or if there were multi-properties then maybe...
> g.upsertV('person', [(T.id): 1],[name:'marko']).
>   properties('name').hasValue('marko').property('acl','public')
>
> As for multi-properties, I dont think we should assume that a List object
> should be interpreted by Gremlin as a multi-property. Perhaps we just rely
> on the underlying graph to properly deal with that given a schema or the
> user falls back to:
>
> // this ends up as however the graph deals with a List object
> g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
>
> // this is explicit
> g.upsertV('person', [(T.id): 1]).
>   property(list, 'lang', 'java').
>   property(list, 'lang, 'scala').
>   property(list, 'lang', 'java')
>
> If that makes sense to everyone, I will update the gist.
>
>
>
>
>
>
>
> On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <da...@bechberger.com>
> wrote:
>
> > Hello Stephen,
> >
> > Thanks for making that gist, its much easier to follow along with the
> > proposed syntax there.  To your specific comments:
> >
> > #1 - My only worry with the term upsert is that it is not as widely a
> used
> > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).  However I
> > don't have a strong opinion on this, so I am fine with either.
> > #2 - My only real objective here is to make sure that we make the 80-90%
> > case easy and straightforward.  I think that having the fallback option
> of
> > using the current syntax for any complicated edge cases should be
> > considered here as well. I'd appreciate your thoughts here as these are
> > good points you bring up that definitely fall into the 80-90% use case.
> > #3 - Those points make sense to me, not sure I have anything further to
> add
> > #4 - I don't think I would expect to have the values extracted from the
> > traversal filters but I'd be interested in other opinions on that.
> >
> > Thanks,
> > Dave
> >
> > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <sp...@gmail.com>
> > wrote:
> >
> > > It's been a couple weeks since we moved this thread so I went back
> > through
> > > it and refreshed my mind with what's there. Dave, nice job putting all
> > > those examples together. I've recollected them all together and have
> just
> > > been reviewing them in a more formatted style with proper Groovy syntax
> > > than what can be accomplished in this mailing list. Anyone who cares to
> > > review in that form can do so here (i've pasted its markdown contents
> at
> > > the bottom of this post as well):
> > >
> > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > >
> > > In this light, I have the following comments and thoughts:
> > >
> > > 1. I feel more inclined toward saving "merge" for a more generalized
> step
> > > of that name and have thus renamed the examples to use "upsert".
> > >
> > > 2. I still don't quite feel comfortable with meta/multi-properties
> > > examples. Perhaps you could explain further as it's possible I'm not
> > > thinking things through properly. For meta-properties the example is:
> > >
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > >
> > > So I see that "name" takes a Map here, but how does Gremlin know if
> that
> > > means "meta-property" or "Map value" and if the former, then how do you
> > set
> > > the value of "name"?  My question is similar to the multi-property
> > examples
> > > - using this one:
> > >
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1,lang: ['java', 'scala'])
> > >
> > > How does Gremlin know if the user means that "lang" should be a
> > > "multi-property" with Cardinality.list (or Cardinality.set for that
> > matter)
> > > or a value of Cardinality.single with a List/Array value? I can perhaps
> > > suggest some solutions here but wanted to make sure I wasn't just
> > > misunderstanding something.
> > >
> > > 3. The "upsert with stream" example is:
> > >
> > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > >          [(id): 2, (label): 'person', name: 'josh']).
> > >   upsertV(values(label), valueMap(id), valueMap())
> > >
> > > That would establish an API of upsertV(Traversal, Traversal, Traversal)
> > > which would enable your final thought of your last post with:
> > >
> > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> 'NY'),
> > >                         [active: true])
> > >
> > > The open-ended nature of that is neat but comes with some complexity
> with
> > > steps that traditionally take an "open" Traversal (and this step now
> has
> > up
> > > to three of them) - perhaps, that's our own fault in some ways. For the
> > > "stream" concept, values()/valueMap() won't immediately work that way
> as
> > > they only apply to Element...given today's Gremlin semantics, I think
> > we'd
> > > re-write that as:
> > >
> > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > >          [(id): 2, (label): 'person', name: 'josh']).
> > >   upsertV(select(label), select(id), select(id,label,'name'))
> > >
> > > Though, even that doesn't quite work because you can't select(T) right
> > now
> > > which means you'd need to go with:
> > >
> > > g.inject([id: 1, label: 'person', name: 'marko'],
> > >          [id: 2, label: 'person', name: 'josh']).
> > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > >
> > > not terrible in a way and perhaps we could fix select(T) - can't
> remember
> > > why that isn't allowed except that select() operates over multiple
> scopes
> > > and perhaps trying T through those scopes causes trouble.
> > >
> > > 4. Continuing with the use of Traversal as arguments, let's go back to:
> > >
> > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> 'NY'),
> > >                         [active: true])
> > >
> > > I originally had it in my mind to prefer this approach over a Map for
> the
> > > search as it provides a great deal of flexibility and fits the common
> > > filtering patterns that users follow really well. I suppose the
> question
> > is
> > > what to do in the situation of "create". Would we extract all the
> strict
> > > equality filter values from the initial has(), thus, in the example,
> > > "state" but not "age", and create a vertex that has "state=NY,
> > > active=true"? That is possible with how things are designed in Gremlin
> > > today I think.
> > >
> > > This syntax does create some conflict with the subject of 3 and streams
> > > because the traverser in the streams case is the incoming Map but for
> > this
> > > context it's meant as a filter of V(). Getting too whacky?
> > >
> > > 5. I like the idea of using a side-effect to capture whether the
> element
> > > was created or not. that makes sense. I think that can work.
> > >
> > > ==============================
> > > = gist contents
> > > ==============================
> > >
> > > # API
> > >
> > > ```java
> > > upsertV(String label, Map matchOrCreateProperties)
> > > upsertV(String label, Map matchOrCreateProperties, Map
> > > additionalProperties)
> > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > upsertV(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > additionalProperties)
> > > upsertV(...).
> > >   with(WithOptions.sideEffectLabel, 'a')
> > >
> > > upsertE(String label, Map matchOrCreateProperties)
> > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > upsertE(String label, Map matchOrCreateProperties, Map
> > > additionalProperties)
> > > upsertE(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > additionalProperties)
> > > upsertE(...).
> > >   from(Vertex incidentOut).to(Vertex incidentIn)
> > >   with(WithOptions.sideEffectLabel, 'a')
> > > ```
> > >
> > > # Examples
> > >
> > > ## upsert
> > >
> > > ```groovy
> > > g.upsertV('person', [name: 'marko'],
> > >                     [name: 'marko', age: 29])
> > > ```
> > >
> > > ## upsert with id
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, name: 'Marko'])
> > > ```
> > >
> > > ## upsert with Meta Properties
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > ```
> > >
> > > ## upsert with Multi Properties
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > ```
> > >
> > > ## upsert with Single Cardinality
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, lang: 'java')
> > > ```
> > >
> > > ## upsert with List Cardinality
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > ```
> > >
> > > ## upsert with Set Cardinality
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, lang: ['java','scala'])
> > > ```
> > >
> > > ## upsert with List Cardinality - and add value to list
> > >
> > > ```groovy
> > > g.upsertV('person', [(T.id): 1],
> > >                     [(T.id): 1, lang: ['java','scala','java']).
> > >   property(Cardinality.list, 'lang', 'java')
> > > ```
> > >
> > > ## upsert with stream
> > >
> > > ```groovy
> > > // doesn't work with today's Gremlin semantics
> > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > >          [(id): 2, (label): 'person', name: 'josh']).
> > >   upsertV(values(label), valueMap(id), valueMap())
> > >
> > > // the following would be more in line with what's possible with
> existing
> > > Gremlin semantics:
> > > g.inject([id: 1, label: 'person', name: 'marko'],
> > >          [id: 2, label: 'person', name: 'josh']).
> > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > ```
> > >
> > > ## upsert with reporting added or updated side effect
> > >
> > > ```groovy
> > > g.upsertV('person', [name: 'marko'],
> > >                     [name: 'marko', age: 29]).
> > >     with(WithOptions.sideEffectLabel, 'a').
> > >   project('vertex', 'added').
> > >     by().
> > >     by(cap('a'))
> > > ```
> > >
> > > ## upsert Edge assuming a self-relation of "marko"
> > >
> > > ```groovy
> > > g.V().has('person','name','marko').
> > >   upsertE( 'self', [weight:0.5])
> > > ```
> > >
> > > ## upsert Edge with incident vertex checks
> > >
> > > ```groovy
> > > g.upsertE('knows', [weight:0.5]).
> > >    from(V(1)).to(V(2))
> > > g.V().has('person','name','marko').
> > >   upsertE( 'knows', [weight:0.5]).
> > >     to(V(2))
> > > ```
> > >
> > > ## upsert using a Traversal for the match
> > >
> > > ```groovy
> > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> 'NY'),
> > >                         [active: true])
> > > ```
> > >
> > >
> > >
> > >
> > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <da...@bechberger.com>
> > > wrote:
> > >
> > > > So, I've been doing some additional thinking about the ways that this
> > > could
> > > > work based on the comments below and have put my comments inline.
> > > >
> > > > Dave
> > > >
> > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> spmallette@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > I added some thoughts inline below:
> > > > >
> > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > dave@bechberger.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the detailed comments Stephen.  I have addressed them
> > > inline
> > > > > > below.
> > > > > >
> > > > > > I did read the proposal from earlier and I think that we are in
> > close
> > > > > > agreement with what we are trying to accomplish.  I also fully
> > > support
> > > > > > Josh's comment on providing a mechanism for submitting a map of
> > > > > properties
> > > > > > as manually unrolling this all right now leads to a lot of
> > potential
> > > > for
> > > > > > error and a long messy traversal.
> > > > > >
> > > > > > I'm looking forward to this discussion on how to merge these two
> > > > > proposals.
> > > > > >
> > > > > >
> > > > > > 1. How would multi/meta-properties fit into the API you've
> > proposed?
> > > > > >
> > > > > > My first thought here is that multi-properties would be
> represented
> > > as
> > > > > > lists in the map, e.g.
> > > > > >
> > > > > > {names: ['Dave', 'David']}
> > > > > >
> > > > > > and meta-properties would be represented as maps in the maps,
> e.g.
> > > > > >
> > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > >
> > > > > > I can't say I've thought through all the implications of this
> > though
> > > so
> > > > > it
> > > > > > is an area we would need to explore.
> > > > > >
> > > > >
> > > > > The first implication that comes to mind is that it makes the
> > > assumption
> > > > > the user wants multi/meta-properties as opposed to a single
> > cardinality
> > > > of
> > > > > List in the first case and a Map as a property value in the second
> > > case.
> > > > I
> > > > > suppose that graphs with a schema could resolve those assumptions
> but
> > > > > graphs that are schemaless would have a problem. The issue could be
> > > > > resolved by specialized configuration of "g" or per merge() step
> > using
> > > a
> > > > > with() modulator I suppose but that goes into a yet another level
> of
> > > > > implications to consider. I've often wondered if the start point
> for
> > > > > getting types/schema into TP3 without a full rewrite would be in
> this
> > > > form
> > > > > where Gremlin would be given hints as to what to expect as to the
> > types
> > > > of
> > > > > data it might encounter while traversing. Anyway, I'd be hesitant
> to
> > go
> > > > > down paths that don't account for multi/metaproperties well.  They
> > are
> > > > > first class citizens in TP3 (with those hoping for extension of at
> > > least
> > > > > multiproperties to edges) and while I find them a constant
> annoyance
> > > for
> > > > so
> > > > > many reasons, we're kinda stuck with them.
> > > > >
> > > >
> > > > I agree we need to account for multi/meta properties as 1st class
> > > > citizens.  This is my current thinking on the syntax for the
> situations
> > > we
> > > > have laid out so far:
> > > >
> > > > *Merge*
> > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
> > > >
> > > >
> > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> 'Marko'})
> > > >
> > > >
> > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > 'name':
> > > > {'first': Marko', 'last': 'R'})
> > > >
> > > >
> > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang': ['java', 'scala'])
> > > >
> > > >
> > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang': 'java')
> > > >
> > > >
> > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > 'lang':
> > > > ['java', 'scala', 'java'])
> > > >
> > > >
> > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > 'lang':
> > > > ['java', 'scala'])
> > > >
> > > > Since in a mergeV() scenario we are only ever adding whatever values
> > are
> > > > passed in there would be no need to specify the cardinality of the
> > > property
> > > > being added.  If they wanted to add a value to an existing property
> > then
> > > > the current property() method would still be available on the output
> > > > traversal. e.g.
> > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > >
> > > >
> > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> > 'marko'},
> > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > >   mergeV(values('label'), valueMap('id'), valueMap())
> > > >
> > > >
> > > > *Merge with reporting added or updated side effect* g.mergeV('name',
> > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > >   with(WithOptions.sideEffectLabel, 'a').
> > > >   project('vertex', 'added').
> > > >     by().
> > > >     by(cap('a'))
> > > >
> > > >
> > > >
> > > > >
> > > > >
> > > > > > 2. How would users set the T.id on creation? would that T.id just
> > be
> > > a
> > > > > key
> > > > > > in the first Map argument?
> > > > > >
> > > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > > >
> > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > >
> > > > >
> > > > > ok - I can't say I see a problem with that atm.
> > > > >
> > > > >
> > > > > > 3. I do like the general idea of a match on multiple properties
> for
> > > the
> > > > > > first argument as a convenience but wonder about the specificity
> of
> > > > this
> > > > > > API a bit as it focuses heavily on equality - I suppose that's
> most
> > > > cases
> > > > > > for get-or-create, so perhaps that's ok.
> > > > > >
> > > > > > In most cases I've seen use exact matches on the vertex or
> edge.  I
> > > > think
> > > > > > it might be best to keep this straightforward as any complex edge
> > > cases
> > > > > > still can perform the same functionality using the coalesce()
> > > pattern.
> > > > > >
> > > > >
> > > > > I played around with the idea to use modulators to apply additional
> > > > > constraints:
> > > > >
> > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > >     by(has('age', gt(29))
> > > > >
> > > > > I suppose that's neat but maybe not...we could easily get the same
> > > thing
> > > > > from:
> > > > >
> > > > > g.V().has('person','age',gt(29)).
> > > > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > > > >
> > > > > which is more readable though it does make it harder for providers
> to
> > > > > optimize upsert operations if they could make use of the has() as
> > part
> > > of
> > > > > that. I think I like has() as part of the main query rather than
> in a
> > > > by()
> > > > > - just thinking out loud.
> > > > >
> > > > >
> > > > Another potential option here would be to allow the second parameter
> of
> > > > mergeV() accept a traversal.  Not sure how much complexity that would
> > add
> > > > though
> > > >
> > > > g.V().mergeV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'}.,
> > > > {'active': true}).
> > > >
> > > >
> > > > >
> > > > > > 4. I think your suggestion points to one of the troubles Gremlin
> > has
> > > > > which
> > > > > > we see with "algorithms" - extending the language with new steps
> > that
> > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > > generalization of "merging" in my link above which fits into the
> > core
> > > > > > Gremlin language style. There is a bigger picture where we are
> > > missing
> > > > > > something in Gremlin that lets us extend the language in ways
> that
> > > let
> > > > us
> > > > > > easily introduce new steps that aren't for general purpose. This
> > > issue
> > > > is
> > > > > > discussed in terms of "algorithms" here:
> > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think
> > we
> > > > > could
> > > > > > see how there might be some "mutation" extension steps that would
> > > cover
> > > > > > your suggested API, plus batch operations, etc. We need a way to
> > add
> > > > > > "sugar" without it interfering with the consistency of the core.
> > > > > Obviously
> > > > > > this is a bigger issue but perhaps important to solve to
> implement
> > > > steps
> > > > > in
> > > > > > the fashion you describe.
> > > > > >
> > > > > > I agree that the "algorithm" steps do seem a bit odd in the core
> > > > language
> > > > > > but I think the need is there.  I'd be interested in furthering
> > this
> > > > > > discussion but I think these "pattern" steps may or may not be
> the
> > > same
> > > > > as
> > > > > > the algorithm steps.  I'll have to think on that.
> > > > >
> > > > >
> > > > >
> > > > > > 5. I suppose that the reason for mergeE and mergeV is to specify
> > what
> > > > > > element type the first Map argument should be applied to? what
> > about
> > > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > > tricky
> > > > > but
> > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > "complex
> > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > g.V().merge()
> > > > > > and the label and first Map would fold to VertexStartStep (i.e.
> > V())
> > > > for
> > > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > > modulator"
> > > > > > as it does more than just change the behavior of the previous
> step
> > -
> > > it
> > > > > > also adds its own. I suppose it could also add has() steps
> followed
> > > by
> > > > > the
> > > > > > MergeStep and then the has() operations would fold in normally as
> > > they
> > > > do
> > > > > > today. In this way, we can simplify to just one single
> > > > > > merge(String,Map,Map). ??
> > > > > >
> > > > > > I agree that we should also think about how to include properties
> > in
> > > > this
> > > > > > merge construct.  The reason I was thinking about mergeV() and
> > > mergeE()
> > > > > is
> > > > > > that it follows the same pattern as the already well understood
> > > > > > addV()/addE() steps.  I am a bit wary of trying to generalize
> this
> > > down
> > > > > to
> > > > > > a single merge() step as these sorts of complex overloads make it
> > > hard
> > > > to
> > > > > > figure out which Gremlin step you should use for a particular
> > pattern
> > > > > (e.g.
> > > > > > upsert vertex or upsert edge).  One thing that I think customers
> > find
> > > > > > useful about the addV/addE step is that they are very
> discoverable,
> > > the
> > > > > > name tells me what functionality to expect from that step.
> > > > > >
> > > > >
> > > > > I'm starting to agree with the idea that we have to do something
> like
> > > > > mergeV()/E() as it wouldn't quite work as a start step otherwise.
> We
> > > > > couldn't do:
> > > > >
> > > > > g.merge()
> > > > >
> > > > > as we wouldnt know what sort of Element it applied to. If so, I
> > wonder
> > > if
> > > > > it would be better to preserve "merge" for my more general use-case
> > and
> > > > > prefer upsertV()/E()?
> > > > >
> > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is an
> > > > uncommon
> > > > > case (compared to V/E()) and we don't really have addVP() as an
> > > analogous
> > > > > step. Perhaps existing coalesce() patterns and/or my more general
> > > purpose
> > > > > merge() step would satisfy those situations??
> > > > >
> > > >
> > > > I think that a mergeVP() type step may require a different pattern
> > since
> > > it
> > > > would really need to accept a map of values otherwise it would
> > > essentially
> > > > perform the same function as property()
> > > >
> > > >
> > > > >
> > > > >
> > > > > > 6. One thing neither my approach nor yours seems to do is tell
> the
> > > user
> > > > > if
> > > > > > they created something or updated something - that's another
> thing
> > > I've
> > > > > > seen users want to have in get-or-create. Here again we go deeper
> > > into
> > > > a
> > > > > > less general step specification as alluded to in 4, but a merge()
> > > step
> > > > as
> > > > > > proposed in 5, might return [Element,boolean] so as to provide an
> > > > > indicator
> > > > > > of creation?
> > > > > >
> > > > > > Hmm, I'll have to think about that.  How would returning multiple
> > > > values
> > > > > > work if we want to chain these together. e.g. Add a vertex
> between
> > > two
> > > > > > edges but make sure the vertices exist?
> > > > > >
> > > > >
> > > >
> > > > Added a thought on how to accomplish this above that I'd like to get
> > > > thoughts on.
> > > >
> > > >
> > > > > >
> > > > > > 7. You were just introducing your ideas here, so perhaps you
> > haven't
> > > > > gotten
> > > > > > this far yet, but a shortcoming to doing merge(String,Map,Map) is
> > > that
> > > > it
> > > > > > leaves open no opportunity to stream a List of Maps to a merge()
> > for
> > > a
> > > > > form
> > > > > > of batch loading which is mighty common and one of the variations
> > of
> > > > the
> > > > > > coalesce() pattern that I alluded to at the start of all this. I
> > > think
> > > > > that
> > > > > > we would want to be sure that we left open the option to do that
> > > > somehow.
> > > > > > 8. If we had a general purpose merge() step I wonder if it makes
> > > > > developing
> > > > > > the API as you suggested easier to do?
> > > > > >
> > > > > > Hmm, let me think about that one.
> > > > > >
> > > > > >
> > > > > I will add an item 8 to think about which I didn't mention before:
> > > > >
> > > > > 8. The signature you suggested for mergeE() is:
> > > > >
> > > > > mergeE(String, Map, Map)
> > > > >      String - The edge label
> > > > >      Map (first) - The properties to match existing edge on
> > > > >      Map (second) - Any additional properties to set if a new edge
> is
> > > > > created (optional)
> > > > >
> > > > > but does that exactly answer the need? Typically the idea is to
> > detect
> > > an
> > > > > edge between two vertices not globally in the graph by way of some
> > > > > properties. This signature doesn't seem to allow for that as it
> > doesn't
> > > > > allow specification of the vertices to test against. Perhaps the
> > answer
> > > > is
> > > > > to use from() and to() modulators?
> > > > >
> > > > > g.mergeE('knows', {'weight':0.5}).
> > > > >   from(V(1)).to(V(2))
> > > > > g.V().has('person','name','marko').
> > > > >   mergeE( 'knows', {'weight':0.5}).
> > > > >     to(V(2))
> > > > > g.V().has('person','name','marko').
> > > > >   mergeE( 'self', {'weight':0.5})
> > > > >
> > > > > That seems to work?
> > > > >
> > > >
> > > > I think that is an elegant solution to that problem.  I like it and
> it
> > > > keeps in line with the way that addE() works.
> > > >
> > > >
> > > > >
> > > > >
> > > > >
> > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > spmallette@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I like our coalesce() pattern but it is verbose and over time
> it
> > > has
> > > > > gone
> > > > > > > from a simple pattern to one with numerous variations for all
> > > manner
> > > > of
> > > > > > > different sorts of merge-like operations. As such, I do think
> we
> > > > should
> > > > > > > introduce something to cover this pattern.
> > > > > > >
> > > > > > > I like that you used the word "merge" in your description of
> this
> > > as
> > > > it
> > > > > > is
> > > > > > > the word I've liked using. You might want to give a look at my
> > > > proposed
> > > > > > > merge() step from earlier in the year:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > >
> > > > > > > I'm just going to dump thoughts as they come regarding what you
> > > > wrote:
> > > > > > >
> > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > proposed?
> > > > > > > 2. How would users set the T.id on creation? would that T.id
> just
> > > be
> > > > a
> > > > > > key
> > > > > > > in the first Map argument?
> > > > > > > 3. I do like the general idea of a match on multiple properties
> > for
> > > > the
> > > > > > > first argument as a convenience but wonder about the
> specificity
> > of
> > > > > this
> > > > > > > API a bit as it focuses heavily on equality - I suppose that's
> > most
> > > > > cases
> > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > 4. I think your suggestion points to one of the troubles
> Gremlin
> > > has
> > > > > > which
> > > > > > > we see with "algorithms" - extending the language with new
> steps
> > > that
> > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > > > generalization of "merging" in my link above which fits into
> the
> > > core
> > > > > > > Gremlin language style. There is a bigger picture where we are
> > > > missing
> > > > > > > something in Gremlin that lets us extend the language in ways
> > that
> > > > let
> > > > > us
> > > > > > > easily introduce new steps that aren't for general purpose.
> This
> > > > issue
> > > > > is
> > > > > > > discussed in terms of "algorithms" here:
> > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> think
> > > we
> > > > > > could
> > > > > > > see how there might be some "mutation" extension steps that
> would
> > > > cover
> > > > > > > your suggested API, plus batch operations, etc. We need a way
> to
> > > add
> > > > > > > "sugar" without it interfering with the consistency of the
> core.
> > > > > > Obviously
> > > > > > > this is a bigger issue but perhaps important to solve to
> > implement
> > > > > steps
> > > > > > in
> > > > > > > the fashion you describe.
> > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> specify
> > > what
> > > > > > > element type the first Map argument should be applied to? what
> > > about
> > > > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > > > tricky
> > > > > > but
> > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > "complex
> > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > g.V().merge()
> > > > > > > and the label and first Map would fold to VertexStartStep (i.e.
> > > V())
> > > > > for
> > > > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > > > modulator"
> > > > > > > as it does more than just change the behavior of the previous
> > step
> > > -
> > > > it
> > > > > > > also adds its own. I suppose it could also add has() steps
> > followed
> > > > by
> > > > > > the
> > > > > > > MergeStep and then the has() operations would fold in normally
> as
> > > > they
> > > > > do
> > > > > > > today. In this way, we can simplify to just one single
> > > > > > > merge(String,Map,Map). ??
> > > > > > > 6. One thing neither my approach nor yours seems to do is tell
> > the
> > > > user
> > > > > > if
> > > > > > > they created something or updated something - that's another
> > thing
> > > > I've
> > > > > > > seen users want to have in get-or-create. Here again we go
> deeper
> > > > into
> > > > > a
> > > > > > > less general step specification as alluded to in 4, but a
> merge()
> > > > step
> > > > > as
> > > > > > > proposed in 5, might return [Element,boolean] so as to provide
> an
> > > > > > indicator
> > > > > > > of creation?
> > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > haven't
> > > > > > gotten
> > > > > > > this far yet, but a shortcoming to doing merge(String,Map,Map)
> is
> > > > that
> > > > > it
> > > > > > > leaves open no opportunity to stream a List of Maps to a
> merge()
> > > for
> > > > a
> > > > > > form
> > > > > > > of batch loading which is mighty common and one of the
> variations
> > > of
> > > > > the
> > > > > > > coalesce() pattern that I alluded to at the start of all this.
> I
> > > > think
> > > > > > that
> > > > > > > we would want to be sure that we left open the option to do
> that
> > > > > somehow.
> > > > > > > 8. If we had a general purpose merge() step I wonder if it
> makes
> > > > > > developing
> > > > > > > the API as you suggested easier to do?
> > > > > > >
> > > > > > > I think I'd like to solve the problems you describe in your
> post
> > as
> > > > > well
> > > > > > as
> > > > > > > the ones in mine. There is some relation there, but gaps as
> well.
> > > > With
> > > > > > more
> > > > > > > discussion here we can figure something out.
> > > > > > >
> > > > > > > Thanks for starting this talk - good one!
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > dave@bechberger.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I've had a few on and off discussions with a few people here,
> > so
> > > I
> > > > > > wanted
> > > > > > > > to send this out to everyone for feedback.
> > > > > > > >
> > > > > > > > What are people's thoughts on creating a new set of steps
> that
> > > > codify
> > > > > > > > common Gremlin best practices?
> > > > > > > >
> > > > > > > > I think there are several common Gremlin patterns where users
> > > would
> > > > > > > benefit
> > > > > > > > from the additional guidance that these codified steps
> > represent.
> > > > > The
> > > > > > > > first one I would recommend though is codifying the element
> > > > existence
> > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > >
> > > > > > > > mergeV(String, Map, Map)
> > > > > > > >      String - The vertex label
> > > > > > > >      Map (first) - The properties to match existing vertices
> on
> > > > > > > >      Map (second) - Any additional properties to set if a new
> > > > vertex
> > > > > is
> > > > > > > > created (optional)
> > > > > > > > mergeE(String, Map, Map)
> > > > > > > >      String - The edge label
> > > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > > >      Map (second) - Any additional properties to set if a new
> > > edge
> > > > is
> > > > > > > > created (optional)
> > > > > > > >
> > > > > > > > In each of these cases these steps would perform the same
> > upsert
> > > > > > > > functionality as the element existence pattern.
> > > > > > > >
> > > > > > > > Example:
> > > > > > > >
> > > > > > > > g.V().has('person','name','stephen').
> > > > > > > >            fold().
> > > > > > > >            coalesce(unfold(),
> > > > > > > >                     addV('person').
> > > > > > > >                       property('name','stephen').
> > > > > > > >                       property('age',34))
> > > > > > > >
> > > > > > > > would become:
> > > > > > > >
> > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > >
> > > > > > > > I think that this change would be a good addition to the
> > language
> > > > for
> > > > > > > > several reasons:
> > > > > > > >
> > > > > > > > * This codifies the best practice for a specific
> action/recipe,
> > > > which
> > > > > > > > reduces the chance that someone uses the pattern incorrectly
> > > > > > > > * Most complex Gremlin traversals are verbose.  Reducing the
> > > amount
> > > > > of
> > > > > > > code
> > > > > > > > that needs to be written and maintained allows for a better
> > > > developer
> > > > > > > > experience.
> > > > > > > > * It will lower the bar of entry for a developer by making
> > these
> > > > > > actions
> > > > > > > > more discoverable.  The more we can help bring these patterns
> > to
> > > > the
> > > > > > > > forefront of the language via these pattern/meta steps the
> more
> > > we
> > > > > > guide
> > > > > > > > users towards writing better Gremlin faster
> > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > >
> > > > > > > > I know that this would likely be the first step in Gremlin
> that
> > > > > > codifies
> > > > > > > a
> > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > >
> > > > > > > > Dave
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Stephen Mallette <sp...@gmail.com>.
I started a expanded this discussion to gremlin-users for a wider audience
and the thread is starting to grow:

https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ

I guess we'll need to summarize that discussion back here now....

I did have some more thoughts to hang out there and figured that I wouldn't
convolute the discussion on gremlin-users with it so I will continue the
discussion here.

1, The very first couple of examples seem wrong (or at least not best
demonstrating the usage):

g.upsertV('person', [name: 'marko'],
                    [name: 'marko', age: 29])
g.upsertV('person', [(T.id): 1],
                    [(T.id): 1, name: 'Marko'])

should instead be:

g.upsertV('person', [name: 'marko'],
                    [age: 29])
g.upsertV('person', [(T.id): 1],
                    [name: 'Marko'])

2. I can't recall if we made this clear anywhere but in situations where we
"get" rather than "create" do the additional properties act in an update
fashion to the element that was found? I think I've been working on the
assumption that it would, though perhaps that is not always desirable?

3. We really never settled up how to deal with multi/meta-properties. That
story should be clear so that when we document upsert() we include the
approaches for the fallback patterns that don't meet the 90% of use cases
we are targeting and I sense that we're saying that meta/multi-properties
don't fit in that bucket. So with that in mind, I don't think that the
following works for metaproperties:

g.upsertV('person', [(T.id): 1],
                    [name:[acl: 'public'])

as it doesn't let us set the value for the "name" just the pairs for the
meta-properties. I guess a user would have to fall back to:

g.upsertV('person', [(T.id): 1]).
  property('name','marko','acl','public')

// or use the additional properties syntax
g.upsertV('person', [(T.id): 1],[name:'marko']).
  properties('name').property('acl','public')

// or if there were multi-properties then maybe...
g.upsertV('person', [(T.id): 1],[name:'marko']).
  properties('name').hasValue('marko').property('acl','public')

As for multi-properties, I dont think we should assume that a List object
should be interpreted by Gremlin as a multi-property. Perhaps we just rely
on the underlying graph to properly deal with that given a schema or the
user falls back to:

// this ends up as however the graph deals with a List object
g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])

// this is explicit
g.upsertV('person', [(T.id): 1]).
  property(list, 'lang', 'java').
  property(list, 'lang, 'scala').
  property(list, 'lang', 'java')

If that makes sense to everyone, I will update the gist.







On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <da...@bechberger.com>
wrote:

> Hello Stephen,
>
> Thanks for making that gist, its much easier to follow along with the
> proposed syntax there.  To your specific comments:
>
> #1 - My only worry with the term upsert is that it is not as widely a used
> term for this sort of pattern as "Merge" (e.g. SQL, Cypher).  However I
> don't have a strong opinion on this, so I am fine with either.
> #2 - My only real objective here is to make sure that we make the 80-90%
> case easy and straightforward.  I think that having the fallback option of
> using the current syntax for any complicated edge cases should be
> considered here as well. I'd appreciate your thoughts here as these are
> good points you bring up that definitely fall into the 80-90% use case.
> #3 - Those points make sense to me, not sure I have anything further to add
> #4 - I don't think I would expect to have the values extracted from the
> traversal filters but I'd be interested in other opinions on that.
>
> Thanks,
> Dave
>
> On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > It's been a couple weeks since we moved this thread so I went back
> through
> > it and refreshed my mind with what's there. Dave, nice job putting all
> > those examples together. I've recollected them all together and have just
> > been reviewing them in a more formatted style with proper Groovy syntax
> > than what can be accomplished in this mailing list. Anyone who cares to
> > review in that form can do so here (i've pasted its markdown contents at
> > the bottom of this post as well):
> >
> > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> >
> > In this light, I have the following comments and thoughts:
> >
> > 1. I feel more inclined toward saving "merge" for a more generalized step
> > of that name and have thus renamed the examples to use "upsert".
> >
> > 2. I still don't quite feel comfortable with meta/multi-properties
> > examples. Perhaps you could explain further as it's possible I'm not
> > thinking things through properly. For meta-properties the example is:
> >
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> >
> > So I see that "name" takes a Map here, but how does Gremlin know if that
> > means "meta-property" or "Map value" and if the former, then how do you
> set
> > the value of "name"?  My question is similar to the multi-property
> examples
> > - using this one:
> >
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1,lang: ['java', 'scala'])
> >
> > How does Gremlin know if the user means that "lang" should be a
> > "multi-property" with Cardinality.list (or Cardinality.set for that
> matter)
> > or a value of Cardinality.single with a List/Array value? I can perhaps
> > suggest some solutions here but wanted to make sure I wasn't just
> > misunderstanding something.
> >
> > 3. The "upsert with stream" example is:
> >
> > g.inject([(id): 1, (label): 'person', name: 'marko'],
> >          [(id): 2, (label): 'person', name: 'josh']).
> >   upsertV(values(label), valueMap(id), valueMap())
> >
> > That would establish an API of upsertV(Traversal, Traversal, Traversal)
> > which would enable your final thought of your last post with:
> >
> > g.V().upsertV('person', __.has('person','age',gt(29)).has('state', 'NY'),
> >                         [active: true])
> >
> > The open-ended nature of that is neat but comes with some complexity with
> > steps that traditionally take an "open" Traversal (and this step now has
> up
> > to three of them) - perhaps, that's our own fault in some ways. For the
> > "stream" concept, values()/valueMap() won't immediately work that way as
> > they only apply to Element...given today's Gremlin semantics, I think
> we'd
> > re-write that as:
> >
> > g.inject([(id): 1, (label): 'person', name: 'marko'],
> >          [(id): 2, (label): 'person', name: 'josh']).
> >   upsertV(select(label), select(id), select(id,label,'name'))
> >
> > Though, even that doesn't quite work because you can't select(T) right
> now
> > which means you'd need to go with:
> >
> > g.inject([id: 1, label: 'person', name: 'marko'],
> >          [id: 2, label: 'person', name: 'josh']).
> >   upsertV(select('label'), select('id'), select('id','label','name'))
> >
> > not terrible in a way and perhaps we could fix select(T) - can't remember
> > why that isn't allowed except that select() operates over multiple scopes
> > and perhaps trying T through those scopes causes trouble.
> >
> > 4. Continuing with the use of Traversal as arguments, let's go back to:
> >
> > g.V().upsertV('person', __.has('person','age',gt(29)).has('state', 'NY'),
> >                         [active: true])
> >
> > I originally had it in my mind to prefer this approach over a Map for the
> > search as it provides a great deal of flexibility and fits the common
> > filtering patterns that users follow really well. I suppose the question
> is
> > what to do in the situation of "create". Would we extract all the strict
> > equality filter values from the initial has(), thus, in the example,
> > "state" but not "age", and create a vertex that has "state=NY,
> > active=true"? That is possible with how things are designed in Gremlin
> > today I think.
> >
> > This syntax does create some conflict with the subject of 3 and streams
> > because the traverser in the streams case is the incoming Map but for
> this
> > context it's meant as a filter of V(). Getting too whacky?
> >
> > 5. I like the idea of using a side-effect to capture whether the element
> > was created or not. that makes sense. I think that can work.
> >
> > ==============================
> > = gist contents
> > ==============================
> >
> > # API
> >
> > ```java
> > upsertV(String label, Map matchOrCreateProperties)
> > upsertV(String label, Map matchOrCreateProperties, Map
> > additionalProperties)
> > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > upsertV(Traversal label, Traversal matchOrCreateProperties, Traversal
> > additionalProperties)
> > upsertV(...).
> >   with(WithOptions.sideEffectLabel, 'a')
> >
> > upsertE(String label, Map matchOrCreateProperties)
> > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > upsertE(String label, Map matchOrCreateProperties, Map
> > additionalProperties)
> > upsertE(Traversal label, Traversal matchOrCreateProperties, Traversal
> > additionalProperties)
> > upsertE(...).
> >   from(Vertex incidentOut).to(Vertex incidentIn)
> >   with(WithOptions.sideEffectLabel, 'a')
> > ```
> >
> > # Examples
> >
> > ## upsert
> >
> > ```groovy
> > g.upsertV('person', [name: 'marko'],
> >                     [name: 'marko', age: 29])
> > ```
> >
> > ## upsert with id
> >
> > ```groovy
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, name: 'Marko'])
> > ```
> >
> > ## upsert with Meta Properties
> >
> > ```groovy
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > ```
> >
> > ## upsert with Multi Properties
> >
> > ```groovy
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1,lang: ['java', 'scala'])
> > ```
> >
> > ## upsert with Single Cardinality
> >
> > ```groovy
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, lang: 'java')
> > ```
> >
> > ## upsert with List Cardinality
> >
> > ```groovy
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> > ```
> >
> > ## upsert with Set Cardinality
> >
> > ```groovy
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, lang: ['java','scala'])
> > ```
> >
> > ## upsert with List Cardinality - and add value to list
> >
> > ```groovy
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, lang: ['java','scala','java']).
> >   property(Cardinality.list, 'lang', 'java')
> > ```
> >
> > ## upsert with stream
> >
> > ```groovy
> > // doesn't work with today's Gremlin semantics
> > g.inject([(id): 1, (label): 'person', name: 'marko'],
> >          [(id): 2, (label): 'person', name: 'josh']).
> >   upsertV(values(label), valueMap(id), valueMap())
> >
> > // the following would be more in line with what's possible with existing
> > Gremlin semantics:
> > g.inject([id: 1, label: 'person', name: 'marko'],
> >          [id: 2, label: 'person', name: 'josh']).
> >   upsertV(select('label'), select('id'), select('id','label','name'))
> > ```
> >
> > ## upsert with reporting added or updated side effect
> >
> > ```groovy
> > g.upsertV('person', [name: 'marko'],
> >                     [name: 'marko', age: 29]).
> >     with(WithOptions.sideEffectLabel, 'a').
> >   project('vertex', 'added').
> >     by().
> >     by(cap('a'))
> > ```
> >
> > ## upsert Edge assuming a self-relation of "marko"
> >
> > ```groovy
> > g.V().has('person','name','marko').
> >   upsertE( 'self', [weight:0.5])
> > ```
> >
> > ## upsert Edge with incident vertex checks
> >
> > ```groovy
> > g.upsertE('knows', [weight:0.5]).
> >    from(V(1)).to(V(2))
> > g.V().has('person','name','marko').
> >   upsertE( 'knows', [weight:0.5]).
> >     to(V(2))
> > ```
> >
> > ## upsert using a Traversal for the match
> >
> > ```groovy
> > g.V().upsertV('person', __.has('person','age',gt(29)).has('state', 'NY'),
> >                         [active: true])
> > ```
> >
> >
> >
> >
> > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <da...@bechberger.com>
> > wrote:
> >
> > > So, I've been doing some additional thinking about the ways that this
> > could
> > > work based on the comments below and have put my comments inline.
> > >
> > > Dave
> > >
> > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <spmallette@gmail.com
> >
> > > wrote:
> > >
> > > > I added some thoughts inline below:
> > > >
> > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> dave@bechberger.com>
> > > > wrote:
> > > >
> > > > > Thanks for the detailed comments Stephen.  I have addressed them
> > inline
> > > > > below.
> > > > >
> > > > > I did read the proposal from earlier and I think that we are in
> close
> > > > > agreement with what we are trying to accomplish.  I also fully
> > support
> > > > > Josh's comment on providing a mechanism for submitting a map of
> > > > properties
> > > > > as manually unrolling this all right now leads to a lot of
> potential
> > > for
> > > > > error and a long messy traversal.
> > > > >
> > > > > I'm looking forward to this discussion on how to merge these two
> > > > proposals.
> > > > >
> > > > >
> > > > > 1. How would multi/meta-properties fit into the API you've
> proposed?
> > > > >
> > > > > My first thought here is that multi-properties would be represented
> > as
> > > > > lists in the map, e.g.
> > > > >
> > > > > {names: ['Dave', 'David']}
> > > > >
> > > > > and meta-properties would be represented as maps in the maps, e.g.
> > > > >
> > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > >
> > > > > I can't say I've thought through all the implications of this
> though
> > so
> > > > it
> > > > > is an area we would need to explore.
> > > > >
> > > >
> > > > The first implication that comes to mind is that it makes the
> > assumption
> > > > the user wants multi/meta-properties as opposed to a single
> cardinality
> > > of
> > > > List in the first case and a Map as a property value in the second
> > case.
> > > I
> > > > suppose that graphs with a schema could resolve those assumptions but
> > > > graphs that are schemaless would have a problem. The issue could be
> > > > resolved by specialized configuration of "g" or per merge() step
> using
> > a
> > > > with() modulator I suppose but that goes into a yet another level of
> > > > implications to consider. I've often wondered if the start point for
> > > > getting types/schema into TP3 without a full rewrite would be in this
> > > form
> > > > where Gremlin would be given hints as to what to expect as to the
> types
> > > of
> > > > data it might encounter while traversing. Anyway, I'd be hesitant to
> go
> > > > down paths that don't account for multi/metaproperties well.  They
> are
> > > > first class citizens in TP3 (with those hoping for extension of at
> > least
> > > > multiproperties to edges) and while I find them a constant annoyance
> > for
> > > so
> > > > many reasons, we're kinda stuck with them.
> > > >
> > >
> > > I agree we need to account for multi/meta properties as 1st class
> > > citizens.  This is my current thinking on the syntax for the situations
> > we
> > > have laid out so far:
> > >
> > > *Merge*
> > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
> > >
> > >
> > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name': 'Marko'})
> > >
> > >
> > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > 'name':
> > > {'first': Marko', 'last': 'R'})
> > >
> > >
> > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > 'lang': ['java', 'scala'])
> > >
> > >
> > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > 'lang': 'java')
> > >
> > >
> > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > 'lang':
> > > ['java', 'scala', 'java'])
> > >
> > >
> > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > 'lang':
> > > ['java', 'scala'])
> > >
> > > Since in a mergeV() scenario we are only ever adding whatever values
> are
> > > passed in there would be no need to specify the cardinality of the
> > property
> > > being added.  If they wanted to add a value to an existing property
> then
> > > the current property() method would still be available on the output
> > > traversal. e.g.
> > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > 'java']).property(Cardinality.list, 'lang, 'java')
> > >
> > >
> > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> 'marko'},
> > > {id: 2, label: 'person', 'name': 'josh'}]).
> > >   mergeV(values('label'), valueMap('id'), valueMap())
> > >
> > >
> > > *Merge with reporting added or updated side effect* g.mergeV('name',
> > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > >   with(WithOptions.sideEffectLabel, 'a').
> > >   project('vertex', 'added').
> > >     by().
> > >     by(cap('a'))
> > >
> > >
> > >
> > > >
> > > >
> > > > > 2. How would users set the T.id on creation? would that T.id just
> be
> > a
> > > > key
> > > > > in the first Map argument?
> > > > >
> > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > >
> > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > >
> > > >
> > > > ok - I can't say I see a problem with that atm.
> > > >
> > > >
> > > > > 3. I do like the general idea of a match on multiple properties for
> > the
> > > > > first argument as a convenience but wonder about the specificity of
> > > this
> > > > > API a bit as it focuses heavily on equality - I suppose that's most
> > > cases
> > > > > for get-or-create, so perhaps that's ok.
> > > > >
> > > > > In most cases I've seen use exact matches on the vertex or edge.  I
> > > think
> > > > > it might be best to keep this straightforward as any complex edge
> > cases
> > > > > still can perform the same functionality using the coalesce()
> > pattern.
> > > > >
> > > >
> > > > I played around with the idea to use modulators to apply additional
> > > > constraints:
> > > >
> > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > >     by(has('age', gt(29))
> > > >
> > > > I suppose that's neat but maybe not...we could easily get the same
> > thing
> > > > from:
> > > >
> > > > g.V().has('person','age',gt(29)).
> > > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > > >
> > > > which is more readable though it does make it harder for providers to
> > > > optimize upsert operations if they could make use of the has() as
> part
> > of
> > > > that. I think I like has() as part of the main query rather than in a
> > > by()
> > > > - just thinking out loud.
> > > >
> > > >
> > > Another potential option here would be to allow the second parameter of
> > > mergeV() accept a traversal.  Not sure how much complexity that would
> add
> > > though
> > >
> > > g.V().mergeV('person', __.has('person','age',gt(29)).has('state',
> 'NY'}.,
> > > {'active': true}).
> > >
> > >
> > > >
> > > > > 4. I think your suggestion points to one of the troubles Gremlin
> has
> > > > which
> > > > > we see with "algorithms" - extending the language with new steps
> that
> > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > generalization of "merging" in my link above which fits into the
> core
> > > > > Gremlin language style. There is a bigger picture where we are
> > missing
> > > > > something in Gremlin that lets us extend the language in ways that
> > let
> > > us
> > > > > easily introduce new steps that aren't for general purpose. This
> > issue
> > > is
> > > > > discussed in terms of "algorithms" here:
> > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think
> we
> > > > could
> > > > > see how there might be some "mutation" extension steps that would
> > cover
> > > > > your suggested API, plus batch operations, etc. We need a way to
> add
> > > > > "sugar" without it interfering with the consistency of the core.
> > > > Obviously
> > > > > this is a bigger issue but perhaps important to solve to implement
> > > steps
> > > > in
> > > > > the fashion you describe.
> > > > >
> > > > > I agree that the "algorithm" steps do seem a bit odd in the core
> > > language
> > > > > but I think the need is there.  I'd be interested in furthering
> this
> > > > > discussion but I think these "pattern" steps may or may not be the
> > same
> > > > as
> > > > > the algorithm steps.  I'll have to think on that.
> > > >
> > > >
> > > >
> > > > > 5. I suppose that the reason for mergeE and mergeV is to specify
> what
> > > > > element type the first Map argument should be applied to? what
> about
> > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > tricky
> > > > but
> > > > > I don't think we should miss that. Perhaps merge() could be a
> > "complex
> > > > > modulator"?? that's a new concept of course, but you would do
> > > > g.V().merge()
> > > > > and the label and first Map would fold to VertexStartStep (i.e.
> V())
> > > for
> > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > modulator"
> > > > > as it does more than just change the behavior of the previous step
> -
> > it
> > > > > also adds its own. I suppose it could also add has() steps followed
> > by
> > > > the
> > > > > MergeStep and then the has() operations would fold in normally as
> > they
> > > do
> > > > > today. In this way, we can simplify to just one single
> > > > > merge(String,Map,Map). ??
> > > > >
> > > > > I agree that we should also think about how to include properties
> in
> > > this
> > > > > merge construct.  The reason I was thinking about mergeV() and
> > mergeE()
> > > > is
> > > > > that it follows the same pattern as the already well understood
> > > > > addV()/addE() steps.  I am a bit wary of trying to generalize this
> > down
> > > > to
> > > > > a single merge() step as these sorts of complex overloads make it
> > hard
> > > to
> > > > > figure out which Gremlin step you should use for a particular
> pattern
> > > > (e.g.
> > > > > upsert vertex or upsert edge).  One thing that I think customers
> find
> > > > > useful about the addV/addE step is that they are very discoverable,
> > the
> > > > > name tells me what functionality to expect from that step.
> > > > >
> > > >
> > > > I'm starting to agree with the idea that we have to do something like
> > > > mergeV()/E() as it wouldn't quite work as a start step otherwise. We
> > > > couldn't do:
> > > >
> > > > g.merge()
> > > >
> > > > as we wouldnt know what sort of Element it applied to. If so, I
> wonder
> > if
> > > > it would be better to preserve "merge" for my more general use-case
> and
> > > > prefer upsertV()/E()?
> > > >
> > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is an
> > > uncommon
> > > > case (compared to V/E()) and we don't really have addVP() as an
> > analogous
> > > > step. Perhaps existing coalesce() patterns and/or my more general
> > purpose
> > > > merge() step would satisfy those situations??
> > > >
> > >
> > > I think that a mergeVP() type step may require a different pattern
> since
> > it
> > > would really need to accept a map of values otherwise it would
> > essentially
> > > perform the same function as property()
> > >
> > >
> > > >
> > > >
> > > > > 6. One thing neither my approach nor yours seems to do is tell the
> > user
> > > > if
> > > > > they created something or updated something - that's another thing
> > I've
> > > > > seen users want to have in get-or-create. Here again we go deeper
> > into
> > > a
> > > > > less general step specification as alluded to in 4, but a merge()
> > step
> > > as
> > > > > proposed in 5, might return [Element,boolean] so as to provide an
> > > > indicator
> > > > > of creation?
> > > > >
> > > > > Hmm, I'll have to think about that.  How would returning multiple
> > > values
> > > > > work if we want to chain these together. e.g. Add a vertex between
> > two
> > > > > edges but make sure the vertices exist?
> > > > >
> > > >
> > >
> > > Added a thought on how to accomplish this above that I'd like to get
> > > thoughts on.
> > >
> > >
> > > > >
> > > > > 7. You were just introducing your ideas here, so perhaps you
> haven't
> > > > gotten
> > > > > this far yet, but a shortcoming to doing merge(String,Map,Map) is
> > that
> > > it
> > > > > leaves open no opportunity to stream a List of Maps to a merge()
> for
> > a
> > > > form
> > > > > of batch loading which is mighty common and one of the variations
> of
> > > the
> > > > > coalesce() pattern that I alluded to at the start of all this. I
> > think
> > > > that
> > > > > we would want to be sure that we left open the option to do that
> > > somehow.
> > > > > 8. If we had a general purpose merge() step I wonder if it makes
> > > > developing
> > > > > the API as you suggested easier to do?
> > > > >
> > > > > Hmm, let me think about that one.
> > > > >
> > > > >
> > > > I will add an item 8 to think about which I didn't mention before:
> > > >
> > > > 8. The signature you suggested for mergeE() is:
> > > >
> > > > mergeE(String, Map, Map)
> > > >      String - The edge label
> > > >      Map (first) - The properties to match existing edge on
> > > >      Map (second) - Any additional properties to set if a new edge is
> > > > created (optional)
> > > >
> > > > but does that exactly answer the need? Typically the idea is to
> detect
> > an
> > > > edge between two vertices not globally in the graph by way of some
> > > > properties. This signature doesn't seem to allow for that as it
> doesn't
> > > > allow specification of the vertices to test against. Perhaps the
> answer
> > > is
> > > > to use from() and to() modulators?
> > > >
> > > > g.mergeE('knows', {'weight':0.5}).
> > > >   from(V(1)).to(V(2))
> > > > g.V().has('person','name','marko').
> > > >   mergeE( 'knows', {'weight':0.5}).
> > > >     to(V(2))
> > > > g.V().has('person','name','marko').
> > > >   mergeE( 'self', {'weight':0.5})
> > > >
> > > > That seems to work?
> > > >
> > >
> > > I think that is an elegant solution to that problem.  I like it and it
> > > keeps in line with the way that addE() works.
> > >
> > >
> > > >
> > > >
> > > >
> > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > spmallette@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > I like our coalesce() pattern but it is verbose and over time it
> > has
> > > > gone
> > > > > > from a simple pattern to one with numerous variations for all
> > manner
> > > of
> > > > > > different sorts of merge-like operations. As such, I do think we
> > > should
> > > > > > introduce something to cover this pattern.
> > > > > >
> > > > > > I like that you used the word "merge" in your description of this
> > as
> > > it
> > > > > is
> > > > > > the word I've liked using. You might want to give a look at my
> > > proposed
> > > > > > merge() step from earlier in the year:
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > >
> > > > > > I'm just going to dump thoughts as they come regarding what you
> > > wrote:
> > > > > >
> > > > > > 1. How would multi/meta-properties fit into the API you've
> > proposed?
> > > > > > 2. How would users set the T.id on creation? would that T.id just
> > be
> > > a
> > > > > key
> > > > > > in the first Map argument?
> > > > > > 3. I do like the general idea of a match on multiple properties
> for
> > > the
> > > > > > first argument as a convenience but wonder about the specificity
> of
> > > > this
> > > > > > API a bit as it focuses heavily on equality - I suppose that's
> most
> > > > cases
> > > > > > for get-or-create, so perhaps that's ok.
> > > > > > 4. I think your suggestion points to one of the troubles Gremlin
> > has
> > > > > which
> > > > > > we see with "algorithms" - extending the language with new steps
> > that
> > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > > generalization of "merging" in my link above which fits into the
> > core
> > > > > > Gremlin language style. There is a bigger picture where we are
> > > missing
> > > > > > something in Gremlin that lets us extend the language in ways
> that
> > > let
> > > > us
> > > > > > easily introduce new steps that aren't for general purpose. This
> > > issue
> > > > is
> > > > > > discussed in terms of "algorithms" here:
> > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think
> > we
> > > > > could
> > > > > > see how there might be some "mutation" extension steps that would
> > > cover
> > > > > > your suggested API, plus batch operations, etc. We need a way to
> > add
> > > > > > "sugar" without it interfering with the consistency of the core.
> > > > > Obviously
> > > > > > this is a bigger issue but perhaps important to solve to
> implement
> > > > steps
> > > > > in
> > > > > > the fashion you describe.
> > > > > > 5. I suppose that the reason for mergeE and mergeV is to specify
> > what
> > > > > > element type the first Map argument should be applied to? what
> > about
> > > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > > tricky
> > > > > but
> > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > "complex
> > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > g.V().merge()
> > > > > > and the label and first Map would fold to VertexStartStep (i.e.
> > V())
> > > > for
> > > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > > modulator"
> > > > > > as it does more than just change the behavior of the previous
> step
> > -
> > > it
> > > > > > also adds its own. I suppose it could also add has() steps
> followed
> > > by
> > > > > the
> > > > > > MergeStep and then the has() operations would fold in normally as
> > > they
> > > > do
> > > > > > today. In this way, we can simplify to just one single
> > > > > > merge(String,Map,Map). ??
> > > > > > 6. One thing neither my approach nor yours seems to do is tell
> the
> > > user
> > > > > if
> > > > > > they created something or updated something - that's another
> thing
> > > I've
> > > > > > seen users want to have in get-or-create. Here again we go deeper
> > > into
> > > > a
> > > > > > less general step specification as alluded to in 4, but a merge()
> > > step
> > > > as
> > > > > > proposed in 5, might return [Element,boolean] so as to provide an
> > > > > indicator
> > > > > > of creation?
> > > > > > 7. You were just introducing your ideas here, so perhaps you
> > haven't
> > > > > gotten
> > > > > > this far yet, but a shortcoming to doing merge(String,Map,Map) is
> > > that
> > > > it
> > > > > > leaves open no opportunity to stream a List of Maps to a merge()
> > for
> > > a
> > > > > form
> > > > > > of batch loading which is mighty common and one of the variations
> > of
> > > > the
> > > > > > coalesce() pattern that I alluded to at the start of all this. I
> > > think
> > > > > that
> > > > > > we would want to be sure that we left open the option to do that
> > > > somehow.
> > > > > > 8. If we had a general purpose merge() step I wonder if it makes
> > > > > developing
> > > > > > the API as you suggested easier to do?
> > > > > >
> > > > > > I think I'd like to solve the problems you describe in your post
> as
> > > > well
> > > > > as
> > > > > > the ones in mine. There is some relation there, but gaps as well.
> > > With
> > > > > more
> > > > > > discussion here we can figure something out.
> > > > > >
> > > > > > Thanks for starting this talk - good one!
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > dave@bechberger.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I've had a few on and off discussions with a few people here,
> so
> > I
> > > > > wanted
> > > > > > > to send this out to everyone for feedback.
> > > > > > >
> > > > > > > What are people's thoughts on creating a new set of steps that
> > > codify
> > > > > > > common Gremlin best practices?
> > > > > > >
> > > > > > > I think there are several common Gremlin patterns where users
> > would
> > > > > > benefit
> > > > > > > from the additional guidance that these codified steps
> represent.
> > > > The
> > > > > > > first one I would recommend though is codifying the element
> > > existence
> > > > > > > pattern into a single Gremlin step, something like:
> > > > > > >
> > > > > > > mergeV(String, Map, Map)
> > > > > > >      String - The vertex label
> > > > > > >      Map (first) - The properties to match existing vertices on
> > > > > > >      Map (second) - Any additional properties to set if a new
> > > vertex
> > > > is
> > > > > > > created (optional)
> > > > > > > mergeE(String, Map, Map)
> > > > > > >      String - The edge label
> > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > >      Map (second) - Any additional properties to set if a new
> > edge
> > > is
> > > > > > > created (optional)
> > > > > > >
> > > > > > > In each of these cases these steps would perform the same
> upsert
> > > > > > > functionality as the element existence pattern.
> > > > > > >
> > > > > > > Example:
> > > > > > >
> > > > > > > g.V().has('person','name','stephen').
> > > > > > >            fold().
> > > > > > >            coalesce(unfold(),
> > > > > > >                     addV('person').
> > > > > > >                       property('name','stephen').
> > > > > > >                       property('age',34))
> > > > > > >
> > > > > > > would become:
> > > > > > >
> > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > >
> > > > > > > I think that this change would be a good addition to the
> language
> > > for
> > > > > > > several reasons:
> > > > > > >
> > > > > > > * This codifies the best practice for a specific action/recipe,
> > > which
> > > > > > > reduces the chance that someone uses the pattern incorrectly
> > > > > > > * Most complex Gremlin traversals are verbose.  Reducing the
> > amount
> > > > of
> > > > > > code
> > > > > > > that needs to be written and maintained allows for a better
> > > developer
> > > > > > > experience.
> > > > > > > * It will lower the bar of entry for a developer by making
> these
> > > > > actions
> > > > > > > more discoverable.  The more we can help bring these patterns
> to
> > > the
> > > > > > > forefront of the language via these pattern/meta steps the more
> > we
> > > > > guide
> > > > > > > users towards writing better Gremlin faster
> > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > >
> > > > > > > I know that this would likely be the first step in Gremlin that
> > > > > codifies
> > > > > > a
> > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > >
> > > > > > > Dave
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by David Bechberger <da...@bechberger.com>.
Hello Stephen,

Thanks for making that gist, its much easier to follow along with the
proposed syntax there.  To your specific comments:

#1 - My only worry with the term upsert is that it is not as widely a used
term for this sort of pattern as "Merge" (e.g. SQL, Cypher).  However I
don't have a strong opinion on this, so I am fine with either.
#2 - My only real objective here is to make sure that we make the 80-90%
case easy and straightforward.  I think that having the fallback option of
using the current syntax for any complicated edge cases should be
considered here as well. I'd appreciate your thoughts here as these are
good points you bring up that definitely fall into the 80-90% use case.
#3 - Those points make sense to me, not sure I have anything further to add
#4 - I don't think I would expect to have the values extracted from the
traversal filters but I'd be interested in other opinions on that.

Thanks,
Dave

On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <sp...@gmail.com>
wrote:

> It's been a couple weeks since we moved this thread so I went back through
> it and refreshed my mind with what's there. Dave, nice job putting all
> those examples together. I've recollected them all together and have just
> been reviewing them in a more formatted style with proper Groovy syntax
> than what can be accomplished in this mailing list. Anyone who cares to
> review in that form can do so here (i've pasted its markdown contents at
> the bottom of this post as well):
>
> https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
>
> In this light, I have the following comments and thoughts:
>
> 1. I feel more inclined toward saving "merge" for a more generalized step
> of that name and have thus renamed the examples to use "upsert".
>
> 2. I still don't quite feel comfortable with meta/multi-properties
> examples. Perhaps you could explain further as it's possible I'm not
> thinking things through properly. For meta-properties the example is:
>
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1, name:[first: Marko', last: 'R'])
>
> So I see that "name" takes a Map here, but how does Gremlin know if that
> means "meta-property" or "Map value" and if the former, then how do you set
> the value of "name"?  My question is similar to the multi-property examples
> - using this one:
>
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1,lang: ['java', 'scala'])
>
> How does Gremlin know if the user means that "lang" should be a
> "multi-property" with Cardinality.list (or Cardinality.set for that matter)
> or a value of Cardinality.single with a List/Array value? I can perhaps
> suggest some solutions here but wanted to make sure I wasn't just
> misunderstanding something.
>
> 3. The "upsert with stream" example is:
>
> g.inject([(id): 1, (label): 'person', name: 'marko'],
>          [(id): 2, (label): 'person', name: 'josh']).
>   upsertV(values(label), valueMap(id), valueMap())
>
> That would establish an API of upsertV(Traversal, Traversal, Traversal)
> which would enable your final thought of your last post with:
>
> g.V().upsertV('person', __.has('person','age',gt(29)).has('state', 'NY'),
>                         [active: true])
>
> The open-ended nature of that is neat but comes with some complexity with
> steps that traditionally take an "open" Traversal (and this step now has up
> to three of them) - perhaps, that's our own fault in some ways. For the
> "stream" concept, values()/valueMap() won't immediately work that way as
> they only apply to Element...given today's Gremlin semantics, I think we'd
> re-write that as:
>
> g.inject([(id): 1, (label): 'person', name: 'marko'],
>          [(id): 2, (label): 'person', name: 'josh']).
>   upsertV(select(label), select(id), select(id,label,'name'))
>
> Though, even that doesn't quite work because you can't select(T) right now
> which means you'd need to go with:
>
> g.inject([id: 1, label: 'person', name: 'marko'],
>          [id: 2, label: 'person', name: 'josh']).
>   upsertV(select('label'), select('id'), select('id','label','name'))
>
> not terrible in a way and perhaps we could fix select(T) - can't remember
> why that isn't allowed except that select() operates over multiple scopes
> and perhaps trying T through those scopes causes trouble.
>
> 4. Continuing with the use of Traversal as arguments, let's go back to:
>
> g.V().upsertV('person', __.has('person','age',gt(29)).has('state', 'NY'),
>                         [active: true])
>
> I originally had it in my mind to prefer this approach over a Map for the
> search as it provides a great deal of flexibility and fits the common
> filtering patterns that users follow really well. I suppose the question is
> what to do in the situation of "create". Would we extract all the strict
> equality filter values from the initial has(), thus, in the example,
> "state" but not "age", and create a vertex that has "state=NY,
> active=true"? That is possible with how things are designed in Gremlin
> today I think.
>
> This syntax does create some conflict with the subject of 3 and streams
> because the traverser in the streams case is the incoming Map but for this
> context it's meant as a filter of V(). Getting too whacky?
>
> 5. I like the idea of using a side-effect to capture whether the element
> was created or not. that makes sense. I think that can work.
>
> ==============================
> = gist contents
> ==============================
>
> # API
>
> ```java
> upsertV(String label, Map matchOrCreateProperties)
> upsertV(String label, Map matchOrCreateProperties, Map
> additionalProperties)
> upsertV(Traversal label, Traversal matchOrCreateProperties)
> upsertV(Traversal label, Traversal matchOrCreateProperties, Traversal
> additionalProperties)
> upsertV(...).
>   with(WithOptions.sideEffectLabel, 'a')
>
> upsertE(String label, Map matchOrCreateProperties)
> upsertE(Traversal label, Traversal matchOrCreateProperties)
> upsertE(String label, Map matchOrCreateProperties, Map
> additionalProperties)
> upsertE(Traversal label, Traversal matchOrCreateProperties, Traversal
> additionalProperties)
> upsertE(...).
>   from(Vertex incidentOut).to(Vertex incidentIn)
>   with(WithOptions.sideEffectLabel, 'a')
> ```
>
> # Examples
>
> ## upsert
>
> ```groovy
> g.upsertV('person', [name: 'marko'],
>                     [name: 'marko', age: 29])
> ```
>
> ## upsert with id
>
> ```groovy
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1, name: 'Marko'])
> ```
>
> ## upsert with Meta Properties
>
> ```groovy
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1, name:[first: Marko', last: 'R'])
> ```
>
> ## upsert with Multi Properties
>
> ```groovy
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1,lang: ['java', 'scala'])
> ```
>
> ## upsert with Single Cardinality
>
> ```groovy
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1, lang: 'java')
> ```
>
> ## upsert with List Cardinality
>
> ```groovy
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> ```
>
> ## upsert with Set Cardinality
>
> ```groovy
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1, lang: ['java','scala'])
> ```
>
> ## upsert with List Cardinality - and add value to list
>
> ```groovy
> g.upsertV('person', [(T.id): 1],
>                     [(T.id): 1, lang: ['java','scala','java']).
>   property(Cardinality.list, 'lang', 'java')
> ```
>
> ## upsert with stream
>
> ```groovy
> // doesn't work with today's Gremlin semantics
> g.inject([(id): 1, (label): 'person', name: 'marko'],
>          [(id): 2, (label): 'person', name: 'josh']).
>   upsertV(values(label), valueMap(id), valueMap())
>
> // the following would be more in line with what's possible with existing
> Gremlin semantics:
> g.inject([id: 1, label: 'person', name: 'marko'],
>          [id: 2, label: 'person', name: 'josh']).
>   upsertV(select('label'), select('id'), select('id','label','name'))
> ```
>
> ## upsert with reporting added or updated side effect
>
> ```groovy
> g.upsertV('person', [name: 'marko'],
>                     [name: 'marko', age: 29]).
>     with(WithOptions.sideEffectLabel, 'a').
>   project('vertex', 'added').
>     by().
>     by(cap('a'))
> ```
>
> ## upsert Edge assuming a self-relation of "marko"
>
> ```groovy
> g.V().has('person','name','marko').
>   upsertE( 'self', [weight:0.5])
> ```
>
> ## upsert Edge with incident vertex checks
>
> ```groovy
> g.upsertE('knows', [weight:0.5]).
>    from(V(1)).to(V(2))
> g.V().has('person','name','marko').
>   upsertE( 'knows', [weight:0.5]).
>     to(V(2))
> ```
>
> ## upsert using a Traversal for the match
>
> ```groovy
> g.V().upsertV('person', __.has('person','age',gt(29)).has('state', 'NY'),
>                         [active: true])
> ```
>
>
>
>
> On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <da...@bechberger.com>
> wrote:
>
> > So, I've been doing some additional thinking about the ways that this
> could
> > work based on the comments below and have put my comments inline.
> >
> > Dave
> >
> > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <sp...@gmail.com>
> > wrote:
> >
> > > I added some thoughts inline below:
> > >
> > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <da...@bechberger.com>
> > > wrote:
> > >
> > > > Thanks for the detailed comments Stephen.  I have addressed them
> inline
> > > > below.
> > > >
> > > > I did read the proposal from earlier and I think that we are in close
> > > > agreement with what we are trying to accomplish.  I also fully
> support
> > > > Josh's comment on providing a mechanism for submitting a map of
> > > properties
> > > > as manually unrolling this all right now leads to a lot of potential
> > for
> > > > error and a long messy traversal.
> > > >
> > > > I'm looking forward to this discussion on how to merge these two
> > > proposals.
> > > >
> > > >
> > > > 1. How would multi/meta-properties fit into the API you've proposed?
> > > >
> > > > My first thought here is that multi-properties would be represented
> as
> > > > lists in the map, e.g.
> > > >
> > > > {names: ['Dave', 'David']}
> > > >
> > > > and meta-properties would be represented as maps in the maps, e.g.
> > > >
> > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > >
> > > > I can't say I've thought through all the implications of this though
> so
> > > it
> > > > is an area we would need to explore.
> > > >
> > >
> > > The first implication that comes to mind is that it makes the
> assumption
> > > the user wants multi/meta-properties as opposed to a single cardinality
> > of
> > > List in the first case and a Map as a property value in the second
> case.
> > I
> > > suppose that graphs with a schema could resolve those assumptions but
> > > graphs that are schemaless would have a problem. The issue could be
> > > resolved by specialized configuration of "g" or per merge() step using
> a
> > > with() modulator I suppose but that goes into a yet another level of
> > > implications to consider. I've often wondered if the start point for
> > > getting types/schema into TP3 without a full rewrite would be in this
> > form
> > > where Gremlin would be given hints as to what to expect as to the types
> > of
> > > data it might encounter while traversing. Anyway, I'd be hesitant to go
> > > down paths that don't account for multi/metaproperties well.  They are
> > > first class citizens in TP3 (with those hoping for extension of at
> least
> > > multiproperties to edges) and while I find them a constant annoyance
> for
> > so
> > > many reasons, we're kinda stuck with them.
> > >
> >
> > I agree we need to account for multi/meta properties as 1st class
> > citizens.  This is my current thinking on the syntax for the situations
> we
> > have laid out so far:
> >
> > *Merge*
> > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
> >
> >
> > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name': 'Marko'})
> >
> >
> > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1,
> 'name':
> > {'first': Marko', 'last': 'R'})
> >
> >
> > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > 'lang': ['java', 'scala'])
> >
> >
> > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > 'lang': 'java')
> >
> >
> > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> 'lang':
> > ['java', 'scala', 'java'])
> >
> >
> > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1,
> 'lang':
> > ['java', 'scala'])
> >
> > Since in a mergeV() scenario we are only ever adding whatever values are
> > passed in there would be no need to specify the cardinality of the
> property
> > being added.  If they wanted to add a value to an existing property then
> > the current property() method would still be available on the output
> > traversal. e.g.
> > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > 'java']).property(Cardinality.list, 'lang, 'java')
> >
> >
> > *Merge with stream * g.inject([{id: 1, label: 'person', 'name': 'marko'},
> > {id: 2, label: 'person', 'name': 'josh'}]).
> >   mergeV(values('label'), valueMap('id'), valueMap())
> >
> >
> > *Merge with reporting added or updated side effect* g.mergeV('name',
> > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> >   with(WithOptions.sideEffectLabel, 'a').
> >   project('vertex', 'added').
> >     by().
> >     by(cap('a'))
> >
> >
> >
> > >
> > >
> > > > 2. How would users set the T.id on creation? would that T.id just be
> a
> > > key
> > > > in the first Map argument?
> > > >
> > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > >
> > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > >
> > >
> > > ok - I can't say I see a problem with that atm.
> > >
> > >
> > > > 3. I do like the general idea of a match on multiple properties for
> the
> > > > first argument as a convenience but wonder about the specificity of
> > this
> > > > API a bit as it focuses heavily on equality - I suppose that's most
> > cases
> > > > for get-or-create, so perhaps that's ok.
> > > >
> > > > In most cases I've seen use exact matches on the vertex or edge.  I
> > think
> > > > it might be best to keep this straightforward as any complex edge
> cases
> > > > still can perform the same functionality using the coalesce()
> pattern.
> > > >
> > >
> > > I played around with the idea to use modulators to apply additional
> > > constraints:
> > >
> > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > >     by(has('age', gt(29))
> > >
> > > I suppose that's neat but maybe not...we could easily get the same
> thing
> > > from:
> > >
> > > g.V().has('person','age',gt(29)).
> > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > >
> > > which is more readable though it does make it harder for providers to
> > > optimize upsert operations if they could make use of the has() as part
> of
> > > that. I think I like has() as part of the main query rather than in a
> > by()
> > > - just thinking out loud.
> > >
> > >
> > Another potential option here would be to allow the second parameter of
> > mergeV() accept a traversal.  Not sure how much complexity that would add
> > though
> >
> > g.V().mergeV('person', __.has('person','age',gt(29)).has('state', 'NY'}.,
> > {'active': true}).
> >
> >
> > >
> > > > 4. I think your suggestion points to one of the troubles Gremlin has
> > > which
> > > > we see with "algorithms" - extending the language with new steps that
> > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > generalization of "merging" in my link above which fits into the core
> > > > Gremlin language style. There is a bigger picture where we are
> missing
> > > > something in Gremlin that lets us extend the language in ways that
> let
> > us
> > > > easily introduce new steps that aren't for general purpose. This
> issue
> > is
> > > > discussed in terms of "algorithms" here:
> > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we
> > > could
> > > > see how there might be some "mutation" extension steps that would
> cover
> > > > your suggested API, plus batch operations, etc. We need a way to add
> > > > "sugar" without it interfering with the consistency of the core.
> > > Obviously
> > > > this is a bigger issue but perhaps important to solve to implement
> > steps
> > > in
> > > > the fashion you describe.
> > > >
> > > > I agree that the "algorithm" steps do seem a bit odd in the core
> > language
> > > > but I think the need is there.  I'd be interested in furthering this
> > > > discussion but I think these "pattern" steps may or may not be the
> same
> > > as
> > > > the algorithm steps.  I'll have to think on that.
> > >
> > >
> > >
> > > > 5. I suppose that the reason for mergeE and mergeV is to specify what
> > > > element type the first Map argument should be applied to? what about
> > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> tricky
> > > but
> > > > I don't think we should miss that. Perhaps merge() could be a
> "complex
> > > > modulator"?? that's a new concept of course, but you would do
> > > g.V().merge()
> > > > and the label and first Map would fold to VertexStartStep (i.e. V())
> > for
> > > > the lookup and then a MergeStep would follow - thus a "complex
> > modulator"
> > > > as it does more than just change the behavior of the previous step -
> it
> > > > also adds its own. I suppose it could also add has() steps followed
> by
> > > the
> > > > MergeStep and then the has() operations would fold in normally as
> they
> > do
> > > > today. In this way, we can simplify to just one single
> > > > merge(String,Map,Map). ??
> > > >
> > > > I agree that we should also think about how to include properties in
> > this
> > > > merge construct.  The reason I was thinking about mergeV() and
> mergeE()
> > > is
> > > > that it follows the same pattern as the already well understood
> > > > addV()/addE() steps.  I am a bit wary of trying to generalize this
> down
> > > to
> > > > a single merge() step as these sorts of complex overloads make it
> hard
> > to
> > > > figure out which Gremlin step you should use for a particular pattern
> > > (e.g.
> > > > upsert vertex or upsert edge).  One thing that I think customers find
> > > > useful about the addV/addE step is that they are very discoverable,
> the
> > > > name tells me what functionality to expect from that step.
> > > >
> > >
> > > I'm starting to agree with the idea that we have to do something like
> > > mergeV()/E() as it wouldn't quite work as a start step otherwise. We
> > > couldn't do:
> > >
> > > g.merge()
> > >
> > > as we wouldnt know what sort of Element it applied to. If so, I wonder
> if
> > > it would be better to preserve "merge" for my more general use-case and
> > > prefer upsertV()/E()?
> > >
> > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is an
> > uncommon
> > > case (compared to V/E()) and we don't really have addVP() as an
> analogous
> > > step. Perhaps existing coalesce() patterns and/or my more general
> purpose
> > > merge() step would satisfy those situations??
> > >
> >
> > I think that a mergeVP() type step may require a different pattern since
> it
> > would really need to accept a map of values otherwise it would
> essentially
> > perform the same function as property()
> >
> >
> > >
> > >
> > > > 6. One thing neither my approach nor yours seems to do is tell the
> user
> > > if
> > > > they created something or updated something - that's another thing
> I've
> > > > seen users want to have in get-or-create. Here again we go deeper
> into
> > a
> > > > less general step specification as alluded to in 4, but a merge()
> step
> > as
> > > > proposed in 5, might return [Element,boolean] so as to provide an
> > > indicator
> > > > of creation?
> > > >
> > > > Hmm, I'll have to think about that.  How would returning multiple
> > values
> > > > work if we want to chain these together. e.g. Add a vertex between
> two
> > > > edges but make sure the vertices exist?
> > > >
> > >
> >
> > Added a thought on how to accomplish this above that I'd like to get
> > thoughts on.
> >
> >
> > > >
> > > > 7. You were just introducing your ideas here, so perhaps you haven't
> > > gotten
> > > > this far yet, but a shortcoming to doing merge(String,Map,Map) is
> that
> > it
> > > > leaves open no opportunity to stream a List of Maps to a merge() for
> a
> > > form
> > > > of batch loading which is mighty common and one of the variations of
> > the
> > > > coalesce() pattern that I alluded to at the start of all this. I
> think
> > > that
> > > > we would want to be sure that we left open the option to do that
> > somehow.
> > > > 8. If we had a general purpose merge() step I wonder if it makes
> > > developing
> > > > the API as you suggested easier to do?
> > > >
> > > > Hmm, let me think about that one.
> > > >
> > > >
> > > I will add an item 8 to think about which I didn't mention before:
> > >
> > > 8. The signature you suggested for mergeE() is:
> > >
> > > mergeE(String, Map, Map)
> > >      String - The edge label
> > >      Map (first) - The properties to match existing edge on
> > >      Map (second) - Any additional properties to set if a new edge is
> > > created (optional)
> > >
> > > but does that exactly answer the need? Typically the idea is to detect
> an
> > > edge between two vertices not globally in the graph by way of some
> > > properties. This signature doesn't seem to allow for that as it doesn't
> > > allow specification of the vertices to test against. Perhaps the answer
> > is
> > > to use from() and to() modulators?
> > >
> > > g.mergeE('knows', {'weight':0.5}).
> > >   from(V(1)).to(V(2))
> > > g.V().has('person','name','marko').
> > >   mergeE( 'knows', {'weight':0.5}).
> > >     to(V(2))
> > > g.V().has('person','name','marko').
> > >   mergeE( 'self', {'weight':0.5})
> > >
> > > That seems to work?
> > >
> >
> > I think that is an elegant solution to that problem.  I like it and it
> > keeps in line with the way that addE() works.
> >
> >
> > >
> > >
> > >
> > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> spmallette@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > I like our coalesce() pattern but it is verbose and over time it
> has
> > > gone
> > > > > from a simple pattern to one with numerous variations for all
> manner
> > of
> > > > > different sorts of merge-like operations. As such, I do think we
> > should
> > > > > introduce something to cover this pattern.
> > > > >
> > > > > I like that you used the word "merge" in your description of this
> as
> > it
> > > > is
> > > > > the word I've liked using. You might want to give a look at my
> > proposed
> > > > > merge() step from earlier in the year:
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > >
> > > > > I'm just going to dump thoughts as they come regarding what you
> > wrote:
> > > > >
> > > > > 1. How would multi/meta-properties fit into the API you've
> proposed?
> > > > > 2. How would users set the T.id on creation? would that T.id just
> be
> > a
> > > > key
> > > > > in the first Map argument?
> > > > > 3. I do like the general idea of a match on multiple properties for
> > the
> > > > > first argument as a convenience but wonder about the specificity of
> > > this
> > > > > API a bit as it focuses heavily on equality - I suppose that's most
> > > cases
> > > > > for get-or-create, so perhaps that's ok.
> > > > > 4. I think your suggestion points to one of the troubles Gremlin
> has
> > > > which
> > > > > we see with "algorithms" - extending the language with new steps
> that
> > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > generalization of "merging" in my link above which fits into the
> core
> > > > > Gremlin language style. There is a bigger picture where we are
> > missing
> > > > > something in Gremlin that lets us extend the language in ways that
> > let
> > > us
> > > > > easily introduce new steps that aren't for general purpose. This
> > issue
> > > is
> > > > > discussed in terms of "algorithms" here:
> > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think
> we
> > > > could
> > > > > see how there might be some "mutation" extension steps that would
> > cover
> > > > > your suggested API, plus batch operations, etc. We need a way to
> add
> > > > > "sugar" without it interfering with the consistency of the core.
> > > > Obviously
> > > > > this is a bigger issue but perhaps important to solve to implement
> > > steps
> > > > in
> > > > > the fashion you describe.
> > > > > 5. I suppose that the reason for mergeE and mergeV is to specify
> what
> > > > > element type the first Map argument should be applied to? what
> about
> > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > tricky
> > > > but
> > > > > I don't think we should miss that. Perhaps merge() could be a
> > "complex
> > > > > modulator"?? that's a new concept of course, but you would do
> > > > g.V().merge()
> > > > > and the label and first Map would fold to VertexStartStep (i.e.
> V())
> > > for
> > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > modulator"
> > > > > as it does more than just change the behavior of the previous step
> -
> > it
> > > > > also adds its own. I suppose it could also add has() steps followed
> > by
> > > > the
> > > > > MergeStep and then the has() operations would fold in normally as
> > they
> > > do
> > > > > today. In this way, we can simplify to just one single
> > > > > merge(String,Map,Map). ??
> > > > > 6. One thing neither my approach nor yours seems to do is tell the
> > user
> > > > if
> > > > > they created something or updated something - that's another thing
> > I've
> > > > > seen users want to have in get-or-create. Here again we go deeper
> > into
> > > a
> > > > > less general step specification as alluded to in 4, but a merge()
> > step
> > > as
> > > > > proposed in 5, might return [Element,boolean] so as to provide an
> > > > indicator
> > > > > of creation?
> > > > > 7. You were just introducing your ideas here, so perhaps you
> haven't
> > > > gotten
> > > > > this far yet, but a shortcoming to doing merge(String,Map,Map) is
> > that
> > > it
> > > > > leaves open no opportunity to stream a List of Maps to a merge()
> for
> > a
> > > > form
> > > > > of batch loading which is mighty common and one of the variations
> of
> > > the
> > > > > coalesce() pattern that I alluded to at the start of all this. I
> > think
> > > > that
> > > > > we would want to be sure that we left open the option to do that
> > > somehow.
> > > > > 8. If we had a general purpose merge() step I wonder if it makes
> > > > developing
> > > > > the API as you suggested easier to do?
> > > > >
> > > > > I think I'd like to solve the problems you describe in your post as
> > > well
> > > > as
> > > > > the ones in mine. There is some relation there, but gaps as well.
> > With
> > > > more
> > > > > discussion here we can figure something out.
> > > > >
> > > > > Thanks for starting this talk - good one!
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > dave@bechberger.com>
> > > > > wrote:
> > > > >
> > > > > > I've had a few on and off discussions with a few people here, so
> I
> > > > wanted
> > > > > > to send this out to everyone for feedback.
> > > > > >
> > > > > > What are people's thoughts on creating a new set of steps that
> > codify
> > > > > > common Gremlin best practices?
> > > > > >
> > > > > > I think there are several common Gremlin patterns where users
> would
> > > > > benefit
> > > > > > from the additional guidance that these codified steps represent.
> > > The
> > > > > > first one I would recommend though is codifying the element
> > existence
> > > > > > pattern into a single Gremlin step, something like:
> > > > > >
> > > > > > mergeV(String, Map, Map)
> > > > > >      String - The vertex label
> > > > > >      Map (first) - The properties to match existing vertices on
> > > > > >      Map (second) - Any additional properties to set if a new
> > vertex
> > > is
> > > > > > created (optional)
> > > > > > mergeE(String, Map, Map)
> > > > > >      String - The edge label
> > > > > >      Map (first) - The properties to match existing edge on
> > > > > >      Map (second) - Any additional properties to set if a new
> edge
> > is
> > > > > > created (optional)
> > > > > >
> > > > > > In each of these cases these steps would perform the same upsert
> > > > > > functionality as the element existence pattern.
> > > > > >
> > > > > > Example:
> > > > > >
> > > > > > g.V().has('person','name','stephen').
> > > > > >            fold().
> > > > > >            coalesce(unfold(),
> > > > > >                     addV('person').
> > > > > >                       property('name','stephen').
> > > > > >                       property('age',34))
> > > > > >
> > > > > > would become:
> > > > > >
> > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > >
> > > > > > I think that this change would be a good addition to the language
> > for
> > > > > > several reasons:
> > > > > >
> > > > > > * This codifies the best practice for a specific action/recipe,
> > which
> > > > > > reduces the chance that someone uses the pattern incorrectly
> > > > > > * Most complex Gremlin traversals are verbose.  Reducing the
> amount
> > > of
> > > > > code
> > > > > > that needs to be written and maintained allows for a better
> > developer
> > > > > > experience.
> > > > > > * It will lower the bar of entry for a developer by making these
> > > > actions
> > > > > > more discoverable.  The more we can help bring these patterns to
> > the
> > > > > > forefront of the language via these pattern/meta steps the more
> we
> > > > guide
> > > > > > users towards writing better Gremlin faster
> > > > > > * This allows DB vendors to optimize for this pattern
> > > > > >
> > > > > > I know that this would likely be the first step in Gremlin that
> > > > codifies
> > > > > a
> > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > >
> > > > > > Dave
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Stephen Mallette <sp...@gmail.com>.
It's been a couple weeks since we moved this thread so I went back through
it and refreshed my mind with what's there. Dave, nice job putting all
those examples together. I've recollected them all together and have just
been reviewing them in a more formatted style with proper Groovy syntax
than what can be accomplished in this mailing list. Anyone who cares to
review in that form can do so here (i've pasted its markdown contents at
the bottom of this post as well):

https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31

In this light, I have the following comments and thoughts:

1. I feel more inclined toward saving "merge" for a more generalized step
of that name and have thus renamed the examples to use "upsert".

2. I still don't quite feel comfortable with meta/multi-properties
examples. Perhaps you could explain further as it's possible I'm not
thinking things through properly. For meta-properties the example is:

g.upsertV('person', [(T.id): 1],
                    [(T.id): 1, name:[first: Marko', last: 'R'])

So I see that "name" takes a Map here, but how does Gremlin know if that
means "meta-property" or "Map value" and if the former, then how do you set
the value of "name"?  My question is similar to the multi-property examples
- using this one:

g.upsertV('person', [(T.id): 1],
                    [(T.id): 1,lang: ['java', 'scala'])

How does Gremlin know if the user means that "lang" should be a
"multi-property" with Cardinality.list (or Cardinality.set for that matter)
or a value of Cardinality.single with a List/Array value? I can perhaps
suggest some solutions here but wanted to make sure I wasn't just
misunderstanding something.

3. The "upsert with stream" example is:

g.inject([(id): 1, (label): 'person', name: 'marko'],
         [(id): 2, (label): 'person', name: 'josh']).
  upsertV(values(label), valueMap(id), valueMap())

That would establish an API of upsertV(Traversal, Traversal, Traversal)
which would enable your final thought of your last post with:

g.V().upsertV('person', __.has('person','age',gt(29)).has('state', 'NY'),
                        [active: true])

The open-ended nature of that is neat but comes with some complexity with
steps that traditionally take an "open" Traversal (and this step now has up
to three of them) - perhaps, that's our own fault in some ways. For the
"stream" concept, values()/valueMap() won't immediately work that way as
they only apply to Element...given today's Gremlin semantics, I think we'd
re-write that as:

g.inject([(id): 1, (label): 'person', name: 'marko'],
         [(id): 2, (label): 'person', name: 'josh']).
  upsertV(select(label), select(id), select(id,label,'name'))

Though, even that doesn't quite work because you can't select(T) right now
which means you'd need to go with:

g.inject([id: 1, label: 'person', name: 'marko'],
         [id: 2, label: 'person', name: 'josh']).
  upsertV(select('label'), select('id'), select('id','label','name'))

not terrible in a way and perhaps we could fix select(T) - can't remember
why that isn't allowed except that select() operates over multiple scopes
and perhaps trying T through those scopes causes trouble.

4. Continuing with the use of Traversal as arguments, let's go back to:

g.V().upsertV('person', __.has('person','age',gt(29)).has('state', 'NY'),
                        [active: true])

I originally had it in my mind to prefer this approach over a Map for the
search as it provides a great deal of flexibility and fits the common
filtering patterns that users follow really well. I suppose the question is
what to do in the situation of "create". Would we extract all the strict
equality filter values from the initial has(), thus, in the example,
"state" but not "age", and create a vertex that has "state=NY,
active=true"? That is possible with how things are designed in Gremlin
today I think.

This syntax does create some conflict with the subject of 3 and streams
because the traverser in the streams case is the incoming Map but for this
context it's meant as a filter of V(). Getting too whacky?

5. I like the idea of using a side-effect to capture whether the element
was created or not. that makes sense. I think that can work.

==============================
= gist contents
==============================

# API

```java
upsertV(String label, Map matchOrCreateProperties)
upsertV(String label, Map matchOrCreateProperties, Map additionalProperties)
upsertV(Traversal label, Traversal matchOrCreateProperties)
upsertV(Traversal label, Traversal matchOrCreateProperties, Traversal
additionalProperties)
upsertV(...).
  with(WithOptions.sideEffectLabel, 'a')

upsertE(String label, Map matchOrCreateProperties)
upsertE(Traversal label, Traversal matchOrCreateProperties)
upsertE(String label, Map matchOrCreateProperties, Map additionalProperties)
upsertE(Traversal label, Traversal matchOrCreateProperties, Traversal
additionalProperties)
upsertE(...).
  from(Vertex incidentOut).to(Vertex incidentIn)
  with(WithOptions.sideEffectLabel, 'a')
```

# Examples

## upsert

```groovy
g.upsertV('person', [name: 'marko'],
                    [name: 'marko', age: 29])
```

## upsert with id

```groovy
g.upsertV('person', [(T.id): 1],
                    [(T.id): 1, name: 'Marko'])
```

## upsert with Meta Properties

```groovy
g.upsertV('person', [(T.id): 1],
                    [(T.id): 1, name:[first: Marko', last: 'R'])
```

## upsert with Multi Properties

```groovy
g.upsertV('person', [(T.id): 1],
                    [(T.id): 1,lang: ['java', 'scala'])
```

## upsert with Single Cardinality

```groovy
g.upsertV('person', [(T.id): 1],
                    [(T.id): 1, lang: 'java')
```

## upsert with List Cardinality

```groovy
g.upsertV('person', [(T.id): 1],
                    [(T.id): 1, lang: ['java', 'scala', 'java'])
```

## upsert with Set Cardinality

```groovy
g.upsertV('person', [(T.id): 1],
                    [(T.id): 1, lang: ['java','scala'])
```

## upsert with List Cardinality - and add value to list

```groovy
g.upsertV('person', [(T.id): 1],
                    [(T.id): 1, lang: ['java','scala','java']).
  property(Cardinality.list, 'lang', 'java')
```

## upsert with stream

```groovy
// doesn't work with today's Gremlin semantics
g.inject([(id): 1, (label): 'person', name: 'marko'],
         [(id): 2, (label): 'person', name: 'josh']).
  upsertV(values(label), valueMap(id), valueMap())

// the following would be more in line with what's possible with existing
Gremlin semantics:
g.inject([id: 1, label: 'person', name: 'marko'],
         [id: 2, label: 'person', name: 'josh']).
  upsertV(select('label'), select('id'), select('id','label','name'))
```

## upsert with reporting added or updated side effect

```groovy
g.upsertV('person', [name: 'marko'],
                    [name: 'marko', age: 29]).
    with(WithOptions.sideEffectLabel, 'a').
  project('vertex', 'added').
    by().
    by(cap('a'))
```

## upsert Edge assuming a self-relation of "marko"

```groovy
g.V().has('person','name','marko').
  upsertE( 'self', [weight:0.5])
```

## upsert Edge with incident vertex checks

```groovy
g.upsertE('knows', [weight:0.5]).
   from(V(1)).to(V(2))
g.V().has('person','name','marko').
  upsertE( 'knows', [weight:0.5]).
    to(V(2))
```

## upsert using a Traversal for the match

```groovy
g.V().upsertV('person', __.has('person','age',gt(29)).has('state', 'NY'),
                        [active: true])
```




On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <da...@bechberger.com>
wrote:

> So, I've been doing some additional thinking about the ways that this could
> work based on the comments below and have put my comments inline.
>
> Dave
>
> On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > I added some thoughts inline below:
> >
> > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <da...@bechberger.com>
> > wrote:
> >
> > > Thanks for the detailed comments Stephen.  I have addressed them inline
> > > below.
> > >
> > > I did read the proposal from earlier and I think that we are in close
> > > agreement with what we are trying to accomplish.  I also fully support
> > > Josh's comment on providing a mechanism for submitting a map of
> > properties
> > > as manually unrolling this all right now leads to a lot of potential
> for
> > > error and a long messy traversal.
> > >
> > > I'm looking forward to this discussion on how to merge these two
> > proposals.
> > >
> > >
> > > 1. How would multi/meta-properties fit into the API you've proposed?
> > >
> > > My first thought here is that multi-properties would be represented as
> > > lists in the map, e.g.
> > >
> > > {names: ['Dave', 'David']}
> > >
> > > and meta-properties would be represented as maps in the maps, e.g.
> > >
> > > {name: {first: 'Dave', last: 'Bechberger'}}
> > >
> > > I can't say I've thought through all the implications of this though so
> > it
> > > is an area we would need to explore.
> > >
> >
> > The first implication that comes to mind is that it makes the assumption
> > the user wants multi/meta-properties as opposed to a single cardinality
> of
> > List in the first case and a Map as a property value in the second case.
> I
> > suppose that graphs with a schema could resolve those assumptions but
> > graphs that are schemaless would have a problem. The issue could be
> > resolved by specialized configuration of "g" or per merge() step using a
> > with() modulator I suppose but that goes into a yet another level of
> > implications to consider. I've often wondered if the start point for
> > getting types/schema into TP3 without a full rewrite would be in this
> form
> > where Gremlin would be given hints as to what to expect as to the types
> of
> > data it might encounter while traversing. Anyway, I'd be hesitant to go
> > down paths that don't account for multi/metaproperties well.  They are
> > first class citizens in TP3 (with those hoping for extension of at least
> > multiproperties to edges) and while I find them a constant annoyance for
> so
> > many reasons, we're kinda stuck with them.
> >
>
> I agree we need to account for multi/meta properties as 1st class
> citizens.  This is my current thinking on the syntax for the situations we
> have laid out so far:
>
> *Merge*
> g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
>
>
> *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name': 'Marko'})
>
>
> *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> {'first': Marko', 'last': 'R'})
>
>
> *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id: 1,
> 'lang': ['java', 'scala'])
>
>
> *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> 'lang': 'java')
>
>
> *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang':
> ['java', 'scala', 'java'])
>
>
> *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang':
> ['java', 'scala'])
>
> Since in a mergeV() scenario we are only ever adding whatever values are
> passed in there would be no need to specify the cardinality of the property
> being added.  If they wanted to add a value to an existing property then
> the current property() method would still be available on the output
> traversal. e.g.
> g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> 'java']).property(Cardinality.list, 'lang, 'java')
>
>
> *Merge with stream * g.inject([{id: 1, label: 'person', 'name': 'marko'},
> {id: 2, label: 'person', 'name': 'josh'}]).
>   mergeV(values('label'), valueMap('id'), valueMap())
>
>
> *Merge with reporting added or updated side effect* g.mergeV('name',
> {'name': 'marko'}, {'name': 'marko', 'age': 29}).
>   with(WithOptions.sideEffectLabel, 'a').
>   project('vertex', 'added').
>     by().
>     by(cap('a'))
>
>
>
> >
> >
> > > 2. How would users set the T.id on creation? would that T.id just be a
> > key
> > > in the first Map argument?
> > >
> > > Yes, I was thinking that T.id would be the key name, e.g.:
> > >
> > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > >
> >
> > ok - I can't say I see a problem with that atm.
> >
> >
> > > 3. I do like the general idea of a match on multiple properties for the
> > > first argument as a convenience but wonder about the specificity of
> this
> > > API a bit as it focuses heavily on equality - I suppose that's most
> cases
> > > for get-or-create, so perhaps that's ok.
> > >
> > > In most cases I've seen use exact matches on the vertex or edge.  I
> think
> > > it might be best to keep this straightforward as any complex edge cases
> > > still can perform the same functionality using the coalesce() pattern.
> > >
> >
> > I played around with the idea to use modulators to apply additional
> > constraints:
> >
> > g.mergeV('person', {'state':'NY'}, {'active': true}).
> >     by(has('age', gt(29))
> >
> > I suppose that's neat but maybe not...we could easily get the same thing
> > from:
> >
> > g.V().has('person','age',gt(29)).
> >   mergeV('person', {'state':'NY'}, {'active': true}).
> >
> > which is more readable though it does make it harder for providers to
> > optimize upsert operations if they could make use of the has() as part of
> > that. I think I like has() as part of the main query rather than in a
> by()
> > - just thinking out loud.
> >
> >
> Another potential option here would be to allow the second parameter of
> mergeV() accept a traversal.  Not sure how much complexity that would add
> though
>
> g.V().mergeV('person', __.has('person','age',gt(29)).has('state', 'NY'}.,
> {'active': true}).
>
>
> >
> > > 4. I think your suggestion points to one of the troubles Gremlin has
> > which
> > > we see with "algorithms" - extending the language with new steps that
> > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > shortestPath() step) pollutes the core language a bit, hence my
> > > generalization of "merging" in my link above which fits into the core
> > > Gremlin language style. There is a bigger picture where we are missing
> > > something in Gremlin that lets us extend the language in ways that let
> us
> > > easily introduce new steps that aren't for general purpose. This issue
> is
> > > discussed in terms of "algorithms" here:
> > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we
> > could
> > > see how there might be some "mutation" extension steps that would cover
> > > your suggested API, plus batch operations, etc. We need a way to add
> > > "sugar" without it interfering with the consistency of the core.
> > Obviously
> > > this is a bigger issue but perhaps important to solve to implement
> steps
> > in
> > > the fashion you describe.
> > >
> > > I agree that the "algorithm" steps do seem a bit odd in the core
> language
> > > but I think the need is there.  I'd be interested in furthering this
> > > discussion but I think these "pattern" steps may or may not be the same
> > as
> > > the algorithm steps.  I'll have to think on that.
> >
> >
> >
> > > 5. I suppose that the reason for mergeE and mergeV is to specify what
> > > element type the first Map argument should be applied to? what about
> > > mergeVP (i.e. vertex property as it too is an element) ? That's tricky
> > but
> > > I don't think we should miss that. Perhaps merge() could be a "complex
> > > modulator"?? that's a new concept of course, but you would do
> > g.V().merge()
> > > and the label and first Map would fold to VertexStartStep (i.e. V())
> for
> > > the lookup and then a MergeStep would follow - thus a "complex
> modulator"
> > > as it does more than just change the behavior of the previous step - it
> > > also adds its own. I suppose it could also add has() steps followed by
> > the
> > > MergeStep and then the has() operations would fold in normally as they
> do
> > > today. In this way, we can simplify to just one single
> > > merge(String,Map,Map). ??
> > >
> > > I agree that we should also think about how to include properties in
> this
> > > merge construct.  The reason I was thinking about mergeV() and mergeE()
> > is
> > > that it follows the same pattern as the already well understood
> > > addV()/addE() steps.  I am a bit wary of trying to generalize this down
> > to
> > > a single merge() step as these sorts of complex overloads make it hard
> to
> > > figure out which Gremlin step you should use for a particular pattern
> > (e.g.
> > > upsert vertex or upsert edge).  One thing that I think customers find
> > > useful about the addV/addE step is that they are very discoverable, the
> > > name tells me what functionality to expect from that step.
> > >
> >
> > I'm starting to agree with the idea that we have to do something like
> > mergeV()/E() as it wouldn't quite work as a start step otherwise. We
> > couldn't do:
> >
> > g.merge()
> >
> > as we wouldnt know what sort of Element it applied to. If so, I wonder if
> > it would be better to preserve "merge" for my more general use-case and
> > prefer upsertV()/E()?
> >
> > Also, perhaps mergeVP() doesn't need to exist as perhaps it is an
> uncommon
> > case (compared to V/E()) and we don't really have addVP() as an analogous
> > step. Perhaps existing coalesce() patterns and/or my more general purpose
> > merge() step would satisfy those situations??
> >
>
> I think that a mergeVP() type step may require a different pattern since it
> would really need to accept a map of values otherwise it would essentially
> perform the same function as property()
>
>
> >
> >
> > > 6. One thing neither my approach nor yours seems to do is tell the user
> > if
> > > they created something or updated something - that's another thing I've
> > > seen users want to have in get-or-create. Here again we go deeper into
> a
> > > less general step specification as alluded to in 4, but a merge() step
> as
> > > proposed in 5, might return [Element,boolean] so as to provide an
> > indicator
> > > of creation?
> > >
> > > Hmm, I'll have to think about that.  How would returning multiple
> values
> > > work if we want to chain these together. e.g. Add a vertex between two
> > > edges but make sure the vertices exist?
> > >
> >
>
> Added a thought on how to accomplish this above that I'd like to get
> thoughts on.
>
>
> > >
> > > 7. You were just introducing your ideas here, so perhaps you haven't
> > gotten
> > > this far yet, but a shortcoming to doing merge(String,Map,Map) is that
> it
> > > leaves open no opportunity to stream a List of Maps to a merge() for a
> > form
> > > of batch loading which is mighty common and one of the variations of
> the
> > > coalesce() pattern that I alluded to at the start of all this. I think
> > that
> > > we would want to be sure that we left open the option to do that
> somehow.
> > > 8. If we had a general purpose merge() step I wonder if it makes
> > developing
> > > the API as you suggested easier to do?
> > >
> > > Hmm, let me think about that one.
> > >
> > >
> > I will add an item 8 to think about which I didn't mention before:
> >
> > 8. The signature you suggested for mergeE() is:
> >
> > mergeE(String, Map, Map)
> >      String - The edge label
> >      Map (first) - The properties to match existing edge on
> >      Map (second) - Any additional properties to set if a new edge is
> > created (optional)
> >
> > but does that exactly answer the need? Typically the idea is to detect an
> > edge between two vertices not globally in the graph by way of some
> > properties. This signature doesn't seem to allow for that as it doesn't
> > allow specification of the vertices to test against. Perhaps the answer
> is
> > to use from() and to() modulators?
> >
> > g.mergeE('knows', {'weight':0.5}).
> >   from(V(1)).to(V(2))
> > g.V().has('person','name','marko').
> >   mergeE( 'knows', {'weight':0.5}).
> >     to(V(2))
> > g.V().has('person','name','marko').
> >   mergeE( 'self', {'weight':0.5})
> >
> > That seems to work?
> >
>
> I think that is an elegant solution to that problem.  I like it and it
> keeps in line with the way that addE() works.
>
>
> >
> >
> >
> > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <spmallette@gmail.com
> >
> > > wrote:
> > >
> > > > I like our coalesce() pattern but it is verbose and over time it has
> > gone
> > > > from a simple pattern to one with numerous variations for all manner
> of
> > > > different sorts of merge-like operations. As such, I do think we
> should
> > > > introduce something to cover this pattern.
> > > >
> > > > I like that you used the word "merge" in your description of this as
> it
> > > is
> > > > the word I've liked using. You might want to give a look at my
> proposed
> > > > merge() step from earlier in the year:
> > > >
> > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > >
> > > > I'm just going to dump thoughts as they come regarding what you
> wrote:
> > > >
> > > > 1. How would multi/meta-properties fit into the API you've proposed?
> > > > 2. How would users set the T.id on creation? would that T.id just be
> a
> > > key
> > > > in the first Map argument?
> > > > 3. I do like the general idea of a match on multiple properties for
> the
> > > > first argument as a convenience but wonder about the specificity of
> > this
> > > > API a bit as it focuses heavily on equality - I suppose that's most
> > cases
> > > > for get-or-create, so perhaps that's ok.
> > > > 4. I think your suggestion points to one of the troubles Gremlin has
> > > which
> > > > we see with "algorithms" - extending the language with new steps that
> > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > generalization of "merging" in my link above which fits into the core
> > > > Gremlin language style. There is a bigger picture where we are
> missing
> > > > something in Gremlin that lets us extend the language in ways that
> let
> > us
> > > > easily introduce new steps that aren't for general purpose. This
> issue
> > is
> > > > discussed in terms of "algorithms" here:
> > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we
> > > could
> > > > see how there might be some "mutation" extension steps that would
> cover
> > > > your suggested API, plus batch operations, etc. We need a way to add
> > > > "sugar" without it interfering with the consistency of the core.
> > > Obviously
> > > > this is a bigger issue but perhaps important to solve to implement
> > steps
> > > in
> > > > the fashion you describe.
> > > > 5. I suppose that the reason for mergeE and mergeV is to specify what
> > > > element type the first Map argument should be applied to? what about
> > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> tricky
> > > but
> > > > I don't think we should miss that. Perhaps merge() could be a
> "complex
> > > > modulator"?? that's a new concept of course, but you would do
> > > g.V().merge()
> > > > and the label and first Map would fold to VertexStartStep (i.e. V())
> > for
> > > > the lookup and then a MergeStep would follow - thus a "complex
> > modulator"
> > > > as it does more than just change the behavior of the previous step -
> it
> > > > also adds its own. I suppose it could also add has() steps followed
> by
> > > the
> > > > MergeStep and then the has() operations would fold in normally as
> they
> > do
> > > > today. In this way, we can simplify to just one single
> > > > merge(String,Map,Map). ??
> > > > 6. One thing neither my approach nor yours seems to do is tell the
> user
> > > if
> > > > they created something or updated something - that's another thing
> I've
> > > > seen users want to have in get-or-create. Here again we go deeper
> into
> > a
> > > > less general step specification as alluded to in 4, but a merge()
> step
> > as
> > > > proposed in 5, might return [Element,boolean] so as to provide an
> > > indicator
> > > > of creation?
> > > > 7. You were just introducing your ideas here, so perhaps you haven't
> > > gotten
> > > > this far yet, but a shortcoming to doing merge(String,Map,Map) is
> that
> > it
> > > > leaves open no opportunity to stream a List of Maps to a merge() for
> a
> > > form
> > > > of batch loading which is mighty common and one of the variations of
> > the
> > > > coalesce() pattern that I alluded to at the start of all this. I
> think
> > > that
> > > > we would want to be sure that we left open the option to do that
> > somehow.
> > > > 8. If we had a general purpose merge() step I wonder if it makes
> > > developing
> > > > the API as you suggested easier to do?
> > > >
> > > > I think I'd like to solve the problems you describe in your post as
> > well
> > > as
> > > > the ones in mine. There is some relation there, but gaps as well.
> With
> > > more
> > > > discussion here we can figure something out.
> > > >
> > > > Thanks for starting this talk - good one!
> > > >
> > > >
> > > >
> > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> dave@bechberger.com>
> > > > wrote:
> > > >
> > > > > I've had a few on and off discussions with a few people here, so I
> > > wanted
> > > > > to send this out to everyone for feedback.
> > > > >
> > > > > What are people's thoughts on creating a new set of steps that
> codify
> > > > > common Gremlin best practices?
> > > > >
> > > > > I think there are several common Gremlin patterns where users would
> > > > benefit
> > > > > from the additional guidance that these codified steps represent.
> > The
> > > > > first one I would recommend though is codifying the element
> existence
> > > > > pattern into a single Gremlin step, something like:
> > > > >
> > > > > mergeV(String, Map, Map)
> > > > >      String - The vertex label
> > > > >      Map (first) - The properties to match existing vertices on
> > > > >      Map (second) - Any additional properties to set if a new
> vertex
> > is
> > > > > created (optional)
> > > > > mergeE(String, Map, Map)
> > > > >      String - The edge label
> > > > >      Map (first) - The properties to match existing edge on
> > > > >      Map (second) - Any additional properties to set if a new edge
> is
> > > > > created (optional)
> > > > >
> > > > > In each of these cases these steps would perform the same upsert
> > > > > functionality as the element existence pattern.
> > > > >
> > > > > Example:
> > > > >
> > > > > g.V().has('person','name','stephen').
> > > > >            fold().
> > > > >            coalesce(unfold(),
> > > > >                     addV('person').
> > > > >                       property('name','stephen').
> > > > >                       property('age',34))
> > > > >
> > > > > would become:
> > > > >
> > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > >
> > > > > I think that this change would be a good addition to the language
> for
> > > > > several reasons:
> > > > >
> > > > > * This codifies the best practice for a specific action/recipe,
> which
> > > > > reduces the chance that someone uses the pattern incorrectly
> > > > > * Most complex Gremlin traversals are verbose.  Reducing the amount
> > of
> > > > code
> > > > > that needs to be written and maintained allows for a better
> developer
> > > > > experience.
> > > > > * It will lower the bar of entry for a developer by making these
> > > actions
> > > > > more discoverable.  The more we can help bring these patterns to
> the
> > > > > forefront of the language via these pattern/meta steps the more we
> > > guide
> > > > > users towards writing better Gremlin faster
> > > > > * This allows DB vendors to optimize for this pattern
> > > > >
> > > > > I know that this would likely be the first step in Gremlin that
> > > codifies
> > > > a
> > > > > pattern, so I'd like to get other's thoughts on this?
> > > > >
> > > > > Dave
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by David Bechberger <da...@bechberger.com>.
So, I've been doing some additional thinking about the ways that this could
work based on the comments below and have put my comments inline.

Dave

On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <sp...@gmail.com>
wrote:

> I added some thoughts inline below:
>
> On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <da...@bechberger.com>
> wrote:
>
> > Thanks for the detailed comments Stephen.  I have addressed them inline
> > below.
> >
> > I did read the proposal from earlier and I think that we are in close
> > agreement with what we are trying to accomplish.  I also fully support
> > Josh's comment on providing a mechanism for submitting a map of
> properties
> > as manually unrolling this all right now leads to a lot of potential for
> > error and a long messy traversal.
> >
> > I'm looking forward to this discussion on how to merge these two
> proposals.
> >
> >
> > 1. How would multi/meta-properties fit into the API you've proposed?
> >
> > My first thought here is that multi-properties would be represented as
> > lists in the map, e.g.
> >
> > {names: ['Dave', 'David']}
> >
> > and meta-properties would be represented as maps in the maps, e.g.
> >
> > {name: {first: 'Dave', last: 'Bechberger'}}
> >
> > I can't say I've thought through all the implications of this though so
> it
> > is an area we would need to explore.
> >
>
> The first implication that comes to mind is that it makes the assumption
> the user wants multi/meta-properties as opposed to a single cardinality of
> List in the first case and a Map as a property value in the second case. I
> suppose that graphs with a schema could resolve those assumptions but
> graphs that are schemaless would have a problem. The issue could be
> resolved by specialized configuration of "g" or per merge() step using a
> with() modulator I suppose but that goes into a yet another level of
> implications to consider. I've often wondered if the start point for
> getting types/schema into TP3 without a full rewrite would be in this form
> where Gremlin would be given hints as to what to expect as to the types of
> data it might encounter while traversing. Anyway, I'd be hesitant to go
> down paths that don't account for multi/metaproperties well.  They are
> first class citizens in TP3 (with those hoping for extension of at least
> multiproperties to edges) and while I find them a constant annoyance for so
> many reasons, we're kinda stuck with them.
>

I agree we need to account for multi/meta properties as 1st class
citizens.  This is my current thinking on the syntax for the situations we
have laid out so far:

*Merge*
g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})


*Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name': 'Marko'})


*Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
{'first': Marko', 'last': 'R'})


*Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id: 1,
'lang': ['java', 'scala'])


*Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
'lang': 'java')


*Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang':
['java', 'scala', 'java'])


*Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang':
['java', 'scala'])

Since in a mergeV() scenario we are only ever adding whatever values are
passed in there would be no need to specify the cardinality of the property
being added.  If they wanted to add a value to an existing property then
the current property() method would still be available on the output
traversal. e.g.
g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
'java']).property(Cardinality.list, 'lang, 'java')


*Merge with stream * g.inject([{id: 1, label: 'person', 'name': 'marko'},
{id: 2, label: 'person', 'name': 'josh'}]).
  mergeV(values('label'), valueMap('id'), valueMap())


*Merge with reporting added or updated side effect* g.mergeV('name',
{'name': 'marko'}, {'name': 'marko', 'age': 29}).
  with(WithOptions.sideEffectLabel, 'a').
  project('vertex', 'added').
    by().
    by(cap('a'))



>
>
> > 2. How would users set the T.id on creation? would that T.id just be a
> key
> > in the first Map argument?
> >
> > Yes, I was thinking that T.id would be the key name, e.g.:
> >
> > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> >
>
> ok - I can't say I see a problem with that atm.
>
>
> > 3. I do like the general idea of a match on multiple properties for the
> > first argument as a convenience but wonder about the specificity of this
> > API a bit as it focuses heavily on equality - I suppose that's most cases
> > for get-or-create, so perhaps that's ok.
> >
> > In most cases I've seen use exact matches on the vertex or edge.  I think
> > it might be best to keep this straightforward as any complex edge cases
> > still can perform the same functionality using the coalesce() pattern.
> >
>
> I played around with the idea to use modulators to apply additional
> constraints:
>
> g.mergeV('person', {'state':'NY'}, {'active': true}).
>     by(has('age', gt(29))
>
> I suppose that's neat but maybe not...we could easily get the same thing
> from:
>
> g.V().has('person','age',gt(29)).
>   mergeV('person', {'state':'NY'}, {'active': true}).
>
> which is more readable though it does make it harder for providers to
> optimize upsert operations if they could make use of the has() as part of
> that. I think I like has() as part of the main query rather than in a by()
> - just thinking out loud.
>
>
Another potential option here would be to allow the second parameter of
mergeV() accept a traversal.  Not sure how much complexity that would add
though

g.V().mergeV('person', __.has('person','age',gt(29)).has('state', 'NY'}.,
{'active': true}).


>
> > 4. I think your suggestion points to one of the troubles Gremlin has
> which
> > we see with "algorithms" - extending the language with new steps that
> > provides a form of "sugar" (e.g. in algorithms we end up with
> > shortestPath() step) pollutes the core language a bit, hence my
> > generalization of "merging" in my link above which fits into the core
> > Gremlin language style. There is a bigger picture where we are missing
> > something in Gremlin that lets us extend the language in ways that let us
> > easily introduce new steps that aren't for general purpose. This issue is
> > discussed in terms of "algorithms" here:
> > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we
> could
> > see how there might be some "mutation" extension steps that would cover
> > your suggested API, plus batch operations, etc. We need a way to add
> > "sugar" without it interfering with the consistency of the core.
> Obviously
> > this is a bigger issue but perhaps important to solve to implement steps
> in
> > the fashion you describe.
> >
> > I agree that the "algorithm" steps do seem a bit odd in the core language
> > but I think the need is there.  I'd be interested in furthering this
> > discussion but I think these "pattern" steps may or may not be the same
> as
> > the algorithm steps.  I'll have to think on that.
>
>
>
> > 5. I suppose that the reason for mergeE and mergeV is to specify what
> > element type the first Map argument should be applied to? what about
> > mergeVP (i.e. vertex property as it too is an element) ? That's tricky
> but
> > I don't think we should miss that. Perhaps merge() could be a "complex
> > modulator"?? that's a new concept of course, but you would do
> g.V().merge()
> > and the label and first Map would fold to VertexStartStep (i.e. V()) for
> > the lookup and then a MergeStep would follow - thus a "complex modulator"
> > as it does more than just change the behavior of the previous step - it
> > also adds its own. I suppose it could also add has() steps followed by
> the
> > MergeStep and then the has() operations would fold in normally as they do
> > today. In this way, we can simplify to just one single
> > merge(String,Map,Map). ??
> >
> > I agree that we should also think about how to include properties in this
> > merge construct.  The reason I was thinking about mergeV() and mergeE()
> is
> > that it follows the same pattern as the already well understood
> > addV()/addE() steps.  I am a bit wary of trying to generalize this down
> to
> > a single merge() step as these sorts of complex overloads make it hard to
> > figure out which Gremlin step you should use for a particular pattern
> (e.g.
> > upsert vertex or upsert edge).  One thing that I think customers find
> > useful about the addV/addE step is that they are very discoverable, the
> > name tells me what functionality to expect from that step.
> >
>
> I'm starting to agree with the idea that we have to do something like
> mergeV()/E() as it wouldn't quite work as a start step otherwise. We
> couldn't do:
>
> g.merge()
>
> as we wouldnt know what sort of Element it applied to. If so, I wonder if
> it would be better to preserve "merge" for my more general use-case and
> prefer upsertV()/E()?
>
> Also, perhaps mergeVP() doesn't need to exist as perhaps it is an uncommon
> case (compared to V/E()) and we don't really have addVP() as an analogous
> step. Perhaps existing coalesce() patterns and/or my more general purpose
> merge() step would satisfy those situations??
>

I think that a mergeVP() type step may require a different pattern since it
would really need to accept a map of values otherwise it would essentially
perform the same function as property()


>
>
> > 6. One thing neither my approach nor yours seems to do is tell the user
> if
> > they created something or updated something - that's another thing I've
> > seen users want to have in get-or-create. Here again we go deeper into a
> > less general step specification as alluded to in 4, but a merge() step as
> > proposed in 5, might return [Element,boolean] so as to provide an
> indicator
> > of creation?
> >
> > Hmm, I'll have to think about that.  How would returning multiple values
> > work if we want to chain these together. e.g. Add a vertex between two
> > edges but make sure the vertices exist?
> >
>

Added a thought on how to accomplish this above that I'd like to get
thoughts on.


> >
> > 7. You were just introducing your ideas here, so perhaps you haven't
> gotten
> > this far yet, but a shortcoming to doing merge(String,Map,Map) is that it
> > leaves open no opportunity to stream a List of Maps to a merge() for a
> form
> > of batch loading which is mighty common and one of the variations of the
> > coalesce() pattern that I alluded to at the start of all this. I think
> that
> > we would want to be sure that we left open the option to do that somehow.
> > 8. If we had a general purpose merge() step I wonder if it makes
> developing
> > the API as you suggested easier to do?
> >
> > Hmm, let me think about that one.
> >
> >
> I will add an item 8 to think about which I didn't mention before:
>
> 8. The signature you suggested for mergeE() is:
>
> mergeE(String, Map, Map)
>      String - The edge label
>      Map (first) - The properties to match existing edge on
>      Map (second) - Any additional properties to set if a new edge is
> created (optional)
>
> but does that exactly answer the need? Typically the idea is to detect an
> edge between two vertices not globally in the graph by way of some
> properties. This signature doesn't seem to allow for that as it doesn't
> allow specification of the vertices to test against. Perhaps the answer is
> to use from() and to() modulators?
>
> g.mergeE('knows', {'weight':0.5}).
>   from(V(1)).to(V(2))
> g.V().has('person','name','marko').
>   mergeE( 'knows', {'weight':0.5}).
>     to(V(2))
> g.V().has('person','name','marko').
>   mergeE( 'self', {'weight':0.5})
>
> That seems to work?
>

I think that is an elegant solution to that problem.  I like it and it
keeps in line with the way that addE() works.


>
>
>
> > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <sp...@gmail.com>
> > wrote:
> >
> > > I like our coalesce() pattern but it is verbose and over time it has
> gone
> > > from a simple pattern to one with numerous variations for all manner of
> > > different sorts of merge-like operations. As such, I do think we should
> > > introduce something to cover this pattern.
> > >
> > > I like that you used the word "merge" in your description of this as it
> > is
> > > the word I've liked using. You might want to give a look at my proposed
> > > merge() step from earlier in the year:
> > >
> > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > >
> > > I'm just going to dump thoughts as they come regarding what you wrote:
> > >
> > > 1. How would multi/meta-properties fit into the API you've proposed?
> > > 2. How would users set the T.id on creation? would that T.id just be a
> > key
> > > in the first Map argument?
> > > 3. I do like the general idea of a match on multiple properties for the
> > > first argument as a convenience but wonder about the specificity of
> this
> > > API a bit as it focuses heavily on equality - I suppose that's most
> cases
> > > for get-or-create, so perhaps that's ok.
> > > 4. I think your suggestion points to one of the troubles Gremlin has
> > which
> > > we see with "algorithms" - extending the language with new steps that
> > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > shortestPath() step) pollutes the core language a bit, hence my
> > > generalization of "merging" in my link above which fits into the core
> > > Gremlin language style. There is a bigger picture where we are missing
> > > something in Gremlin that lets us extend the language in ways that let
> us
> > > easily introduce new steps that aren't for general purpose. This issue
> is
> > > discussed in terms of "algorithms" here:
> > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we
> > could
> > > see how there might be some "mutation" extension steps that would cover
> > > your suggested API, plus batch operations, etc. We need a way to add
> > > "sugar" without it interfering with the consistency of the core.
> > Obviously
> > > this is a bigger issue but perhaps important to solve to implement
> steps
> > in
> > > the fashion you describe.
> > > 5. I suppose that the reason for mergeE and mergeV is to specify what
> > > element type the first Map argument should be applied to? what about
> > > mergeVP (i.e. vertex property as it too is an element) ? That's tricky
> > but
> > > I don't think we should miss that. Perhaps merge() could be a "complex
> > > modulator"?? that's a new concept of course, but you would do
> > g.V().merge()
> > > and the label and first Map would fold to VertexStartStep (i.e. V())
> for
> > > the lookup and then a MergeStep would follow - thus a "complex
> modulator"
> > > as it does more than just change the behavior of the previous step - it
> > > also adds its own. I suppose it could also add has() steps followed by
> > the
> > > MergeStep and then the has() operations would fold in normally as they
> do
> > > today. In this way, we can simplify to just one single
> > > merge(String,Map,Map). ??
> > > 6. One thing neither my approach nor yours seems to do is tell the user
> > if
> > > they created something or updated something - that's another thing I've
> > > seen users want to have in get-or-create. Here again we go deeper into
> a
> > > less general step specification as alluded to in 4, but a merge() step
> as
> > > proposed in 5, might return [Element,boolean] so as to provide an
> > indicator
> > > of creation?
> > > 7. You were just introducing your ideas here, so perhaps you haven't
> > gotten
> > > this far yet, but a shortcoming to doing merge(String,Map,Map) is that
> it
> > > leaves open no opportunity to stream a List of Maps to a merge() for a
> > form
> > > of batch loading which is mighty common and one of the variations of
> the
> > > coalesce() pattern that I alluded to at the start of all this. I think
> > that
> > > we would want to be sure that we left open the option to do that
> somehow.
> > > 8. If we had a general purpose merge() step I wonder if it makes
> > developing
> > > the API as you suggested easier to do?
> > >
> > > I think I'd like to solve the problems you describe in your post as
> well
> > as
> > > the ones in mine. There is some relation there, but gaps as well. With
> > more
> > > discussion here we can figure something out.
> > >
> > > Thanks for starting this talk - good one!
> > >
> > >
> > >
> > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <da...@bechberger.com>
> > > wrote:
> > >
> > > > I've had a few on and off discussions with a few people here, so I
> > wanted
> > > > to send this out to everyone for feedback.
> > > >
> > > > What are people's thoughts on creating a new set of steps that codify
> > > > common Gremlin best practices?
> > > >
> > > > I think there are several common Gremlin patterns where users would
> > > benefit
> > > > from the additional guidance that these codified steps represent.
> The
> > > > first one I would recommend though is codifying the element existence
> > > > pattern into a single Gremlin step, something like:
> > > >
> > > > mergeV(String, Map, Map)
> > > >      String - The vertex label
> > > >      Map (first) - The properties to match existing vertices on
> > > >      Map (second) - Any additional properties to set if a new vertex
> is
> > > > created (optional)
> > > > mergeE(String, Map, Map)
> > > >      String - The edge label
> > > >      Map (first) - The properties to match existing edge on
> > > >      Map (second) - Any additional properties to set if a new edge is
> > > > created (optional)
> > > >
> > > > In each of these cases these steps would perform the same upsert
> > > > functionality as the element existence pattern.
> > > >
> > > > Example:
> > > >
> > > > g.V().has('person','name','stephen').
> > > >            fold().
> > > >            coalesce(unfold(),
> > > >                     addV('person').
> > > >                       property('name','stephen').
> > > >                       property('age',34))
> > > >
> > > > would become:
> > > >
> > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > >
> > > > I think that this change would be a good addition to the language for
> > > > several reasons:
> > > >
> > > > * This codifies the best practice for a specific action/recipe, which
> > > > reduces the chance that someone uses the pattern incorrectly
> > > > * Most complex Gremlin traversals are verbose.  Reducing the amount
> of
> > > code
> > > > that needs to be written and maintained allows for a better developer
> > > > experience.
> > > > * It will lower the bar of entry for a developer by making these
> > actions
> > > > more discoverable.  The more we can help bring these patterns to the
> > > > forefront of the language via these pattern/meta steps the more we
> > guide
> > > > users towards writing better Gremlin faster
> > > > * This allows DB vendors to optimize for this pattern
> > > >
> > > > I know that this would likely be the first step in Gremlin that
> > codifies
> > > a
> > > > pattern, so I'd like to get other's thoughts on this?
> > > >
> > > > Dave
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Stephen Mallette <sp...@gmail.com>.
I added some thoughts inline below:

On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <da...@bechberger.com>
wrote:

> Thanks for the detailed comments Stephen.  I have addressed them inline
> below.
>
> I did read the proposal from earlier and I think that we are in close
> agreement with what we are trying to accomplish.  I also fully support
> Josh's comment on providing a mechanism for submitting a map of properties
> as manually unrolling this all right now leads to a lot of potential for
> error and a long messy traversal.
>
> I'm looking forward to this discussion on how to merge these two proposals.
>
>
> 1. How would multi/meta-properties fit into the API you've proposed?
>
> My first thought here is that multi-properties would be represented as
> lists in the map, e.g.
>
> {names: ['Dave', 'David']}
>
> and meta-properties would be represented as maps in the maps, e.g.
>
> {name: {first: 'Dave', last: 'Bechberger'}}
>
> I can't say I've thought through all the implications of this though so it
> is an area we would need to explore.
>

The first implication that comes to mind is that it makes the assumption
the user wants multi/meta-properties as opposed to a single cardinality of
List in the first case and a Map as a property value in the second case. I
suppose that graphs with a schema could resolve those assumptions but
graphs that are schemaless would have a problem. The issue could be
resolved by specialized configuration of "g" or per merge() step using a
with() modulator I suppose but that goes into a yet another level of
implications to consider. I've often wondered if the start point for
getting types/schema into TP3 without a full rewrite would be in this form
where Gremlin would be given hints as to what to expect as to the types of
data it might encounter while traversing. Anyway, I'd be hesitant to go
down paths that don't account for multi/metaproperties well.  They are
first class citizens in TP3 (with those hoping for extension of at least
multiproperties to edges) and while I find them a constant annoyance for so
many reasons, we're kinda stuck with them.


> 2. How would users set the T.id on creation? would that T.id just be a key
> in the first Map argument?
>
> Yes, I was thinking that T.id would be the key name, e.g.:
>
> g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
>

ok - I can't say I see a problem with that atm.


> 3. I do like the general idea of a match on multiple properties for the
> first argument as a convenience but wonder about the specificity of this
> API a bit as it focuses heavily on equality - I suppose that's most cases
> for get-or-create, so perhaps that's ok.
>
> In most cases I've seen use exact matches on the vertex or edge.  I think
> it might be best to keep this straightforward as any complex edge cases
> still can perform the same functionality using the coalesce() pattern.
>

I played around with the idea to use modulators to apply additional
constraints:

g.mergeV('person', {'state':'NY'}, {'active': true}).
    by(has('age', gt(29))

I suppose that's neat but maybe not...we could easily get the same thing
from:

g.V().has('person','age',gt(29)).
  mergeV('person', {'state':'NY'}, {'active': true}).

which is more readable though it does make it harder for providers to
optimize upsert operations if they could make use of the has() as part of
that. I think I like has() as part of the main query rather than in a by()
- just thinking out loud.


> 4. I think your suggestion points to one of the troubles Gremlin has which
> we see with "algorithms" - extending the language with new steps that
> provides a form of "sugar" (e.g. in algorithms we end up with
> shortestPath() step) pollutes the core language a bit, hence my
> generalization of "merging" in my link above which fits into the core
> Gremlin language style. There is a bigger picture where we are missing
> something in Gremlin that lets us extend the language in ways that let us
> easily introduce new steps that aren't for general purpose. This issue is
> discussed in terms of "algorithms" here:
> https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we could
> see how there might be some "mutation" extension steps that would cover
> your suggested API, plus batch operations, etc. We need a way to add
> "sugar" without it interfering with the consistency of the core. Obviously
> this is a bigger issue but perhaps important to solve to implement steps in
> the fashion you describe.
>
> I agree that the "algorithm" steps do seem a bit odd in the core language
> but I think the need is there.  I'd be interested in furthering this
> discussion but I think these "pattern" steps may or may not be the same as
> the algorithm steps.  I'll have to think on that.



> 5. I suppose that the reason for mergeE and mergeV is to specify what
> element type the first Map argument should be applied to? what about
> mergeVP (i.e. vertex property as it too is an element) ? That's tricky but
> I don't think we should miss that. Perhaps merge() could be a "complex
> modulator"?? that's a new concept of course, but you would do g.V().merge()
> and the label and first Map would fold to VertexStartStep (i.e. V()) for
> the lookup and then a MergeStep would follow - thus a "complex modulator"
> as it does more than just change the behavior of the previous step - it
> also adds its own. I suppose it could also add has() steps followed by the
> MergeStep and then the has() operations would fold in normally as they do
> today. In this way, we can simplify to just one single
> merge(String,Map,Map). ??
>
> I agree that we should also think about how to include properties in this
> merge construct.  The reason I was thinking about mergeV() and mergeE() is
> that it follows the same pattern as the already well understood
> addV()/addE() steps.  I am a bit wary of trying to generalize this down to
> a single merge() step as these sorts of complex overloads make it hard to
> figure out which Gremlin step you should use for a particular pattern (e.g.
> upsert vertex or upsert edge).  One thing that I think customers find
> useful about the addV/addE step is that they are very discoverable, the
> name tells me what functionality to expect from that step.
>

I'm starting to agree with the idea that we have to do something like
mergeV()/E() as it wouldn't quite work as a start step otherwise. We
couldn't do:

g.merge()

as we wouldnt know what sort of Element it applied to. If so, I wonder if
it would be better to preserve "merge" for my more general use-case and
prefer upsertV()/E()?

Also, perhaps mergeVP() doesn't need to exist as perhaps it is an uncommon
case (compared to V/E()) and we don't really have addVP() as an analogous
step. Perhaps existing coalesce() patterns and/or my more general purpose
merge() step would satisfy those situations??


> 6. One thing neither my approach nor yours seems to do is tell the user if
> they created something or updated something - that's another thing I've
> seen users want to have in get-or-create. Here again we go deeper into a
> less general step specification as alluded to in 4, but a merge() step as
> proposed in 5, might return [Element,boolean] so as to provide an indicator
> of creation?
>
> Hmm, I'll have to think about that.  How would returning multiple values
> work if we want to chain these together. e.g. Add a vertex between two
> edges but make sure the vertices exist?
>
>
> 7. You were just introducing your ideas here, so perhaps you haven't gotten
> this far yet, but a shortcoming to doing merge(String,Map,Map) is that it
> leaves open no opportunity to stream a List of Maps to a merge() for a form
> of batch loading which is mighty common and one of the variations of the
> coalesce() pattern that I alluded to at the start of all this. I think that
> we would want to be sure that we left open the option to do that somehow.
> 8. If we had a general purpose merge() step I wonder if it makes developing
> the API as you suggested easier to do?
>
> Hmm, let me think about that one.
>
>
I will add an item 8 to think about which I didn't mention before:

8. The signature you suggested for mergeE() is:

mergeE(String, Map, Map)
     String - The edge label
     Map (first) - The properties to match existing edge on
     Map (second) - Any additional properties to set if a new edge is
created (optional)

but does that exactly answer the need? Typically the idea is to detect an
edge between two vertices not globally in the graph by way of some
properties. This signature doesn't seem to allow for that as it doesn't
allow specification of the vertices to test against. Perhaps the answer is
to use from() and to() modulators?

g.mergeE('knows', {'weight':0.5}).
  from(V(1)).to(V(2))
g.V().has('person','name','marko').
  mergeE( 'knows', {'weight':0.5}).
    to(V(2))
g.V().has('person','name','marko').
  mergeE( 'self', {'weight':0.5})

That seems to work?



> On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > I like our coalesce() pattern but it is verbose and over time it has gone
> > from a simple pattern to one with numerous variations for all manner of
> > different sorts of merge-like operations. As such, I do think we should
> > introduce something to cover this pattern.
> >
> > I like that you used the word "merge" in your description of this as it
> is
> > the word I've liked using. You might want to give a look at my proposed
> > merge() step from earlier in the year:
> >
> >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> >
> > I'm just going to dump thoughts as they come regarding what you wrote:
> >
> > 1. How would multi/meta-properties fit into the API you've proposed?
> > 2. How would users set the T.id on creation? would that T.id just be a
> key
> > in the first Map argument?
> > 3. I do like the general idea of a match on multiple properties for the
> > first argument as a convenience but wonder about the specificity of this
> > API a bit as it focuses heavily on equality - I suppose that's most cases
> > for get-or-create, so perhaps that's ok.
> > 4. I think your suggestion points to one of the troubles Gremlin has
> which
> > we see with "algorithms" - extending the language with new steps that
> > provides a form of "sugar" (e.g. in algorithms we end up with
> > shortestPath() step) pollutes the core language a bit, hence my
> > generalization of "merging" in my link above which fits into the core
> > Gremlin language style. There is a bigger picture where we are missing
> > something in Gremlin that lets us extend the language in ways that let us
> > easily introduce new steps that aren't for general purpose. This issue is
> > discussed in terms of "algorithms" here:
> > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we
> could
> > see how there might be some "mutation" extension steps that would cover
> > your suggested API, plus batch operations, etc. We need a way to add
> > "sugar" without it interfering with the consistency of the core.
> Obviously
> > this is a bigger issue but perhaps important to solve to implement steps
> in
> > the fashion you describe.
> > 5. I suppose that the reason for mergeE and mergeV is to specify what
> > element type the first Map argument should be applied to? what about
> > mergeVP (i.e. vertex property as it too is an element) ? That's tricky
> but
> > I don't think we should miss that. Perhaps merge() could be a "complex
> > modulator"?? that's a new concept of course, but you would do
> g.V().merge()
> > and the label and first Map would fold to VertexStartStep (i.e. V()) for
> > the lookup and then a MergeStep would follow - thus a "complex modulator"
> > as it does more than just change the behavior of the previous step - it
> > also adds its own. I suppose it could also add has() steps followed by
> the
> > MergeStep and then the has() operations would fold in normally as they do
> > today. In this way, we can simplify to just one single
> > merge(String,Map,Map). ??
> > 6. One thing neither my approach nor yours seems to do is tell the user
> if
> > they created something or updated something - that's another thing I've
> > seen users want to have in get-or-create. Here again we go deeper into a
> > less general step specification as alluded to in 4, but a merge() step as
> > proposed in 5, might return [Element,boolean] so as to provide an
> indicator
> > of creation?
> > 7. You were just introducing your ideas here, so perhaps you haven't
> gotten
> > this far yet, but a shortcoming to doing merge(String,Map,Map) is that it
> > leaves open no opportunity to stream a List of Maps to a merge() for a
> form
> > of batch loading which is mighty common and one of the variations of the
> > coalesce() pattern that I alluded to at the start of all this. I think
> that
> > we would want to be sure that we left open the option to do that somehow.
> > 8. If we had a general purpose merge() step I wonder if it makes
> developing
> > the API as you suggested easier to do?
> >
> > I think I'd like to solve the problems you describe in your post as well
> as
> > the ones in mine. There is some relation there, but gaps as well. With
> more
> > discussion here we can figure something out.
> >
> > Thanks for starting this talk - good one!
> >
> >
> >
> > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <da...@bechberger.com>
> > wrote:
> >
> > > I've had a few on and off discussions with a few people here, so I
> wanted
> > > to send this out to everyone for feedback.
> > >
> > > What are people's thoughts on creating a new set of steps that codify
> > > common Gremlin best practices?
> > >
> > > I think there are several common Gremlin patterns where users would
> > benefit
> > > from the additional guidance that these codified steps represent.  The
> > > first one I would recommend though is codifying the element existence
> > > pattern into a single Gremlin step, something like:
> > >
> > > mergeV(String, Map, Map)
> > >      String - The vertex label
> > >      Map (first) - The properties to match existing vertices on
> > >      Map (second) - Any additional properties to set if a new vertex is
> > > created (optional)
> > > mergeE(String, Map, Map)
> > >      String - The edge label
> > >      Map (first) - The properties to match existing edge on
> > >      Map (second) - Any additional properties to set if a new edge is
> > > created (optional)
> > >
> > > In each of these cases these steps would perform the same upsert
> > > functionality as the element existence pattern.
> > >
> > > Example:
> > >
> > > g.V().has('person','name','stephen').
> > >            fold().
> > >            coalesce(unfold(),
> > >                     addV('person').
> > >                       property('name','stephen').
> > >                       property('age',34))
> > >
> > > would become:
> > >
> > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > >
> > > I think that this change would be a good addition to the language for
> > > several reasons:
> > >
> > > * This codifies the best practice for a specific action/recipe, which
> > > reduces the chance that someone uses the pattern incorrectly
> > > * Most complex Gremlin traversals are verbose.  Reducing the amount of
> > code
> > > that needs to be written and maintained allows for a better developer
> > > experience.
> > > * It will lower the bar of entry for a developer by making these
> actions
> > > more discoverable.  The more we can help bring these patterns to the
> > > forefront of the language via these pattern/meta steps the more we
> guide
> > > users towards writing better Gremlin faster
> > > * This allows DB vendors to optimize for this pattern
> > >
> > > I know that this would likely be the first step in Gremlin that
> codifies
> > a
> > > pattern, so I'd like to get other's thoughts on this?
> > >
> > > Dave
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by David Bechberger <da...@bechberger.com>.
Thanks for the detailed comments Stephen.  I have addressed them inline
below.

I did read the proposal from earlier and I think that we are in close
agreement with what we are trying to accomplish.  I also fully support
Josh's comment on providing a mechanism for submitting a map of properties
as manually unrolling this all right now leads to a lot of potential for
error and a long messy traversal.

I'm looking forward to this discussion on how to merge these two proposals.


1. How would multi/meta-properties fit into the API you've proposed?

My first thought here is that multi-properties would be represented as
lists in the map, e.g.

{names: ['Dave', 'David']}

and meta-properties would be represented as maps in the maps, e.g.

{name: {first: 'Dave', last: 'Bechberger'}}

I can't say I've thought through all the implications of this though so it
is an area we would need to explore.

2. How would users set the T.id on creation? would that T.id just be a key
in the first Map argument?

Yes, I was thinking that T.id would be the key name, e.g.:

g.mergeV('name', {T.id: 1}, {'name': 'Marko'})

3. I do like the general idea of a match on multiple properties for the
first argument as a convenience but wonder about the specificity of this
API a bit as it focuses heavily on equality - I suppose that's most cases
for get-or-create, so perhaps that's ok.

In most cases I've seen use exact matches on the vertex or edge.  I think
it might be best to keep this straightforward as any complex edge cases
still can perform the same functionality using the coalesce() pattern.


4. I think your suggestion points to one of the troubles Gremlin has which
we see with "algorithms" - extending the language with new steps that
provides a form of "sugar" (e.g. in algorithms we end up with
shortestPath() step) pollutes the core language a bit, hence my
generalization of "merging" in my link above which fits into the core
Gremlin language style. There is a bigger picture where we are missing
something in Gremlin that lets us extend the language in ways that let us
easily introduce new steps that aren't for general purpose. This issue is
discussed in terms of "algorithms" here:
https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we could
see how there might be some "mutation" extension steps that would cover
your suggested API, plus batch operations, etc. We need a way to add
"sugar" without it interfering with the consistency of the core. Obviously
this is a bigger issue but perhaps important to solve to implement steps in
the fashion you describe.

I agree that the "algorithm" steps do seem a bit odd in the core language
but I think the need is there.  I'd be interested in furthering this
discussion but I think these "pattern" steps may or may not be the same as
the algorithm steps.  I'll have to think on that.

5. I suppose that the reason for mergeE and mergeV is to specify what
element type the first Map argument should be applied to? what about
mergeVP (i.e. vertex property as it too is an element) ? That's tricky but
I don't think we should miss that. Perhaps merge() could be a "complex
modulator"?? that's a new concept of course, but you would do g.V().merge()
and the label and first Map would fold to VertexStartStep (i.e. V()) for
the lookup and then a MergeStep would follow - thus a "complex modulator"
as it does more than just change the behavior of the previous step - it
also adds its own. I suppose it could also add has() steps followed by the
MergeStep and then the has() operations would fold in normally as they do
today. In this way, we can simplify to just one single
merge(String,Map,Map). ??

I agree that we should also think about how to include properties in this
merge construct.  The reason I was thinking about mergeV() and mergeE() is
that it follows the same pattern as the already well understood
addV()/addE() steps.  I am a bit wary of trying to generalize this down to
a single merge() step as these sorts of complex overloads make it hard to
figure out which Gremlin step you should use for a particular pattern (e.g.
upsert vertex or upsert edge).  One thing that I think customers find
useful about the addV/addE step is that they are very discoverable, the
name tells me what functionality to expect from that step.


6. One thing neither my approach nor yours seems to do is tell the user if
they created something or updated something - that's another thing I've
seen users want to have in get-or-create. Here again we go deeper into a
less general step specification as alluded to in 4, but a merge() step as
proposed in 5, might return [Element,boolean] so as to provide an indicator
of creation?

Hmm, I'll have to think about that.  How would returning multiple values
work if we want to chain these together. e.g. Add a vertex between two
edges but make sure the vertices exist?


7. You were just introducing your ideas here, so perhaps you haven't gotten
this far yet, but a shortcoming to doing merge(String,Map,Map) is that it
leaves open no opportunity to stream a List of Maps to a merge() for a form
of batch loading which is mighty common and one of the variations of the
coalesce() pattern that I alluded to at the start of all this. I think that
we would want to be sure that we left open the option to do that somehow.
8. If we had a general purpose merge() step I wonder if it makes developing
the API as you suggested easier to do?

Hmm, let me think about that one.

On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <sp...@gmail.com>
wrote:

> I like our coalesce() pattern but it is verbose and over time it has gone
> from a simple pattern to one with numerous variations for all manner of
> different sorts of merge-like operations. As such, I do think we should
> introduce something to cover this pattern.
>
> I like that you used the word "merge" in your description of this as it is
> the word I've liked using. You might want to give a look at my proposed
> merge() step from earlier in the year:
>
>
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
>
> I'm just going to dump thoughts as they come regarding what you wrote:
>
> 1. How would multi/meta-properties fit into the API you've proposed?
> 2. How would users set the T.id on creation? would that T.id just be a key
> in the first Map argument?
> 3. I do like the general idea of a match on multiple properties for the
> first argument as a convenience but wonder about the specificity of this
> API a bit as it focuses heavily on equality - I suppose that's most cases
> for get-or-create, so perhaps that's ok.
> 4. I think your suggestion points to one of the troubles Gremlin has which
> we see with "algorithms" - extending the language with new steps that
> provides a form of "sugar" (e.g. in algorithms we end up with
> shortestPath() step) pollutes the core language a bit, hence my
> generalization of "merging" in my link above which fits into the core
> Gremlin language style. There is a bigger picture where we are missing
> something in Gremlin that lets us extend the language in ways that let us
> easily introduce new steps that aren't for general purpose. This issue is
> discussed in terms of "algorithms" here:
> https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we could
> see how there might be some "mutation" extension steps that would cover
> your suggested API, plus batch operations, etc. We need a way to add
> "sugar" without it interfering with the consistency of the core. Obviously
> this is a bigger issue but perhaps important to solve to implement steps in
> the fashion you describe.
> 5. I suppose that the reason for mergeE and mergeV is to specify what
> element type the first Map argument should be applied to? what about
> mergeVP (i.e. vertex property as it too is an element) ? That's tricky but
> I don't think we should miss that. Perhaps merge() could be a "complex
> modulator"?? that's a new concept of course, but you would do g.V().merge()
> and the label and first Map would fold to VertexStartStep (i.e. V()) for
> the lookup and then a MergeStep would follow - thus a "complex modulator"
> as it does more than just change the behavior of the previous step - it
> also adds its own. I suppose it could also add has() steps followed by the
> MergeStep and then the has() operations would fold in normally as they do
> today. In this way, we can simplify to just one single
> merge(String,Map,Map). ??
> 6. One thing neither my approach nor yours seems to do is tell the user if
> they created something or updated something - that's another thing I've
> seen users want to have in get-or-create. Here again we go deeper into a
> less general step specification as alluded to in 4, but a merge() step as
> proposed in 5, might return [Element,boolean] so as to provide an indicator
> of creation?
> 7. You were just introducing your ideas here, so perhaps you haven't gotten
> this far yet, but a shortcoming to doing merge(String,Map,Map) is that it
> leaves open no opportunity to stream a List of Maps to a merge() for a form
> of batch loading which is mighty common and one of the variations of the
> coalesce() pattern that I alluded to at the start of all this. I think that
> we would want to be sure that we left open the option to do that somehow.
> 8. If we had a general purpose merge() step I wonder if it makes developing
> the API as you suggested easier to do?
>
> I think I'd like to solve the problems you describe in your post as well as
> the ones in mine. There is some relation there, but gaps as well. With more
> discussion here we can figure something out.
>
> Thanks for starting this talk - good one!
>
>
>
> On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <da...@bechberger.com>
> wrote:
>
> > I've had a few on and off discussions with a few people here, so I wanted
> > to send this out to everyone for feedback.
> >
> > What are people's thoughts on creating a new set of steps that codify
> > common Gremlin best practices?
> >
> > I think there are several common Gremlin patterns where users would
> benefit
> > from the additional guidance that these codified steps represent.  The
> > first one I would recommend though is codifying the element existence
> > pattern into a single Gremlin step, something like:
> >
> > mergeV(String, Map, Map)
> >      String - The vertex label
> >      Map (first) - The properties to match existing vertices on
> >      Map (second) - Any additional properties to set if a new vertex is
> > created (optional)
> > mergeE(String, Map, Map)
> >      String - The edge label
> >      Map (first) - The properties to match existing edge on
> >      Map (second) - Any additional properties to set if a new edge is
> > created (optional)
> >
> > In each of these cases these steps would perform the same upsert
> > functionality as the element existence pattern.
> >
> > Example:
> >
> > g.V().has('person','name','stephen').
> >            fold().
> >            coalesce(unfold(),
> >                     addV('person').
> >                       property('name','stephen').
> >                       property('age',34))
> >
> > would become:
> >
> > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> >
> > I think that this change would be a good addition to the language for
> > several reasons:
> >
> > * This codifies the best practice for a specific action/recipe, which
> > reduces the chance that someone uses the pattern incorrectly
> > * Most complex Gremlin traversals are verbose.  Reducing the amount of
> code
> > that needs to be written and maintained allows for a better developer
> > experience.
> > * It will lower the bar of entry for a developer by making these actions
> > more discoverable.  The more we can help bring these patterns to the
> > forefront of the language via these pattern/meta steps the more we guide
> > users towards writing better Gremlin faster
> > * This allows DB vendors to optimize for this pattern
> >
> > I know that this would likely be the first step in Gremlin that codifies
> a
> > pattern, so I'd like to get other's thoughts on this?
> >
> > Dave
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Posted by Stephen Mallette <sp...@gmail.com>.
I like our coalesce() pattern but it is verbose and over time it has gone
from a simple pattern to one with numerous variations for all manner of
different sorts of merge-like operations. As such, I do think we should
introduce something to cover this pattern.

I like that you used the word "merge" in your description of this as it is
the word I've liked using. You might want to give a look at my proposed
merge() step from earlier in the year:

https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E

I'm just going to dump thoughts as they come regarding what you wrote:

1. How would multi/meta-properties fit into the API you've proposed?
2. How would users set the T.id on creation? would that T.id just be a key
in the first Map argument?
3. I do like the general idea of a match on multiple properties for the
first argument as a convenience but wonder about the specificity of this
API a bit as it focuses heavily on equality - I suppose that's most cases
for get-or-create, so perhaps that's ok.
4. I think your suggestion points to one of the troubles Gremlin has which
we see with "algorithms" - extending the language with new steps that
provides a form of "sugar" (e.g. in algorithms we end up with
shortestPath() step) pollutes the core language a bit, hence my
generalization of "merging" in my link above which fits into the core
Gremlin language style. There is a bigger picture where we are missing
something in Gremlin that lets us extend the language in ways that let us
easily introduce new steps that aren't for general purpose. This issue is
discussed in terms of "algorithms" here:
https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we could
see how there might be some "mutation" extension steps that would cover
your suggested API, plus batch operations, etc. We need a way to add
"sugar" without it interfering with the consistency of the core. Obviously
this is a bigger issue but perhaps important to solve to implement steps in
the fashion you describe.
5. I suppose that the reason for mergeE and mergeV is to specify what
element type the first Map argument should be applied to? what about
mergeVP (i.e. vertex property as it too is an element) ? That's tricky but
I don't think we should miss that. Perhaps merge() could be a "complex
modulator"?? that's a new concept of course, but you would do g.V().merge()
and the label and first Map would fold to VertexStartStep (i.e. V()) for
the lookup and then a MergeStep would follow - thus a "complex modulator"
as it does more than just change the behavior of the previous step - it
also adds its own. I suppose it could also add has() steps followed by the
MergeStep and then the has() operations would fold in normally as they do
today. In this way, we can simplify to just one single
merge(String,Map,Map). ??
6. One thing neither my approach nor yours seems to do is tell the user if
they created something or updated something - that's another thing I've
seen users want to have in get-or-create. Here again we go deeper into a
less general step specification as alluded to in 4, but a merge() step as
proposed in 5, might return [Element,boolean] so as to provide an indicator
of creation?
7. You were just introducing your ideas here, so perhaps you haven't gotten
this far yet, but a shortcoming to doing merge(String,Map,Map) is that it
leaves open no opportunity to stream a List of Maps to a merge() for a form
of batch loading which is mighty common and one of the variations of the
coalesce() pattern that I alluded to at the start of all this. I think that
we would want to be sure that we left open the option to do that somehow.
8. If we had a general purpose merge() step I wonder if it makes developing
the API as you suggested easier to do?

I think I'd like to solve the problems you describe in your post as well as
the ones in mine. There is some relation there, but gaps as well. With more
discussion here we can figure something out.

Thanks for starting this talk - good one!



On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <da...@bechberger.com>
wrote:

> I've had a few on and off discussions with a few people here, so I wanted
> to send this out to everyone for feedback.
>
> What are people's thoughts on creating a new set of steps that codify
> common Gremlin best practices?
>
> I think there are several common Gremlin patterns where users would benefit
> from the additional guidance that these codified steps represent.  The
> first one I would recommend though is codifying the element existence
> pattern into a single Gremlin step, something like:
>
> mergeV(String, Map, Map)
>      String - The vertex label
>      Map (first) - The properties to match existing vertices on
>      Map (second) - Any additional properties to set if a new vertex is
> created (optional)
> mergeE(String, Map, Map)
>      String - The edge label
>      Map (first) - The properties to match existing edge on
>      Map (second) - Any additional properties to set if a new edge is
> created (optional)
>
> In each of these cases these steps would perform the same upsert
> functionality as the element existence pattern.
>
> Example:
>
> g.V().has('person','name','stephen').
>            fold().
>            coalesce(unfold(),
>                     addV('person').
>                       property('name','stephen').
>                       property('age',34))
>
> would become:
>
> g.mergeV('person', {'name': 'stephen'}, {'age', 34})
>
> I think that this change would be a good addition to the language for
> several reasons:
>
> * This codifies the best practice for a specific action/recipe, which
> reduces the chance that someone uses the pattern incorrectly
> * Most complex Gremlin traversals are verbose.  Reducing the amount of code
> that needs to be written and maintained allows for a better developer
> experience.
> * It will lower the bar of entry for a developer by making these actions
> more discoverable.  The more we can help bring these patterns to the
> forefront of the language via these pattern/meta steps the more we guide
> users towards writing better Gremlin faster
> * This allows DB vendors to optimize for this pattern
>
> I know that this would likely be the first step in Gremlin that codifies a
> pattern, so I'd like to get other's thoughts on this?
>
> Dave
>