You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by Stark Arya <sa...@gmail.com> on 2019/10/15 01:42:12 UTC

Should we provide a more general VertexSerializer with GraphBinaryMessageSerializerV1 to be compatible with GraphSONMessageSerializerV3d0

Old post: https://groups.google.com/forum/#!topic/gremlin-users/lLpxd8FZQOQ


In TinkerPop， the client request - response path is:    client-req  -->   server-generateResponse  --> server-response-serializer --> client-response-deserializer -> client-respone-ogm

Two key steps:
1. server-generateResponse:   
    TinkerPop server could  generate results with: ReferenceVertex, ReferenceEdge in most scenarios to reduce network package size and reduce seralizer-deseralizer time,  this works well。
    But in very limited scenarios like people themselves know the result set is few enough (or by limit step) and  people need DetachedVertex, DetachedEdge,..., to get all information in a driect way。
      TinkerPop also provide DetachedFactory to detach any result XXX into detachedXXX, this is good and the Tinkerpop server could generate DetachedXXX in response easily。

2. server-response-serializer / client-response-deserializer: 
     For GraphSONMessageSerializerV3d0,  If server-generateResponse  return ReferenceVertex,  client will get DetachedVertex but properties will be set null；If server-generateResponse  
          return DetachedVertex,  client will get DetachedVertex with properties will not be null。it seem ok and I think This model works well： TinkerPop Server decide if return ReferenceXXX or
          DetachedXXX。serializer / deserializer just transfer the result， Serializer / deserializer should not interfere the server of how the response format should be which will make the system complex。

     But For GraphBinaryMessageSerializerV1， the upper constraint is destroyed。Even  server-generateResponse generate DetachedVertex the client will get ReferenceVertex eventually 。 
     I do not think this is a good choice。It will make TinkerPop server unable to control client’s experience and the provider will extends Serializer by the following code：
 
writer = new GraphBinaryWriter(TypeSerializerRegistry.build()
        .addCustomType(DetachedVertex.class, new DetachedVertexSerializer())
        .addCustomType(DetachedEdge.class, new DetachedEdgeSerializer())
        .addCustomType(DetachedPath.class, new DetachedPathSerializer())
        .create());

reader = new GraphBinaryReader(TypeSerializerRegistry.build()
        .addCustomType(DetachedVertex.class, new DetachedVertexSerializer())
        .addCustomType(DetachedEdge.class, new DetachedEdgeSerializer())
        .addCustomType(DetachedPath.class, new DetachedPathSerializer())
        .create());

    This need reconstruct  gremlin-driver which will be incompatible with the standard gremlin-driver and will take Inconvenience to users；

   So I think we should take careful consideration with properties as the annotations have already been mentioned， and judge Vertex  type when writeValue to determine if write properties
protected Vertex readValue(final ByteBuf buffer, final GraphBinaryReader context) throws SerializationException {
    final Vertex v = new ReferenceVertex(context.read(buffer), 
                                         context.readValue(buffer, String.class, false));
    
    // discard the properties - as we only send "references" this should always be null, but will we change our
    // minds some day????
    context.read(buffer);
    return v;
}
@Override
protected void writeValue(final Vertex value, final ByteBuf buffer, final GraphBinaryWriter context) throws SerializationException {
    context.write(value.id(), buffer);
    context.writeValue(value.label(), buffer, false);
    context.write(null, buffer);
}

Any comments are very welcome， Thanks

Re: Should we provide a more general VertexSerializer with GraphBinaryMessageSerializerV1 to be compatible with GraphSONMessageSerializerV3d0

Posted by Stephen Mallette <sp...@gmail.com>.

Thanks for reposting here...

Your topic is an old one. We've discussed it again and again. Do we return
properties with graph elements or not? For several major versions now, we
have stuck with "do not" for a variety of reasons:

1. There is the reason you mentioned - the scenarios where users are
knowledgeable enough to do this are "very limited", in which case, we're
just adding complexity for a small number of users and the possibility for
the greater number of less knowledgeable users to make mistakes. fwiw, I
don't think we really want to "reduce network package size and reduce
serializer-deserializer time" - that just happens to be a by-product of not
returning properties. In other words, we don't count this as a way to
enhance performance really.
2.If we support properties we increase the complexity and size of the GLV
code bases
3. If we don't do 2, then there is discrepancy between scripts and bytecode
in terms of what they return and folks get confused
4. If we do 2 then we have to educate users about "detachment", in that
these special graph elements aren't mutable.....
5. One of the big reasons to not include properties is
multi/meta-properties - a fat vertex with many properties will be a massive
expense to a server that users will likely fail to account for
6. There's other reasons we've debated over the years - can't remember them
all

For 3.5.0, I'd like to double down on the notion of "no properties"
(ReferenceElement) and make it the standard return method for even embedded
graphs, so TinkerGraph would even return a "Reference" outside of Gremlin
Server. You would have to use Gremlin to work with your graphs no matter
what method of connection you chose. I think we would just install the
ReferenceElementStrategy to the default set of strategies. Once that was
done then all access would be consistent.

The only thing that bothers me with that approach is how to rectify
subgraph() across GLVs if properties are never returned - there was
actually a nice user discussion on that recently:

https://groups.google.com/d/msg/gremlin-users/uharEVL1rsk/3xwC2pM4DAAJ

Since subgraph() produces a Graph instance the graph serializer tries to
basically write a TinkerGraph which writes full vertex/edge instances and
GLVs dont' account for that at all.

> But For GraphBinaryMessageSerializerV1， the upper constraint is
destroyed。Even  server-generateResponse generate DetachedVertex the client
will get ReferenceVertex eventually 。

GraphBinaryMessageSerializerV1 does not behave like the old serializers
because it is taking a step forward in embracing the idea of not returning
properties for graph elements. You noticed the indecision in the code:

>    // discard the properties - as we only send "references" this should
always be null, but will we change our
>    // minds some day????

but I think that comment was more due to subgraph() concerns than about
indecision about the use of "references" for all graph elements.

I think that the need to return properties now is diminished for 3.4.4 as
we've introduced elementMap()

https://github.com/apache/tinkerpop/blob/6447f3ddf2b7198f3725739526c63f21f30aba6e/docs/src/upgrade/release-3.4.x.asciidoc#elementmap-step

That does everything that returning a Vertex or Edge would do but does it
in a way that should be safe for most users. What do you think of
elementMap()? Does that help make things easier for users who want the
entire graph element?

On Mon, Oct 14, 2019 at 10:39 PM Stark Arya <sa...@gmail.com> wrote:

> Old post:
> https://groups.google.com/forum/#!topic/gremlin-users/lLpxd8FZQOQ
>
>
> In TinkerPop， the client request - response path is:    client-req  -->
>  server-generateResponse  --> server-response-serializer -->
> client-response-deserializer -> client-respone-ogm
>
> Two key steps:
> 1. server-generateResponse:
>     TinkerPop server could  generate results with: ReferenceVertex,
> ReferenceEdge in most scenarios to reduce network package size and reduce
> seralizer-deseralizer time,  this works well。
>     But in very limited scenarios like people themselves know the result
> set is few enough (or by limit step) and  people need DetachedVertex,
> DetachedEdge,..., to get all information in a driect way。
>       TinkerPop also provide DetachedFactory to detach any result XXX into
> detachedXXX, this is good and the Tinkerpop server could generate
> DetachedXXX in response easily。
>
> 2. server-response-serializer / client-response-deserializer:
>      For GraphSONMessageSerializerV3d0,  If server-generateResponse
> return ReferenceVertex,  client will get DetachedVertex but properties will
> be set null；If server-generateResponse
>           return DetachedVertex,  client will get DetachedVertex with
> properties will not be null。it seem ok and I think This model works well：
> TinkerPop Server decide if return ReferenceXXX or
>           DetachedXXX。serializer / deserializer just transfer the result，
> Serializer / deserializer should not interfere the server of how the
> response format should be which will make the system complex。
>
>      But For GraphBinaryMessageSerializerV1， the upper constraint is
> destroyed。Even  server-generateResponse generate DetachedVertex the client
> will get ReferenceVertex eventually 。
>      I do not think this is a good choice。It will make TinkerPop server
> unable to control client’s experience and the provider will extends
> Serializer by the following code：
>
> writer = new GraphBinaryWriter(TypeSerializerRegistry.build()
>         .addCustomType(DetachedVertex.class, new
> DetachedVertexSerializer())
>         .addCustomType(DetachedEdge.class, new DetachedEdgeSerializer())
>         .addCustomType(DetachedPath.class, new DetachedPathSerializer())
>         .create());
>
> reader = new GraphBinaryReader(TypeSerializerRegistry.build()
>         .addCustomType(DetachedVertex.class, new
> DetachedVertexSerializer())
>         .addCustomType(DetachedEdge.class, new DetachedEdgeSerializer())
>         .addCustomType(DetachedPath.class, new DetachedPathSerializer())
>         .create());
>
>     This need reconstruct  gremlin-driver which will be incompatible with
> the standard gremlin-driver and will take Inconvenience to users；
>
>    So I think we should take careful consideration with properties as the
> annotations have already been mentioned， and judge Vertex  type when
> writeValue to determine if write properties
> protected Vertex readValue(final ByteBuf buffer, final GraphBinaryReader
> context) throws SerializationException {
>     final Vertex v = new ReferenceVertex(context.read(buffer),
>                                          context.readValue(buffer,
> String.class, false));
>
>     // discard the properties - as we only send "references" this should
> always be null, but will we change our
>     // minds some day????
>     context.read(buffer);
>     return v;
> }
> @Override
> protected void writeValue(final Vertex value, final ByteBuf buffer, final
> GraphBinaryWriter context) throws SerializationException {
>     context.write(value.id(), buffer);
>     context.writeValue(value.label(), buffer, false);
>     context.write(null, buffer);
> }
>
> Any comments are very welcome， Thanks
>