You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@s2graph.apache.org by DO YUNG YOON <sh...@gmail.com> on 2016/05/17 22:10:00 UTC

[Discuss] Move orderBy, groupBy logic from PostProcess into Graph#getEdges inside.

Hi folks.

Problem is user must use Json representation for query result since
groupBy, orderBy, and merging multiple query result into one reside on
PostProcess which is not meant for it.

My suggestion in one line, is make PostProcess simply build json from
result type of getEdges and make getEdges build complete query result, so
different output format can be easily supported.

In more detail, query traversal is processed by Graph#getEdges which accept
Query then return Future[Seq[QueryRequestWithResult]].
On completion of this Future, we run PostProcess#toSimpleVertexArrJson to
change Seq[QueryRequestWithResult] into Json result.

I think most of post process logic, especially with GroupBy, OrderBy needs
to be processed in Graph#getEdges inside, then return better
result type which can contains grouped, ordered, result of edges. I am
suggesting make PostProcess only to deal with building json result from
already processed result.

This measn a lot of changes(only because return type change of core
method), refactor Graph#filterEdges to build right result(grouped, ordered,
and each edge should have all information describe how they will be
translated into json later).

By separate Json from traverse flow, I think we can provide user more
choice on their result format such as thrift, protobuf, etc.

One exceptional case is MultiQuery. it run multiple traversal per each
query in MultiQuery then aggregate each query's traversal result into one.
I think this exceptional case also removed by provide better method to
merge different query's traversal result into one.

There will be few subtasks for this. please feel free to list up more task.

1. Define better return type of getEdges.
2. Refactor getEdges to build complete result(ordered, grouped, json
writable).

Re: [Discuss] Move orderBy, groupBy logic from PostProcess into Graph#getEdges inside.

Posted by Jun Ki Kim <wi...@gmail.com>.
Hi, DoYoung!

I can't understand well what you planned for refactoring `getEdges`.
However, it seems s2graph's control of edges to communicate with other
module or projects after make a kind of I/O standard.
You mean make graph traversal result into POJO-able. Am I right?
I'm very intereted in your suggestion and want to follow up your tasks, but
I can't make a more list of sub tasks, yet.
I will do my best to contribute for your the suggestion.

Thanks,

Best regrards,
Junki Kim

2016년 5월 18일 (수) 오전 7:10, DO YUNG YOON <sh...@gmail.com>님이 작성:

> Hi folks.
>
> Problem is user must use Json representation for query result since
> groupBy, orderBy, and merging multiple query result into one reside on
> PostProcess which is not meant for it.
>
> My suggestion in one line, is make PostProcess simply build json from
> result type of getEdges and make getEdges build complete query result, so
> different output format can be easily supported.
>
> In more detail, query traversal is processed by Graph#getEdges which accept
> Query then return Future[Seq[QueryRequestWithResult]].
> On completion of this Future, we run PostProcess#toSimpleVertexArrJson to
> change Seq[QueryRequestWithResult] into Json result.
>
> I think most of post process logic, especially with GroupBy, OrderBy needs
> to be processed in Graph#getEdges inside, then return better
> result type which can contains grouped, ordered, result of edges. I am
> suggesting make PostProcess only to deal with building json result from
> already processed result.
>
> This measn a lot of changes(only because return type change of core
> method), refactor Graph#filterEdges to build right result(grouped, ordered,
> and each edge should have all information describe how they will be
> translated into json later).
>
> By separate Json from traverse flow, I think we can provide user more
> choice on their result format such as thrift, protobuf, etc.
>
> One exceptional case is MultiQuery. it run multiple traversal per each
> query in MultiQuery then aggregate each query's traversal result into one.
> I think this exceptional case also removed by provide better method to
> merge different query's traversal result into one.
>
> There will be few subtasks for this. please feel free to list up more task.
>
> 1. Define better return type of getEdges.
> 2. Refactor getEdges to build complete result(ordered, grouped, json
> writable).
>