You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@s2graph.apache.org by DO YUNG YOON <sh...@gmail.com> on 2016/05/28 06:34:56 UTC

Re: Simplify write-write conflict resolving logic.

Hi JunKi.

I had free time today, and start working on
https://issues.apache.org/jira/browse/S2GRAPH-68.
You can find out where this is going to on
https://github.com/apache/incubator-s2graph/pull/53.
Any feedback and question would be appreciated.

On Thu, Apr 28, 2016 at 12:33 AM DO YUNG YOON <sh...@gmail.com> wrote:

> Hi Jun Ki.
> I just created https://issues.apache.org/jira/browse/S2GRAPH-68 and start
> to work on this.
> I will ask your opinion while I am working on this, so please gives any
> feedback and let's continue discussion.
>
>
> On Fri, Apr 22, 2016 at 9:10 AM Jun Ki Kim <wi...@gmail.com> wrote:
>
>> OK, I see what you said. Sorry for my mis-understanding.
>> More comments makes more readability. That will be very helpful to me.
>> After the your job, I may also try to make another suggestion.
>>
>> Thanks for your correction!
>>
>> Best regards,
>> Junki Kim
>> 2016년 4월 21일 (목) 오후 11:58, DO YUNG YOON <sh...@gmail.com>님이 작성:
>>
>> > Apology on my bad explanation. it is always hard to explain concepts
>> > clearly.
>> >
>> > Point is current logic is over complicated so make others hard to
>> > understand.
>> > What I am planning is start from writing comments with explanation and
>> > assertion. Then ask others if it makes sense, then go for
>> implementation.
>> > JunKi, How about this?
>> >
>> > What's others opinion on this?
>> >
>> > Best Regards.
>> > DO YUNG YOON
>> >
>> > On Wed, Apr 20, 2016 at 5:37 PM Jun Ki Kim <wi...@gmail.com> wrote:
>> >
>> > > Conflict management is the hardest logic to understand how it works. I
>> > > think many people may agree my opinion.
>> > > I don't know well about core module and don't understand well what you
>> > > suggested too. However I totally agree with refactoring the core
>> logic! I
>> > > very welcome to make more readable to me and more comments.
>> > > I will try to understand and refactor the core logic. Please give me
>> more
>> > > wisdom. ^^;
>> > >
>> > > Regards,
>> > > Junki Kim
>> > > 2016년 4월 20일 (수) 오전 9:15, DO YUNG YOON <sh...@gmail.com>님이 작성:
>> > >
>> > > > Current implementation to resolve conflicts on same snapshotEdge is
>> > > > overwhelmingly complicated.
>> > > > I am suggesting refactor this part so we can understand what is
>> going
>> > on
>> > > > for consistency.
>> > > >
>> > > > More importantly, with positive fail probability for
>> > > > RPC(hbase.fail.prob=0.01), test cases actually fail.
>> > > > we can reproduce failure by running `sbt "project s2core"
>> > > > -Dconfig.file=s2rest_play/conf/test.conf test`.
>> > > > What we intended initially here is guarantee to resolve conflicts
>> even
>> > > some
>> > > > RPC into storage fail.
>> > > >
>> > > > I was fixing this issue and quickly realize current code base is way
>> > too
>> > > > complicated and not efficient.
>> > > >
>> > > > here is brief explanation(
>> > > >
>> > > >
>> > >
>> >
>> https://steamshon.gitbooks.io/s2graph-book/content/request_state_diagram.html
>> > > > ).
>> > > >
>> > > > the big picture for this process goes like below.
>> > > >
>> > > > - retry on failure
>> > > >   -- fetch snapshotEdge as oldSN.
>> > > >     --- merge oldSN and requesting edges and build newSN containing
>> > > updated
>> > > > state.
>> > > >     --- lock this snapshotEdge by storing (oldSN, requesting edge as
>> > > > pendingEdge) using checkAndSet(statusCode 0 -> 1).
>> > > >     --- fire RPCs for indexEdge insert + delete
>> mutations(statusCode 1
>> > ->
>> > > > 2).
>> > > >     --- fire RPCs for indexEdge degree increment
>> mutations(statusCode 2
>> > > ->
>> > > > 3).
>> > > >     --- release lock on this snapshotEdge by storing (newSN, None as
>> > > > pendingEdge) using CheckAndSet(statusCode 3 -> 0).
>> > > >
>> > > > I think there is no need to use CheckAndSet on release lock phase
>> since
>> > > it
>> > > > is guaranteed for only one thread can success to lock snapshotEdge.
>> > > >
>> > > > Also fetch snapshotEdge on ever retry produce lots of RPC on same
>> row.
>> > > > Actually, I think there is no need to fetch snapshotEdge over and
>> over
>> > on
>> > > > every retry, once this snapshotEdge is locked.
>> > > >
>> > > > Of course threads that failed to acquire lock, still need to fetch
>> > > > snapshotEdge on retry, but thread who own lock on this snapshotEdge
>> do
>> > > not
>> > > > need to fetch it again. it is guaranteed that no other thread
>> changed
>> > > > snapshotEdge while thread who own lock is retrying on partial
>> failure.
>> > > >
>> > > > Also current implementation use Thread.sleep on Future and throw
>> > > exception
>> > > > inside Future(not return Future.failed(ex)) which does not seems
>> normal
>> > > in
>> > > > asynchronous manner.
>> > > >
>> > > > Finally there is no comments and it is very hard to follow control
>> flow
>> > > for
>> > > > this resolving logic.
>> > > >
>> > > > I think it is worthwhile to refactor conflict resolving logic.
>> > > >
>> > > > What do you guys think?
>> > > >
>> > > > Best Regards.
>> > > > DOYUNG YOON
>> > > >
>> > >
>> >
>>
>

Re: Simplify write-write conflict resolving logic.

Posted by DO YUNG YOON <sh...@gmail.com>.

I also applied changes from above issue on separate branch on gitbook to
explain details.
Anyone who is interested, the details can be found at
https://steamshon.gitbooks.io/s2graph-book/content/v/S2GRAPH-83/request_state_diagram.html
(resolve
https://issues.apache.org/jira/browse/S2GRAPH-83?jql=project%20%3D%20S2GRAPH
)

On Sat, May 28, 2016 at 3:35 PM DO YUNG YOON <sh...@gmail.com> wrote:

> Hi JunKi.
>
> I had free time today, and start working on
> https://issues.apache.org/jira/browse/S2GRAPH-68.
> You can find out where this is going to on
> https://github.com/apache/incubator-s2graph/pull/53.
> Any feedback and question would be appreciated.
>
> On Thu, Apr 28, 2016 at 12:33 AM DO YUNG YOON <sh...@gmail.com> wrote:
>
>> Hi Jun Ki.
>> I just created https://issues.apache.org/jira/browse/S2GRAPH-68 and
>> start to work on this.
>> I will ask your opinion while I am working on this, so please gives any
>> feedback and let's continue discussion.
>>
>>
>> On Fri, Apr 22, 2016 at 9:10 AM Jun Ki Kim <wi...@gmail.com> wrote:
>>
>>> OK, I see what you said. Sorry for my mis-understanding.
>>> More comments makes more readability. That will be very helpful to me.
>>> After the your job, I may also try to make another suggestion.
>>>
>>> Thanks for your correction!
>>>
>>> Best regards,
>>> Junki Kim
>>> 2016년 4월 21일 (목) 오후 11:58, DO YUNG YOON <sh...@gmail.com>님이 작성:
>>>
>>> > Apology on my bad explanation. it is always hard to explain concepts
>>> > clearly.
>>> >
>>> > Point is current logic is over complicated so make others hard to
>>> > understand.
>>> > What I am planning is start from writing comments with explanation and
>>> > assertion. Then ask others if it makes sense, then go for
>>> implementation.
>>> > JunKi, How about this?
>>> >
>>> > What's others opinion on this?
>>> >
>>> > Best Regards.
>>> > DO YUNG YOON
>>> >
>>> > On Wed, Apr 20, 2016 at 5:37 PM Jun Ki Kim <wi...@gmail.com>
>>> wrote:
>>> >
>>> > > Conflict management is the hardest logic to understand how it works.
>>> I
>>> > > think many people may agree my opinion.
>>> > > I don't know well about core module and don't understand well what
>>> you
>>> > > suggested too. However I totally agree with refactoring the core
>>> logic! I
>>> > > very welcome to make more readable to me and more comments.
>>> > > I will try to understand and refactor the core logic. Please give me
>>> more
>>> > > wisdom. ^^;
>>> > >
>>> > > Regards,
>>> > > Junki Kim
>>> > > 2016년 4월 20일 (수) 오전 9:15, DO YUNG YOON <sh...@gmail.com>님이 작성:
>>> > >
>>> > > > Current implementation to resolve conflicts on same snapshotEdge is
>>> > > > overwhelmingly complicated.
>>> > > > I am suggesting refactor this part so we can understand what is
>>> going
>>> > on
>>> > > > for consistency.
>>> > > >
>>> > > > More importantly, with positive fail probability for
>>> > > > RPC(hbase.fail.prob=0.01), test cases actually fail.
>>> > > > we can reproduce failure by running `sbt "project s2core"
>>> > > > -Dconfig.file=s2rest_play/conf/test.conf test`.
>>> > > > What we intended initially here is guarantee to resolve conflicts
>>> even
>>> > > some
>>> > > > RPC into storage fail.
>>> > > >
>>> > > > I was fixing this issue and quickly realize current code base is
>>> way
>>> > too
>>> > > > complicated and not efficient.
>>> > > >
>>> > > > here is brief explanation(
>>> > > >
>>> > > >
>>> > >
>>> >
>>> https://steamshon.gitbooks.io/s2graph-book/content/request_state_diagram.html
>>> > > > ).
>>> > > >
>>> > > > the big picture for this process goes like below.
>>> > > >
>>> > > > - retry on failure
>>> > > >   -- fetch snapshotEdge as oldSN.
>>> > > >     --- merge oldSN and requesting edges and build newSN containing
>>> > > updated
>>> > > > state.
>>> > > >     --- lock this snapshotEdge by storing (oldSN, requesting edge
>>> as
>>> > > > pendingEdge) using checkAndSet(statusCode 0 -> 1).
>>> > > >     --- fire RPCs for indexEdge insert + delete
>>> mutations(statusCode 1
>>> > ->
>>> > > > 2).
>>> > > >     --- fire RPCs for indexEdge degree increment
>>> mutations(statusCode 2
>>> > > ->
>>> > > > 3).
>>> > > >     --- release lock on this snapshotEdge by storing (newSN, None
>>> as
>>> > > > pendingEdge) using CheckAndSet(statusCode 3 -> 0).
>>> > > >
>>> > > > I think there is no need to use CheckAndSet on release lock phase
>>> since
>>> > > it
>>> > > > is guaranteed for only one thread can success to lock snapshotEdge.
>>> > > >
>>> > > > Also fetch snapshotEdge on ever retry produce lots of RPC on same
>>> row.
>>> > > > Actually, I think there is no need to fetch snapshotEdge over and
>>> over
>>> > on
>>> > > > every retry, once this snapshotEdge is locked.
>>> > > >
>>> > > > Of course threads that failed to acquire lock, still need to fetch
>>> > > > snapshotEdge on retry, but thread who own lock on this
>>> snapshotEdge do
>>> > > not
>>> > > > need to fetch it again. it is guaranteed that no other thread
>>> changed
>>> > > > snapshotEdge while thread who own lock is retrying on partial
>>> failure.
>>> > > >
>>> > > > Also current implementation use Thread.sleep on Future and throw
>>> > > exception
>>> > > > inside Future(not return Future.failed(ex)) which does not seems
>>> normal
>>> > > in
>>> > > > asynchronous manner.
>>> > > >
>>> > > > Finally there is no comments and it is very hard to follow control
>>> flow
>>> > > for
>>> > > > this resolving logic.
>>> > > >
>>> > > > I think it is worthwhile to refactor conflict resolving logic.
>>> > > >
>>> > > > What do you guys think?
>>> > > >
>>> > > > Best Regards.
>>> > > > DOYUNG YOON
>>> > > >
>>> > >
>>> >
>>>
>>