You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by "Yang, Dong [GTSUS Non-J&J]" <dy...@ITS.JNJ.com> on 2018/10/18 13:00:47 UTC

About JIRA GEODE-5896

Hi,

I am Dong Yang, and my apache account is twosand.  What we are using Gemfire is not commonly usage scenario in other company, it's more like a OLTP and OLAP mixed scenario. The concept is very similar to using Spark-Gemfire connect, we have some server-side function that can shuffle data from server to client as stream style. And we encountered the thread lock issue in different environments. Before we use Gemfire8 , now we are upgrading to GemFrie9.
About GEODE-5896, it's very important usage for us, and I think the same for others if they want using spark to connect to Gemfire. Now we just do some patch at client-side the force the meta ready before function executed. But the perfect solution should fix some sever-side code.
I can share what I found and where I want to fix, you can review it , resonale or not . Fix it by current geode team or I can do it as a contributor.



Dong Yang, Dong [GTSUS Non-J&J
Thanks

Re: About JIRA GEODE-5896

Posted by Bruce Schuchardt <bs...@pivotal.io>.
Submit a PR and it will run precheckin for you.  The precheckin run will 
be much faster than running it yourself.

All you need to do before submitting a pull request is make sure the 
"build" task has no errors.


On 10/25/18 2:42 AM, Yang, Dong [GTSUS Non-J&J] wrote:
>
> Hi
>
> I already finish the code and test show it’s fine. Now the prechecking 
> step I need some suggestion..
>
> Is it possible make it faster? Some config or tricks
>
> It cost 3 hours and just 40% progress.. and seems some database 
> related case failed but actually nothing change on them.
>
> Thanks
>
> Dong
>
> *From:* Patrick Rhomberg <pr...@apache.org>
> *Sent:* Tuesday, October 23, 2018 4:46 PM
> *To:* dev@geode.apache.org; Yang, Dong [GTSUS Non-J&J] 
> <dy...@its.jnj.com>
> *Subject:* [EXTERNAL] Re: About JIRA GEODE-5896
>
> > Think I need finish the test code before create pull request.
>
> We have integrations into GitHub that launch precheckin testing in our 
> continuous integration Concourse pipelines.  PR status hooks updated 
> when tests pass or fail.
>
> Of course, from a philosophical point of view, every bug is the result 
> of insufficient testing coverage, but as long as your PR includes / 
> updates tests that would identify this bug, then opening the PR will 
> cover the rest.
>
> > But like I mentioned above, I need some suggestion from develop 
> team, is my idea suitable or something I missed.
>
> In my mind, this is what the PR is meant to do -- facilitate 
> discussion around immediate proposed changes.  When the PR is opened, 
> the community can review the change set, and if anything jumps out at 
> us, we have the opportunity to shore up any deficiencies then.
>
> If you were looking for a collaborator to help you with a problem that 
> you didn't know how to start, we could figure something out.  But if 
> you believe you have a fix, we'll all look forward to the pull request!
>
> On Tue, Oct 23, 2018 at 2:41 AM, Yang, Dong [GTSUS Non-J&J] 
> <dyang39@its.jnj.com <ma...@its.jnj.com>> wrote:
>
>     Hi, Udo
>
>     I already fork the geode and commit my code in
>     https://github.com/twosand/geode.git  feature/GEODE-5896 .
>
>     Think I need finish the test code before create pull request. But
>     actually I hope I can get some suggestion or maybe someone can
>     review the code changes.
>
>     I do some investigate about the code invocation chain. The
>     attachment chat can show the whole idea. We can find the problem
>     at on-server node, FunctionStreamingReplyMessage comes from
>     onRegion node and there should have a processor exist missed. Then
>     a PartitionedRegionFunctionStreamingAbortMessage can send from
>     this point, here we have the sender member, processorId, that’s
>     enouth.
>
>     Then the abort message received at on-region node, at this node,
>     user-defined function is still running and continuously invoke the
>     PartitionedRegionFunctionResultSender.sendResult method to send
>     the result as stream way. It’s running in another thread. We need
>     a shared variable can notify that sender the remote processor
>     already dropped. So PartitionedRegionFunctionStreamingContext
>     class here is tracing the processorId, normally it should be
>     placed into a map before send action and remove after last send.
>     Once abort message arrived, the processorId will be removed, then
>     the next sendResult method can throw an exception to endup the
>     useless function.
>
>     I am trying to follow the Github PR workflow, now are writing the
>     test code. But like I mentioned above, I need some suggestion from
>     develop team, is my idea suitable or something I missed.
>
>     Thanks
>
>     Dong
>
>     -----Original Message-----
>     From: Udo Kohlmeyer <ukohlmeyer@pivotal.io
>     <ma...@pivotal.io>>
>     Sent: Monday, October 22, 2018 4:53 PM
>     To: Yang, Dong [GTSUS Non-J&J] <dyang39@its.jnj.com
>     <ma...@its.jnj.com>>
>     Cc: dev@geode.apache.org <ma...@geode.apache.org>
>     Subject: [EXTERNAL] Re: About JIRA GEODE-5896
>
>     Hi there Dong Yang,
>
>     If you have completed a fix, please submit it via the PR mechanism
>     within Github. We will most gladly review and incorporate.
>
>     --Udo
>
>     On 10/18/18 06:00, Yang, Dong [GTSUS Non-J&J] wrote:
>
>     > Hi,
>
>     >
>
>     > I am Dong Yang, and my apache account is twosand. What we are
>     using Gemfire is not commonly usage scenario in other company,
>     it's more like a OLTP and OLAP mixed scenario. The concept is very
>     similar to using Spark-Gemfire connect, we have some server-side
>     function that can shuffle data from server to client as stream
>     style. And we encountered the thread lock issue in different
>     environments. Before we use Gemfire8 , now we are upgrading to
>     GemFrie9.
>
>     > About GEODE-5896, it's very important usage for us, and I think
>     the same for others if they want using spark to connect to
>     Gemfire. Now we just do some patch at client-side the force the
>     meta ready before function executed. But the perfect solution
>     should fix some sever-side code.
>
>     > I can share what I found and where I want to fix, you can review
>     it , resonale or not . Fix it by current geode team or I can do it
>     as a contributor.
>
>     >
>
>     >
>
>     >
>
>     > Dong Yang, Dong [GTSUS Non-J&J
>
>     > Thanks
>
>     >
>


RE: About JIRA GEODE-5896

Posted by "Yang, Dong [GTSUS Non-J&J]" <dy...@ITS.JNJ.com.INVALID>.
Hi
I already finish the code and test show it’s fine. Now the prechecking step I need some suggestion..
Is it possible make it faster? Some config or tricks
It cost 3 hours and just 40% progress.. and seems some database related case failed but actually nothing change on them.
[cid:image002.jpg@01D46C25.810E3A10]

Thanks
Dong

From: Patrick Rhomberg <pr...@apache.org>
Sent: Tuesday, October 23, 2018 4:46 PM
To: dev@geode.apache.org; Yang, Dong [GTSUS Non-J&J] <dy...@its.jnj.com>
Subject: [EXTERNAL] Re: About JIRA GEODE-5896

> Think I need finish the test code before create pull request.

We have integrations into GitHub that launch precheckin testing in our continuous integration Concourse pipelines.  PR status hooks updated when tests pass or fail.

Of course, from a philosophical point of view, every bug is the result of insufficient testing coverage, but as long as your PR includes / updates tests that would identify this bug, then opening the PR will cover the rest.

> But like I mentioned above, I need some suggestion from develop team, is my idea suitable or something I missed.

In my mind, this is what the PR is meant to do -- facilitate discussion around immediate proposed changes.  When the PR is opened, the community can review the change set, and if anything jumps out at us, we have the opportunity to shore up any deficiencies then.

If you were looking for a collaborator to help you with a problem that you didn't know how to start, we could figure something out.  But if you believe you have a fix, we'll all look forward to the pull request!

On Tue, Oct 23, 2018 at 2:41 AM, Yang, Dong [GTSUS Non-J&J] <dy...@its.jnj.com>> wrote:

Hi, Udo



I already fork the geode and commit my code in https://github.com/twosand/geode.git  feature/GEODE-5896 .

Think I need finish the test code before create pull request. But actually I hope I can get some suggestion or maybe someone can review the code changes.



I do some investigate about the code invocation chain. The attachment chat can show the whole idea. We can find the problem at on-server node, FunctionStreamingReplyMessage comes from onRegion node and there should have a processor exist missed. Then a PartitionedRegionFunctionStreamingAbortMessage can send from this point, here we have the sender member, processorId, that’s enouth.

Then the abort message received at on-region node, at this node, user-defined function is still running and continuously invoke the PartitionedRegionFunctionResultSender.sendResult method to send the result as stream way. It’s running in another thread. We need a shared variable can notify that sender the remote processor already dropped. So PartitionedRegionFunctionStreamingContext class here is tracing the processorId, normally it should be placed into a map before send action and remove after last send. Once abort message arrived, the processorId will be removed, then the next sendResult method can throw an exception to endup the useless function.



I am trying to follow the Github PR workflow, now are writing the test code. But like I mentioned above, I need some suggestion from develop team, is my idea suitable or something I missed.



Thanks

Dong



-----Original Message-----
From: Udo Kohlmeyer <uk...@pivotal.io>>
Sent: Monday, October 22, 2018 4:53 PM
To: Yang, Dong [GTSUS Non-J&J] <dy...@its.jnj.com>>
Cc: dev@geode.apache.org<ma...@geode.apache.org>
Subject: [EXTERNAL] Re: About JIRA GEODE-5896



Hi there Dong Yang,



If you have completed a fix, please submit it via the PR mechanism within Github. We will most gladly review and incorporate.



--Udo



On 10/18/18 06:00, Yang, Dong [GTSUS Non-J&J] wrote:

> Hi,

>

> I am Dong Yang, and my apache account is twosand.  What we are using Gemfire is not commonly usage scenario in other company, it's more like a OLTP and OLAP mixed scenario. The concept is very similar to using Spark-Gemfire connect, we have some server-side function that can shuffle data from server to client as stream style. And we encountered the thread lock issue in different environments. Before we use Gemfire8 , now we are upgrading to GemFrie9.

> About GEODE-5896, it's very important usage for us, and I think the same for others if they want using spark to connect to Gemfire. Now we just do some patch at client-side the force the meta ready before function executed. But the perfect solution should fix some sever-side code.

> I can share what I found and where I want to fix, you can review it , resonale or not . Fix it by current geode team or I can do it as a contributor.

>

>

>

> Dong Yang, Dong [GTSUS Non-J&J

> Thanks

>




Re: About JIRA GEODE-5896

Posted by Patrick Rhomberg <pr...@apache.org>.
> Think I need finish the test code before create pull request.

We have integrations into GitHub that launch precheckin testing in our
continuous integration Concourse pipelines.  PR status hooks updated when
tests pass or fail.

Of course, from a philosophical point of view, every bug is the result of
insufficient testing coverage, but as long as your PR includes / updates
tests that would identify this bug, then opening the PR will cover the rest.

> But like I mentioned above, I need some suggestion from develop team, is
my idea suitable or something I missed.


In my mind, this is what the PR is meant to do -- facilitate discussion
around immediate proposed changes.  When the PR is opened, the community
can review the change set, and if anything jumps out at us, we have the
opportunity to shore up any deficiencies then.

If you were looking for a collaborator to help you with a problem that you
didn't know how to start, we could figure something out.  But if you
believe you have a fix, we'll all look forward to the pull request!

On Tue, Oct 23, 2018 at 2:41 AM, Yang, Dong [GTSUS Non-J&J] <
dyang39@its.jnj.com> wrote:

> Hi, Udo
>
>
>
> I already fork the geode and commit my code in https://github.com/twosand/
> geode.git  feature/GEODE-5896 .
>
> Think I need finish the test code before create pull request. But actually
> I hope I can get some suggestion or maybe someone can review the code
> changes.
>
>
>
> I do some investigate about the code invocation chain. The attachment chat
> can show the whole idea. We can find the problem at on-server node,
> FunctionStreamingReplyMessage comes from onRegion node and there should
> have a processor exist missed. Then a PartitionedRegionFunctionStreamingAbortMessage
> can send from this point, here we have the sender member, processorId,
> that’s enouth.
>
> Then the abort message received at on-region node, at this node,
> user-defined function is still running and continuously invoke the
> PartitionedRegionFunctionResultSender.sendResult method to send the
> result as stream way. It’s running in another thread. We need a shared
> variable can notify that sender the remote processor already dropped. So
> PartitionedRegionFunctionStreamingContext class here is tracing the
> processorId, normally it should be placed into a map before send action and
> remove after last send. Once abort message arrived, the processorId will be
> removed, then the next sendResult method can throw an exception to endup
> the useless function.
>
>
>
> I am trying to follow the Github PR workflow, now are writing the test
> code. But like I mentioned above, I need some suggestion from develop team,
> is my idea suitable or something I missed.
>
>
>
> Thanks
>
> Dong
>
>
>
> -----Original Message-----
> From: Udo Kohlmeyer <uk...@pivotal.io>
> Sent: Monday, October 22, 2018 4:53 PM
> To: Yang, Dong [GTSUS Non-J&J] <dy...@its.jnj.com>
> Cc: dev@geode.apache.org
> Subject: [EXTERNAL] Re: About JIRA GEODE-5896
>
>
>
> Hi there Dong Yang,
>
>
>
> If you have completed a fix, please submit it via the PR mechanism within
> Github. We will most gladly review and incorporate.
>
>
>
> --Udo
>
>
>
> On 10/18/18 06:00, Yang, Dong [GTSUS Non-J&J] wrote:
>
> > Hi,
>
> >
>
> > I am Dong Yang, and my apache account is twosand.  What we are using
> Gemfire is not commonly usage scenario in other company, it's more like a
> OLTP and OLAP mixed scenario. The concept is very similar to using
> Spark-Gemfire connect, we have some server-side function that can shuffle
> data from server to client as stream style. And we encountered the thread
> lock issue in different environments. Before we use Gemfire8 , now we are
> upgrading to GemFrie9.
>
> > About GEODE-5896, it's very important usage for us, and I think the same
> for others if they want using spark to connect to Gemfire. Now we just do
> some patch at client-side the force the meta ready before function
> executed. But the perfect solution should fix some sever-side code.
>
> > I can share what I found and where I want to fix, you can review it ,
> resonale or not . Fix it by current geode team or I can do it as a
> contributor.
>
> >
>
> >
>
> >
>
> > Dong Yang, Dong [GTSUS Non-J&J
>
> > Thanks
>
> >
>
>
>

RE: About JIRA GEODE-5896

Posted by "Yang, Dong [GTSUS Non-J&J]" <dy...@ITS.JNJ.com>.
Hi, Udo



I already fork the geode and commit my code in https://github.com/twosand/geode.git  feature/GEODE-5896 .

Think I need finish the test code before create pull request. But actually I hope I can get some suggestion or maybe someone can review the code changes.



I do some investigate about the code invocation chain. The attachment chat can show the whole idea. We can find the problem at on-server node, FunctionStreamingReplyMessage comes from onRegion node and there should have a processor exist missed. Then a PartitionedRegionFunctionStreamingAbortMessage can send from this point, here we have the sender member, processorId, that's enouth.

Then the abort message received at on-region node, at this node, user-defined function is still running and continuously invoke the PartitionedRegionFunctionResultSender.sendResult method to send the result as stream way. It's running in another thread. We need a shared variable can notify that sender the remote processor already dropped. So PartitionedRegionFunctionStreamingContext class here is tracing the processorId, normally it should be placed into a map before send action and remove after last send. Once abort message arrived, the processorId will be removed, then the next sendResult method can throw an exception to endup the useless function.



I am trying to follow the Github PR workflow, now are writing the test code. But like I mentioned above, I need some suggestion from develop team, is my idea suitable or something I missed.



Thanks

Dong



-----Original Message-----
From: Udo Kohlmeyer <uk...@pivotal.io>
Sent: Monday, October 22, 2018 4:53 PM
To: Yang, Dong [GTSUS Non-J&J] <dy...@its.jnj.com>
Cc: dev@geode.apache.org
Subject: [EXTERNAL] Re: About JIRA GEODE-5896



Hi there Dong Yang,



If you have completed a fix, please submit it via the PR mechanism within Github. We will most gladly review and incorporate.



--Udo



On 10/18/18 06:00, Yang, Dong [GTSUS Non-J&J] wrote:

> Hi,

>

> I am Dong Yang, and my apache account is twosand.  What we are using Gemfire is not commonly usage scenario in other company, it's more like a OLTP and OLAP mixed scenario. The concept is very similar to using Spark-Gemfire connect, we have some server-side function that can shuffle data from server to client as stream style. And we encountered the thread lock issue in different environments. Before we use Gemfire8 , now we are upgrading to GemFrie9.

> About GEODE-5896, it's very important usage for us, and I think the same for others if they want using spark to connect to Gemfire. Now we just do some patch at client-side the force the meta ready before function executed. But the perfect solution should fix some sever-side code.

> I can share what I found and where I want to fix, you can review it , resonale or not . Fix it by current geode team or I can do it as a contributor.

>

>

>

> Dong Yang, Dong [GTSUS Non-J&J

> Thanks

>



Re: About JIRA GEODE-5896

Posted by Udo Kohlmeyer <uk...@pivotal.io>.
Hi there Dong Yang,

If you have completed a fix, please submit it via the PR mechanism 
within Github. We will most gladly review and incorporate.

--Udo

On 10/18/18 06:00, Yang, Dong [GTSUS Non-J&J] wrote:
> Hi,
>
> I am Dong Yang, and my apache account is twosand.  What we are using Gemfire is not commonly usage scenario in other company, it's more like a OLTP and OLAP mixed scenario. The concept is very similar to using Spark-Gemfire connect, we have some server-side function that can shuffle data from server to client as stream style. And we encountered the thread lock issue in different environments. Before we use Gemfire8 , now we are upgrading to GemFrie9.
> About GEODE-5896, it's very important usage for us, and I think the same for others if they want using spark to connect to Gemfire. Now we just do some patch at client-side the force the meta ready before function executed. But the perfect solution should fix some sever-side code.
> I can share what I found and where I want to fix, you can review it , resonale or not . Fix it by current geode team or I can do it as a contributor.
>
>
>
> Dong Yang, Dong [GTSUS Non-J&J
> Thanks
>