You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Udbhav Agarwal <ud...@syncoms.com> on 2016/02/25 14:49:25 UTC

Multiple user operations in spark.

Hi,
I am using graphx. I am adding a batch of vertices to a graph with around 100,000 vertices and few edges. Adding around 400 vertices is taking 7 seconds with one machine of 8 core and 8g ram. My trouble is when this process of addition is happening with the graph(name is inputGraph) am not able to access it or perform query over it. Since it is s real time system I want it to be available to the user every time. Currently when I am querying the graph during this process of addition of vertices its giving result after the addition is over. I have also tried with creating and querying another variable tempInputGraph where am storing state of inputGraph, which is updated whenever the addition process is over. But querying this is also being delayed due to the background process.
I have set the number of executors as 8 as per my 8 core system.
Please provide any suggestion as to how I can keep this graph always available to user even if any background process is happening over it.

Thanks,
Udbhav Agarwal



Re: Multiple user operations in spark.

Posted by Sabarish Sasidharan <sa...@gmail.com>.
I don't have a proper answer to this. But to circumvent if you have 2
independent Spark jobs, you could update one when the other is serving
reads. But it's still not scalable for incessant updates.

Regards
Sab
On 25-Feb-2016 7:19 pm, "Udbhav Agarwal" <ud...@syncoms.com> wrote:

> Hi,
>
> I am using graphx. I am adding a batch of vertices to a graph with around
> 100,000 vertices and few edges. Adding around 400 vertices is taking 7
> seconds with one machine of 8 core and 8g ram. My trouble is when this
> process of addition is happening with the graph(name is *inputGraph)* am
> not able to access it or perform query over it. Since it is s real time
> system I want it to be available to the user every time. Currently when I
> am querying the graph during this process of addition of vertices its
> giving result after the addition is over. I have also tried with creating
> and querying another variable tempInputGraph where am storing state of
> inputGraph, which is updated whenever the addition process is over. But
> querying this is also being delayed due to the background process.
>
> I have set the number of executors as 8 as per my 8 core system.
>
> Please provide any suggestion as to how I can keep this graph always
> available to user even if any background process is happening over it.
>
>
>
> *Thanks,*
>
> *Udbhav Agarwal*
>
>
>
>
>