You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Deepak Nulu <de...@gmail.com> on 2014/03/02 23:38:35 UTC

Incrementally add/remove vertices in GraphX

Hi,

Is there a way to incrementally add/remove vertices in GraphX? I have read
the documentation and looked at the API, but I don't see a way to
incrementally add/remove vertices in GraphX.

Thanks.

-deepak




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Incrementally add/remove vertices in GraphX

Posted by Adam Novak <an...@soe.ucsc.edu>.
I would assume that, regardless of the efficiency of such an operation, any
method of adding or removing vertices would need to result in a new graph,
since graphs in GraphX are supposed to be immutable.

It sounds like what you probably want is an efficient
union/subtract/whatever that operates on graphs, returning a graph modified
to include your new vertices or edges, or to remove the ones you wanted to
throw out. Because of the fancy indexing going on inside VertexRDD, I am
not sure how easy it would be to do this on vertices in an efficient way,
without having to rebuild the whole index.

Is the index used in a VertexRDD able to efficiently accommodate insertions?

-Adam


On Mon, Mar 17, 2014 at 9:50 AM, Alessandro Lulli <
alessandro.lulli@gmail.com> wrote:

> Hi All,
>
> Is somebody looking into this?
> I think this is correlated with the discussion "Are there any plans to
> develop Graphx Streaming?".
>
> Using union / subtract on VertexRDD or EdgeRDD leads on the creation of
> new RDD but NOT in the modification of the RDD in the graph.
> Is creating a new graph the only way to go to add /remove vertex or edge?
>
> Thanks
> Alessandro
>
>
> On Fri, Mar 14, 2014 at 4:32 PM, alelulli <al...@gmail.com>wrote:
>
>> Hi Matei,
>>
>> Could you please clarify why i must call union before creating the graph?
>>
>> What's the behavior if i call union / subtract after the creation?
>> Is the added /removed vertexes been processed?
>>
>> For example if i'm implementing an iterative algorithm and at the 5th
>> step i
>> need to add some vertex / edge, can i call union / subtract on the
>> VertexRDD, EdgeRDD and Triplets?
>>
>> Thanks
>> Alessandro
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>
>

Re: Incrementally add/remove vertices in GraphX

Posted by Alessandro Lulli <al...@gmail.com>.
Hi All,

Is somebody looking into this?
I think this is correlated with the discussion "Are there any plans to
develop Graphx Streaming?".

Using union / subtract on VertexRDD or EdgeRDD leads on the creation of new
RDD but NOT in the modification of the RDD in the graph.
Is creating a new graph the only way to go to add /remove vertex or edge?

Thanks
Alessandro


On Fri, Mar 14, 2014 at 4:32 PM, alelulli <al...@gmail.com>wrote:

> Hi Matei,
>
> Could you please clarify why i must call union before creating the graph?
>
> What's the behavior if i call union / subtract after the creation?
> Is the added /removed vertexes been processed?
>
> For example if i'm implementing an iterative algorithm and at the 5th step
> i
> need to add some vertex / edge, can i call union / subtract on the
> VertexRDD, EdgeRDD and Triplets?
>
> Thanks
> Alessandro
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: Incrementally add/remove vertices in GraphX

Posted by vzaychik <za...@drexel.edu>.
Any updates on GraphX Streaming? There was mention of this about a year ago,
but nothing much since.
Thanks!



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p22963.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Incrementally add/remove vertices in GraphX

Posted by mas <ma...@gmail.com>.
Dear All,

Any update regarding Graph Streaming, I want to update, i.e., add vertices
and edges after creation of graph. 

Any suggestions or recommendations to do that.

Thanks,



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p25116.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Incrementally add/remove vertices in GraphX

Posted by Alessandro Lulli <al...@gmail.com>.
Hi All,

Thanks for your answer.

Regarding GraphX streaming:

   - Is there an issue (pull request) to follow to keep track of the update?
   - where is possible to find description and details of what will be
   provided?


Thanks for your help and your time to answer my questions
Alessandro



On Wed, Mar 19, 2014 at 2:43 AM, Ankur Dave <an...@gmail.com> wrote:

> As Matei said, there's currently no support for incrementally adding
> vertices or edges to their respective partitions. Doing this efficiently
> would require extensive modifications to GraphX, so for now, the only
> options are to rebuild the indices on every graph modification, or to use
> the subgraph operator if the modification only involves removing vertices
> and edges.
>
> However, Joey and I are working on GraphX streaming, which is currently in
> the very early stages but eventually will enable this.
>
> Ankur <http://www.ankurdave.com/>
>
>
> On Tue, Mar 18, 2014 at 3:30 PM, Matei Zaharia <ma...@gmail.com>wrote:
>
>> I just meant that you call union() before creating the RDDs that you pass
>> to new Graph(). If you call it after it will produce other RDDs.
>>
>> The Graph() constructor actually shuffles and "indexes" the data to make
>> graph operations efficient, so it's not too easy to add elements after. You
>> could access graph.vertices and graph.edges to build new RDDs, and then
>> call Graph() again to make a new graph. I've CCed Joey and Ankur to see if
>> they have further ideas on how to optimize this. It would be cool to
>> support more efficient union and subtracting of graphs once they've been
>> partitioned by GraphX.
>>
>> Matei
>>
>> On Mar 14, 2014, at 8:32 AM, alelulli <al...@gmail.com> wrote:
>>
>> > Hi Matei,
>> >
>> > Could you please clarify why i must call union before creating the
>> graph?
>> >
>> > What's the behavior if i call union / subtract after the creation?
>> > Is the added /removed vertexes been processed?
>> >
>> > For example if i'm implementing an iterative algorithm and at the 5th
>> step i
>> > need to add some vertex / edge, can i call union / subtract on the
>> > VertexRDD, EdgeRDD and Triplets?
>> >
>> > Thanks
>> > Alessandro
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>>
>

Re: Incrementally add/remove vertices in GraphX

Posted by Ankur Dave <an...@gmail.com>.
As Matei said, there's currently no support for incrementally adding
vertices or edges to their respective partitions. Doing this efficiently
would require extensive modifications to GraphX, so for now, the only
options are to rebuild the indices on every graph modification, or to use
the subgraph operator if the modification only involves removing vertices
and edges.

However, Joey and I are working on GraphX streaming, which is currently in
the very early stages but eventually will enable this.

Ankur <http://www.ankurdave.com/>


On Tue, Mar 18, 2014 at 3:30 PM, Matei Zaharia <ma...@gmail.com>wrote:

> I just meant that you call union() before creating the RDDs that you pass
> to new Graph(). If you call it after it will produce other RDDs.
>
> The Graph() constructor actually shuffles and “indexes” the data to make
> graph operations efficient, so it’s not too easy to add elements after. You
> could access graph.vertices and graph.edges to build new RDDs, and then
> call Graph() again to make a new graph. I’ve CCed Joey and Ankur to see if
> they have further ideas on how to optimize this. It would be cool to
> support more efficient union and subtracting of graphs once they’ve been
> partitioned by GraphX.
>
> Matei
>
> On Mar 14, 2014, at 8:32 AM, alelulli <al...@gmail.com> wrote:
>
> > Hi Matei,
> >
> > Could you please clarify why i must call union before creating the graph?
> >
> > What's the behavior if i call union / subtract after the creation?
> > Is the added /removed vertexes been processed?
> >
> > For example if i'm implementing an iterative algorithm and at the 5th
> step i
> > need to add some vertex / edge, can i call union / subtract on the
> > VertexRDD, EdgeRDD and Triplets?
> >
> > Thanks
> > Alessandro
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
>

Re: Incrementally add/remove vertices in GraphX

Posted by Matei Zaharia <ma...@gmail.com>.
I just meant that you call union() before creating the RDDs that you pass to new Graph(). If you call it after it will produce other RDDs.

The Graph() constructor actually shuffles and “indexes” the data to make graph operations efficient, so it’s not too easy to add elements after. You could access graph.vertices and graph.edges to build new RDDs, and then call Graph() again to make a new graph. I’ve CCed Joey and Ankur to see if they have further ideas on how to optimize this. It would be cool to support more efficient union and subtracting of graphs once they’ve been partitioned by GraphX.

Matei

On Mar 14, 2014, at 8:32 AM, alelulli <al...@gmail.com> wrote:

> Hi Matei,
> 
> Could you please clarify why i must call union before creating the graph?
> 
> What's the behavior if i call union / subtract after the creation? 
> Is the added /removed vertexes been processed?
> 
> For example if i'm implementing an iterative algorithm and at the 5th step i
> need to add some vertex / edge, can i call union / subtract on the
> VertexRDD, EdgeRDD and Triplets?
> 
> Thanks
> Alessandro
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: Incrementally add/remove vertices in GraphX

Posted by alelulli <al...@gmail.com>.
Hi Matei,

Could you please clarify why i must call union before creating the graph?

What's the behavior if i call union / subtract after the creation? 
Is the added /removed vertexes been processed?

For example if i'm implementing an iterative algorithm and at the 5th step i
need to add some vertex / edge, can i call union / subtract on the
VertexRDD, EdgeRDD and Triplets?

Thanks
Alessandro



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Incrementally add/remove vertices in GraphX

Posted by Matei Zaharia <ma...@gmail.com>.
Good catch, I’ve fixed those.

On Mar 2, 2014, at 5:25 PM, Nicholas Chammas <ni...@gmail.com> wrote:

> Quick side-note on that page, Matei: Several versions up to and including 0.9.0 are still marked as "unreleased" in JIRA. Dunno if that's intentional (or if it matters any).
> 
> 
> On Sun, Mar 2, 2014 at 7:52 PM, Matei Zaharia <ma...@gmail.com> wrote:
> You can create a ticket, but note that real-time updates to the graph are outside the scope of GraphX right now. It’s meant to be a graph analysis system, not a graph storage system. I’ve added it as a component on https://spark-project.atlassian.net/browse/SPARK.
> 
> Matei
> 
> On Mar 2, 2014, at 3:32 PM, Deepak Nulu <de...@gmail.com> wrote:
> 
> > Hi Matei,
> >
> > Thanks for the quick response. Is there a plan to support this? Any ticket I
> > can follow? I don't see a GraphX component at
> > https://spark-project.atlassian.net; is there a different bug database for
> > GraphX?
> >
> > Thanks.
> >
> > -deepak
> >
> >
> >
> >
> > --
> > View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2230.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> 


Re: Incrementally add/remove vertices in GraphX

Posted by Nicholas Chammas <ni...@gmail.com>.
Quick side-note on that page, Matei: Several versions up to and including
0.9.0 are still marked as "unreleased" in JIRA. Dunno if that's intentional
(or if it matters any).


On Sun, Mar 2, 2014 at 7:52 PM, Matei Zaharia <ma...@gmail.com>wrote:

> You can create a ticket, but note that real-time updates to the graph are
> outside the scope of GraphX right now. It's meant to be a graph analysis
> system, not a graph storage system. I've added it as a component on
> https://spark-project.atlassian.net/browse/SPARK.
>
> Matei
>
> On Mar 2, 2014, at 3:32 PM, Deepak Nulu <de...@gmail.com> wrote:
>
> > Hi Matei,
> >
> > Thanks for the quick response. Is there a plan to support this? Any
> ticket I
> > can follow? I don't see a GraphX component at
> > https://spark-project.atlassian.net; is there a different bug database
> for
> > GraphX?
> >
> > Thanks.
> >
> > -deepak
> >
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2230.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
>

Re: Incrementally add/remove vertices in GraphX

Posted by Matei Zaharia <ma...@gmail.com>.
You can create a ticket, but note that real-time updates to the graph are outside the scope of GraphX right now. It’s meant to be a graph analysis system, not a graph storage system. I’ve added it as a component on https://spark-project.atlassian.net/browse/SPARK.

Matei

On Mar 2, 2014, at 3:32 PM, Deepak Nulu <de...@gmail.com> wrote:

> Hi Matei,
> 
> Thanks for the quick response. Is there a plan to support this? Any ticket I
> can follow? I don't see a GraphX component at
> https://spark-project.atlassian.net; is there a different bug database for
> GraphX?
> 
> Thanks.
> 
> -deepak
> 
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2230.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: Incrementally add/remove vertices in GraphX

Posted by Deepak Nulu <de...@gmail.com>.
Hi Matei,

Thanks for the quick response. Is there a plan to support this? Any ticket I
can follow? I don't see a GraphX component at
https://spark-project.atlassian.net; is there a different bug database for
GraphX?

Thanks.

-deepak




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2230.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Incrementally add/remove vertices in GraphX

Posted by Matei Zaharia <ma...@gmail.com>.
Right now there isn’t. It’s meant for analysis once you have a graph. If you just need a few vertices at the beginning you could add them to the vertex and edge RDDs using RDD.union() before creating a Graph.

Matei

On Mar 2, 2014, at 2:38 PM, Deepak Nulu <de...@gmail.com> wrote:

> Hi,
> 
> Is there a way to incrementally add/remove vertices in GraphX? I have read
> the documentation and looked at the API, but I don't see a way to
> incrementally add/remove vertices in GraphX.
> 
> Thanks.
> 
> -deepak
> 
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.