You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by tgensol <th...@gmail.com> on 2016/04/21 20:47:18 UTC

[GRAPHX] Graph Algorithms and Spark

Hi there,

I am working in a group of the University of Michigan, and we are trying to
make (and find first) some Distributed graph algorithms. 

I know spark, and I found GraphX. I read the docs, but I only found Latent
Dirichlet Allocation algorithms working with GraphX, so I was wondering why
?

Basically, the groupe wants to implement Minimal Spanning Tree, kNN,
shortest path at first.

So my askings are :
Is graphX enough stable for developing this kind of algorithms on it ?
Do you know some algorithms like these working on top of GraphX ? And if no,
why do you think, nobody tried to do it ? Is this too hard ? Or just because
nobody needs it ?

Maybe, it is only my knowledge about GraphX which is weak, and it is not
possible to make these algorithms with GraphX.

Thanking you in advance,
Best regards,

Thibaut 



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/GRAPHX-Graph-Algorithms-and-Spark-tp17301.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: [GRAPHX] Graph Algorithms and Spark

Posted by Denny Lee <de...@gmail.com>.
BTW, we recently had a webinar on GraphFrames at
http://go.databricks.com/graphframes-dataframe-based-graphs-for-apache-spark

On Thu, Apr 21, 2016 at 14:30 Dimitris Kouzis - Loukas <lo...@gmail.com>
wrote:

> This thread is good. Maybe it should make it to doc or the users group
>
> On Thu, Apr 21, 2016 at 9:25 PM, Zhan Zhang <zz...@hortonworks.com>
> wrote:
>
>>
>> You can take a look at this blog from data bricks about GraphFrames
>>
>> https://databricks.com/blog/2016/03/03/introducing-graphframes.html
>>
>> Thanks.
>>
>> Zhan Zhang
>>
>> On Apr 21, 2016, at 12:53 PM, Robin East <ro...@xense.co.uk> wrote:
>>
>> Hi
>>
>> Aside from LDA, which is implemented in MLLib, GraphX has the following
>> built-in algorithms:
>>
>>
>>    - PageRank/Personalised PageRank
>>    - Connected Components
>>    - Strongly Connected Components
>>    - Triangle Count
>>    - Shortest Paths
>>    - Label Propagation
>>
>>
>> It also implements a version of Pregel framework, a form of
>> bulk-synchronous parallel processing that is the foundation of most of the
>> above algorithms. We cover other algorithms in our book and if you search
>> on google you will find a number of other examples.
>>
>>
>> -------------------------------------------------------------------------------
>> Robin East
>> *Spark GraphX in Action* Michael Malak and Robin East
>> Manning Publications Co.
>> http://www.manning.com/books/spark-graphx-in-action
>>
>>
>>
>>
>>
>> On 21 Apr 2016, at 19:47, tgensol <th...@gmail.com> wrote:
>>
>> Hi there,
>>
>> I am working in a group of the University of Michigan, and we are trying
>> to
>> make (and find first) some Distributed graph algorithms.
>>
>> I know spark, and I found GraphX. I read the docs, but I only found Latent
>> Dirichlet Allocation algorithms working with GraphX, so I was wondering
>> why
>> ?
>>
>> Basically, the groupe wants to implement Minimal Spanning Tree, kNN,
>> shortest path at first.
>>
>> So my askings are :
>> Is graphX enough stable for developing this kind of algorithms on it ?
>> Do you know some algorithms like these working on top of GraphX ? And if
>> no,
>> why do you think, nobody tried to do it ? Is this too hard ? Or just
>> because
>> nobody needs it ?
>>
>> Maybe, it is only my knowledge about GraphX which is weak, and it is not
>> possible to make these algorithms with GraphX.
>>
>> Thanking you in advance,
>> Best regards,
>>
>> Thibaut
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/GRAPHX-Graph-Algorithms-and-Spark-tp17301.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com <http://nabble.com/>.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>>
>>
>

Re: [GRAPHX] Graph Algorithms and Spark

Posted by Dimitris Kouzis - Loukas <lo...@gmail.com>.
This thread is good. Maybe it should make it to doc or the users group

On Thu, Apr 21, 2016 at 9:25 PM, Zhan Zhang <zz...@hortonworks.com> wrote:

>
> You can take a look at this blog from data bricks about GraphFrames
>
> https://databricks.com/blog/2016/03/03/introducing-graphframes.html
>
> Thanks.
>
> Zhan Zhang
>
> On Apr 21, 2016, at 12:53 PM, Robin East <ro...@xense.co.uk> wrote:
>
> Hi
>
> Aside from LDA, which is implemented in MLLib, GraphX has the following
> built-in algorithms:
>
>
>    - PageRank/Personalised PageRank
>    - Connected Components
>    - Strongly Connected Components
>    - Triangle Count
>    - Shortest Paths
>    - Label Propagation
>
>
> It also implements a version of Pregel framework, a form of
> bulk-synchronous parallel processing that is the foundation of most of the
> above algorithms. We cover other algorithms in our book and if you search
> on google you will find a number of other examples.
>
>
> -------------------------------------------------------------------------------
> Robin East
> *Spark GraphX in Action* Michael Malak and Robin East
> Manning Publications Co.
> http://www.manning.com/books/spark-graphx-in-action
>
>
>
>
>
> On 21 Apr 2016, at 19:47, tgensol <th...@gmail.com> wrote:
>
> Hi there,
>
> I am working in a group of the University of Michigan, and we are trying to
> make (and find first) some Distributed graph algorithms.
>
> I know spark, and I found GraphX. I read the docs, but I only found Latent
> Dirichlet Allocation algorithms working with GraphX, so I was wondering why
> ?
>
> Basically, the groupe wants to implement Minimal Spanning Tree, kNN,
> shortest path at first.
>
> So my askings are :
> Is graphX enough stable for developing this kind of algorithms on it ?
> Do you know some algorithms like these working on top of GraphX ? And if
> no,
> why do you think, nobody tried to do it ? Is this too hard ? Or just
> because
> nobody needs it ?
>
> Maybe, it is only my knowledge about GraphX which is weak, and it is not
> possible to make these algorithms with GraphX.
>
> Thanking you in advance,
> Best regards,
>
> Thibaut
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/GRAPHX-Graph-Algorithms-and-Spark-tp17301.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com <http://nabble.com/>.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>
>
>

Re: [GRAPHX] Graph Algorithms and Spark

Posted by Zhan Zhang <zz...@hortonworks.com>.
You can take a look at this blog from data bricks about GraphFrames

https://databricks.com/blog/2016/03/03/introducing-graphframes.html

Thanks.

Zhan Zhang

On Apr 21, 2016, at 12:53 PM, Robin East <ro...@xense.co.uk>> wrote:

Hi

Aside from LDA, which is implemented in MLLib, GraphX has the following built-in algorithms:


  *   PageRank/Personalised PageRank
  *   Connected Components
  *   Strongly Connected Components
  *   Triangle Count
  *   Shortest Paths
  *   Label Propagation

It also implements a version of Pregel framework, a form of bulk-synchronous parallel processing that is the foundation of most of the above algorithms. We cover other algorithms in our book and if you search on google you will find a number of other examples.

-------------------------------------------------------------------------------
Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action





On 21 Apr 2016, at 19:47, tgensol <th...@gmail.com>> wrote:

Hi there,

I am working in a group of the University of Michigan, and we are trying to
make (and find first) some Distributed graph algorithms.

I know spark, and I found GraphX. I read the docs, but I only found Latent
Dirichlet Allocation algorithms working with GraphX, so I was wondering why
?

Basically, the groupe wants to implement Minimal Spanning Tree, kNN,
shortest path at first.

So my askings are :
Is graphX enough stable for developing this kind of algorithms on it ?
Do you know some algorithms like these working on top of GraphX ? And if no,
why do you think, nobody tried to do it ? Is this too hard ? Or just because
nobody needs it ?

Maybe, it is only my knowledge about GraphX which is weak, and it is not
possible to make these algorithms with GraphX.

Thanking you in advance,
Best regards,

Thibaut



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/GRAPHX-Graph-Algorithms-and-Spark-tp17301.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com<http://nabble.com/>.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org<ma...@spark.apache.org>
For additional commands, e-mail: dev-help@spark.apache.org<ma...@spark.apache.org>




Re: [GRAPHX] Graph Algorithms and Spark

Posted by Robin East <ro...@xense.co.uk>.
Hi

Aside from LDA, which is implemented in MLLib, GraphX has the following built-in algorithms:

PageRank/Personalised PageRank
Connected Components
Strongly Connected Components
Triangle Count
Shortest Paths
Label Propagation

It also implements a version of Pregel framework, a form of bulk-synchronous parallel processing that is the foundation of most of the above algorithms. We cover other algorithms in our book and if you search on google you will find a number of other examples.

-------------------------------------------------------------------------------
Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action <http://www.manning.com/books/spark-graphx-in-action>





> On 21 Apr 2016, at 19:47, tgensol <th...@gmail.com> wrote:
> 
> Hi there,
> 
> I am working in a group of the University of Michigan, and we are trying to
> make (and find first) some Distributed graph algorithms. 
> 
> I know spark, and I found GraphX. I read the docs, but I only found Latent
> Dirichlet Allocation algorithms working with GraphX, so I was wondering why
> ?
> 
> Basically, the groupe wants to implement Minimal Spanning Tree, kNN,
> shortest path at first.
> 
> So my askings are :
> Is graphX enough stable for developing this kind of algorithms on it ?
> Do you know some algorithms like these working on top of GraphX ? And if no,
> why do you think, nobody tried to do it ? Is this too hard ? Or just because
> nobody needs it ?
> 
> Maybe, it is only my knowledge about GraphX which is weak, and it is not
> possible to make these algorithms with GraphX.
> 
> Thanking you in advance,
> Best regards,
> 
> Thibaut 
> 
> 
> 
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/GRAPHX-Graph-Algorithms-and-Spark-tp17301.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
> 


Re: [GRAPHX] Graph Algorithms and Spark

Posted by Krishna Sankar <ks...@gmail.com>.
Hi,

   1. Yep, GraphX is stable and would be a good choice for you to implement
   algorithms. For a quick intro you can refer to our Strata MLlib tutorial
   GraphX slides http://goo.gl/Ffq2Az
   2. GraphX has implemented algorithms like PageRank &
   ConnectedComponents[1]
   3. It also has primitives to develop the kind of algorithms that you are
   talking about
   4. For you to implement interesting algorithms, the main APIs of
   interest would be the pregel API and the aggregateMessages API[2]. Am sure
   you will also use the map*, subgraph and the join APIs.

Cheers
<k/>
[1]
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.graphx.GraphOps
[2]
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.graphx.Graph

On Thu, Apr 21, 2016 at 11:47 AM, tgensol <th...@gmail.com>
wrote:

> Hi there,
>
> I am working in a group of the University of Michigan, and we are trying to
> make (and find first) some Distributed graph algorithms.
>
> I know spark, and I found GraphX. I read the docs, but I only found Latent
> Dirichlet Allocation algorithms working with GraphX, so I was wondering why
> ?
>
> Basically, the groupe wants to implement Minimal Spanning Tree, kNN,
> shortest path at first.
>
> So my askings are :
> Is graphX enough stable for developing this kind of algorithms on it ?
> Do you know some algorithms like these working on top of GraphX ? And if
> no,
> why do you think, nobody tried to do it ? Is this too hard ? Or just
> because
> nobody needs it ?
>
> Maybe, it is only my knowledge about GraphX which is weak, and it is not
> possible to make these algorithms with GraphX.
>
> Thanking you in advance,
> Best regards,
>
> Thibaut
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/GRAPHX-Graph-Algorithms-and-Spark-tp17301.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>