You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by SherlockYang <gi...@git.apache.org> on 2015/10/16 18:53:35 UTC
[GitHub] spark pull request: [Spark-10994] Add local clustering coefficient...
GitHub user SherlockYang opened a pull request:
https://github.com/apache/spark/pull/9150
[Spark-10994] Add local clustering coefficient computation in GraphX
The local clustering coefficient of a vertex (node) in a graph quantifies how close its neighbours are to being a clique (complete graph).
More specifically, the local clustering coefficient C_i for a vertex v_i is given by the proportion of links between the vertices within its neighbourhood divided by the number of links that could possibly exist between them.
Duncan J. Watts and Steven Strogatz introduced the measure in 1998 to determine whether a graph is a small-world network.
#### Usage
Here is a usage example for LocalClusteringCoefficient:
```Scala
import org.apache.spark.graphx._
import org.apache.spark._
val conf = new SparkConf().setAppName("testApp")
val sc = new SparkContext(conf)
// load a graph
val graph = GraphLoader.edgeListFile(sc, "graph.txt").partitionBy(PartitionStrategy.RandomVertexCut)
// perform the local clustering coefficient computation
val LccCounter = graph.localClusteringCoefficient()
// output results for each vertex
val verts = LccCounter.vertices
verts.collect().foreach { case (vid, count) =>
println(vid + ": " + count)
}
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/SherlockYang/spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9150.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9150
----
commit 20b56b79c630cd07e574cdf1bbc9b36225b7fcab
Author: SherlockYang72 <sh...@gmail.com>
Date: 2015-10-15T16:35:22Z
add local clustering coeffcient computation along with test suite
commit 83529d99740b22529c89d7bb8c5e082ab7706263
Author: SherlockYang72 <sh...@gmail.com>
Date: 2015-10-15T16:45:36Z
revise test suite
commit 169298df66932564a8505a8961ed34ad50fbe6ba
Author: SherlockYang72 <sh...@gmail.com>
Date: 2015-10-16T12:03:05Z
unit test
commit 97c705ed63ccb092724f54d5a63a382a93a88030
Author: SherlockYang72 <sh...@gmail.com>
Date: 2015-10-16T14:41:19Z
=
commit bd8d212ef8d9909a73ae074ff8a43b76d6c4f811
Author: SherlockYang72 <sh...@gmail.com>
Date: 2015-10-16T15:45:04Z
=
commit 18af17ceccf3b0d94e4f1a7c398cb36f15f1a6a1
Author: SherlockYang72 <sh...@gmail.com>
Date: 2015-10-16T16:21:35Z
=
commit a33789f74df41dfca2d6f8548e32fdb67ba6115b
Author: SherlockYang72 <sh...@gmail.com>
Date: 2015-10-16T16:31:49Z
=
commit 8e1a3560991e8fbce473ed37a65091ef0cbfb884
Author: SherlockYang72 <sh...@gmail.com>
Date: 2015-10-16T16:51:51Z
=
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #9150: [Spark-10994] Add clustering coefficient computati...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/9150
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #9150: [Spark-10994] Add clustering coefficient computation in G...
Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/9150
Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one. For this one you might want to consider creating a spark-packages.org package.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [Spark-10994] Add local clustering coefficient...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9150#issuecomment-148770205
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org