You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/02/09 16:29:42 UTC
[jira] [Commented] (FLINK-4896) PageRank algorithm for directed
graphs
[ https://issues.apache.org/jira/browse/FLINK-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15859749#comment-15859749 ]
ASF GitHub Bot commented on FLINK-4896:
---------------------------------------
Github user greghogan commented on the issue:
https://github.com/apache/flink/pull/2733
Running on a c4.xlarge with 4 slots and a 4 GB preallocated TaskManager heap. EdgeList measures the time to simplify the graph since the library PageRank using Scatter-Gather ("PageRankSG") requires each vertex to have both incoming and outgoing edges. "PageRank" is the algorithm from this PR.
Algorithm | Scale 16 | Scale 18
------------ | ------------- | -------------
EdgeList | 2537 ms | 8779 ms
PageRank | 9563 ms | 39558 ms
PageRankSG | 11188 ms | 47736 ms
> PageRank algorithm for directed graphs
> --------------------------------------
>
> Key: FLINK-4896
> URL: https://issues.apache.org/jira/browse/FLINK-4896
> Project: Flink
> Issue Type: New Feature
> Components: Gelly
> Affects Versions: 1.2.0
> Reporter: Greg Hogan
> Assignee: Greg Hogan
>
> Gelly includes PageRank implementations for scatter-gather and gather-sum-apply. Both ship with the warning "The implementation assumes that each page has at least one incoming and one outgoing link."
> PageRank is a directed algorithm and sources and sinks are common in directed graphs.
> Sinks drain the total score across the graph which affects convergence and the balance of the random hop (convergence is not currently a feature of Gelly's PageRanks as this a very recent feature from FLINK-3888).
> Sources are handled nicely by the algorithm highlighted on Flink's features page under "Iterations and Delta Iterations" since score deltas are transmitted and a source's score never changes (is always equal to the random hop probability divided by the vertex count).
> https://flink.apache.org/features.html
> We should find an implementation featuring convergence and unrestricted processing of directed graphs and move other implementations to Gelly examples.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)