You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Vasia Kalavri (JIRA)" <ji...@apache.org> on 2015/10/26 10:39:27 UTC

[jira] [Commented] (FLINK-2909) Gelly Graph Generators

    [ https://issues.apache.org/jira/browse/FLINK-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973937#comment-14973937 ] 

Vasia Kalavri commented on FLINK-2909:
--------------------------------------

Hi [~greghogan],
thank you for opening this issue :)
Can you give us some idea on what are your implementation plans for this? Are you planning to implement all the generators you mention in the description as separate algorithms or are you considering making this a Gelly utility where we can add generators incrementally?
Thank you!

> Gelly Graph Generators
> ----------------------
>
>                 Key: FLINK-2909
>                 URL: https://issues.apache.org/jira/browse/FLINK-2909
>             Project: Flink
>          Issue Type: New Feature
>          Components: Gelly
>    Affects Versions: 1.0
>            Reporter: Greg Hogan
>            Assignee: Greg Hogan
>
> Include a selection of graph generators in Gelly. Generated graphs will be useful for performing scalability, stress, and regression testing as well as benchmarking and comparing algorithms, for both Flink users and developers. Generated data is infinitely scalable yet described by a few simple parameters and can often substitute for user data or sharing large files when reporting issues.
> There are at multiple categories of graphs as documented by [NetworkX|https://networkx.github.io/documentation/latest/reference/generators.html] and elsewhere.
> Graphs may be a well-defined, i.e. the [Chvátal graph|https://en.wikipedia.org/wiki/Chv%C3%A1tal_graph]. These may be sufficiently small to populate locally.
> Graphs may be scalable, i.e. complete and star graphs. These should use Flink's distributed parallelism.
> Graphs may be stochastic, i.e. [RMat graphs|http://snap.stanford.edu/class/cs224w-readings/chakrabarti04rmat.pdf] . A key consideration is that the graphs should source randomness from a seedable PRNG and generate the same Graph regardless of parallelism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)