You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by RJ Nowling <rn...@gmail.com> on 2014/08/27 23:17:59 UTC
[GraphX] JIRA / PR to fix breakage in GraphGenerator.logNormalGraph
in PR #720
Hi all,
PR #720 <https://github.com/apache/spark/pull/720> made multiple changes
to GraphGenerator.logNormalGraph including:
- Replacing the call to functions for generating random vertices and
edges with in-line implementations with different equations. Based on
reading the Pregel paper, I believe the in-line functions are incorrect.
- Hard-coding of RNG seeds so that method now generates the same graph
for a given number of vertices, edges, mu, and sigma -- user is not able to
override seed or specify that seed should be randomly generated.
- Backwards-incompatible change to logNormalGraph signature with
introduction of new required parameter.
- Failed to update scala docs and programming guide for API changes
- Added a Synthetic Benchmark in the examples.
I submitted JIRA SPARK-3263
<https://issues.apache.org/jira/browse/SPARK-3263> and PR #2168
<https://github.com/apache/spark/pull/2168> to revert some of these changes
and fix usage of the RNGs:
- Removes the in-line calls and calls original vertex / edge generation
functions again
- Adds an optional seed parameter for deterministic behavior (when
desired)
- Keeps the number of partitions parameter that was added.
- Keeps compatibility with the synthetic benchmark example
- Maintains backwards-compatible API
I would appreciate feedback and people taking a look. :)
Thanks!
RJ
--
em rnowling@gmail.com
c 954.496.2314