You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by greghogan <gi...@git.apache.org> on 2017/07/07 14:00:38 UTC

[GitHub] flink pull request #4282: [FLINK-7019] [gelly] Rework parallelism in Gelly a...

GitHub user greghogan opened a pull request:

    https://github.com/apache/flink/pull/4282

    [FLINK-7019] [gelly] Rework parallelism in Gelly algorithms and examples

    Flink job parallelism is set with ExecutionConfig#setParallelism or with -p on the command-line. The Gelly algorithms JaccardIndex, AdamicAdar, TriangleListing, and ClusteringCoefficient have intermediate operators which generate output quadratic in the size of input. These algorithms may need to be run with a high parallelism but doing so for all operations is wasteful. Thus was introduced "little parallelism".
    
    This can be simplified by moving the parallelism parameter to the new common base class with the rule-of-thumb to use the algorithm parallelism for all normal (small output) operators. The asymptotically large operators will default to the job parallelism, as will the default algorithm parallelism.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/greghogan/flink 7019_rework_parallelism_in_gelly_algorithms_and_examples

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4282.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4282
    
----
commit 105ddee81892b46f5e9ab0b42bbc9cf206a71414
Author: Greg Hogan <co...@greghogan.com>
Date:   2017-06-26T14:21:50Z

    [FLINK-7019] [gelly] Rework parallelism in Gelly algorithms and examples
    
    Flink job parallelism is set with ExecutionConfig#setParallelism or with
    -p on the command-line. The Gelly algorithms JaccardIndex, AdamicAdar,
    TriangleListing, and ClusteringCoefficient have intermediate operators
    which generate output quadratic in the size of input. These algorithms
    may need to be run with a high parallelism but doing so for all
    operations is wasteful. Thus was introduced "little parallelism".
    
    This can be simplified by moving the parallelism parameter to the new
    common base class with the rule-of-thumb to use the algorithm
    parallelism for all normal (small output) operators. The asymptotically
    large operators will default to the job parallelism, as will the default
    algorithm parallelism.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4282: [FLINK-7019] [gelly] Rework parallelism in Gelly a...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/4282


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---