You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by greghogan <gi...@git.apache.org> on 2016/03/09 21:24:52 UTC

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

GitHub user greghogan opened a pull request:

    https://github.com/apache/flink/pull/1778

    [FLINK-3589] Allow setting Operator parallelism to default value

    Adds the public field ExecutionConfig.PARALLELISM_UNKNOWN as a flag
    value to indicate that the default parallelism should be used.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/greghogan/flink 3589_allow_setting_operator_parallelism_to_default_value

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1778.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1778
    
----
commit 8c573e22cf74c9445df130bbbbbb80d69cbd3649
Author: Greg Hogan <co...@greghogan.com>
Date:   2016-03-09T17:41:45Z

    [FLINK-3589] Allow setting Operator parallelism to default value
    
    Adds the public field ExecutionConfig.PARALLELISM_UNKNOWN as a flag
    value to indicate that the default parallelism should be used.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1778#discussion_r58394697
  
    --- Diff: flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java ---
    @@ -64,10 +64,15 @@
     
     	/**
     	 * The constant to use for the parallelism, if the system should use the number
    -	 *  of currently available slots.
    +	 * of currently available slots.
     	 */
     	public static final int PARALLELISM_AUTO_MAX = Integer.MAX_VALUE;
     
    +	/**
    +	 * The flag value indicating an unknown or unset parallelism.
    +	 */
    +	public static final int PARALLELISM_UNKNOWN = -1;
    --- End diff --
    
    The parallelism may be overridden elsewhere and we are not resetting it back to the default. This flag simply indicates a value for which the parallelism will not be overridden.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1778#issuecomment-221960487
  
    Couldn't that be equally well checked in the program that calls `setParallelism(...)`?
    
    Just asking because this is handled only in very few parts of the DataSet API anyways (not at all handled in streaming) and seems a bit of a special case.
    
    You may notice, I am starting to ask this question about "what needs to be part of the API, what can be part of the user program" quite a bit lately ;-) Trying to be conscious about complexity and commitment to maintaining all the API contructs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the pull request:

    https://github.com/apache/flink/pull/1778#issuecomment-221965805
  
    It's good to have you back :)
    
    I think we can get by with only `PARALLELISM_DEFAULT` since `java.operators.Operator.setParallelism(int)` is only called by the Scala and Python APIs. I'll open a ticket.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the pull request:

    https://github.com/apache/flink/pull/1778#issuecomment-209330203
  
    The changes look correct. If you could add a small test to verify that passing PARALLELISM_UNKNOWN doesn't modify the parallelism we're probably good to go.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the pull request:

    https://github.com/apache/flink/pull/1778#issuecomment-194756703
  
    passing the default to Operator.setParallelism() will result in an exception, intended?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the pull request:

    https://github.com/apache/flink/pull/1778#issuecomment-221597445
  
    `PARALLELISM_UNKNOWN` is a no-op which leaves the parallelism unchanged. This is a useful default for batch algorithms such as `JaccardIndex` for which parallelism can be optionally configured. If the user does not specify a parallelism we do not want to override the parallelism of the `ExecutionEnvironment` as would happen if we used `PARALLELISM_DEFAULT`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1778#discussion_r55653990
  
    --- Diff: flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java ---
    @@ -64,10 +64,15 @@
     
     	/**
     	 * The constant to use for the parallelism, if the system should use the number
    -	 *  of currently available slots.
    +	 * of currently available slots.
     	 */
     	public static final int PARALLELISM_AUTO_MAX = Integer.MAX_VALUE;
     
    +	/**
    +	 * The flag value indicating an unknown or unset parallelism.
    +	 */
    +	public static final int PARALLELISM_UNKNOWN = -1;
    --- End diff --
    
    i would prefer calling it parallelism_default.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the pull request:

    https://github.com/apache/flink/pull/1778#issuecomment-209330425
  
    and a rebase since github says there are conflicts ,)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1778#issuecomment-221590336
  
    I just stumbled across this in the code.
    I am a bit confused by the two different flags `PARALLELISM_DEFAULT` and `PARALLELISM_UNKNOWN`. What is the purpose of the later?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the pull request:

    https://github.com/apache/flink/pull/1778#issuecomment-205368211
  
    Perhaps there should be two flags, PARALLELISM_DEFAULT = -1 to reset the parallelism to the system default, and PARALLELISM_UNKNOWN = -2 which leaves the parallelism unchanged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3589] Allow setting Operator parallelis...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/1778


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---