You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by greghogan <gi...@git.apache.org> on 2017/07/16 16:22:41 UTC

[GitHub] flink pull request #4350: [FLINK-7204] [core] CombineHint.NONE

GitHub user greghogan opened a pull request:

    https://github.com/apache/flink/pull/4350

    [FLINK-7204] [core] CombineHint.NONE

    FLINK-3477 added a hash-combine preceding the reducer configured with CombineHint.HASH or CombineHint.SORT (default). In some cases it may be useful to disable the combiner in ReduceNode by specifying a new CombineHint.NONE value.
    
    We don't currently have unit tests for this part of the code but I'm happy to discuss and add some.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/greghogan/flink 7204_combinehint_none

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4350.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4350
    
----
commit d08284a61d3af58cb28b47a3190c4dc31b62fa8b
Author: Greg Hogan <co...@greghogan.com>
Date:   2017-07-16T11:24:59Z

    [FLINK-7204] [core] CombineHint.NONE
    
    FLINK-3477 added a hash-combine preceding the reducer configured with
    CombineHint.HASH or CombineHint.SORT (default). In some cases it may be
    useful to disable the combiner in ReduceNode by specifying a new
    CombineHint.NONE value.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4350: [FLINK-7204] [core] CombineHint.NONE

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/4350
  
    @fhueske updated and manually verified. I had misinterpreted `DriverStrategy` to imply that `SORTED_REDUCE` would not use a combiner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4350: [FLINK-7204] [core] CombineHint.NONE

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/4350


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4350: [FLINK-7204] [core] CombineHint.NONE

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/4350
  
    Test updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4350: [FLINK-7204] [core] CombineHint.NONE

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the issue:

    https://github.com/apache/flink/pull/4350
  
    +1 to merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4350: [FLINK-7204] [core] CombineHint.NONE

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the issue:

    https://github.com/apache/flink/pull/4350
  
    Hi @greghogan, the changes look good and the PR is ready to be merged.
    
    If you like, you can simplify the test before merging by using field position keys (as in `testGroupedReduceWithFieldPositionKey()`) instead of a `KeySelector`. That would remove the key extractor and projector map operators and make the plan a bit easier to verify.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4350: [FLINK-7204] [core] CombineHint.NONE

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the issue:

    https://github.com/apache/flink/pull/4350
  
    Hi @greghogan, I think this is definitely a valuable improvement.
    However, I looked at the plan of a query with a `ReduceFunction` with `CombineHint.None` and found that it still has a combine node:
    
    ```
    [IN] -FWD-> [COMBINE(SORTED_REDUCE)] -HASH-> [REDUCE(SORTED_REDUCE)] -> [OUT]
    ```
    
    I think we should
    1. set `combinerStrategy` to `DriverStrategy.NONE` in `ReduceNode` for `CombineHint.NONE`.
    2. change `ReduceProperties.instantiate()` with a check for `combinerStrategy == NONE` and not add a node for the combiner in that case.
    3. add a test in `ReduceCompilationTest` similar to `testGroupedReduceWithHint()` to check that the resulting plan has no combine node.
    
    Best, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---