You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by StephanEwen <gi...@git.apache.org> on 2014/11/17 01:38:43 UTC

[GitHub] incubator-flink pull request: [FLINK-1237] Add support for custom ...

GitHub user StephanEwen opened a pull request:

    https://github.com/apache/incubator-flink/pull/207

    [FLINK-1237] Add support for custom partitioners

      - Functions: GroupReduce, Reduce, Aggregate on UnsortedGrouping, SortedGrouping (Java API & Scala API)
      - Manual partition on DataSet (Java API & Scala API)
      - Distinct operations provide semantic properties for preservation of distinctified fields
      - Tests for pushown (or not pushdown) of custom partitionings and forced rebalancing
      - Tests for GlobalProperties matching of partitionings
      - Caching of generated requested data properties for unary operators

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink partitioner

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-flink/pull/207.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #207
    
----
commit 8a955e51959fbf3a3028496f809f89b60c4e7945
Author: Stephan Ewen <se...@apache.org>
Date:   2014-11-17T00:04:12Z

    [FLINK-1233] Fix flaky AggregateITCase

commit 8d97af04ebf3bd530d8d0382744b5261615f18b3
Author: Stephan Ewen <se...@apache.org>
Date:   2014-11-13T15:26:07Z

    [FLINK-1237] Add support for custom partitioners
      - Functions: GroupReduce, Reduce, Aggregate on UnsortedGrouping, SortedGrouping (Java API & Scala API)
      - Manual partition on DataSet (Java API & Scala API)
      - Distinct operations provide semantic properties for preservation of distinctified fields
      - Tests for pushown (or not pushdown) of custom partitionings and forced rebalancing
      - Tests for GlobalProperties matching of partitionings
      - Caching of generated requested data properties for unary operators

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1237] Add support for custom ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/incubator-flink/pull/207#discussion_r20421929
  
    --- Diff: flink-core/src/main/java/org/apache/flink/api/common/functions/Partitioner.java ---
    @@ -0,0 +1,36 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.api.common.functions;
    +
    +/**
    + * Function to implement a custom partition assignment for keys.
    + * 
    + * @param <K> The type of the key to be partitioned.
    + */
    +public interface Partitioner<K> extends java.io.Serializable {
    --- End diff --
    
    Does it make sense to add another method ``getMaxNumPartititions()`` to check at plan construction time if the the DOP is high enough for the partitioner? Otherwise the program will fail at runtime, if I'm not mistaken.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1237] Add support for custom ...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/incubator-flink/pull/207#issuecomment-63277862
  
    The partitioner gets the number of positions with every call. So it can be written to respect that number...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1237] Add support for custom ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-flink/pull/207


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1237] Add support for custom ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/incubator-flink/pull/207#issuecomment-63283090
  
    Oh, yes. Sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1237] Add support for custom ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/incubator-flink/pull/207#issuecomment-63271747
  
    Documentation needs to be adapted. 
    The difference of Partitioner and KeySelector might confuse users and should be explained.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---