You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Fabian Hueske (JIRA)" <ji...@apache.org> on 2014/11/19 22:17:35 UTC

[jira] [Created] (FLINK-1259) FilterFunction can modify data

Fabian Hueske created FLINK-1259:
------------------------------------

             Summary: FilterFunction can modify data
                 Key: FLINK-1259
                 URL: https://issues.apache.org/jira/browse/FLINK-1259
             Project: Flink
          Issue Type: Bug
          Components: Java API, Optimizer, Scala API
    Affects Versions: 0.7.0-incubating
            Reporter: Fabian Hueske


The FilterFunction returns a boolean for an input record which determines whether the record is filtered or not. 
However, the function can also modify the input record which has effects if the record is not filtered.

The optimizer assumes that the data is not changed by a FilterFunction, i.e., it assumes that a Filter preserves physical data properties (orders, partitionings, etc.) and might also be pushed down in the future. These assumptions can result in semantically incorrect programs, if the function actually changes its incoming records.

Possible solutions are:
- document the requirements (and hope that users read it and behave nicely)
- hand a copy to the function which can be modified but is not passed on (might confuse users). However, this could also be integrated with the mutable/immutable runtime switch (FLINK-1005)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)