You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Pattarawat Chormai (JIRA)" <ji...@apache.org> on 2017/04/16 20:58:42 UTC
[jira] [Commented] (FLINK-2032) Migrate integration tests from temp output files to collect()

    [ https://issues.apache.org/jira/browse/FLINK-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970524#comment-15970524 ] 

Pattarawat Chormai commented on FLINK-2032:
-------------------------------------------

Hi all,

I have searched on Github using [1] and found that there are several tests that haven't been refactored to use _collect_ yet.

{code}
flink-streaming-scala/src/test/scala/org/apache/flink/streaming/api/scala/StreamingOperatorsITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/functions/ClosureCleanerITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/io/ScalaCsvReaderWithPOJOITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/AggregateITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/CoGroupITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/DistinctITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/ExamplesITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/FilterITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/FirstNITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/FlatMapITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/JoinITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/MapITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/OuterJoinITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/PartitionITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/ReduceITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/ScalaSpecialTypesITCase.scala

flink-connectors/flink-avro/src/test/java/org/apache/flink/api/io/avro/AvroPojoTest.java
flink-connectors/flink-hadoop-compatibility/src/test/java/org/apache/flink/test/hadoopcompatibility/mapred/HadoopMapFunctionITCase.java
flink-connectors/flink-hadoop-compatibility/src/test/java/org/apache/flink/test/hadoopcompatibility/mapred/HadoopReduceCombineFunctionITCase.java
flink-connectors/flink-hadoop-compatibility/src/test/java/org/apache/flink/test/hadoopcompatibility/mapred/HadoopReduceFunctionITCase.java
flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/CEPITCase.java
flink-libraries/flink-gelly-examples/src/test/java/org/apache/flink/graph/test/examples/IncrementalSSSPITCase.java
flink-tests/src/test/java/org/apache/flink/test/iterative/aggregators/AggregatorsITCase.java
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/DataSinkITCase.java
{code}

I would suggest to create 2 additional subtasks each for Scala and Java and I can help finishing them. What do you think?

[1] https://github.com/apache/flink/search?p=5&q=TemporaryFolder+write&type=&utf8=%E2%9C%93

> Migrate integration tests from temp output files to collect()
> -------------------------------------------------------------
>
>                 Key: FLINK-2032
>                 URL: https://issues.apache.org/jira/browse/FLINK-2032
>             Project: Flink
>          Issue Type: Task
>          Components: Tests
>    Affects Versions: 0.9
>            Reporter: Fabian Hueske
>            Priority: Minor
>              Labels: starter
>
> Most of Flink's integration tests that execute full Flink programs and check their results are implemented by writing results to temporary output file and comparing the content of the file to a provided set of expected Strings. Flink's test utils make this quite comfortable and hide a lot of the complexity of this approach. Nonetheless, this approach has a few drawbacks:
> - increased latency by going through disk
> - comparison is on String representation of objects
> - depends on the file system
> Since Flink's {{collect()}} feature was added, the temp file approach is not the best approach anymore. Instead, tests can collect the result of a Flink program directly as objects and compare these against a set of expected objects.
> It would be good to migrate the existing test base to use {{collect()}} instead of temporary output files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)