You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Viktor Rosenfled <vi...@tu-berlin.de> on 2014/11/05 13:44:01 UTC

Unit testing Flink programs / DataSet operations

Hi everybody,

I have the following test case prototype and I want to verify that sum()
actually computes the sum.

    @Test
    public void shouldComputeSum() throws Exception {
        // given
        ExecutionEnvironment env =
ExecutionEnvironment.getExecutionEnvironment();
        DataSet<Tuple1&lt;Long>> input = env.fromElements(
                new Tuple1<Long>(1L),
                new Tuple1<Long>(2L),
                new Tuple1<Long>(3L));

        // when
        DataSet<Tuple1&lt;Long>> result = input.sum(0);

        // then
        // verify that result is 6
    }

I found AggregateTranslationTest where a program plan is created and then
the sink is accessed to verify some structure on the output operator. Using
this as a starting point, I wrote the following code:

        // verify that the result is 6
        OutputFormat<Tuple1&lt;Long>> outputFormat =
mock(OutputFormat.class, withSettings().serializable());
        output.output(outputFormat);
        env.execute("ComputeCountTest");
        verify(outputFormat).writeRecord(new Tuple1<Long>(6L));

I encountered a few problems:

- I can't run this test code from the flink-java module because
env.execute() requires flink-clients which leads to a circular dependency.

- The outputFormat needs to be serializable; luckily Mockito supports this
even though they consider it a code smell but that can be argued.

- It doesn't actually work. Mockito prints:

    Wanted but not invoked:
    outputFormat.writeRecord((6));
    -> at
org.apache.flink.api.java.operator.MyAggregateOperatorTest.shouldComputeSum(MyAggregateOperatorTest.java:31)
    Actually, there were zero interactions with this mock.

I suspect env.execute() is non-blocking and that there's a race condition.

Executing a whole Flink program is probably too heavyweight for a unit test
but I wanted to use it as a starting point. I also found two other methods
to test operator code but I'm not sure which is the preferred way:

- MapTest: invokes a Map operator on a collection using
MockInvokable.createAndExecute()

- MapOperatorTest: invokes a Map operator op on a collection using
op.executeOnCollection()

So, my question is basically if there's a best practice in the Flink code
base to write a unit test similar to the one above.

Best,
Viktor



--
View this message in context: http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Unit-testing-Flink-programs-DataSet-operations-tp2371.html
Sent from the Apache Flink (Incubator) Mailing List archive. mailing list archive at Nabble.com.

Re: Unit testing Flink programs / DataSet operations

Posted by Stephan Ewen <se...@apache.org>.
You can have a look at this example, this grabs the output for verification:

https://github.com/apache/incubator-flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/broadcastvars/BroadcastVarInitializationITCase.java

On Wed, Nov 5, 2014 at 2:13 PM, Viktor Rosenfeld <
viktor.rosenfeld@tu-berlin.de> wrote:

> Hi Stephan,
>
>
> Stephan Ewen wrote
> > Why don't you simply run this program and verify that the result is 6?
>
> You mean verify by hand? I want to automate that.
>
>
> > You can use the "LocalCollectionOutputFormat" to collect the results (in
> > your case the one value) and compare it.
>
> Thanks, that's what I was looking for!
>
> Best,
> Viktor
>
>
>
> --
> View this message in context:
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Unit-testing-Flink-programs-DataSet-operations-tp2371p2374.html
> Sent from the Apache Flink (Incubator) Mailing List archive. mailing list
> archive at Nabble.com.
>

Re: Unit testing Flink programs / DataSet operations

Posted by Viktor Rosenfeld <vi...@tu-berlin.de>.
Hi Stephan,


Stephan Ewen wrote
> Why don't you simply run this program and verify that the result is 6?

You mean verify by hand? I want to automate that.


> You can use the "LocalCollectionOutputFormat" to collect the results (in
> your case the one value) and compare it.

Thanks, that's what I was looking for!

Best,
Viktor



--
View this message in context: http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Unit-testing-Flink-programs-DataSet-operations-tp2371p2374.html
Sent from the Apache Flink (Incubator) Mailing List archive. mailing list archive at Nabble.com.

Re: Unit testing Flink programs / DataSet operations

Posted by Stephan Ewen <se...@apache.org>.
Hey!

Why don't you simply run this program and verify that the result is 6?

You can use the "LocalCollectionOutputFormat" to collect the results (in
your case the one value) and compare it.

Stephan



On Wed, Nov 5, 2014 at 1:44 PM, Viktor Rosenfled <
viktor.rosenfeld@tu-berlin.de> wrote:

> Hi everybody,
>
> I have the following test case prototype and I want to verify that sum()
> actually computes the sum.
>
>     @Test
>     public void shouldComputeSum() throws Exception {
>         // given
>         ExecutionEnvironment env =
> ExecutionEnvironment.getExecutionEnvironment();
>         DataSet<Tuple1&lt;Long>> input = env.fromElements(
>                 new Tuple1<Long>(1L),
>                 new Tuple1<Long>(2L),
>                 new Tuple1<Long>(3L));
>
>         // when
>         DataSet<Tuple1&lt;Long>> result = input.sum(0);
>
>         // then
>         // verify that result is 6
>     }
>
> I found AggregateTranslationTest where a program plan is created and then
> the sink is accessed to verify some structure on the output operator. Using
> this as a starting point, I wrote the following code:
>
>         // verify that the result is 6
>         OutputFormat<Tuple1&lt;Long>> outputFormat =
> mock(OutputFormat.class, withSettings().serializable());
>         output.output(outputFormat);
>         env.execute("ComputeCountTest");
>         verify(outputFormat).writeRecord(new Tuple1<Long>(6L));
>
> I encountered a few problems:
>
> - I can't run this test code from the flink-java module because
> env.execute() requires flink-clients which leads to a circular dependency.
>
> - The outputFormat needs to be serializable; luckily Mockito supports this
> even though they consider it a code smell but that can be argued.
>
> - It doesn't actually work. Mockito prints:
>
>     Wanted but not invoked:
>     outputFormat.writeRecord((6));
>     -> at
>
> org.apache.flink.api.java.operator.MyAggregateOperatorTest.shouldComputeSum(MyAggregateOperatorTest.java:31)
>     Actually, there were zero interactions with this mock.
>
> I suspect env.execute() is non-blocking and that there's a race condition.
>
> Executing a whole Flink program is probably too heavyweight for a unit test
> but I wanted to use it as a starting point. I also found two other methods
> to test operator code but I'm not sure which is the preferred way:
>
> - MapTest: invokes a Map operator on a collection using
> MockInvokable.createAndExecute()
>
> - MapOperatorTest: invokes a Map operator op on a collection using
> op.executeOnCollection()
>
> So, my question is basically if there's a best practice in the Flink code
> base to write a unit test similar to the one above.
>
> Best,
> Viktor
>
>
>
> --
> View this message in context:
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Unit-testing-Flink-programs-DataSet-operations-tp2371.html
> Sent from the Apache Flink (Incubator) Mailing List archive. mailing list
> archive at Nabble.com.
>