You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by azagrebin <gi...@git.apache.org> on 2018/05/02 09:58:09 UTC
[GitHub] flink pull request #5947: [FLINK-8978] Stateful generic stream job upgrade e...
GitHub user azagrebin opened a pull request:
https://github.com/apache/flink/pull/5947
[FLINK-8978] Stateful generic stream job upgrade e2e test
## What is the purpose of the change
e2e test for generic state job upgrade and state recovery based operator uid.
## Brief change log
- extract `DataStreamAllroundTestJobFactory` from `DataStreamAllroundTestProgram` to reuse generic components and their cli config.
- extract some code snippets into functions from `test_scripts/test_resume_savepoint.sh` to `test_scripts/common.sh` to reuse them in this new test
- add `flink-stream-stateful-job-upgrade-test` module to flink e2e tests
- add `StatefulStreamJobUpgradeTestProgram` which is constructed from `DataStreamAllroundTestJobFactory`
- add `test_scripts/test_stateful_stream_job_upgrade.sh` to `run-nightly-tests.sh`
## Verifying this change
run from flink repo root
```bash
$ mvn clean package
$ FLINK_DIR=build-target flink-end-to-end-tests/test-scripts/test_stateful_stream_job_upgrade.sh
```
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
- The serializers: (no)
- The runtime per-record code paths (performance sensitive): (no)
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
- The S3 file system connector: (no)
## Documentation
- Does this pull request introduce a new feature? (no)
- If yes, how is the feature documented? (JavaDocs)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/azagrebin/flink FLINK-8978
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/5947.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #5947
----
commit 4282b9cee00dafd4963a3a9902e63cc0fa77d385
Author: Andrey Zagrebin <an...@...>
Date: 2018-04-30T18:25:53Z
[FLINK-8978] Stateful generic stream job upgrade e2e test
----
---
[GitHub] flink issue #5947: [FLINK-8978] Stateful generic stream job upgrade e2e test
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on the issue:
https://github.com/apache/flink/pull/5947
LGTM 👍 Will merge this.
---
[GitHub] flink pull request #5947: [FLINK-8978] Stateful generic stream job upgrade e...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/5947#discussion_r185537704
--- Diff: flink-end-to-end-tests/flink-stream-stateful-job-upgrade-test/src/main/java/org/apache/flink/streaming/tests/StatefulStreamJobUpgradeTestProgram.java ---
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.tests;
+
+import org.apache.flink.api.common.functions.JoinFunction;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer;
+import org.apache.flink.api.java.utils.ParameterTool;
+import org.apache.flink.configuration.ConfigOption;
+import org.apache.flink.configuration.ConfigOptions;
+import org.apache.flink.streaming.api.datastream.KeyedStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.PrintSinkFunction;
+import org.apache.flink.streaming.tests.artificialstate.eventpayload.ComplexPayload;
+
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createArtificialKeyedStateMapper;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createEventSource;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createSemanticsCheckMapper;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createTimestampExtractor;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.setupEnvironment;
+
+import java.util.Collections;
+import java.util.List;
+
+/**
+ * Test upgrade of generic stateful job for Flink's DataStream API operators and primitives.
+ *
+ * <p>The job is constructed of generic components from {@link DataStreamAllroundTestJobFactory}.
+ * The gaol is to test successful state restoration after taking savepoint and recovery with new job version.
+ * It can be configured with '--test.job.variant' to run different variants of it:
+ * <ul>
+ * <li><b>original:</b> includes 2 custom stateful map operators</li>
+ * <li><b>upgraded:</b> changes order of 2 custom stateful map operators and adds one more</li>
+ * </ul>
+ */
--- End diff --
I think we should add into the comment on job classes all possible configuration options are in the comment of `DataStreamAllroundTestJobFactory` so that user can easily find them.
---
[GitHub] flink issue #5947: [FLINK-8978] Stateful generic stream job upgrade e2e test
Posted by azagrebin <gi...@git.apache.org>.
Github user azagrebin commented on the issue:
https://github.com/apache/flink/pull/5947
Thanks for review and good points @StefanRRichter
I updated the PR to address the comments.
The resume state e2e test also checks operator <-> state correspondence upon state restoration now.
---
[GitHub] flink pull request #5947: [FLINK-8978] Stateful generic stream job upgrade e...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/5947#discussion_r185537151
--- Diff: flink-end-to-end-tests/flink-stream-stateful-job-upgrade-test/src/main/java/org/apache/flink/streaming/tests/StatefulStreamJobUpgradeTestProgram.java ---
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.tests;
+
+import org.apache.flink.api.common.functions.JoinFunction;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer;
+import org.apache.flink.api.java.utils.ParameterTool;
+import org.apache.flink.configuration.ConfigOption;
+import org.apache.flink.configuration.ConfigOptions;
+import org.apache.flink.streaming.api.datastream.KeyedStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.PrintSinkFunction;
+import org.apache.flink.streaming.tests.artificialstate.eventpayload.ComplexPayload;
+
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createArtificialKeyedStateMapper;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createEventSource;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createSemanticsCheckMapper;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createTimestampExtractor;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.setupEnvironment;
+
+import java.util.Collections;
+import java.util.List;
+
+/**
+ * Test upgrade of generic stateful job for Flink's DataStream API operators and primitives.
+ *
+ * <p>The job is constructed of generic components from {@link DataStreamAllroundTestJobFactory}.
+ * The gaol is to test successful state restoration after taking savepoint and recovery with new job version.
--- End diff --
typo `gaol`
---
[GitHub] flink issue #5947: [FLINK-8978] Stateful generic stream job upgrade e2e test
Posted by azagrebin <gi...@git.apache.org>.
Github user azagrebin commented on the issue:
https://github.com/apache/flink/pull/5947
agree, I added constant check of previous non-null state into its update method
---
[GitHub] flink pull request #5947: [FLINK-8978] Stateful generic stream job upgrade e...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/flink/pull/5947
---
[GitHub] flink issue #5947: [FLINK-8978] Stateful generic stream job upgrade e2e test
Posted by azagrebin <gi...@git.apache.org>.
Github user azagrebin commented on the issue:
https://github.com/apache/flink/pull/5947
cc @StefanRRichter
---
[GitHub] flink pull request #5947: [FLINK-8978] Stateful generic stream job upgrade e...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/5947#discussion_r185541401
--- Diff: flink-end-to-end-tests/flink-stream-stateful-job-upgrade-test/src/main/java/org/apache/flink/streaming/tests/StatefulStreamJobUpgradeTestProgram.java ---
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.tests;
+
+import org.apache.flink.api.common.functions.JoinFunction;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer;
+import org.apache.flink.api.java.utils.ParameterTool;
+import org.apache.flink.configuration.ConfigOption;
+import org.apache.flink.configuration.ConfigOptions;
+import org.apache.flink.streaming.api.datastream.KeyedStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.PrintSinkFunction;
+import org.apache.flink.streaming.tests.artificialstate.eventpayload.ComplexPayload;
+
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createArtificialKeyedStateMapper;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createEventSource;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createSemanticsCheckMapper;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.createTimestampExtractor;
+import static org.apache.flink.streaming.tests.DataStreamAllroundTestJobFactory.setupEnvironment;
+
+import java.util.Collections;
+import java.util.List;
+
+/**
+ * Test upgrade of generic stateful job for Flink's DataStream API operators and primitives.
+ *
+ * <p>The job is constructed of generic components from {@link DataStreamAllroundTestJobFactory}.
+ * The gaol is to test successful state restoration after taking savepoint and recovery with new job version.
+ * It can be configured with '--test.job.variant' to run different variants of it:
+ * <ul>
+ * <li><b>original:</b> includes 2 custom stateful map operators</li>
+ * <li><b>upgraded:</b> changes order of 2 custom stateful map operators and adds one more</li>
+ * </ul>
+ */
+public class StatefulStreamJobUpgradeTestProgram {
+ private static final String TEST_JOB_VARIANT_ORIGINAL = "original";
+ private static final String TEST_JOB_VARIANT_UPGRADED = "upgraded";
+
+ private static final JoinFunction<Event, ComplexPayload, ComplexPayload> SIMPLE_STATE_UPDATE =
+ (Event first, ComplexPayload second) -> new ComplexPayload(first);
+ private static final JoinFunction<Event, ComplexPayload, ComplexPayload> LAST_EVENT_STATE_UPDATE =
+ (Event first, ComplexPayload second) ->
+ (second != null && first.getEventTime() <= second.getEventTime()) ? second : new ComplexPayload(first);
+
+ private static final ConfigOption<String> TEST_JOB_VARIANT = ConfigOptions
+ .key("test.job.variant")
+ .defaultValue(TEST_JOB_VARIANT_ORIGINAL)
+ .withDescription(String.format("This configures the job variant to test. Can be '%s' or '%s'",
+ TEST_JOB_VARIANT_ORIGINAL, TEST_JOB_VARIANT_UPGRADED));
+
+ public static void main(String[] args) throws Exception {
+ final ParameterTool pt = ParameterTool.fromArgs(args);
+
+ final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
+
+ setupEnvironment(env, pt);
+
+ KeyedStream<Event, Integer> source = env.addSource(createEventSource(pt))
+ .assignTimestampsAndWatermarks(createTimestampExtractor(pt))
+ .keyBy(Event::getKey);
+
+ List<TypeSerializer<ComplexPayload>> stateSer =
+ Collections.singletonList(new KryoSerializer<>(ComplexPayload.class, env.getConfig()));
+
+ boolean isOriginal = pt.get(TEST_JOB_VARIANT.key()).equals(TEST_JOB_VARIANT_ORIGINAL);
--- End diff --
We could throw an `IllegalArgumentException` for any unexpected string.
---
[GitHub] flink pull request #5947: [FLINK-8978] Stateful generic stream job upgrade e...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/5947#discussion_r185840213
--- Diff: flink-end-to-end-tests/flink-datastream-allround-test/src/main/java/org/apache/flink/streaming/tests/artificialstate/eventpayload/ArtificialValueStateBuilder.java ---
@@ -33,21 +33,28 @@
private static final long serialVersionUID = -1205814329756790916L;
private transient ValueState<STATE> valueState;
+ private transient boolean afterRestoration;
private final TypeSerializer<STATE> typeSerializer;
private final JoinFunction<IN, STATE, STATE> stateValueGenerator;
+ private final RestoredStateVerifier<STATE> restoredStateVerifier;
public ArtificialValueStateBuilder(
String stateName,
JoinFunction<IN, STATE, STATE> stateValueGenerator,
- TypeSerializer<STATE> typeSerializer) {
-
+ TypeSerializer<STATE> typeSerializer,
+ RestoredStateVerifier<STATE> restoredStateVerifier) {
super(stateName);
this.typeSerializer = typeSerializer;
this.stateValueGenerator = stateValueGenerator;
+ this.restoredStateVerifier = restoredStateVerifier;
}
@Override
public void artificialStateForElement(IN event) throws Exception {
+ if (afterRestoration) {
--- End diff --
I find this way of checking the state rather invasive not completely thorough. There is now a pretty tight coupling between creating artificial state and checking something about it on restore. In particular, there is a hardcoded way now when to check. This makes it harder to reuse the classes in further test jobs that we might want to build with them. Can't we use a way that is more based on composition? For example, wrap the state builder in a state checker? This is also only doing just one check, so if the input element has a key that we never encountered, the state is `null` and there might be no check.
---
[GitHub] flink pull request #5947: [FLINK-8978] Stateful generic stream job upgrade e...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/5947#discussion_r185840683
--- Diff: flink-end-to-end-tests/flink-datastream-allround-test/src/main/java/org/apache/flink/streaming/tests/artificialstate/eventpayload/ArtificialValueStateBuilder.java ---
@@ -33,21 +33,28 @@
private static final long serialVersionUID = -1205814329756790916L;
private transient ValueState<STATE> valueState;
+ private transient boolean afterRestoration;
private final TypeSerializer<STATE> typeSerializer;
private final JoinFunction<IN, STATE, STATE> stateValueGenerator;
+ private final RestoredStateVerifier<STATE> restoredStateVerifier;
public ArtificialValueStateBuilder(
String stateName,
JoinFunction<IN, STATE, STATE> stateValueGenerator,
- TypeSerializer<STATE> typeSerializer) {
-
+ TypeSerializer<STATE> typeSerializer,
+ RestoredStateVerifier<STATE> restoredStateVerifier) {
super(stateName);
this.typeSerializer = typeSerializer;
this.stateValueGenerator = stateValueGenerator;
+ this.restoredStateVerifier = restoredStateVerifier;
}
@Override
public void artificialStateForElement(IN event) throws Exception {
+ if (afterRestoration) {
--- End diff --
As this is a test job, I think it might not hurt to just check every element after a restore.
---
[GitHub] flink issue #5947: [FLINK-8978] Stateful generic stream job upgrade e2e test
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on the issue:
https://github.com/apache/flink/pull/5947
Please also double check, it seems there are files without license header.
---