You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by kl0u <gi...@git.apache.org> on 2018/04/25 11:57:42 UTC
[GitHub] flink pull request #5910: [FLINK-8841] [state] Remove HashMapSerializer and ...
GitHub user kl0u opened a pull request:
https://github.com/apache/flink/pull/5910
[FLINK-8841] [state] Remove HashMapSerializer and use MapSerializer instead.
## What is the purpose of the change
So far we had the `MapSerializer` and the `HashMapSerializer`. The two had almost identical code and the second was only used on the `HeapStateBackend`/`FSStateBackend` when creating a `MapState`. This PR removes the `HashMapSerializer` and replaces its uses with the `MapSerializer`. It also guarantees backwards compatibility.
## Brief change log
It introduces the `MigrationUtil` as an inner class of the `InstantiationUtil`. This class contains mapping between deprecated/deleted serializers and their replacements.
Also the removal of the `HashMapSerializer` uniformizes a bit the `HeapMapState` and the `RocksDBMapState`.
## Verifying this change
Added the `HeapKeyedStateBackendSnapshotMigrationTest#testMapStateMigrationAfterHashMapSerRemoval()`.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no
- The serializers: yes
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
- The S3 file system connector: no
## Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? not applicable
R @StefanRRichter or @aljoscha
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kl0u/flink map-serializer-inv
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/5910.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #5910
----
commit 7de83de4765384080cea6d94b64a81e1584ce82e
Author: kkloudas <kk...@...>
Date: 2018-04-24T12:48:34Z
[FLINK-8841] Remove HashMapSerializer and use MapSerializer instead.
----
---
[GitHub] flink issue #5910: [FLINK-8841] [state] Remove HashMapSerializer and use Map...
Posted by bowenli86 <gi...@git.apache.org>.
Github user bowenli86 commented on the issue:
https://github.com/apache/flink/pull/5910
+1
---
[GitHub] flink pull request #5910: [FLINK-8841] [state] Remove HashMapSerializer and ...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/5910#discussion_r184304412
--- Diff: flink-core/src/main/java/org/apache/flink/util/InstantiationUtil.java ---
@@ -221,6 +226,52 @@ protected ObjectStreamClass readClassDescriptor() throws IOException, ClassNotFo
}
}
+ /**
+ * A mapping between the full path of a deprecated serializer and its equivalent.
+ * These mappings are hardcoded and fixed.
+ *
+ * <p>IMPORTANT: mappings can be removed after 1 release as there will be a "migration path".
+ * As an example, a serializer is removed in 1.5-SNAPSHOT, then the mapping should be added for 1.5,
+ * and it can be removed in 1.6, as the path would be Flink-{< 1.5} -> Flink-1.5 -> Flink-{>= 1.6}.
+ */
+ private enum MigrationUtil {
+
+ // To add a new mapping just pick a name and add an entry as the following:
+
+ GENERIC_DATA_ARRAY_SERIALIZER(
+ "org.apache.avro.generic.GenericData$Array",
+ ObjectStreamClass.lookup(KryoRegistrationSerializerConfigSnapshot.DummyRegisteredClass.class)),
+ HASH_MAP_SERIALIZER(
+ "org.apache.flink.runtime.state.HashMapSerializer",
+ ObjectStreamClass.lookup(MapSerializer.class)); // added in 1.5
+
+ /** An internal unmoddifiable map containing the mappings between deprecated and new serialzers. */
+ private static final Map<String, ObjectStreamClass> equivalenceMap = Collections.unmodifiableMap(initMap());
--- End diff --
If it is static final, I suggest to use capital letter and underscores for the variable name.
---
[GitHub] flink issue #5910: [FLINK-8841] [state] Remove HashMapSerializer and use Map...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on the issue:
https://github.com/apache/flink/pull/5910
LGTM 👍
---
[GitHub] flink pull request #5910: [FLINK-8841] [state] Remove HashMapSerializer and ...
Posted by yuqi1129 <gi...@git.apache.org>.
Github user yuqi1129 commented on a diff in the pull request:
https://github.com/apache/flink/pull/5910#discussion_r184044453
--- Diff: flink-core/src/main/java/org/apache/flink/util/InstantiationUtil.java ---
@@ -221,6 +226,52 @@ protected ObjectStreamClass readClassDescriptor() throws IOException, ClassNotFo
}
}
+ /**
+ * A mapping between the full path of a deprecated serializer and its equivalent.
+ * These mappings are hardcoded and fixed.
+ *
+ * <p>IMPORTANT: mappings can be removed after 1 release as there will be a "migration path".
+ * As an example, a serializer is removed in 1.5-SNAPSHOT, then the mapping should be added for 1.5,
+ * and it can be removed in 1.6, as the path would be Flink-{< 1.5} -> Flink-1.5 -> Flink-{>= 1.6}.
+ */
+ private enum MigrationUtil {
+
+ // To add a new mapping just pick a name and add an entry as the following:
+
+ GENERIC_DATA_ARRAY_SERIALIZER(
+ "org.apache.avro.generic.GenericData$Array",
+ ObjectStreamClass.lookup(KryoRegistrationSerializerConfigSnapshot.DummyRegisteredClass.class)),
+ HASH_MAP_SERIALIZER(
+ "org.apache.flink.runtime.state.HashMapSerializer",
+ ObjectStreamClass.lookup(MapSerializer.class)); // added in 1.5
+
+ /** An internal unmoddifiable map containing the mappings between deprecated and new serialzers. */
--- End diff --
`unmoddifiable`, `serialzers `
spelling mistake
---
[GitHub] flink pull request #5910: [FLINK-8841] [state] Remove HashMapSerializer and ...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/5910#discussion_r184304664
--- Diff: flink-core/src/main/java/org/apache/flink/util/InstantiationUtil.java ---
@@ -221,6 +226,52 @@ protected ObjectStreamClass readClassDescriptor() throws IOException, ClassNotFo
}
}
+ /**
+ * A mapping between the full path of a deprecated serializer and its equivalent.
+ * These mappings are hardcoded and fixed.
+ *
+ * <p>IMPORTANT: mappings can be removed after 1 release as there will be a "migration path".
+ * As an example, a serializer is removed in 1.5-SNAPSHOT, then the mapping should be added for 1.5,
+ * and it can be removed in 1.6, as the path would be Flink-{< 1.5} -> Flink-1.5 -> Flink-{>= 1.6}.
+ */
+ private enum MigrationUtil {
+
+ // To add a new mapping just pick a name and add an entry as the following:
+
+ GENERIC_DATA_ARRAY_SERIALIZER(
+ "org.apache.avro.generic.GenericData$Array",
+ ObjectStreamClass.lookup(KryoRegistrationSerializerConfigSnapshot.DummyRegisteredClass.class)),
+ HASH_MAP_SERIALIZER(
+ "org.apache.flink.runtime.state.HashMapSerializer",
+ ObjectStreamClass.lookup(MapSerializer.class)); // added in 1.5
+
+ /** An internal unmoddifiable map containing the mappings between deprecated and new serialzers. */
+ private static final Map<String, ObjectStreamClass> equivalenceMap = Collections.unmodifiableMap(initMap());
+
+ /** The full name of the class of the old serializer. */
+ private final String oldSerializer;
+
+ /** The serialization descriptor of the class of the new serializer. */
+ private final ObjectStreamClass newSerializer;
--- End diff --
Here I suggest to add `StreamClass` to the name.
---
[GitHub] flink pull request #5910: [FLINK-8841] [state] Remove HashMapSerializer and ...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/5910#discussion_r184304800
--- Diff: flink-core/src/main/java/org/apache/flink/util/InstantiationUtil.java ---
@@ -221,6 +226,52 @@ protected ObjectStreamClass readClassDescriptor() throws IOException, ClassNotFo
}
}
+ /**
+ * A mapping between the full path of a deprecated serializer and its equivalent.
+ * These mappings are hardcoded and fixed.
+ *
+ * <p>IMPORTANT: mappings can be removed after 1 release as there will be a "migration path".
+ * As an example, a serializer is removed in 1.5-SNAPSHOT, then the mapping should be added for 1.5,
+ * and it can be removed in 1.6, as the path would be Flink-{< 1.5} -> Flink-1.5 -> Flink-{>= 1.6}.
+ */
+ private enum MigrationUtil {
+
+ // To add a new mapping just pick a name and add an entry as the following:
+
+ GENERIC_DATA_ARRAY_SERIALIZER(
+ "org.apache.avro.generic.GenericData$Array",
+ ObjectStreamClass.lookup(KryoRegistrationSerializerConfigSnapshot.DummyRegisteredClass.class)),
+ HASH_MAP_SERIALIZER(
+ "org.apache.flink.runtime.state.HashMapSerializer",
+ ObjectStreamClass.lookup(MapSerializer.class)); // added in 1.5
+
+ /** An internal unmoddifiable map containing the mappings between deprecated and new serialzers. */
+ private static final Map<String, ObjectStreamClass> equivalenceMap = Collections.unmodifiableMap(initMap());
+
+ /** The full name of the class of the old serializer. */
+ private final String oldSerializer;
+
+ /** The serialization descriptor of the class of the new serializer. */
+ private final ObjectStreamClass newSerializer;
+
+ MigrationUtil(final String oldSerializer, final ObjectStreamClass newSerializer) {
--- End diff --
the `final` in on these parameters is just noise
---
[GitHub] flink pull request #5910: [FLINK-8841] [state] Remove HashMapSerializer and ...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/5910#discussion_r184304555
--- Diff: flink-core/src/main/java/org/apache/flink/util/InstantiationUtil.java ---
@@ -221,6 +226,52 @@ protected ObjectStreamClass readClassDescriptor() throws IOException, ClassNotFo
}
}
+ /**
+ * A mapping between the full path of a deprecated serializer and its equivalent.
+ * These mappings are hardcoded and fixed.
+ *
+ * <p>IMPORTANT: mappings can be removed after 1 release as there will be a "migration path".
+ * As an example, a serializer is removed in 1.5-SNAPSHOT, then the mapping should be added for 1.5,
+ * and it can be removed in 1.6, as the path would be Flink-{< 1.5} -> Flink-1.5 -> Flink-{>= 1.6}.
+ */
+ private enum MigrationUtil {
+
+ // To add a new mapping just pick a name and add an entry as the following:
+
+ GENERIC_DATA_ARRAY_SERIALIZER(
+ "org.apache.avro.generic.GenericData$Array",
+ ObjectStreamClass.lookup(KryoRegistrationSerializerConfigSnapshot.DummyRegisteredClass.class)),
+ HASH_MAP_SERIALIZER(
+ "org.apache.flink.runtime.state.HashMapSerializer",
+ ObjectStreamClass.lookup(MapSerializer.class)); // added in 1.5
+
+ /** An internal unmoddifiable map containing the mappings between deprecated and new serialzers. */
+ private static final Map<String, ObjectStreamClass> equivalenceMap = Collections.unmodifiableMap(initMap());
+
+ /** The full name of the class of the old serializer. */
+ private final String oldSerializer;
--- End diff --
mabye add `name` to the variable name to show it is a string, not a serializer.
---
[GitHub] flink pull request #5910: [FLINK-8841] [state] Remove HashMapSerializer and ...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/flink/pull/5910
---
[GitHub] flink issue #5910: [FLINK-8841] [state] Remove HashMapSerializer and use Map...
Posted by kl0u <gi...@git.apache.org>.
Github user kl0u commented on the issue:
https://github.com/apache/flink/pull/5910
Thanks @yuqi1129, @bowenli86 and @StefanRRichter for the reviews. I integrated your comments. If you are done with reviewing, I will merge it as soon as Travis gives the green light.
---