You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/05/29 07:14:43 UTC
[GitHub] [flink] twentyworld commented on a change in pull request #12287: [FLINK-16077][docs-zh] Translate Custom State Serialization page into Chinese

twentyworld commented on a change in pull request #12287:
URL: https://github.com/apache/flink/pull/12287#discussion_r432296638



##########
File path: docs/dev/stream/state/custom_serialization.zh.md
##########
@@ -116,110 +102,71 @@ public abstract class TypeSerializer<T> {
 }
 {% endhighlight %}
 </div>
+序列化器的 `TypeSerializerSnapshot` 包含序列化器的结构信息，以及恢复序列化器所需要的其他附加信息。序列化器的快照读写逻辑在 `writeSnapshot` 以及 `readSnapshot` 中进行实现。
+
+需要注意的是快照本身的格式可能也需要随时间发生变化（比如，往快照中增加更多序列化器的信息）。为了方便，快照携带版本号，可以通过 `getCurrentVersion` 获取当前版本。在恢复的时候，从 savepoint 读取到快照后，`readSnapshot` 会调用对应版本的实现方法。
+
+在恢复时，检测序列化器格式是否发生变化的逻辑应该在 `resolveSchemaCompatibility` 中实现，该方法接收新的序列化器作为参数。该方法返回一个 `TypeSerializerSchemaCompatibility` 表示兼容性的结果，该结果有如下几种：
+ 1. **`TypeSerializerSchemaCompatibility.compatibleAsIs()`**: 该结果表明新的序列化器是兼容的，意味着新序列化器和之前的序列化器拥有相同的格式。也可能是在 `resolveSchemaCompatibility` 中对新序列化器继续重新配置所达到的。
+ 2. **`TypeSerializerSchemaCompatibility.compatibleAfterMigration()`**: 该结果表示新旧序列化器的格式不同，不过可以使用之前的序列化器反序列化，然后再用新序列化器进行序列化，从而进行迁移。
+ 3. **`TypeSerializerSchemaCompatibility.incompatible()`**: 该结果表示新旧序列化器的格式不同，且无法进行迁移。
 
-A serializer's `TypeSerializerSnapshot` is a point-in-time information that serves as the single source of truth about
-the state serializer's write schema, as well as any additional information mandatory to restore a serializer that
-would be identical to the given point-in-time. The logic about what should be written and read at restore time
-as the serializer snapshot is defined in the `writeSnapshot` and `readSnapshot` methods.
-
-Note that the snapshot's own write schema may also need to change over time (e.g. when you wish to add more information
-about the serializer to the snapshot). To facilitate this, snapshots are versioned, with the current version
-number defined in the `getCurrentVersion` method. On restore, when the serializer snapshot is read from savepoints,
-the version of the schema in which the snapshot was written in will be provided to the `readSnapshot` method so that
-the read implementation can handle different versions.
-
-At restore time, the logic that detects whether or not the new serializer's schema has changed should be implemented in
-the `resolveSchemaCompatibility` method. When previous registered state is registered again with new serializers in the
-restored execution of an operator, the new serializer is provided to the previous serializer's snapshot via this method.
-This method returns a `TypeSerializerSchemaCompatibility` representing the result of the compatibility resolution,
-which can be one of the following:
-
- 1. **`TypeSerializerSchemaCompatibility.compatibleAsIs()`**: this result signals that the new serializer is compatible,
- meaning that the new serializer has identical schema with the previous serializer. It is possible that the new
- serializer has been reconfigured in the `resolveSchemaCompatibility` method so that it is compatible.
- 2. **`TypeSerializerSchemaCompatibility.compatibleAfterMigration()`**: this result signals that the new serializer has a
- different serialization schema, and it is possible to migrate from the old schema by using the previous serializer
- (which recognizes the old schema) to read bytes into state objects, and then rewriting the object back to bytes with
- the new serializer (which recognizes the new schema). 
- 3. **`TypeSerializerSchemaCompatibility.incompatible()`**: this result signals that the new serializer has a
- different serialization schema, but it is not possible to migrate from the old schema.
-
-The last bit of detail is how the previous serializer is obtained in the case that migration is required.
-Another important role of a serializer's `TypeSerializerSnapshot` is that it serves as a factory to restore
-the previous serializer. More specifically, the `TypeSerializerSnapshot` should implement the `restoreSerializer` method
-to instantiate a serializer instance that recognizes the previous serializer's schema and configuration, and can therefore
-safely read data written by the previous serializer.
+最后一点细节在于需要进行 state 迁移时如何获取之前的序列化器。`TypeSerializerSnapshot` 的另一个重要作用是可以构造之前的序列化器。更具体的说，`TypeSerializerSnapshot` 应该实现 `restoreSerializer` 方法，该方法返回一个可以安全读取之前序列化器写出数据的序列化器。
 
 ### How Flink interacts with the `TypeSerializer` and `TypeSerializerSnapshot` abstractions
 
-To wrap up, this section concludes how Flink, or more specifically the state backends, interact with the
-abstractions. The interaction is slightly different depending on the state backend, but this is orthogonal
-to the implementation of state serializers and their serializer snapshots.
+总结一下，本节总结了 Flink，或者更具体的说状态后端，是如何与这些抽象交互。根据状态后端的不同，交互方式略有不同，但这与序列化器以及序列化器快照的实现是正交的。

Review comment:
       ```suggestion
   总结一下，本节阐述了 Flink，或者更具体的说状态后端，是如何与这些抽象交互。根据状态后端的不同，交互方式略有不同，但这与序列化器以及序列化器快照的实现是正交的。
   ```
   感觉 这里同时用了两个总结阅读起来有点奇怪。




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org