You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/10/28 04:15:55 UTC

[GitHub] [flink] SteNicholas opened a new pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

SteNicholas opened a new pull request #13813:
URL: https://github.com/apache/flink/pull/13813


   ## What is the purpose of the change
   
   *Currently Flink can only handle schemas up to a size of 65535 bytes, and `AvroSerializerSnapshot` cannot handle large schema. If size of schame exceed 65535 bytes, the `UTFDataFormatException` could be thrown. `AvroSerializerSnapshot` should introduce a V3 that uses `StringValue.writeString()` and `StringValue.readString()` to handle the string encoding.*
   
   ## Brief change log
   
     - *`AvroSerializerSnapshot` introduces a V3 that uses `StringValue.writeString()` in `writeSnapshot()` and `StringValue.readString()` in `readSnapshot()`.*
   
   ## Verifying this change
   
     - *`AvroSerializerSnapshotTest` adds the test method `largeSchemaWithExceedV2ThresholdSizeShouldResultInCompatibleV3Serializer` to verify the test case whether the large schema which exceeds a threshold size of 65535 bytes is compatible with the new version 3 of `AvroSerializerSnapshot`.*
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / **no**)
     - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598",
       "triggerID" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c9d7074043ea0035f3655b4865b3248e4d70afdc Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598) 
   * e36847952532a6b199d1af5e2b90c5416d7f89ba UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61014eba30b67fc21a965f876d9484dd3873e14d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598",
       "triggerID" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61014eba30b67fc21a965f876d9484dd3873e14d Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459) 
   * c9d7074043ea0035f3655b4865b3248e4d70afdc Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] SteNicholas commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
SteNicholas commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-723849408


   > I pushed some changes to your branch, I hope that's ok:
   > 
   > * I added one prerequisite commit that adds testing infra for testing past serializer snapshots. I also created a snapshot with v2 before bumping the version
   > * then the second commit (from @SteNicholas) just needs to do the changes and the existing test verifies that we can still restore
   > * when we bump the version the next time we would again have to create a snapshot before
   > 
   > Strictly speaking, we wouldn't have to test restoring from old versions here because that's covered by `AvroSerializerUpgradeTest` but it's good to be extra sure and `AvroSerializerUpgradeTest` is a bit coarse grained because we only create new snapshots per Flink version and not every time we bump a serializer snapshot version.
   > 
   > cc @StephanEwen
   
   @aljoscha I have pushed one commit to exclude the serializer snapshot from license. Please help to review this pull request again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598",
       "triggerID" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9105",
       "triggerID" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "triggerType" : "PUSH"
     }, {
       "hash" : "07d9add05898c11fae9742df43a41d9f7b48916b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "07d9add05898c11fae9742df43a41d9f7b48916b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e36847952532a6b199d1af5e2b90c5416d7f89ba Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9105) 
   * 07d9add05898c11fae9742df43a41d9f7b48916b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598",
       "triggerID" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9105",
       "triggerID" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "triggerType" : "PUSH"
     }, {
       "hash" : "07d9add05898c11fae9742df43a41d9f7b48916b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9338",
       "triggerID" : "07d9add05898c11fae9742df43a41d9f7b48916b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e36847952532a6b199d1af5e2b90c5416d7f89ba Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9105) 
   * 07d9add05898c11fae9742df43a41d9f7b48916b Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9338) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61014eba30b67fc21a965f876d9484dd3873e14d Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459) 
   * c9d7074043ea0035f3655b4865b3248e4d70afdc UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] SteNicholas commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
SteNicholas commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-723849213


   @aljoscha I have pushed one commit to exclude the serializer snapshot from license. Please help to review this pull request again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] SteNicholas edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
SteNicholas edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-718676236


   > Thanks for taking a look at this.
   > 
   > The feature code is fine, but I think the tests need some improvements.
   > 
   > * This misses adding a V2 config snapshot to the test resources and ensuring we can resume from there.
   > * The test for large schema re-engineers the V2 method in the tests.
   >   
   >   * This is not necessary. We don't care whether the previous version fails, we only want the new version to succeed
   >   * Rebuilding functionality in tests and testing that does not help tp guard anything in the production code, so it is ineffective.
   
   @StephanEwen , thanks for your detailed review. I have followed up your comments about the improvements of tests. Please review again if you are available. cc @AHeise 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] aljoscha commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
aljoscha commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-724002195


   Merged!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] SteNicholas edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
SteNicholas edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-720805792


   > ome shared versioning scheme of the snapshot metadata framework (in `flink-core`), which It also removes previous serializer snaphot test data.
   
   @StephanEwen I thought the version 3 need to be considered in read() of TypeSerializerSnapshotSerializationUtil in flink-core, therefore I changes some shared versioning scheme of the snapshot metadata framework. I **used the latest version of serializer snaphot to generate serializer snaphot test data, which previously used the version 2**. In this pull request, I didn't remove previous serializer snaphot test data.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] StephanEwen commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717772823


   Thanks for taking a look at this.
   
   The feature code is fine, but I think the tests need some improvements.
   
     - This misses adding a V2 config snapshot to the test resources and ensuring we can resume from there.
   
     - The test for large schema re-engineers the V2 method in the tests.
         - This is not necessary. We don't care whether the previous version fails, we only want the new version to succeed
         - Rebuilding functionality in tests and testing that does not help tp guard anything in the production code, so it is ineffective. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598",
       "triggerID" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c9d7074043ea0035f3655b4865b3248e4d70afdc Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] aljoscha commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
aljoscha commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-722450412


   I pushed some changes to your branch, I hope that's ok:
    - I added one prerequisite commit that adds testing infra for testing past serializer snapshots. I also created a snapshot with v2 before bumping the version
   - then the second commit (from @SteNicholas) just needs to do the changes and the existing test verifies that we can still restore
   - when we bump the version the next time we would again have to create a snapshot before
   
   Strictly speaking, we wouldn't have to test restoring from old versions here because that's covered by `AvroSerializerUpgradeTest` but it's good to be extra sure and `AvroSerializerUpgradeTest` is a bit coarse grained because we only create new snapshots per Flink version and not every time we bump a serializer snapshot version.
   
   cc @StephanEwen 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] SteNicholas commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
SteNicholas commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-718676236


   > Thanks for taking a look at this.
   > 
   > The feature code is fine, but I think the tests need some improvements.
   > 
   > * This misses adding a V2 config snapshot to the test resources and ensuring we can resume from there.
   > * The test for large schema re-engineers the V2 method in the tests.
   >   
   >   * This is not necessary. We don't care whether the previous version fails, we only want the new version to succeed
   >   * Rebuilding functionality in tests and testing that does not help tp guard anything in the production code, so it is ineffective.
   
   @StephanEwen , thanks for your detailed review. I have followed up your comments about the improvements of tests. Please review again if you are available.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598",
       "triggerID" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9105",
       "triggerID" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "triggerType" : "PUSH"
     }, {
       "hash" : "07d9add05898c11fae9742df43a41d9f7b48916b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9338",
       "triggerID" : "07d9add05898c11fae9742df43a41d9f7b48916b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 07d9add05898c11fae9742df43a41d9f7b48916b Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9338) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717685369


   Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress of the review.
   
   
   ## Automated Checks
   Last check on commit 61014eba30b67fc21a965f876d9484dd3873e14d (Wed Oct 28 04:18:58 UTC 2020)
   
   **Warnings:**
    * No documentation files were touched! Remember to keep the Flink docs up to date!
   
   
   <sub>Mention the bot in a comment to re-run the automated checks.</sub>
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process.<details>
    The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`)
    - `@flinkbot approve all` to approve all aspects
    - `@flinkbot approve-until architecture` to approve everything until `architecture`
    - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention
    - `@flinkbot disapprove architecture` to remove an approval you gave earlier
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] StephanEwen commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-720472340


   I think we need to clarify what actually needs to be done first.
   
   This PR changes some shared versioning scheme of the snapshot metadata framework (in `flink-core`), which It also removes previous serializer snaphot test data.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] SteNicholas removed a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
SteNicholas removed a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-723849213


   @aljoscha I have pushed one commit to exclude the serializer snapshot from license. Please help to review this pull request again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] SteNicholas commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
SteNicholas commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717684935


   @AHeise , could you please help to review this pull request if you are available?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] aljoscha closed pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
aljoscha closed pull request #13813:
URL: https://github.com/apache/flink/pull/13813


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61014eba30b67fc21a965f876d9484dd3873e14d Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8598",
       "triggerID" : "c9d7074043ea0035f3655b4865b3248e4d70afdc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9105",
       "triggerID" : "e36847952532a6b199d1af5e2b90c5416d7f89ba",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e36847952532a6b199d1af5e2b90c5416d7f89ba Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9105) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] SteNicholas commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
SteNicholas commented on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-720805792


   > ome shared versioning scheme of the snapshot metadata framework (in `flink-core`), which It also removes previous serializer snaphot test data.
   
   I used the latest version of serializer snaphot to generate serializer snaphot test data, which previously used the version 2.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] SteNicholas edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
SteNicholas edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-720805792


   > ome shared versioning scheme of the snapshot metadata framework (in `flink-core`), which It also removes previous serializer snaphot test data.
   
   @StephanEwen I thought the version 3 need to be considered in read() of TypeSerializerSnapshotSerializationUtil in flink-core, therefore I changes some shared versioning scheme of the snapshot metadata framework. I used the latest version of serializer snaphot to generate serializer snaphot test data, which previously used the version 2. In this pull request, I didn't remove previous serializer snaphot test data. The previous serializer snaphot test data exists:
   https://github.com/SteNicholas/flink/blob/avro-serializer-snapshot/flink-formats/flink-avro/src/test/resources/generic-avro-serializer-1.11/serializer-snapshot
   https://github.com/SteNicholas/flink/blob/avro-serializer-snapshot/flink-formats/flink-avro/src/test/resources/specific-avro-serializer-1.11/serializer-snapshot


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13813:
URL: https://github.com/apache/flink/pull/13813#issuecomment-717706259


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459",
       "triggerID" : "61014eba30b67fc21a965f876d9484dd3873e14d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61014eba30b67fc21a965f876d9484dd3873e14d Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8459) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org